Volume 91 Number 2 Published monthly by the American Psychological Association
August 2006
ISSN 0022-3514
Journal of
Personality and Social Psychology ATTITUDES AND SOCIAL COGNITION
Charles M. Judd, Editor Dacher Keltner, Associate Editor Anne Maass, Associate Editor Bernd Wittenbrink, Associate Editor Vincent Yzerbyt, Associate Editor INTERPERSONAL RELATIONS AND GROUP PROCESSES
John F. Dovidio, Editor Daphne Blunt Bugental, Associate Editor Beverley Fehr, Associate Editor Jacques-Philippe Leyens, Associate Editor Antony Manstead, Associate Editor Jeffry A. Simpson, Associate Editor Scott Tindale, Associate Editor Jacquie D. Vorauer, Associate Editor PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES
www.apa.org/journals/psp.html
Charles S. Carver, Editor Tim Kasser, Associate Editor Mario Mikulincer, Associate Editor Eva M. Pomerantz, Associate Editor Richard W. Robins, Associate Editor Gerard Saucier, Associate Editor Thomas A. Widiger, Associate Editor
The Journal of Personality and Social Psychology publishes original papers in all areas of personality and social psychology. It emphasizes empirical reports but may include specialized theoretical, methodological, and review papers. The journal is divided into three independently edited sections: f ATTITUDES AND SOCIAL COGNITION addresses those domains of social behavior in which cognition plays a major role, including the interface of cognition with overt behavior, affect, and motivation. Among topics covered are the formation, change, and utilization of attitudes, attributions, and stereotypes, person memory, self-regulation, and the origins and consequences of moods and emotions insofar as these interact with cognition. Of interest also is the influence of cognition and its various interfaces on significant social phenomena such as persuasion, communication, prejudice, social development, and cultural trends. f INTERPERSONAL RELATIONS AND GROUP PROCESSES focuses on psychological and structural features of interaction in dyads and groups. Appropriate to this section are papers on the nature and dynamics of interactions and social relationships, including interpersonal attraction, communication, emotion, and relationship development, and on group and organizational processes such as social influence, group decision making and task performance, intergroup relations, and aggression, prosocial behavior and other types of social behavior. f PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES publishes research on all aspects of personality psychology. It includes studies of individual differences and basic processes in behavior, emotions, coping, health, motivation, and other phenomena that reflect personality. Articles in areas such as personality structure, personality development, and personality assessment are also appropriate to this section of the journal, as are studies of the interplay of culture and personality and manifestations of personality in everyday behavior. Manuscripts: Submit manuscripts to the appropriate section editor according to the above definitions and according to the Instructions to Authors. Section editors reserve the right to redirect papers among themselves as appropriate unless an author specifically requests otherwise. Rejection by one section editor is considered rejection by all; therefore a manuscript rejected by one section editor should not be submitted to another. The opinions and statements published are the responsibility of the authors, and such opinions and statements do not necessarily represent the policies of APA or the views of the editors. Section editors’ addresses appear below:
ATTITUDES AND SOCIAL COGNITION Charles M. Judd, Editor c/o Laurie Hawkins Department of Psychology University of Colorado UCB 345 Boulder, CO 80309
INTERPERSONAL RELATIONS AND GROUP PROCESSES John F. Dovidio, Editor Department of Psychology University of Connecticut 406 Babbidge Road Storrs, CT 06269-1020
PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES Charles S. Carver, Editor ATTN: JPSP: PPID Department of Psychology University of Miami P.O. Box 248185 Coral Gables, FL 33124-0751 Change of Address: Send change of address notice and a recent mailing label to the attention of the Subscriptions Department, American Psychological Association, 30 days prior to the actual change of address.
APA will not replace undelivered copies resulting from address changes; journals will be forwarded only if subscribers notify the local post office in writing that they will guarantee periodicals forwarding postage. Electronic access: APA members who subscribe to this journal have automatic access to a 3-year file of the journal in the PsycARTICLES姞 full-text database. See http://members.apa.org/access. Reprints: Authors may order reprints of their articles from the printer when they receive proofs. Single Issues, Back Issues, and Back Volumes: For information regarding back issues or back volumes write to Order Department, American Psychological Association, 750 First Street, NE, Washington, DC 20002-4242. Microform Editions: For information regarding microform editions, write to University Microfilms, Ann Arbor, MI 48106. Copyright and Permission: Those who wish to reuse APAcopyrighted material in a non-APA publication must secure from APA and the author of reproduced material written permission to reproduce a journal article in full or journal text of more than 500 words. APA normally grants permission contingent upon like permission of the author, inclusion of the APA copyright notice on the first page of reproduced material, and payment of a fee of $20 per page. Permission from APA and fees are waived for those who wish to reproduce a single table or figure from a journal for use in a print product, provided the author’s permission is obtained and full credit is given to APA as copyright holder and to the author through a complete citation. (Requesters requiring written permission for commercial use of a single table or figure will be assessed a $25 service fee.) Permission and fees are waived for authors who wish to reproduce their own material for personal use; fees only are waived for authors who wish to use more than a single table or figure of their own material commercially (but for use in edited books, fees are waived for the author only if serving as the book editor). Permission and fees are waived for the photocopying of isolated journal articles for nonprofit classroom or library reserve use by instructors and educational institutions. A permission fee may be charged to the requester if students are charged for the material, multiple articles are copied, or large-scale copying is involved (e.g., for course packs). Access services may use unedited abstracts without the permission of APA or the author. Libraries are permitted to photocopy beyond the limits of U.S. copyright law: (1) post-1977 articles, provided the per-copy fee in the code for this journal (0022-3514/06/ $12.00) is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923; (2) pre-1978 articles, provided that the per-copy fee stated in the Publishers’ Fee List is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Address requests for reprint permission to the Permissions Office, American Psychological Association, 750 First Street, NE, Washington, DC 20002-4242. APA Journal Staff: Susan J. A. Harris, Senior Director, Journals Program; Greg Long, Production Account Manager; Julie PalmerHoffman, Manuscript Editor; Jodi Ashcraft, Advertising Sales Manager.
Journal of Personality and Social Psychology (ISSN 0022-3514) is published monthly in two volumes per year by the American Psychological Association, 750 First Street, NE, Washington, DC 20002-4242. Subscriptions are available on a calendar year basis only (January through December). The 2006 rates follow: Nonmember Individual: $421 Domestic, $464 Foreign, $491 Air Mail. Institutional: $1,249 Domestic, $1,340 Foreign, $1,367 Air Mail. APA Member: $202. Write to Subscriptions Department, American Psychological Association, 750 First Street, NE, Washington, DC 20002-4242. Printed in the U.S.A. Periodicals postage paid at Washington, DC, and at additional mailing offices. POSTMASTER: Send address changes to Journal of Personality and Social Psychology, 750 First Street, NE, Washington, DC 20002-4242.
The paper in this journal meets or exceeds EPA guidelines for recycled paper. Since 1986, this journal has been printed on acid-free paper.
APA DICTIONARY OF PSYCHOLOGY Editor-in-Chief: Gary R. VandenBos 2006. 1,008 pages. Hardcover. List: $49.95 • APA Member/Affiliate: $39.95 ISBN 1-59147-380-2 ISBN-13: 978-1-59147-380-0 Item # 4311007
BECOMING CULTURALLY ORIENTED Practical Advice for Psychologists and Educators Nadya A. Fouad and Patricia Arredondo 2007. 208 pages. Hardcover. List: $49.95 • APA Member/Affiliate: $39.95 ISBN 1-59147-424-8 ISBN-13: 978-1-59147-424-1 Item # 4317114
SPIRITUAL APPROACHES IN THE TREATMENT OF WOMEN WITH EATING DISORDERS P. Scott Richards, Randy K. Hardman, and Michael E. Berrett 2007. 312 pages. Hardcover. List: $59.95 • APA Member/Affiliate: $49.95 ISBN 1-59147-393-4 ISBN-13: 978-1-59147-393-0 Item # 4317103
GRADUATE STUDY IN PSYCHOLOGY, 2007 2007. 832 pages. Paperback. List: $24.95 • APA Member/Affiliate: $21.95 ISBN 1-59147-423-X ISBN-13: 978-1-59147-423-4 Item # 4270090
DISORDERS OF THE SELF A Personality-Guided Approach Marshall L. Silverstein 2007. 320 pages. Hardcover. List: $69.95 • APA Member/Affiliate: $49.95 ISBN 1-59147-430-2 ISBN-13: 978-1-59147-430-2 Item # 4317116
CHILD DEVELOPMENT AND SOCIAL POLICY Knowledge for Action
Second Edition Edited by Kathleen J. Bieschke, Ruperto M. Perez, and Kurt A. DeBord 2007. 464 pages. Hardcover. List: $79.95 • APA Member/Affiliate: $49.95 ISBN 1-59147-421-3 ISBN-13: 978-1-59147-421-0 Item # 4317113
INTERVENING IN CHILDREN’S LIVES An Ecological, Family-Centered Approach to Mental Health Care Thomas J. Dishion and Elizabeth A. Stormshak 2007. 320 pages. Hardcover. List: $69.95 • APA Member/Affiliate: $49.95 ISBN 1-59147-428-0 ISBN-13: 978-1-59147-428-9 Item # 4317115
PREVENTING YOUTH SUBSTANCE ABUSE Science-Based Programs for Children and Adolescents Edited by Patrick Tolan, José Szapocznik, and Soledad Sambrano 2007. 264 pages. Hardcover. List: $69.95 • APA Member/Affiliate: $49.95 ISBN 1-59147-307-1 ISBN-13: 978-1-59147-307-7 Item # 4316058
SCIENTIFIC JURY SELECTION Joel D. Lieberman and Bruce D. Sales 2007. 264 pages. Hardcover. List: $79.95 • APA Member/Affiliate: $49.95 ISBN 1-59147-427-2 ISBN-13: 978-1-59147-427-2 Item # 4316081
PRIMATE PERSPECTIVES ON BEHAVIOR AND COGNITION Edited by David A. Washburn 2007. 368 pages. Hardcover. List: $79.95 • APA Member/Affiliate: $49.95 ISBN 1-59147-422-1 ISBN-13: 978-1-59147-422-7 Item # 4318035
AD0456
Edited by J. Lawrence Aber, Sandra J. BishopJosef, Stephanie M. Jones, Kathryn Taaffe McLearn, and Deborah A. Phillips 2007. 352 pages. Hardcover. List: $79.95 • APA Member/Affiliate: $49.95 ISBN 1-59147-425-6 ISBN-13: 978-1-59147-425-8 Item # 4318036
HANDBOOK OF COUNSELING AND PSYCHOTHERAPY WITH LESBIAN, GAY, BISEXUAL, AND TRANSGENDER CLIENTS
1-800-374-2721 • WWW.APA.ORG/BOOKS
apa books
Available July 2006
A Landmark Reference That Defines the Lexicon of Psychology
APA Dictionary of Psychology Editor-in-Chief: Gary R. VandenBos, PhD The American Psychological Association is proud to announce the publication of an invaluable addition to your reference shelf, one that represents a major scholarly and editorial undertaking. With over 25,000 terms and definitions, the APA Dictionary of Psychology encompasses all areas of research and application, and includes coverage of concepts, processes, and therapies across all the major subdisciplines of psychology. Ten years in the making and edited by a distinguished editorial board of nearly 100 psychological scholars, researchers and practitioners, the Dictionary is destined to become the most authoritative reference of its kind. Academicians, researchers, clinicians, undergraduates and graduate students, and professionals in allied mental health, education, medicine, and law, as well as academic and public libraries, will find the Dictionary essential. 2006. Hardcover. 1,008 pages. List: $49.95 | APA Member/Affiliate: $39.95 ISBN 1-59147-380-2 | Item # 4311007
The APA Dictionary of Psychology includes •
25,000 entries offering clear and authoritative definitions
•
Thousands of incisive cross-references directing the user to synonyms and antonyms, acronyms and abbreviations, and related terms and concepts that deepen the user’s understanding of related topics
•
Balanced coverage of over 100 subject areas across the field of psychology including clinical, experimental, neuropsychology, cognitive, personality and social, developmental, health, psychopharmacology, methodology and statistics, and many others
• 800-374-2721 www.apa.org/books
Entries include nearly 8,000 terms from the APA’s Thesaurus of Psychological Index Terms® which helps students and researchers refine their APA database searches (such as the flagship PsycINFO® bibliographic database’s 2+ million records)
• •
“A Guide to Use” and “Quick Guide to Format” that together explain important stylistic and format features to help readers most effectively use the Dictionary Each of four appendices gathers terms into a thematic summary listing, covering (1) biographies; (2) institutions, associations and organizations; (3) psychological therapies and interventions; and (4) psychological tests and assessment instruments
AD0470
Journal of
Personality Social Psychology and
www.apa.org/journals/psp.html August 2006 VOLUME 91 NUMBER 2
Copyright © 2006 by the American Psychological Association
Attitudes and Social Cognition 205
The Role of Task Demands and Processing Resources in the Use of Base-Rate and Individuating Information Woo Young Chun and Arie W. Kruglanski
218
Everyday Magical Powers: The Role of Apparent Mental Causation in the Overestimation of Personal Influence Emily Pronin, Daniel M. Wegner, Kimberly McCarthy, and Sylvia Rodriguez
232
Subgoals as Substitutes or Complements: The Role of Goal Accessibility Ayelet Fishbach, Ravi Dhar, and Ying Zhang
243
Distinguishing Stereotype Threat From Priming Effects: On the Role of the Social Self and Threat-Based Concerns David M. Marx and Diederik A. Stapel
Interpersonal Relations and Group Processes 255
Does Who You Marry Matter for Your Health? Influence of Patients’ and Spouses’ Personality on Their Partners’ Psychological Well-Being Following Coronary Artery Bypass Surgery John M. Ruiz, Karen A. Matthews, Michael F. Scheier, and Richard Schulz
268
From Automatic Antigay Prejudice to Behavior: The Moderating Role of Conscious Beliefs About Gender and Behavioral Control Nilanjana Dasgupta and Luis M. Rivera
281
Going Along Versus Going Alone: When Fundamental Motives Facilitate Strategic (Non)Conformity Vladas Griskevicius, Noah J. Goldstein, Chad R. Mortensen, Robert B. Cialdini, and Douglas T. Kenrick
295
Evidence for Strong Dissociation Between Emotion and Facial Displays: The Case of Surprise Rainer Reisenzein, Sandra Bo¨rdgen, Thomas Holtbernd, and Denise Matz
Personality Processes and Individual Differences 316
The Evolutionary Significance of Depressive Symptoms: Different Adverse Situations Lead to Different Depressive Symptom Patterns Matthew C. Keller and Randolph M. Nesse
331
It’s Not Just the Amount That Counts: Balanced Need Satisfaction Also Affects Well-Being Kennon M. Sheldon and Christopher P. Niemiec
(contents continue)
342
Expect the Unexpected: Ability, Attitude, and Responsiveness to Hypnosis Grant Benham, Erik Z. Woody, K. Shannon Wilson, and Michael R. Nash
351
Conceptual Beliefs About Human Values and Their Implications: Human Nature Beliefs Predict Value Importance, Value Trade-Offs, and Responses to Value-Laden Rhetoric Paul G. Bain, Yoshihisa Kashima, and Nick Haslam
Other 217 315 242 ii
American Psychological Association Subscription Claims Information E-Mail Notification of Your Latest Issue Online! Instructions to Authors Subscription Order Form
ii
342
Expect the Unexpected: Ability, Attitude, and Responsiveness to Hypnosis Grant Benham, Erik Z. Woody, K. Shannon Wilson, and Michael R. Nash
351
Conceptual Beliefs About Human Values and Their Implications: Human Nature Beliefs Predict Value Importance, Value Trade-Offs, and Responses to Value-Laden Rhetoric Paul G. Bain, Yoshihisa Kashima, and Nick Haslam
Other 217 315 242 ii
American Psychological Association Subscription Claims Information E-Mail Notification of Your Latest Issue Online! Instructions to Authors Subscription Order Form
ii
ATTITUDES AND SOCIAL COGNITION CHARLES M. JUDD, Editor University of Colorado at Boulder ASSOCIATE EDITORS DACHER KELTNER University of California, Berkeley ANNE MAASS Universita` di Padova, Padova, Italy BERND WITTENBRINK University of Chicago VINCENT YZERBYT Catholic University of Louvain, Louvain-la-Neuve, Belgium CONSULTING EDITORS ICEK AJZEN University of Massachusetts
ALICE H. EAGLY Northwestern University
NIRA LIBERMAN Tel Aviv University, Tel Aviv, Israel
LINDA SKITKA University of Illinois at Chicago
NICHOLAS EPLEY University of Chicago
DIANE M. MACKIE University of California, Santa Barbara
JOHN SKOWRONSKI Northern Illinois University
RUSSELL H. FAZIO Ohio State University
NEIL MACRAE Dartmouth College
ELIOT R. SMITH Indiana University Bloomington
LISA FELDMAN BARRETT Boston College
TONY MANSTEAD Cardiff University, Cardiff, Wales
SUSAN T. FISKE Princeton University
THOMAS MUSSWEILER Universita¨t Ko¨ln, Cologne, Germany
DIEDERIK STAPEL University of Groningen, Groningen, the Netherlands
BARBARA L. FREDRICKSON University of Michigan
JAMES M. OLSON University of Western Ontario, London, Ontario, Canada
WENDI GARDNER Northwestern University
MAHZARIN BANAJI Harvard University
BERNADETTE M. PARK University of Colorado at Boulder
DANIEL GILBERT Harvard University
MONICA BIERNAT University of Kansas
RICHARD E. PETTY Ohio State University
THOMAS GILOVICH Cornell University
IRENE V. BLAIR University of Colorado at Boulder
NEAL J. ROESE University of Illinois at Urbana– Champaign
ANTHONY G. GREENWALD University of Washington
GALEN V. BODENHAUSEN Northwestern University
DAVID L. HAMILTON University of California, Santa Barbara
MARKUS BRAUER LAPSCO, Universite´ Blaise Pascal Clermont-Ferrand, France
EDWARD R. HIRT Indiana University Bloomington
MARILYNN B. BREWER Ohio State University
TIFFANY ITO University of Colorado at Boulder
JOHN T. CACIOPPO University of Chicago
YOSHIHISA KASHIMA University of Melbourne, Victoria, Australia
OLIVIER CORNEILLE Catholic University of Louvain, Louvain-la-Neuve, Belgium
KARLE CHRISTOPHE KLAUER Albrecht-Ludwigs-Universita¨t Freiburg, Freiburg, Germany
PATRICIA DEVINE University of Wisconsin—Madison AP DIJKSTERHUIS University of Amsterdam, Amsterdam, the Netherlands DAVID DUNNING Cornell University
MYRON ROTHBART University of Oregon LAURIE RUDMAN Rutgers, The State University of New Jersey MARK SCHALLER University of British Columbia, Vancouver, British Columbia, Canada TONI SCHMADER University of Arizona NORBERT SCHWARZ University of Michigan
ARIE W. KRUGLANSKI University of Maryland
GU¨N R. SEMIN Free University, Amsterdam, the Netherlands
ALAN LAMBERT Washington University in St. Louis
JEFFREY W. SHERMAN University of California, Davis
JENNIFER LERNER Carnegie Mellon University
STEVEN J. SHERMAN Indiana University Bloomington
FRITZ STRACK Universita¨t Wu¨rzburg, Wu¨rzburg, Germany ABRAHAM TESSER University of Georgia YAACOV TROPE New York University THERESA K. VESCIO Pennsylvania State University WILLIAM VON HIPPEL University of New South Wales, Sydney, Australia DUANE T. WEGENER Purdue University DANIEL M. WEGNER Harvard University DIRK WENTURA Saarland University, Saarbru¨cken, Germany DANIEL WIGBOLDUS Radboud University Nijmegen, Nijmegen, the Netherlands TIMOTHY D. WILSON University of Virginia PIOTR WINKIELMEN University of California, San Diego MARK P. ZANNA University of Waterloo, Waterloo, Ontario, Canada
ASSISTANT TO THE EDITOR—LAURIE HAWKINS
INTERPERSONAL RELATIONS AND GROUP PROCESSES JOHN F. DOVIDIO, Editor University of Connecticut ASSOCIATE EDITORS DAPHNE BLUNT BUGENTAL University of California, Santa Barbara BEVERLEY FEHR University of Winnipeg, Winnipeg, Manitoba, Canada JACQUES-PHILIPPE LEYENS Catholic University of Louvain, Louvain-la-Neuve, Belgium ANTONY MANSTEAD Cardiff University, Cardiff, United Kingdom JEFFRY A. SIMPSON University of Minnesota, Twin Cities Campus
ARTHUR ARON State University of New York at Stony Brook
RUPERT BROWN The University of Kent at Canterbury, Canterbury, England
XIMENA ARRIAGA Purdue University
LORNE CAMPBELL University of Western Ontario, London, Ontario, Canada
WINTON W. T. AU The Chinese University of Hong Kong, Shatin, Hong Kong MARK BALDWIN McGill University, Montreal, Quebec, Canada KIM BARTHOLOMEW Simon Fraser University, Burnaby, British Columbia, Canada C. DANIEL BATSON University of Kansas
SCOTT TINDALE Loyola University Chicago
B. ANNE BETTENCOURT University of Missouri—Columbia
JACQUIE D. VORAUER University of Manitoba, Winnipeg, Manitoba, Canada
GERD BOHNER Universita¨t Bielefeld, Bielefeld, Germany
CONSULTING EDITORS DOMINIC ABRAMS University of Kent at Canterbury, Canterbury, England
NIALL BOLGER Columbia University
CHRIS AGNEW Purdue University
JONATHON D. BROWN University of Washington
NYLA R. BRANSCOMBE University of Kansas
SERENA CHEN University of California, Berkeley MARGARET CLARK Yale University CARSTEN DE DREU University of Amsterdam, Amsterdam, the Netherlands STE´PHANIE DEMOULIN Catholic University of Louvain Louvain-la-Neuve, Belgium, and Belgan National Fund for Scientific Research, Brussels, Belgium
KLAUS FIEDLER University of Heidelberg, Heidelberg, Germany GARTH FLETCHER University of Canterbury, Christchurch, New Zealand SHELLY GABLE University of California, Los Angeles LOWELL GAERTNER University of Tennessee, Knoxville SAMUEL L. GAERTNER University of Delaware ADAM GALINSKY Northwestern University PETER GLICK Lawrence University STEPHANIE A. GOODWIN Purdue University
DAVID DESTENO Northeastern University
MARTIE G. HASSELTON University of California, Los Angeles
STEVE DRIGOTAS Johns Hopkins University
S. ALEXANDER HASLAM University of Exeter, Exeter, United Kingdom
ELISSA S. EPEL University of California, San Francisco VICTORIA ESSES University of Western Ontario, London, Ontario, Canada
(editors continue)
VERLIN HINSZ North Dakota State University GORDON HODSON Brock University, St. Catherine’s, Ontario, Canada
MICHAEL A. HOGG University of Queensland, Brisbane, Australia
LAURA J. KRAY University of California, Berkeley
ANDREA B. HOLLINGSHEAD University of Southern California JOHN G. HOLMES University of Waterloo, Waterloo, Ontario, Canada RICK H. HOYLE University of Kentucky
JAMES R. LARSON JR. University of Illinois at Chicago COLIN WAYNE LEACH University of Sussex, Sussex, United Kingdom JOHN LEVINE University of Pittsburgh JOHN E. LYDON McGill University, Montreal, Quebec, Canada
JOLANDA JETTEN University of Exeter, Exeter, United Kingdom
JON K. MANER Florida State University
JAMES D. JOHNSON University of North Carolina at Wilmington TATSUYA KAMEDA Hokkaido University, Sapporo, Japan BENJAMIN R. KARNEY RAND Corporation, Santa Monica, California YOSHI KASHIMA University of Melbourne, Victoria, Australia
BRENDA MAJOR University of California, Santa Barbara CRAIG MCGARTY Australian National University, Canberra, Australia WENDY BERRY MENDES Harvard University RICHARD MORELAND University of Pittsburgh
DEBORAH A. KASHY Michigan State University
SABINE OTTEN University of Gro¨ningen, Gro¨ningen, the Netherlands CRAIG D. PARKS Washington State University LOUIS A. PENNER Wayne State University PAULA PIETROMONACO University of Massachusetts at Amherst
CHRISTINE SMITH Grand Valley State University HEATHER J. SMITH Sonoma State University RUSSELL SPEARS Cardiff University, Cardiff, Wales CHARLES STANGOR University of Maryland GARY L. STASSER Miami University—Ohio
TOM POSTMES University of Exeter, Exeter, United Kingdom
WALTER STEPHAN New Mexico State University
FELICIA PRATTO University of Connecticut
WILLIAM B. SWANN JR. University of Texas at Austin
HARRY T. REIS University of Rochester
JANET SWIM Pennsylvania State University
W. STEVEN RHOLES Texas A&M University
LEIGH L. THOMPSON Northwestern University
JENNIFER A. RICHESON Northwestern University
TOM TYLER New York University
MARK SCHALLER University of British Columbia, Vancouver, British Columbia, Canada
JEROEN VAES University of Padova, Padova, Italy
BRIAN MULLEN KERRY KAWAKAMI University of Kent at Canterbury, York University, Toronto, Ontario, Canada Canterbury, England JANICE R. KELLY AME´LIE MUMMENDEY Purdue University Friedrich-Schiller-Universita¨t, Jena, DACHER KELTNER Jena, Germany University of California, Berkeley MARK MURAVEN DAVID A. KENNY University at Albany, State University University of Connecticut of New York
DAVID A. SCHROEDER University of Arkansas
KEES VAN DEN BOS University of Utrecht, Utrecht, the Netherlands
CONSTANTINE SEDIKIDES University of Southampton, Southampton, England
PAUL A. M. VAN LANGE Free University, Amsterdam, Amsterdam, the Netherlands
PHILLIP R. SHAVER University of California, Davis
LAURIE R. WEINGART Carnegie Mellon University
J. NICOLE SHELTON Princeton University
GWEN M. WITTENBAUM Michigan State University
DOUGLAS T. KENRICK Arizona State University
SANDRA L. MURRAY State University of New York at Buffalo
MARGARET SHIH University of Michigan
NORBERT L. KERR Michigan State University
STACEY SINCLAIR LISA A. NEFF University of Virginia University of Toledo ASSISTANT TO THE EDITOR—CHRISTINE KELLY
WENDY L. WOOD Texas A&M University MICHAEL ZA´RATE University of Texas at El Paso
PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES CHARLES S. CARVER, Editor University of Miami ASSOCIATE EDITORS TIM KASSER Knox College
GEORGE A. BONANNO Teachers College, Columbia University
AVSHALOM CASPI MARIO MIKULINCER Bar-Ilan University, Ramat-Gan, Israel King’s College, London EDWARD C. CHANG EVA M. POMERANTZ University of Michigan University of Illinois at Urbana– Champaign RICHARD W. ROBINS University of California, Davis GERARD SAUCIER University of Oregon THOMAS A. WIDIGER University of Kentucky
SERENA CHEN University of California, Berkeley A. TIMOTHY CHURCH Washington State University JAMES COAN University of Wisconsin—Madison M. LYNNE COOPER University of Missouri—Columbia
EDDIE HARMON-JONES Texas A&M University
DANIEL W. RUSSELL Iowa State University
TODD HEATHERTON Dartmouth College
OLIVER C. SCHULTHEISS University of Michigan
JUTTA HECKHAUSEN University of California, Irvine
SUZANNE C. SEGERSTROM University of Kentucky
STEVEN J. HEINE University of British Columbia, Vancouver, British Columbia, Canada
KENNON M. SHELDON University of Missouri—Columbia
RICHARD KOESTNER McGill University Montreal, Quebec, Canada
C. R. SNYDER University of Kansas SANJAY SRIVASTAVA University of Oregon
DAVID LUBINSKI Vanderbilt University
TIMOTHY STRAUMAN Duke University
MICHAEL EID University of Geneva, Geneva, Switzerland
RICHARD E. LUCAS Michigan State University
MICHAEL J. STRUBE Washington University
ROBERT R. MCCRAE National Institute on Aging, Baltimore
JERRY SULS University of Iowa
ANDREW J. ELLIOT University of Rochester
WENDY BERRY MENDES Harvard University
WILLIAM B. SWANN JR. University of Texas at Austin
LISA FELDMAN BARRETT Boston College
RODOLFO MENDOZA-DENTON University of California, Berkeley
HOWARD TENNEN University of Connecticut Health Center
WILLIAM FLEESON Wake Forest University
DANIEL K. MROCZEK Fordham University
MICHAEL C. ASHTON Brock University, St. Catherines, Ontario, Canada
SUZANNE THOMPSON Pomona College
R. CHRIS FRALEY University of Illinois at Chicago
STEPHEN A. PETRILL Pennsylvania State University
OZLEM AYDUK University of California, Berkeley
ANTONIO L. FREITAS State University of New York at Stony Brook
RALPH L. PIEDMONT Loyola College in Maryland
ROBERT J. VALLERAND Universite´ du Que´bec a` Montre´al Montreal, Quebec, Canada
CONSULTING EDITORS STEPHAN A. AHADI American Institutes for Research, Washington, DC JAMIE ARNDT University of Missouri—Columbia JENS B. ASENDORPF Humboldt-Universita¨t Berlin Berlin, Germany
E. ASHBY PLANT Florida State University
ROY F. BAUMEISTER Florida State University VERO´NICA BENET-MARTI´NEZ University of California, Riverside
DAVID C. FUNDER University of California, Riverside STEVEN W. GANGESTAD University of New Mexico
BRENT ROBERTS University of Illinois at Urbana–Champaign
APRIL L. BLESKE-RECHEK University of Wisconsin—Eau Claire
CAROL L. GOHM University of Mississippi
MICHAEL D. ROBINSON North Dakota State University
ASSISTANT TO THE EDITOR—JESSICA LILLESAND
KATHLEEN D. VOHS University of Minnesota DAVID WATSON University of Iowa BARBARA WOIKE Columbia University REX A. WRIGHT University of Alabama at Birmingham
ATTITUDES AND SOCIAL COGNITION
The Role of Task Demands and Processing Resources in the Use of Base-Rate and Individuating Information Woo Young Chun
Arie W. Kruglanski
Hallym University
University of Maryland
This article addresses the process that governs the use of base-rate and individuating information. Five experiments demonstrated that, for both, informational length and order of presentation (determining processing difficulty) interact with the recipients’ processing resources to determine use. In cases in which the base-rate or the individuating information is brief and/or is presented early, the tendency to use it is greater under limited cognitive resources (cognitive load) than under ample cognitive resources. In contrast, in cases in which the base-rate or the individuating information is lengthy and/or is presented late in the informational sequence, the tendency to use it is greater under ample versus limited resources. These results suggest the appropriateness of conceptually decoupling informational contents (having to do with base rates or individuating descriptions) from the task demands (processing ease or difficulty) that a given judgmental problem presents and that may require different amounts of processing resources. Keywords: task demands, processing resources, base rate, individuating information, cognitive load
statistical information. A heuristic that has received a considerable amount of research attention is that of representativeness. In essence, representativeness pertains to the degree to which an individuating description of a target is similar to, or fits within, a given category. For instance, consider the description of Steve as “very shy and withdrawn, invariably helpful, but with little interest in people, or in the world of reality. A meek and tidy soul, he has a need for order and structure, and a passion for detail” (Fiske & Taylor, 1991, p. 382). Such a portrayal is assumed to fit the category of “librarian” better than the categories of “farmer,” “trapeze artist,” “salvage diver,” or “surgeon.” In research described by Tversky and Kahneman (1974; for a review, see Kahneman, 2003), the use of such representativeness information was contrasted with the use of base-rate information, assumed in the normative model to determine the prior probability of outcomes. Thus,
The concept of cognitive resources has figured importantly in social judgment models of the past several decades. In the 1970s and 1980s this concept was tied to the cognitive miser model implicitly adopted by many researchers in this domain. As Fiske and Taylor (1984) described it: The idea is that people are limited in their capacity to process information, so they take shortcuts wherever they can. People adopt strategies that simplify complex problems; the strategies may not be normatively correct or produce normatively correct answers, but they emphasize efficiency. The capacity limited thinker searches for rapid adequate solutions rather than slow accurate solutions. Consequently, errors and biases stem from inherent features of the cognitive system. (p. 12)
Of interest, such rapid solutions often were assumed to use specific types of information differing in their contents from other, “normatively correct” types of information. In this vein, Kahneman and Tversky (1973; Tversky & Kahneman, 1974) highlighted the notion of judgmental heuristics and juxtaposed it to the use of
If Steve lives in a town with lots of chicken farmers and only a few libraries, one’s judgment that he is a librarian should be tempered by this fact; that is, it is simply more likely that he is a chicken farmer than a librarian. Nonetheless, people . . . ignore prior probabilities and instead base their judgments solely on similarity, for example, the fact that Steve resembles a librarian. (Fiske & Taylor, 1991, pp. 382–383)
This research was supported by National Science Foundation Grant 0314291/0313483 and the research grant from Hallym University, Korea. We thank Rayoung Yoo, Susan Kurian, Jennifer Lacey, and Sondi Carter for their assistance in data collection. Correspondence concerning this article should be addressed to Woo Young Chun, Department of Psychology, Hallym University, 39 Hallymdaehak-Gil, Chunchen-Si, Gangwon-Do 200 –702, Korea or Arie W. Kruglanski, Department of Psychology, University of Maryland, College Park, MD 20742. E-mail:
[email protected] or
[email protected]
As already noted, the heuristic information differs in its contents from the statistical information. The narrative about Steve’s retiring personality, for example, is quite distinct contentwise from the information about the base rates of chicken farmers in Steve’s hometown. A reasonable question, therefore, is whether such informational contents should not be conceptually decoupled from the ease with which given information may be used. For instance, for some individuals, and/or for most individuals under some
Journal of Personality and Social Psychology, 2006, Vol. 91, No. 2, 205–217 Copyright 2006 by the American Psychological Association 0022-3514/06/$12.00 DOI: 10.1037/0022-3514.91.2.205
205
CHUN AND KRUGLANSKI
206
conditions, some statistical information might be easier to use than some individuating information. In such a case, a strict application of the cognitive miser model might predict the preferential use of statistical information over that of individuating (representativeness) information. Admittedly, these days the cognitive miser model is a thing of the past. In Fiske and Taylor’s (1991) words, As the cognitive miser viewpoint has matured, the importance of motivations and emotions has again become evident. The emerging view of the social perceiver, then, might best be termed the motivated tactician, a fully engaged thinker who has multiple cognitive strategies available and chooses among them based on goals, motives, and needs. Sometimes the motivated tactician chooses wisely, in the interest of adaptability and accuracy, and sometimes the motivated tactician chooses defensively in the interest of speed or self-esteem. (p. 13)
Unlike the cognitive miser model that views people in general as unwilling or unable to engage in a thorough processing of information, the motivated tactician perspective assumes a more flexible process wherein the availability of motivational and cognitive resources may vary across persons as well as across situations. Whereas in the absence of resources individuals may be drawn to various inferential shortcuts and heuristics, in the presence of resources they may be prepared to engage in a more extensive inferential work, utilizing for that purpose whatever information may furnish the most (subjectively) relevant evidence for their judgments (Pierro, Mannetti, Kruglanski, & Sleeth-Keppler, 2004).
Motivated Tactician Perspective in Models of Social Judgment Several influential models in the realm of social judgment have implicitly embraced the motivated tactician perspective; they include the elaboration likelihood model (ELM; Petty & Cacioppo, 1986) and the heuristic systematic model (HSM; Chaiken, Liberman, & Eagly, 1989), both in the domain of persuasion; the impression formation models of Brewer (1988) and Fiske and Neuberg (1990; Fiske, Lin, & Neuberg, 1999); Kahneman’s (2003) recent model of intuitive versus rational judgments; and other similar frameworks (for a source book, see Chaiken & Trope, 1999). These models suggest that, depending on the availability of (cognitive or motivational) resources, individuals may use informational strategies affording brief and shallow or, alternatively, extensive and deep processing of the information given en route to a judgment.
Persuasion For instance, in the ELM or the HSM, under conditions of low interest in the issue and/or low cognitive resources, individuals are assumed to engage in “peripheral” or “heuristic” processing, respectively; that is, they are assumed to base judgments on information unrelated to the persuasive message or the issue, instead basing their judgments on the communicator’s perceived expertise, attractiveness, the prevailing consensus, or the like. By contrast, under high issue interest and ample cognitive resources, individuals are assumed to engage in “central” or “systematic” processing
(in the ELM and the HSM, respectively); that is, they are presumed to base their judgments on a thorough processing of information contained in the message or pertaining to the issue (for reviews, see Albarracin, Johnson, & Zanna, 2005; Kruglanski, Pierro, Mannetti, Erb, & Spiegel, in press; Kruglanski & Thompson, 1999a, 1999b)
Impression Formation In impression formation models (Brewer, 1988; Brewer, Feinstein, & Harasty, 1999; Fiske et al., 1999; Fiske & Neuberg, 1990), the use of social category information is assumed to constitute the default, occurring on the perceiver’s initial encounter with the social stimulus and irrespective of motivational involvement conditions. Going beyond the categorization stage and on to the consideration of personal information about the target requires a sufficient degree of self-involvement and, hence, of processing motivation.
Judgment Under Uncertainty Recently, Kahneman (2003) systematized the classic work on biases and heuristics (Kahneman & Tversky, 1973; Tversky & Kahneman, 1974) by distinguishing between “two generic modes of cognitive function: an intuitive mode in which judgments and decisions are made automatically and rapidly and a controlled mode, which is deliberate and slower” (Kahneman, 2003, p. 697). The former mode was referred to as System 1 processing and the latter as System 2 processing. As in the models reviewed above, the use of System 1 versus System 2 processing depends on the availability of resources. Thus, judgments mediated by System 1 are independent of resources insofar as they are “are rapid, automatic and effortless” (Kahneman, 2003, p. 700). By contrast, judgments mediated by System 2 are resource dependent insofar as they constitute “slow serial and effortful operations that people need a special reason to undertake” (Kahneman, 2003, p. 700). In this conception, the use of heuristics is a definitional property of System 1 reasoning (Kahneman, 2003, p. 707). Of particular interest in the present context, Kahneman stated the following: “The word heuristic is used in two senses. The noun refers to the cognitive process, and the adjective in heuristic attribute specifies the attribute that is substituted in a particular judgment” (Kahneman, 2003, p. 707). As Kahneman recognized, then, the notion of System 1 reasoning refers both to a cognitive process involved in judgment and to informational contents of a specific kind. In other words, information about heuristic attributes is assumed to be processed heuristically; that is, swiftly, unconsciously, and automatically, presumably because of the ease of its processing stemming from its accessibility (Kahneman, 2003, p. 699). The implicit conflation of specific contents (or types) of information with that information’s ease of processing is common to numerous social judgment models. For instance, the ELM and the HSM assume that in a given situation, peripheral cues or heuristic information, respectively, differ in their contents from message and issue information (they are, in fact, contrasted with message and issue information). Presumably, peripheral or heuristic cues are also easier to process than message and issue information in that they are assumed to be utilized under conditions of low-
USE OF BASE-RATE AND INDIVIDUATING INFORMATION
processing resources, whereas the message and issue information is presumed to be more difficult to process and, hence, to be utilized under conditions of high-processing resources. In the impression formation models, social category information is assumed to be “essentially perceptual, [and] rapid” (Fiske & Neuberg, 1990, pp. 5– 6); hence, it is easy to process. By contrast, the personal or individuating information about the target is assumed to require a greater investment of resources; hence, it presumably is more difficult to process. Again, these two types of information (social category and personal attribute information) clearly differ in their contents—as well as in their presumed ease of processing.
Decoupling Informational Contents From Processing Ease Given the considerable centrality in social judgment models of the issue of processing resources, it seems important to ascertain (a) whether the ease of processing given information may not be conceptually and empirically separated from its contents and, if so, (b) which dimension authentically interacts with resources to determine the use of information in judgment formation. If ease can be decoupled from contents, and if ease rather than contents is what interacts with resources, a couple of interesting implications follow. First is that the ease and/or difficulty of given informational contents may be altered. If so, information typically associated with ease could be rendered more difficult; hence, it could exert judgmental impact only under high motivational or cognitive resources, whereas typically it may have done so under lowresource conditions. Second, it would be important to disentangle whether in prior research in which some types of information exerted judgmental impact under limited resource conditions, whereas other types of information exerted judgmental impact under ample resource condition, this was due to processing difficulty or due to informational contents. In this vein, Erb et al. (2003) observed that often in persuasion research the type of the information (i.e., peripheral or heuristic cues vs. message arguments) was confounded with processing ease. Because the message arguments were typically lengthier, more complex, and placed later in the informational sequence, their processing may have imposed higher processing demands than were imposed by the processing of cues that were invariably brief, simple, and presented up front. When these confoundings were experimentally removed, the previously found differences between conditions under which the cues versus the message arguments (or vice versa) exerted their persuasive effects were eliminated (Erb et al., 2003; Kruglanski, Pierro, et al., in press; Kruglanski & Thompson, 1999a; Pierro, Mannetti, Erb, Spiegel, & Kruglanski, 2005). In particular, when message argument information was presented briefly and up front, and hence was relatively easy to process, it exerted persuasive impact under lowresource condition, mimicking the effects of peripheral or heuristic information in prior research. Similarly, when communicator expertise information (typically assumed to pertain to the “experts are correct” heuristic) was presented in a lengthier manner or subsequent to the message information, and hence was more difficult to process, it exerted persuasive impact under highresource condition (i.e., under high issue motivation and/or in the absence of cognitive load), thus mimicking the effects of message
207
argument information obtained in prior research (for a recent review, see Kruglanski, Pierro, et al., in press). These results are compatible with the possibility that what mattered in prior persuasion research was the ease or difficulty of information processing rather than the type of information processed (e.g., whether its contents were contained in the message, and hence classified as a message argument, or were external to the message, classifying it as a cue).
The Present Research: Ease-of-Processing Effects in the Base-Rate Neglect Paradigm The present research carries the same logic to the realm of judgment under uncertainty, specifically addressing the phenomenon of base-rate neglect. Specifically, we examined the possibility that one reason that research participants tend to use the individuating (representativeness) information and to neglect the base-rate information provided stems from ease-of-processing concerns. Consider the classic lawyer and engineer problem, in which base-rate neglect is often demonstrated. In a typical study addressing this issue, participants are provided with individuating (or representative) information about a target as well as with information about the base rates of engineers and lawyers in the sample from which the target was drawn. In judging whether the target is an engineer, for example, participants might use a representativeness rule that they deem relevant to a likelihood judgment, whereby if a target has characteristics a, b, and c, he or she is likely/unlikely to be an engineer. Alternatively, the participant might use a base-rate rule, if he or she deems it relevant, whereby if the base rate in the sample is X, the target is likely/unlikely to be an engineer. In the original demonstrations of base-rate neglect, participants were much more likely to base their likelihood judgment on the representativeness rule rather than on the base rates, evincing considerable base-rate neglect. One reason for such “favoritism” could be the relation between processing ease and the availability of resources considered earlier. Consider that in a typical lawyer and engineer study (e.g., Kahneman & Tversky, 1973), the base-rate information was presented to participants via a single sentence appearing up front. In contrast, the individuating (or representativeness) information usually followed the base-rate information and was conveyed via a relatively lengthy vignette. As a consequence, the base-rate information may have been relatively easy to process compared with the individuating information. If one assumes that participants in the typical base-rate neglect studies possess sufficiently high degrees of processing motivation and cognitive capacity to consider all of the information provided, then it is possible that they process fully the more complex individuating information and, hence, are able to appreciate its relevance to the judgment at hand. A focus on the later, more complex information may have reduced the accessibility of the earlier information (Higgins, 1996), resulting in its “neglect,” just as in persuasion studies the early and easy-to-process cue information has often been neglected (exerting no significant effect on attitudes) under high-motivation and capacity conditions (e.g., Chaiken et al., 1989; Kruglanski & Thompson, 1999a, 1999b; Petty & Cacioppo, 1986). However, if processing difficulty matters, we should be able to increase or
CHUN AND KRUGLANSKI
208
decrease the use of statistical or individuating information by appropriately varying its processing difficulty and participants’ processing resources.
Base-Rate Utilization Research The present focus on processing resources may contribute to our understanding of conditions affecting base-rate utilization. A considerable amount of research on this topic has been carried out since base-rate neglect effect was strikingly demonstrated in Kahneman and Tversky’s seminal studies (Kahneman & Tversky, 1973; Tversky & Kahneman, 1974; see also Meehl & Rosen, 1955). A major moderator yielded by years of research was the perceived relevance of the base-rate information to the requisite judgments (cf. Bar-Hillel, 1983, 1990; Borgida & Brekke, 1980). Such relevance could be enhanced by imbuing the base rates with causal significance (Ajzen, 1977) or by making them more specific to the target judgments (e.g., Bar-Hillel, 1980; Lynch & Ofir, 1989). As Bar-Hillel (1990) aptly summarized in the following: Base-rates are by and large neglected if and when they are considered to be irrelevant to the prediction at hand . . . . [furthermore] in the . . . tasks that dominate laboratory studies of base-rate neglect— base rates provide only a general informational background on which other information, which typically pertains more directly or specifically to the target case, is added . . . . Such information . . . tends to render the arbitrary base rates subjectively irrelevant. (p. 201)
Though subjective relevance is essential, it appears that other moderators also play a role. Specifically, the same base rates that are neglected in the presence of individuating information are generally taken into account in its absence (Bar-Hillel, 1983, 1990; Borgida & Brekke, 1980; Trope & Ginossar, 1988), suggesting that people are basically aware of their judgmental relevance. Furthermore, even in the presence of individuating information, base rates may be used when conversational considerations (Grice, 1975) draw attention to their relevance (Schwarz, Strack, Hilton, & Naderer, 1991) or when such attention is engendered by a withinsubject design in which participants judge one target in the context of two samples with different base rates (Fischhoff, Slovic, & Lichtenstein, 1979; for reviews, see Hilton, 1995; Hilton & Slugoski, 2001). These findings imply that even though the base-rate information given may be potentially relevant to the judgment from the participants’ perspective, participants may not appreciate its actual relevance in circumstances in which their attention is directed elsewhere (e.g., by conversational means). That potentially relevant information needs to be heeded or processed before it can exert a judgmental impact is attested by research on the impact of vividness or salience on base-rate utilization. The thesis that the utilization of base-rate information may increase as a function of its increased vividness or saliency (Nisbett & Borgida, 1975; Nisbett, Borgida, Crandall, & Reed, 1976) has been supported in work by Christensen-Szalanski and Beach (1982); Fischhoff, Slovic, and Lichtenstein (1979); and Ginossar and Trope (1987). Of interest, Manis, Dovalina, Aviv, and Cardoze (1980) failed to support the vividness hypothesis in a problem, according to Bar-Hillel (1983), where it has been established that people consider base rates to be irrelevant (p. 48). It appears then that the vividness or saliency of information may not compensate
for its perceived irrelevance, yet it may well enhance the use of information generally regarded as relevant to the targeted judgment. The foregoing review is consistent with our general formulation that potentially relevant information needs to be processed (hence, its relevance needs to be recognized) if it is to impact judgments in accordance with its implications. Whereas various (conversationalor saliency-based) means of attracting attention to the information given may increase the likelihood of it being processed, other factors may do so as well. Indeed, our major thesis presented earlier is that the likelihood of processing any given information, hence of its utilization in judgment, is a function of its processing difficulty relative to the individual’s resources. This notion is assumed to apply to any potentially relevant information, be it of a statistical (e.g., base rates) or of an individuating variety. Note that the present analysis has important novel implications. Contrary to the received view whereby heuristics are generally used under low-resource conditions, we suggest that in some circumstances, which are actually exemplified by a typical baserate neglect study, they may be utilized under high-resource conditions, allowing the careful processing of the representativeness information. It also follows that should the individuals’ resources be reduced, individuals in a typical base-rate paradigm might increase their use of base-rate information; that is, they might exhibit a rational or normatively appropriate strategy under lowresource conditions, which is in contrast to the typical assumption that normative rationality is more likely under high-resource conditions. Finally, the decoupling of ease of processing from the contents of the information processed suggests that varying the ease of processing the base-rate information should have the same effect on use of that information under varying resource conditions as does varying the ease of processing the individuating information. This too is a novel implication that has not been explored.
The Present Research: An Overview We conducted five experimental studies to investigate these notions. In Experiment 1, we replicated the typical lawyer and engineer paradigm (Kahneman & Tversky, 1973) in one condition by presenting brief base-rate information up front followed by lengthier individuating information. In a contrasting condition, we reversed these relations by presenting brief individuating information first, followed by lengthier and more complex base-rate information. If our analysis has merit, the former condition should replicate the typical finding of base-rate neglect, whereas the latter condition should evince base-rate utilization. Experiment 2 replicated Experiment 1 and added a manipulation of cognitive load. We expected to replicate the results of Experiment 1 in the low-load condition but to reverse them in the high-load condition. Regardless of information type, under high load we expected the brief and up-front information to be utilized to a greater extent than the lengthy and subsequent information, whereas under low load we expected the lengthy and subsequent information to be utilized more. Experiment 3 investigated the effects of order while controlling for informational length. In one condition, base-rate information preceded the individuating information, whereas in a second condition, it followed the individuating information. Orthogonally, we manipulated cognitive load. We predicted that the early informa-
USE OF BASE-RATE AND INDIVIDUATING INFORMATION
tion (whether of the base-rate or the individuating variety) would exert greater effect under high load, whereas the later information (again irrespective of kind) would exert greater effect under low load. Experiment 4 used two types of individuating information, which were presented in two sequences. In one sequence, brief information consistent with the engineer stereotype was followed by lengthy information consistent with the lawyer stereotype. In the alternative sequence, brief information consistent with the lawyer stereotype was followed by lengthy information consistent with the engineer stereotype. We predicted that a heightened cognitive load would prompt a reliance on the brief and up-front stereotype, whereas the absence of load would prompt a reliance on the lengthier and subsequent stereotype. Experiment 5 presented participants with information about two samples. In one condition, the first sample consisting of 30% engineers and 70% lawyers (the 30/70 sample) was followed by a second sample consisting of 70% engineers and 30% lawyers (the 70/30 sample). In addition, we manipulated load. We predicted that the first sample would be relied upon more under high load and the second under low load. Hence, in the 30/70 sample, the judged likelihood that the target is an engineer should be lower under high (vs. low) load, whereas in the 70/30 sample, the judged likelihood of the target being an engineer should be higher under high (vs. low) load.
Experiment 1 Participants in one condition received the lawyer and engineer problem in the typical format in which brief base-rate information preceded more extensive individuating information. In a second condition, this pattern was reversed, so that brief individuating information preceded extensive base-rate information. If participants’ motivation in such a situation is sufficient for them to peruse the entire informational set—and hence to process thoroughly (and thus increase the accessibility of) the lengthy and late-appearing information—the typical base-rate neglect finding should obtain in the first condition and a proper utilization of the base rates should be manifest in the second condition.
Method Participants. Ninety-two University of Maryland undergraduates (39 men and 53 women) served as participants in fulfillment of a course requirement. There were no significant differences between the genders on our dependent measures, so this variable is not considered further. Procedure. Participants were informed that they would receive a description of a person named Dan and be asked to divine his profession using an 11-point scale ranging from 0% to 100%. In the replication condition, as in typical base-rate neglect studies (Kahneman & Tversky, 1973), brief base-rate information was followed by extensive individuating information. Specifically, the description in this condition (with alternative versions in brackets) read as follows: We collected data regarding a group of people. 70% [30%] of the group members are engineers, and the remaining 30% [70%] are lawyers. One of the members of this group was Dan. He was drawn randomly from that group of people. He is 45 years old. He is married and has four children. He is generally conservative, careful and ambitious. He shows no interest in political and social issues and
209
spends most of his free time on his many hobbies, which include home carpentry, sailing and mathematical puzzles. By contrast, in the reversal condition participants were told the following: We collected data regarding a group of people. One of the members of this group was Dan. His hobbies are home carpentry, sailing and mathematical puzzles. He was drawn randomly from that group of people. The group included 14% electrical engineers, 6% chemical engineers, 9% divorce lawyers, 4% nuclear engineers, 10% civil engineers, 11% criminal lawyers, 12% sound engineers, 8% genetic engineers, 10% trade lawyers, 16% mechanical engineers. [The group included 14% criminal lawyers, 6% trade lawyers, 9% mechanical engineers, 4% patent lawyers, 10% human rights lawyers, 11% electrical engineers, 12% public defense lawyers, 8% divorce lawyers, 10% nuclear engineers, 16% tax lawyers.] Thus, in the reversal condition, the individuating information, consisting of the typical engineer stereotype in the lawyer and engineer problem, was conveyed via a single sentence, whereas the subsequent base-rate information was extensive.
Results and Discussion Participants’ likelihood estimates were subjected to a 2 (condition: replication, reversal) ⫻ 2 (base rate of engineers: 70%, 30%) analysis of variance (ANOVA). The findings are summarized in Table 1. The interaction between condition and base rates is significant, F(1, 88) ⫽ 5.96, p ⬍ .05. As can be seen, the base rates were all but neglected in the replication condition, in which the likelihood estimates were based almost entirely on individuating information consistent with the engineer stereotype (M ⫽ 76.52 in the 70% condition and M ⫽ 72.69 in the 30% condition), t(88) ⫽ 0.73, p ⬎ .46. A very different picture emerges in the reversal condition, in which participants evinced considerable sensitivity to the base rates. Specifically, whereas the estimate in the 70% condition is close to that figure (M ⫽ 67.00), it drops considerably (to M ⫽ 44.35) in the 30% condition, t(88) ⫽ 4.03, p ⬍ .0001. The above results support the notion that participants in a typical lawyer and engineer paradigm possess sufficient resources to process the information given in its entirety, as a consequence of which they tend to base their judgments on the later, moredifficult-to-process information. However, an alternative interpretation of these findings is possible: namely, that the breakdown of the sample into its various subcategories in the reversal condition somehow highlighted the relevance of the subdivided base rates, perhaps because of conversational considerations (e.g., Hilton, Table 1 Likelihood Estimation (Chances Out of 100) That the Target Is an Engineer as a Function of Condition and Base Rate (in Experiment 1) Condition Engineer base rate
Replication
Reversal
70% 30% Difference
76.52 72.69 3.83
67.00 44.35 22.65***
*** p ⬍ .001.
CHUN AND KRUGLANSKI
210
1995; Hilton & Slugoski, 2001; Schwarz et al., 1991) and/or “unpacking” of the base rates (Tversky & Koehler, 1994). To investigate this possibility, we conducted another study in which we replicated Experiment 1 with one addition. Orthogonally to the condition and base-rate variables, we manipulated cognitive load designed to deplete participants’ cognitive resources. If our analysis is correct, we should replicate the findings of Experiment 1 in the absence of load but obtain the opposite data pattern under load: Because in the replication condition the base rates are brief, appear early, and are therefore easy to process, the tendency to utilize them should be particularly apparent under load. By contrast, in the reversal condition, base rates are lengthy, appear late, and are therefore relatively difficult to process; therefore, they should be utilized to a much lesser degree under load. Note that if the breakdown of the sample into its components (or the unpacking of the base rates) lends it (conversational) relevance, then the judgmental impact of the broken down base rates should be independent of load because the subdivision of the sample should be highly noticeable to participants in all conditions. In addition, if probability judgments based on the number of subcategories are a kind of heuristic, then according to accepted views (embodied in the cognitive miser and motivated tactician approaches), its use should increase under cognitive load.
Experiment 2 Method Participants. Ninety-nine University of Maryland undergraduates (58 men, and 41 women) participated in the study either in return for $5 remuneration or in fulfillment of a course requirement. Neither gender nor mode of recruitment exerted significant effects on any of our dependent measures; therefore, they are omitted from further consideration. Procedure. In most relevant details, the procedure of the present study followed that of Experiment 1, except for an added load manipulation. After receiving the general information about the experiment, participants under high cognitive load were told that our additional objective was to investigate how well people can perform two different tasks simultaneously. To that end, before receiving the vignette they were asked to silently rehearse a nine-digit number (854917632) so that they could reproduce it when asked to do so at the end of the experiment. At that point, participants were allowed to view the number in question for 30 s. No similar instructions or viewing experiences were made available to participants in the low-cognitive-load condition. Participants then received the lawyer and engineer vignette. In this case, the base rate of the engineers only (either 70% or 30%) was mentioned. Specifically, the relevant statement read, “70% [or 30%] of the group members are engineers and the rest are lawyers.” To control for exposure time across conditions, we allowed participants to read the engineer and lawyer vignette for 30 s, a sufficient time according to a pretest. Prior to responding to the dependent measures, participants in the high-load condition were asked to reproduce in writing the number they had been previously asked to rehearse. All participants then estimated the likelihood of Dan being an engineer. To check on the efficacy of the cognitive load manipulation, we had participants rate how difficult it was for them to concentrate on the experimental materials and to what extent they were distracted by other thoughts while examining the vignette. Responses to both measures were recorded on a 9-point scale that ranged from 1 (not at all) to 9 (extremely).
Results and Discussion Of the 49 participants in the load condition, 3 recalled fewer than four digits. Following Gilbert and Hixon’s (1991) suggestion that this may reflect a failure of the load manipulation, we excluded these participants from further analysis. Efficacy of the load manipulation. The two items pertaining to the difficulty in concentrating and the presence of distracting thoughts were highly correlated (␣ ⫽ .76). Consequently, we averaged them to form an index of experienced cognitive load. A 2 (condition: replication, reversal) ⫻ 2 (base rate of engineers: 70%, 30%) ⫻ 2 (cognitive load: high, low) ANOVA performed on this index yielded two main effects: (a) an effect of condition such that persons in the reversal versus the replication condition reported feeling greater cognitive load, F(1, 88) ⫽ 6.46, p ⬍ .05, and (b) an effect of cognitive load, such that participants in the highload condition felt greater cognitive load than those under low load, F(1, 88) ⫽ 4.71, p ⬍ .05. The latter effect attests that our cognitive load manipulation was effective. Likelihood estimation. The same 2 ⫻ 2 ⫻ 2 ANOVA was also performed on our likelihood estimation measure (see Table 2). This analysis yielded a significant main effect of base rate, F(1, 88) ⫽ 16.48, p ⬍ .0001, such that the likelihood of Dan being an engineer was judged as higher when the engineer base rate was 70% versus when it was 30%. The main effect of condition also was significant, F(1, 88) ⫽ 9.78; p ⬍ .01. Namely, participants in the replication condition judged the likelihood of Dan being an engineer as higher than did participants in the reversal condition. Finally, the load main effect was significant, F(1, 88) ⫽ 4.87, p ⬍ .05), indicating that the likelihood estimates of participants under high load were lower than those of participants under low load. Of greater present interest, the predicted three-way interaction between condition, base rates, and load was significant, F(1, 88) ⫽ 4.15, p ⬍ .05). To further probe this interaction, we conducted several contrast analyses. Consistent with our prediction, in the replication condition the load manipulation induced participants to base their likelihood estimates on the base rates. Specifically, under high cognitive load, replication participants in the 70% engineer condition judged the likelihood of Dan being an engineer as higher (M ⫽ 63.63) than did their counterparts in the 30% engineer condition (M ⫽ 43.00), t(88) ⫽ 2.40, p ⬍ .05. In the low-cognitive-load condition, however, there was no significant difference between the two conditions (M ⫽ 75.00 and M ⫽ 63.33), t(88) ⫽ 1.45, p ⬎ .14. To the contrary, in the reversal condition, under high load participants in the 70% condition did
Table 2 Likelihood Estimation (Chances Out of 100) That the Target Is an Engineer as a Function of the Condition, Cognitive Load, and Base Rate (in Experiment 2) Replication
Reversal
Engineer base rate
High load
Low load
High load
Low load
70% 30% Difference
63.64 43.00 20.64*
75.00 63.33 11.67
50.00 45.38 4.62
63.85 35.38 28.47***
* p ⬍ .01.
*** p ⬍ .001.
USE OF BASE-RATE AND INDIVIDUATING INFORMATION
not judge the likelihood of Dan being an engineer (M ⫽ 50.00) as significantly different than did their counterparts in the 30% condition (M ⫽ 45.38), t(88) ⫽ 0.59, p ⬎ .55. Under low load, however, reversal participants in the 70% condition judged the likelihood of Dan being an engineer as significantly higher (M ⫽ 63.85) than did their counterparts in the 30% condition (M ⫽ 35.38), t(88) ⫽ 3.69, p ⬍ .0001. These findings support our analysis that the differential tendency to rely on brief, up-front, and easy-to-process information versus extensive, subsequent, and more-difficult–to-process information is related to processing resources. Specifically, even though the information presented to participants was exactly the same within each experimental condition, the high (vs. low) cognitive load completely reversed participants’ tendency to rely on the base-rate versus the individuating information. Specifically, the heightened cognitive load led participants to base their judgments on the brief, hence easy to process, information, whereas under low cognitive load participants apparently had sufficient resources to process the later presented information to a greater extent, allowing it to dominate the brief and early information. It also seems unlikely that the foregoing results are explicable in terms of the notion that the subdivided base rates were perceived as more (conversationally) relevant than the simple base rates. The subdivided nature of the sample should have been readily apparent to participants. Thus, if subdivision communicated relevance, participants should have focused their attention on the subdivided base rates regardless of load. In addition, if the use of the number of subcategories represents a heuristic, it should be facilitated under high cognitive load. Instead, the brief up-front base rates were appropriately utilized under load, whereas the divided, more demanding base rates were utilized only in the absence of load. These findings are consistent with the view that it is the relation between the ease or difficulty of information processing and the presence of informational resources that determines the use of the information in the judgmental process.
211
Method Participants. Ninety-one University of Maryland undergraduates (38 men and 53 women) participated in this experiment either in return for $5 remuneration or in fulfillment of a course requirement. Neither gender nor mode of recruitment had significant effects on our dependent variables; hence, they are not discussed further. Procedure. Except for one detail, the present procedure was similar to that of Experiment 2. That detail concerned informational length. Specifically, unlike Experiments 1 and 2, in which the base-rate and individuating information differed in length, in the present study base-rate and individuating information’s length was equal. In the 30% [70%] engineers condition, the description of the base rate stated the following: We collected data regarding a group of people. The group included 14% criminal lawyers, 6% trade lawyers, 9% mechanical engineers, 4% patent lawyers, 10% human rights lawyers, 11% electrical engineers, 12% public defense lawyers, 8% divorce lawyers, 10% nuclear engineers, 16% tax lawyers. [The group included 14% electrical engineers, 6% chemical engineers, 9% divorce lawyers, 4% nuclear engineers, 10% civil engineers, 11% criminal lawyers, 12% sound engineers, 8% genetic engineers, 10% trade lawyers, 16% mechanical engineers.] The description of the individuating information stated the following: One of the group members is Dan. He was drawn randomly from that group of people. He is 45 years old. He is married and has four children. He is generally conservative, careful and ambitious. He spends most of his free time on his many hobbies, which include home carpentry, sailing, and mathematical puzzles. In one condition, the base-rate information came first and was followed by the individuating information. In the second condition, the sequence was reversed. Participants read the vignette for 40 s, which according to pretest is ample time for a thorough perusal. Prior to receiving these materials, participants in the high-load condition were given a nine-digit number, were allowed to view it for 30 s, and were asked to rehearse it. No similar request was made to participants in the low-load condition. Participants then responded to the dependent measure and manipulation checks used in Experiment 2.
Results and Discussion Experiment 3 Our next study equated the length of the base-rate and the individuating information that differed in the preceding two studies (as well as in most prior studies in the lawyer and engineer paradigm). In previous studies, length was confounded with order of presentation, which in and of itself could be related to processing difficulty (the later information being more subjectively difficult than the earlier information, given the resource-depleting effort that has taken place already). If processing difficulty plays a critical role in driving the observed base-rate effects, then the order of presentation alone should also interact with cognitive resources: Under limited cognitive resources, the early information should be accorded more weight in determining judgments than the laterappearing information, whereas under ample cognitive resources, the later-appearing information should prevail. To test these notions, we created two informational sequences. In one sequence, the base-rate information preceded the equally long individuating information, and in another sequence, it followed the individuating information. Orthogonally, we varied cognitive load.
Of the 47 participants in the high-cognitive-load condition, 6 recalled fewer than four digits. Consistent with the exclusion criterion of Experiment 2, their data were removed from further analyses. Efficacy of the load manipulation. As in Experiment 2, the two items serving as manipulation checks on the cognitive load induction, namely the presence of distracting thoughts and experienced difficulty in concentrating, were highly correlated (␣ ⫽ .75); hence, they were averaged to form a combined index of cognitive load. A 2 (order of presentation: base rate first, individuating information first) ⫻ 2 (cognitive load: high, low) ⫻ 2 (base rate: 70%, 30%) ANOVA performed on these data yielded a significant main effect of the load factor, F(1, 77) ⫽ 8.89, p ⬍ .01. As expected, participants under high cognitive load indeed reported greater load (M ⫽ 5.78) than did those under low load (M ⫽ 4.44). No other effects were significant in this analysis. Likelihood estimation. The likelihood estimation item was subject to the 2 ⫻ 2 ⫻ 2 ANOVA described above. The pertinent findings are summarized in Table 3. This analysis yielded two main effects and an interaction. Specifically, participants in the
CHUN AND KRUGLANSKI
212
Table 3 Likelihood Estimation (Chances Out of 100) That the Target Is an Engineer as a Function of Order of Presentation, Cognitive Load, and Base Rate (in Experiment 3) Base rate first
Individuating information first
Engineer base rate
High load
Low load
High load
Low load
70% 30% Difference
72.00 29.00 43.00***
69.09 65.45 3.64
53.00 42.73 10.27
70.00 50.00 20.00*
* p ⬍ .05.
*** p ⬍ .001.
70% base-rate condition estimated the overall likelihood that Dan is an engineer as higher (M ⫽ 66.19) than did participants in the 30% base-rate condition (M ⫽ 47.21), F(1, 77) ⫽ 15.15, p ⬍ .0001. The load main effect also proved significant, F(1, 77) ⫽ 8.56, p ⬍ .01, in that participants under no load judged the likelihood of Dan being an engineer as higher (M ⫽ 63.34) than did those under load (M ⫽ 49.02). Of greater theoretical interest, the predicted three-way interaction between the order, load, and base-rate variables was significant, F(1, 77) ⫽ 6.17, p ⬍ .05. As suggested by our analysis, in the high-load condition, when the base-rate information was presented prior to the individuating information, participants’ likelihood estimates were highly sensitive to the base rates, such that participants in the 70% condition estimated the likelihood as 72% on average, whereas participants in the 30% condition estimated it as 29% on average, t(77) ⫽ 4.23, p ⬍ .0001. The sensitivity to base rates was all but eliminated in the low-load condition (M ⫽ 69.06 for the 70% condition, and M ⫽ 65.45 for the 30% condition), t(77) ⫽ 0.38, p ⬎ .70. By contrast, when the individuating information was followed by the base-rate information participants were less sensitive to base rates under high (vs. low) load. Thus, under low load participants judged the likelihood of Dan being an engineer as M ⫽ 70.00 in the 70% condition and as M ⫽ 50.00 in the 30% condition, t(77) ⫽ 2.06, p ⬍ .05. Under high load, however, participants did not exhibit differential likelihood estimates in the 70% condition (M ⫽ 53.00) versus the 30% condition (M ⫽ 42.73), t(77) ⫽ 1.03, p ⬎ .30. These findings support our analysis that order of presentation matters and that later-presented information requires more informational resources to be processed than earlier-presented information. Again, it is unlikely that our findings can be explained in terms of conversational relevance. Although participants might view the later-appearing information as more conversationally relevant than the earlier information (Krosnick, Li, & Lehman, 1990), the order of information’s appearance is readily noticeable and should have been equally apparent to participants under high and low load. That the later-appearing information failed to affect judgments under high load seems, therefore, attributable to participants’ lacking sufficient resources to process the information and to appreciate its relevance rather than to conversational relevance per se.
Experiment 4 A paradigmatic feature of studies of base-rate neglect is the juxtaposition of base-rate and individuating information. However, if processing difficulty is critical to previously found differences in the utilization of these two types of information, we should be able to replicate such differential use with two types of individuating information as well as with two types of base-rate information. Our next two studies submitted these notions to empirical test. Specifically, in the present study we investigated the use of two types of individuating information, one easy to process, the second more difficult to process. As in Experiments 2 and 3, we orthogonally varied cognitive load. To test this notion, we created two sequences, one in which brief and early-appearing information consistent with the engineer stereotype was followed by lengthier and later-appearing information consistent with the lawyer stereotype (the engineer-lawyer sequence). In the second sequence, brief information consistent with the lawyer stereotype was followed by lengthier information consistent with the engineer stereotype (the lawyer-engineer sequence). We predicted that in the engineerlawyer sequence, the likelihood estimates of the target being an engineer would be increased by load, whereas in the lawyerengineer sequence, the likelihood estimates of the target being an engineer would be decreased by load.
Method Participants. Eighty University of Maryland students (44 men and 36 women) participated in the study in fulfillment of a course requirement. They were randomly assigned to the four conditions of a 2 ⫻ 2 design including the factors of (a) sequence (engineer-lawyer, lawyer-engineer) and (b) cognitive load (high, low). Gender of participants did not produce significant effects on any of our dependent variables and is not considered further. Procedure. Upon arrival at the experimental site, participants were advised that the experiment they were about to take part in was a study of social perception during which they would be receiving information about an individual named Dan, about whom they subsequently would be asked to answer some questions and give their impressions. Participants in the high-load condition were then asked to rehearse a nine-digit number so that they would be able to reproduce it later. Participants in the low-load condition were not subjected to a similar request. Two different versions of the engineer and lawyer problem were then given to participants in accordance with experimental condition. The initial sentence to all participants read, “We collected data regarding a group of people. One of them is Dan. He was drawn randomly from this group of people.” From that point on, the information diverged. Participants in the engineer-lawyer condition received a brief passage consistent with the engineer stereotype followed by a lengthier passage consistent with the lawyer stereotype. Specifically, this information stated the following: His hobbies are home carpentry, sailing and mathematical puzzles [engineer stereotype]. He is of high intelligence, is quite selfconfident, and tends to be argumentative. He is very involved in his work, and tends to work long hours. He is generally well dressed, even when not at work. He is highly articulate in his oral expression and his writing is very convincing [lawyer stereotype]. In contrast, participants in the lawyer-engineer condition received a brief passage consistent with the lawyer stereotype followed by a lengthier passage consistent with the engineer stereotype. This information stated the following:
USE OF BASE-RATE AND INDIVIDUATING INFORMATION He is highly articulate in his oral expression and his writing is very convincing. [lawyer stereotype]. He likes orderly systems in which each item has its proper place. He is generally conservative and careful. He shows no interest in political and social issues and spends most of his free time on his many hobbies, which include home carpentry, sailing and mathematical puzzles [engineer stereotype]. As in Experiment 3, we controlled for exposure by letting participants read the information given for 30 s. In the load condition, participants were allowed to view the nine-digit number for 25 s and asked to reproduce it prior to responding to the dependent measure. Participants in the low-load condition responded to the dependent measure directly. The dependent measures and manipulation checks used in Experiment 3 were used here as well.
Results and Discussion Of the 39 participants in the load condition, 3 recalled fewer than four digits. Because the failure to recall four of nine digits may reflect a failure of the load manipulation, we excluded these participants’ data from further analysis. Efficacy of the load manipulation. As in our previous studies, the two items that concerned distracting thoughts and difficulties in concentration were highly correlated (␣ ⫽ .97); consequently, we averaged them into an overall index of experienced load. We performed on these data a 2 (cognitive load: high, low) ⫻ 2 (sequence: engineer stereotype, lawyer stereotype) ANOVA. The results indicated that participants under load indeed experienced greater load (M ⫽ 5.97) than did their counterparts not exposed to load (M ⫽ 4.20), F(1, 73) ⫽ 11.15, p ⬍ .01. No other main effects or interactions were significant in this analysis. Likelihood estimation. The likelihood estimation item was subjected to the 2 ⫻ 2 ANOVA described above. The relevant data are displayed in Table 4. The only significant effect in this analysis was the predicted interaction between the load and the sequence variables, F(1, 73) ⫽ 5.92, p ⬍ .05. Consistent with our hypothesis, participants in the lawyer-engineer condition estimated the likelihood of Dan being an engineer as lower under high load (M ⫽ 56.25) than under low load (M ⫽ 71.50), t(73) ⫽ 2.04, p ⬍ .05. By contrast, participants in the engineer-lawyer condition tended to estimate the likelihood of Dan being an engineer as higher (M ⫽ 70.50) under high load than under low load (M ⫽ 60.95), t(73) ⫽ 1.37, p ⬍ .17. These findings are consistent with the notion that a sequence consisting of two types of individuating information—the first of which is brief and the second lengthier— behaves in the same way as does a sequence of brief base-rate followed by lengthier individuating information in classic base-rate neglect research. Under Table 4 Likelihood Estimation (Chances Out of 100) That the Target Is an Engineer as a Function of Sequence of Information and Cognitive Load (in Experiment 4) Sequence of information Cognitive load
Lawyer 3 engineer
Engineer 3 lawyer
High Low Difference
56.25 71.50 ⫺15.25*
70.50 60.95 9.55
* p ⬍ .05.
213
low load, the second portion of the sequence (in this case consisting of one type of individuating information) served as a more influential basis of likelihood estimation than the first portion, whereas under high load, the first portion of the sequence was more influential than the second.
Experiment 5 Our fifth experiment considered a sequence composed of two types of base-rate information that were based on two imagined samples. In one condition, the first sample consisted of 30% engineers and 70% lawyers, and the second sample consisted of 70% engineers and 30% lawyers. In the second condition, the first sample consisted of 70% engineers and 30% lawyers, whereas the second sample consisted of 30% engineers and 70% lawyers. As in the foregoing experiments, cross-cutting these manipulations, we varied cognitive load. In this experiment, we equated the number of lawyer and engineer categories. This allowed us to explore whether the sensitivity to base rates in the lengthy base-rate conditions of our prior experiments was determined by the size of the base rate itself, as we have argued, rather than by the number of categories into which the sample is divided. Because there were more lawyer (vs. engineer) categories in our prior experiments when the base rate of lawyers was higher, and the opposite was true when the base rate of engineers was higher, it is possible to argue that participants performed an aggregation of similarity judgments on the basis of the differential number of lawyer and engineer categories and that this mimicked their sensitivity to the base rates. If, however, participants’ sensitivity to base rates was determined by their processing ease or difficulty in relation to individuals’ processing resources, we should find the same pattern of results as we found in the prior experiments, even though the number of lawyer and engineer categories was now equal. Specifically, our reasoning was that in the 30% engineers–70% engineers sequence, the high (vs. low) cognitive load would increase the utilization of the first informational sample, resulting in a rather low likelihood estimate of the target being an engineer. By the same token, in the 70% engineers–30% lawyers sequence, the high cognitive load should heighten the likelihood estimates of the target being an engineer compared with the low-cognitive-load condition.
Method Participants. Forty-seven University of Maryland students (18 men and 29 women) participated in the experiment in fulfillment of a course requirement. Participants’ gender exerted no significant effects on our dependent variables and is not discussed further. Procedure. In most relevant details, the procedure of this experiment resembled that of Experiment 4, with the exception that the two portions of the present informational sequences now consisted of base-rate (rather than individuating) information. At the outset, participants were informed that the aim of the experiment was to investigate how people develop impressions of persons on the basis of statistical information. Specifically, participants were told they would receive information about two samples from a population to which the target person belonged. Their task was to answer some questions about this target person at the end of the experiment. After receiving these instructions, participants in the high-load condition were asked to rehearse for 30 s the nine-digit number used in Experiment 2. No similar request was presented to participants in the low-load condition. For the next 60 s participants received information about two samples drawn
CHUN AND KRUGLANSKI
214
from the population to which Dan, the target person, also belonged. In one condition, the first sample consisted of 30% engineers and 70% lawyers, and the second sample consisted of 70% engineers and 30% lawyers. Specifically, this information was conveyed as follows: The first sample includes 34% criminal lawyers, 10% genetic engineers, 12% human rights lawyers, 15% electrical engineers, 24% tax lawyers, and 5% sound engineers. The second sample includes 26% chemical engineers, 11% patent lawyers, 10% nuclear engineers, 15% divorce lawyers, 34% mechanical engineers, and 4% public defense lawyers. In the second condition, the order of these two samples was reversed such that the first sample now consisted of 70% engineers and 30% lawyers, whereas the second sample now consisted of 30% engineers and 70% lawyers. Therefore, in both conditions there were three lawyer and three engineer categories. As in the foregoing studies, participants in the high-load condition were asked to reproduce the nine-digit number prior to responding to the dependent measures, which were identical to those used in Experiment 4, whereas participants in the low-load condition responded to the same measures directly, immediately after exposure to the samples’ information.
Results and Discussion Of the 24 participants in the load condition, 1 person recalled fewer than four digits. On the basis of our usual criterion, this person’s data were excluded from further analysis. Efficacy of the load manipulation. As in our preceding studies, the two items concerning distraction and difficulty concentrating were highly correlated (␣ ⫽ .86) and were therefore averaged to form a combined index of perceived load. A 2 (sequence: 30% engineers, 70% engineers; 70% engineers, 30% engineers) ⫻ 2 (cognitive load: high, low) ANOVA yielded only the expected effect of the load manipulation. Specifically, participants in the high-load condition reported having experienced load to a significantly higher degree (M ⫽ 6.67) than did participants in the low-load condition (M ⫽ 5.11), F(1, 42) ⫽ 7.08, p ⬍ .05. No other effects were significant. As in our prior experiments, the manipulation of cognitive load seems to have been successful. Likelihood estimates. The likelihood estimation item was subjected to the 2 ⫻ 2 ANOVA described above. The relevant data are summarized in Table 5. The only significant effect to emerge in this analysis was the predicted interaction between the sequence and the load variables, F(1, 42) ⫽ 11.13, p ⬍ .01. Consistent with our hypothesis, a heightened cognitive load induced participants to base their judgments on the earlier rather than the later sample base rates. Specifically, when the 70% engineer sample was followed by the
30% engineer sample, participants under high load estimated the likelihood of Dan being an engineer as higher (M ⫽ 61.67) than did those under low load (M ⫽ 45.00), t(42) ⫽ 2.63, p ⬍ .05. By contrast, when the 30% engineer sample was followed by the 70% engineer sample, participants under high load estimated the likelihood of Dan being an engineer as lower (M ⫽ 44.17) than did those under low load (M ⫽ 56.67), t(42) ⫽ 2.07, p ⬍ .05. The results of Experiment 5 suggest that reliance on base-rate information is affected by processing difficulty (determined by the information’s order of appearance) and its relation to the individual’s processing resources in the same way as is reliance on individuating information (investigated in Experiment 4). In both cases, the earlier—and hence the easier to process—information appears to have a greater impact on judgments in the presence of processing constraints (e.g. in the form of a cognitive load), whereas given sufficient amount of cognitive resources, it is the later— hence more difficult to process—information that seems to have the greater judgmental impact. It is also important to note that although Experiment 5 controlled for the number of lawyer and engineer categories, setting them to be equal, its results are consistent with our prior findings, indicating that participants’ sensitivity to base rates in the lengthy base-rate conditions of our prior experiments is due to a sensitivity to the size of base rates as such rather than to an aggregation of similarity judgments with a differential number of lawyer and engineer categories.
General Discussion The Present Research The present research, conducted within the classic base-rate neglect paradigm, explored the implications of separating the ease of information processing from contents of the information processed. Our departure point was the observation that in typical studies of the lawyer and engineer type, the base-rate information was presented in a brief form and early in the informational sequence. This may have rendered such information relatively easy to process. By contrast, the individuating (representativeness) information typically was lengthier and came later, which may have made it more challenging to process. Accordingly, earlier findings of base-rate neglect might have been a result of the relation between processing ease and processing resources. If participants had sufficient mental and motivational resources to process the entire informational sequence, they might have thoroughly examined the later-appearing individuating information. The implications of that information might therefore have been highly acces-
Table 5 Likelihood Estimation (Chances Out of 100) That the Target Is an Engineer as a Function of Order of Base-Rate Information and Distraction (in Experiment 5) Sequence of information Cognitive load
70% engineer 3 30% engineer
30% engineer 3 70% engineer
High Low Difference
61.67 45.00 16.67*
44.17 56.67 ⫺12.50*
* p ⬍ .05.
USE OF BASE-RATE AND INDIVIDUATING INFORMATION
sible and, hence, capable of exerting considerable influence on participants’ judgments. From the separation of processing ease from informational contents, it follows that given the same processing resources that typically have yielded base-rate neglect, it should be possible to increase base rate utilization if the base-rate information were made more challenging to process. It similarly follows that in the typical paradigm, in which the base rates are given briefly and up front, base rate utilization might be increased by reducing participants’ processing resources. Finally, if ease rather than contents matters, then brief and/or early-appearing individuating (representativeness) information should have a processing advantage under limited resources, whereas lengthy and/or late-appearing information should have a processing advantage under ample resources. The present results are consistent with all these implications. Specifically, Experiment 1 replicated the usual finding of baserate neglect in a condition in which the order and/or difficulty of the base-rate versus the individuating information followed the typical pattern (of brief base-rate information being followed by relatively extensive individuating information). More important, in a condition in which the order-difficulty combination was reversed, such that the individuating information came early and was brief, whereas the base-rate information was extensive and came later—the base-rate neglect effect was eliminated and participants exhibited considerable sensitivity to base rates. Experiment 2 showed additionally that the presence of load favors the early and/or brief over the subsequent and/or lengthy information, irrespective of whether the earlier or the later information consisted of base rates or individuating information. It is of interest that when the base rates were brief and presented early, their use—typically regarded as the more “normative” or rational response—was enhanced under conditions of reduced mental capacity, represented by the cognitive load. Experiment 3 demonstrated that the order of presentation alone (unconfounded by informational length) is in and of itself capable of producing the effects observed in Experiment 2; hence, it does affect the ease of processing as hypothesized. Experiment 4 generalized the pattern of findings obtained in Experiments 2 and 3 to a case in which two sequential types of individuating information were presented to participants, and Experiment 5 generalized further to the case in which two different sequences of base-rate information were provided. We have argued that the reduction of cognitive resources via our load manipulation conferred an advantage on the easier-to-process and early-presented information and reduced the thoroughness of processing the more-difficult-to-process and later-appearing information; this may have prevented participants from recognizing the relevance of the later information and, hence, reduced their tendency to use it in judgment. Prior research, however, has attested that under restricted cognitive resources the easier-to-process information is utilized even though it is presented concomitantly with the moredifficult-to-process information. For instance, Petty and Cacioppo (1984) found that participants who were relatively uninvolved in the attitude issue based their attitudes on the number of arguments presented, that is, on easier-to-process information rather than on the contents of those arguments. Chun, Spiegel, and Kruglanski (2002) found that when behavior identification information was salient (and hence easier to process), it was utilized under cognitive load, whereas when it was nonsalient (hence more difficult to process), its use was
215
interfered with by load. And Trope and Gaunt (2000) found that the discounting of contextual demands from the requisite judgments was unaffected by load when the information was salient (and hence easy to process) and was interfered with by load when it was nonsalient (and hence more difficult to process). These effects are consistent with the notion that the presently observed tendency to rely on the early and/or brief information under load reflects the general propensity of individuals to base judgments on easily processed information when their processing resources are constrained. At any rate, the fact that the more complex and later-presented (versus the simpler and earlier) information, whether of the statistical or the individuating variety, tends to be relied on more in the absence of load suggests that in the typical base-rate neglect paradigm participants possess sufficient processing resources for a thorough processing of all the information provided, allowing them to glean the relevance of the later-appearing information (of whatever content) and, hence, let it affect their judgments appropriately. These findings suggest that the early demonstrations of base-rate neglect might have had little to do with the contents of statistical versus individuating information per se, but they instead may be explicable by the fact that participants in those studies had sufficient processing resources to divine the judgmental relevance of the later-appearing information.
Implications The present body of findings make two essential points: First, the experienced task demands (e.g., manipulated via the order or length of information) are crucial, and they interact with the participants’ cognitive resources to determine the judgmental impact of the information given and, second, the way processing resources matter is unrelated to the content or type of information. A counterintuitive implication of these results is that information typically regarded as intuitive or heuristic—namely information about the stereotypic characteristics of lawyers or engineers—may be utilized more than the base-rate information when the recipients’ resources are ample rather than limited, in situations in which the heuristic information is lengthier and more difficult to process than the base-rate information (as may have been the case in much classic research in the base-rate neglect paradigm). The possibility that heuristic information tends to be processed more under ample resource conditions is contrary to the usual assumption that heuristic information is generally relied upon more under conditions of restricted resources. Our findings here dovetail nicely with those in the persuasion domain (Kruglanski & Thompson, 1999a, 1999b; Pierro et al., 2004, 2005) that information which is identified as heuristic by its contents (e.g., as information related to the expertise heuristic or to the consensus heuristic) may exert greater persuasive impact under ample processing resources when its format and mode of presentation (e.g., its length and ordinal position) render it relatively difficult to process. Another counterintuitive (and ironic!) implication of our findings, already commented on, is that when the base rates are presented briefly and up front, their use may actually be enhanced by cognitive load. Therefore, information that has been regarded as representing the more appropriate, rational, or normative approach to judgment under uncertainty may— under the appropriate circumstances—actually carry greater weight when individuals’ mental capacity is restricted.
CHUN AND KRUGLANSKI
216
In this connection, it is, finally, important to note that none of the variables revealed (in the present as well as past research) to affect base-rate utilization are unique to base rates per se. Other informational types, manipulated in different research, have exhibited similar effects. Thus, Pierro et al. (2004) found that the potential relevance of both the consensus heuristic and message arguments was only appreciated when the individuals’ motivational resources were adequate to cope with the difficulty of processing each of these two types of information (manipulated via presentation order). As a consequence, each of these information types exerted the judgmental effects commensurate with its relevance only in conditions in which its processing difficulty was combined with motivation of a sufficient magnitude to overcome the hardship involved. The present work thus suggests that to understand the conditions under which any information given affects judgment requires the consideration of several parameters intersecting in a specific situation. These parameters include, first, the potential relevance (or diagnosticity) to the individual of such information; second, the task demands that determine the ease or difficulty of appreciating such relevance (e.g., as represented by information’s salience, its length, or its ordinal position); and, third, the processing resources (cognitive and motivational) that determine individuals’ ability to glean such relevance from the complex of stimuli by which they are confronted (for a discussion, see Kruglanski, Erb, Pierro, Mannetti, & Chun, in press).
References Ajzen, I. (1977). Intuitive theories of events and the effects of base rate information on prediction. Journal of Personality and Social Psychology, 35, 303–314. Albarracin, D., Johnson, B. T., & Zanna, M. P. (2005). The handbook of attitudes. Mahwah, NJ: Erlbaum. Bar-Hillel, M. (1980). The base-rate fallacy in probability judgments. Acta Psychologica, 44, 211–233. Bar-Hillel, M. (1983). The base-rate fallacy controversy. In R. W. Scholz (Ed.), Decision making under uncertainty (pp. 39 – 61). Amsterdam, the Netherlands: Elsevier. Bar-Hillel, M. (1990). Back to base-rates. In R. M. Hogarth (Ed.), Insights in decision making (pp. 200 –216). Chicago: University of Chicago Press. Borgida, E., & Brekke, N. (1980). The base rate fallacy in attribution and prediction. In J. H. Harvey, W. J. Ickes, & R. F. Kidd (Eds.), New directions in attribution research (Vol. 3, pp. 63–95). Hillsdale, NJ: Erlbaum. Brewer, M. B. (1988). A dual process model of impression formation. In T. K. Srull & R. S. Wyer (Eds.), Advances in social cognition (Vol. 1, pp. 1–36), Hillsdale, NJ: Erlbaum. Brewer, M. B., Feinstein, A. S., & Harasty, A. S. (1999). Dual processes in the cognitive representation of persons and social categories. In S. Chaiken & Y. Trope (Eds.), Dual process theories in social psychology (pp. 255–270). New York: Guilford Press. Chaiken, S., Liberman, A., & Eagly, A. H. (1989). Heuristic and systematic processing within and beyond the persuasion context. In J. S. Uleman & J. A. Bargh (Eds.), Unintended thought (pp. 212–252). New York: Guilford Press. Chaiken, S., & Trope, Y. (Eds.). (1999). Dual-process theories in social psychology. New York: Guilford Press. Christensen-Szalanski, J. J. J., & Beach, L. R. (1982). Experience and the base rate effect. Organizational Behavior and Human Performance, 29, 270 –278. Chun, W. Y., Spiegel, S., & Kruglanski, A. W. (2002). Assimilative behavior identification can also be resource dependent: A unimodel
perspective on personal-attribution phases. Journal of Personality and Social Psychology, 83, 542–555. Erb, H.-P., Kruglanski, A. W., Chun, W. Y., Piero, A., Mannetti, L., & Spiegel, S. (2003). Searching for commonalities in human judgement: The parametric unimodel and its dual mode alternatives. European Review of Social Psychology, 14, 1– 47. Fischhoff, B., Slovic, P., & Lichtenstein, S. (1979). Subjective sensitivity analysis. Organizational Behavior & Human Performance, 23, 339 –359. Fiske, S. T., Lin, M., & Neuberg, S. L. (1999). The continuum model: Ten years later. In S. Chaiken & Y. Trope (Eds.), Dual process theories in social psychology (pp. 231–254). New York: Guilford Press. Fiske, S. T., & Neuberg, S. L. (1990). A continuum model of impression formation, from category-based to individuating processes: Influences of information and motivation on attention and interpretation. In M. P. Zanna (Ed.), Advances in experimental social psychology (Vol. 23, pp. 1–74). New York: Academic Press. Fiske, S. T., & Taylor, S. E. (1984). Social cognition. Reading, MA: Addison-Wesley. Fiske, S. T., & Taylor, S. E. (1991). Social cognition. New York: McGrawHill. Gilbert, D. T., & Hixon, J. G. (1991). The trouble of thinking: Activation and application of stereotypic beliefs. Journal of Personality and Social Psychology, 60, 509 –517. Ginossar, Z., & Trope, Y. (1987). Problem solving in judgment under uncertainty. Journal of Personality and Social Psychology, 52, 464 – 476. Grice, H. P. (1975). Logic and conversation. In P. Cole & J. L. Morgan (Eds.), Syntax and semantics: Vol. 3. Speech acts (pp. 41–58). New York: Academic Press. Higgins, E. T. (1996). Knowledge activation: Accessibility, applicability and salience. In E. T. Higgins & A. W. Kruglanski (Eds.), Social psychology: A handbook of basic principles (pp. 133–168). New York: Guilford Press. Hilton, D. J. (1995). The social context of reasoning: Conversational inference and rational judgment. Psychological Bulletin, 118, 248 –271. Hilton, D. J., & Slugoski, B. R. (2001). Conversational processes in reasoning and explanation. In A. Tesser & N. Schwarz (Eds.), Blackwell handbook of social psychology: Vol. 1. Intrapersonal processes (pp. 181–206). Oxford, England: Blackwell. Kahneman, D. (2003). A perspective on judgment and choice: Mapping bounded rationality. American Psychologist, 58, 697–720. Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. Psychological Review, 80, 237–251. Krosnick, J. A., Li, F., & Lehman, D. R. (1990). Conversational conventions, order of information acquisition, and the effect of base rates and individuating information on social judgments. Journal of Personality and Social Psychology, 59, 1140 –1152. Kruglanski, A. W., Erb, H. P., Pierro, A., Mannetti, L., & Chun. W. Y. (in press). On parametric continuities in the world of binary either ors. Psychological Inquiry. Kruglanski, A. W., Pierro, A., Mannetti, L., Erb, H. P., & Spiegel, S. (in press). Persuasion according to the unimodel. Journal of Communication Research. Kruglanski, A. W., & Thompson, E. P. (1999a). The illusory second mode or, the cue is the message. Psychological Inquiry, 10, 182–193. Kruglanski, A. W., & Thompson, E. P. (1999b). Persuasion by a single route: A view from the unimodel. Psychological Inquiry, 10, 83–110. Lynch, J. G., & Ofir, C. (1989). Effects of cue consistency and value on base-rate utilization. Journal of Personality and Social Psychology, 56, 170 –181. Manis, M., Dovalina, I., Avis, N. E., & Cardoze, S. (1980). Base rates can effect individual predictions. Journal of Personality and Social Psychology, 38, 231–248. Meehl, P. E., & Rosen, A. (1955). Antecedent probability and the effi-
USE OF BASE-RATE AND INDIVIDUATING INFORMATION ciency of psychometric signs, patterns, or cutting scores. Psychological Bulletin, 52, 194 –216. Nisbett, R. E., & Borgida, E. (1975). Attribution and the psychology of prediction. Journal of Personality and Social Psychology, 32, 932–943. Nisbett, R. E., Borgida, E. E., Crandall, R., & Reed, H. (1976). Popular induction: Information is not always informative. In J. S. Carroll & J. W. Payne (Eds.), Cognition and social behavior (pp. 23–45). Potomac, MD: Erlbaum. Petty, R. E., & Cacioppo, J. T. (1984). The effects of involvement on response to argument quantity and quality: Central and peripheral routes to persuasion. Journal of Personality and Social Psychology, 46, 69 – 81. Petty, R. E., & Cacioppo, J. T. (1986). The elaboration likelihood model of persuasion. In L. Berkowitz (Ed.), Advances of experimental social psychology (Vol. 19, pp. 123–205). San Diego, CA: Academic Press. Pierro, A., Mannetti, L., Erb, H. P., Spiegel, S., & Kruglanski, A. W. (2005). Informational length and order of presentation as determinants of persuasion. Journal of Experimental Social Psychology, 41, 458 – 469. Pierro, A., Mannetti, L., Kruglanski, A. W., & Sleeth-Keppler, D. (2004). Relevance override: On the reduced impact of cues under high motivation conditions of persuasion studies. Journal of Personality and Social Psychology, 86, 252–264.
217
Schwarz, N., Strack, F., Hilton, D., & Naderer, G. (1991). Base rates, representativeness, and the logic of conversation: The contextual relevance of “irrelevant” information. Social Cognition, 9, 67– 84. Trope, Y., & Gaunt, R. (2000). Processing alternative explanations of behavior: Correction of integration? Journal of Personality and Social Psychology, 79, 344 –354. Trope, Y., & Ginossar, Z. (1988). On the use of statistical and nonstatistical knowledge: A problem solving approach. In D. Bar-Tal & A. W. Kruglanski (Eds.), The social psychology of knowledge (pp. 209 –230). Cambridge, England: Cambridge University Press. Tversky, A., & Kahneman, D. (1974, September 27). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124 –1131. Tversky, A., & Koehler, D. J. (1994). Support theory: A nonextensional representation of subjective probability. Psychological Review, 101, 547–567.
Received April 12, 2005 Revision received October 26, 2005 Accepted November 4, 2005 䡲
USE OF BASE-RATE AND INDIVIDUATING INFORMATION ciency of psychometric signs, patterns, or cutting scores. Psychological Bulletin, 52, 194 –216. Nisbett, R. E., & Borgida, E. (1975). Attribution and the psychology of prediction. Journal of Personality and Social Psychology, 32, 932–943. Nisbett, R. E., Borgida, E. E., Crandall, R., & Reed, H. (1976). Popular induction: Information is not always informative. In J. S. Carroll & J. W. Payne (Eds.), Cognition and social behavior (pp. 23–45). Potomac, MD: Erlbaum. Petty, R. E., & Cacioppo, J. T. (1984). The effects of involvement on response to argument quantity and quality: Central and peripheral routes to persuasion. Journal of Personality and Social Psychology, 46, 69 – 81. Petty, R. E., & Cacioppo, J. T. (1986). The elaboration likelihood model of persuasion. In L. Berkowitz (Ed.), Advances of experimental social psychology (Vol. 19, pp. 123–205). San Diego, CA: Academic Press. Pierro, A., Mannetti, L., Erb, H. P., Spiegel, S., & Kruglanski, A. W. (2005). Informational length and order of presentation as determinants of persuasion. Journal of Experimental Social Psychology, 41, 458 – 469. Pierro, A., Mannetti, L., Kruglanski, A. W., & Sleeth-Keppler, D. (2004). Relevance override: On the reduced impact of cues under high motivation conditions of persuasion studies. Journal of Personality and Social Psychology, 86, 252–264.
217
Schwarz, N., Strack, F., Hilton, D., & Naderer, G. (1991). Base rates, representativeness, and the logic of conversation: The contextual relevance of “irrelevant” information. Social Cognition, 9, 67– 84. Trope, Y., & Gaunt, R. (2000). Processing alternative explanations of behavior: Correction of integration? Journal of Personality and Social Psychology, 79, 344 –354. Trope, Y., & Ginossar, Z. (1988). On the use of statistical and nonstatistical knowledge: A problem solving approach. In D. Bar-Tal & A. W. Kruglanski (Eds.), The social psychology of knowledge (pp. 209 –230). Cambridge, England: Cambridge University Press. Tversky, A., & Kahneman, D. (1974, September 27). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124 –1131. Tversky, A., & Koehler, D. J. (1994). Support theory: A nonextensional representation of subjective probability. Psychological Review, 101, 547–567.
Received April 12, 2005 Revision received October 26, 2005 Accepted November 4, 2005 䡲
Journal of Personality and Social Psychology 2006, Vol. 91, No. 2, 218 –231
Copyright 2006 by the American Psychological Association 0022-3514/06/$12.00 DOI: 10.1037/0022-3514.91.2.218
Everyday Magical Powers: The Role of Apparent Mental Causation in the Overestimation of Personal Influence Emily Pronin
Daniel M. Wegner and Kimberly McCarthy
Princeton University
Harvard University
Sylvia Rodriguez Princeton University These studies examined whether having thoughts related to an event before it occurs leads people to infer that they caused the event— even when such causation might otherwise seem magical. In Study 1, people perceived that they had harmed another person via a voodoo hex. These perceptions were more likely among those who had first been induced to harbor evil thoughts about their victim. In Study 2, spectators of a peer’s basketball-shooting performance were more likely to perceive that they had influenced his success if they had first generated positive visualizations consistent with that success. Observers privy to those spectators’ visualizations made similar attributions about the spectators’ influence. Finally, additional studies suggested that these results occur even when the thought-about outcome is viewed as unwanted by the thinker and even in field settings where the relevant outcome is occurring as part of a live athletic competition. Keywords: magical beliefs, causal inference, self-perception, apparent mental causation, conscious will
Most of us make our way through life without any magical powers. Unlike Harry Potter, Superman, or other characters in fantasy, we find we can barely get the lid off the peanut butter jar, let alone levitate a villain or produce a banquet at the wave of a wand. There are some circumstances, though, in which we do find ourselves doing rather remarkable things. Every so often, we may learn that someone we have wished ill actually has become ill, for instance, or that the sports team for which we are cheering has in fact gone and won the game. When such things happen, although we are far from causal, we may nonetheless experience a sense of authorship—a feeling that we caused the events we had imagined. This feeling need not be particularly magical, however, as it may arise from the normal processes by which we infer the operation of our own causal influence in the world. The present experiments were designed to examine whether and when such experiences of everyday magical powers might arise.
Magical Thinking in Everyday Life Belief in the ability to influence events at a distance with no known physical explanation has been termed magical thinking (e.g., Eckblad & Chapman, 1983; Nemeroff & Rozin, 2000; Woolley, 1997; Zusne & Jones, 1989). Perhaps because this definition focuses on striking departures from normative reasoning, many of us would probably deny believing in magic. But many would not be surprised to learn that magical thinking has been found among people living in tribal cultures (e.g., Golden, 1977), people experiencing psychosis from schizophrenia or bipolar disorder (e.g., Thalbourne & French, 1995), and young children who have yet to learn the principles of science (Piaget, 1929; Woolley, 1997). What people probably would be surprised to learn is that glimmers of magical thinking appear even in ordinary people and circumstances when events conspire to promote it. Research has shown several manifestations of everyday magical thinking. Studies by Rozin and colleagues have shown, for example, that people hold magical beliefs about contagion and contamination that can lead them to decline consuming a glass of juice that once had a sterilized roach in it or to decline sipping sugar water arbitrarily labeled “Sodium Cyanide” (Rozin, Millman, & Nemeroff, 1986; see Nemeroff & Rozin, 2000, for a review). Subbotsky (2004) has found that people sometimes behave as though they fear the operation of magical forms of causation. For, example, people are reluctant to put their hand in a box when the suggestion is made that the box could cause harm to their hand, albeit through no known physical mechanism. Finally, superstition and magical thinking are observed in circumstances involving stressful and uncertain events. For example, college athletes show superstitious behaviors in sports competitions (Bleak & Frederick,
Emily Pronin, Department of Psychology, Princeton University; Daniel M. Wegner and Kimberly McCarthy, Department of Psychology, Harvard University; Sylvia Rodriguez, Department of Psychology, Princeton University. This research was supported by National Institute of Mental Health Grant MH-63524 to Emily Pronin and Grant MH-49127 to Daniel M. Wegner as well as by Princeton University start-up funds to Emily Pronin and Harvard College Research Program funds to Kimberly McCarthy. We thank Steven Koh, Alyssa Knotts, Awilda Mendez, Manish Pakrashi, Patrice Ryce, Joshua Savage, Luke Stoeckel, and Lyle Williams for research assistance. We thank Nicholas Epley for helpful discussions about this research. Correspondence concerning this article should be addressed to Emily Pronin, Department of Psychology, Princeton University, Green Hall, Princeton, NJ 08540. E-mail:
[email protected] 218
EVERYDAY MAGICAL POWERS
1998; Ciborowski, 1997), and inhabitants of war zones report magical beliefs about their personal safety (Keinan, 1994). Research on magical thinking has offered a number of theoretical explanations for why it is that magical thinking is found among individuals who are well versed in concepts of physical science. One set of explanations, building on primitive laws of contagion and similarity (Frazer, 1890/1959; Mauss, 1902/1972), suggests that people act as though they believe in those laws. An implicit belief in these laws leads people to behave, in spite of a rational analysis to the contrary, as though physical contact between objects leads to the transfer of “essence” between the objects and as though the transfer of such essence creates a connection between the two entities, thus, for example, making us unwilling to wear a washed garment once worn by a Nazi (Rozin et al., 1986; Rozin & Nemeroff, 2002). But why might people act as though they subscribe to magical beliefs, such as beliefs in the laws of contagion and similarity, even though these beliefs defy rational, scientific analysis? Developmental psychologists (e.g., Piaget, 1929; Subbotsky, 2000, 2004; Woolley, 1997; also Freud, 1913/1950) have suggested that magical thinking could be a holdover from infancy when scientific conceptions of causality are less well understood (and less culturally ingrained). Consistent with the idea that reliance on scientific explanations rather than magical ones increases with psychological development, adults often refuse to verbally endorse magical beliefs even though, like young children, they may behave as though they hold those beliefs (Nemeroff & Rozin, 2000; Subbotsky, 2004). Magical beliefs could also be the result of common cognitive errors involving the use of mental shortcuts, or heuristics (e.g., Gilovich, Griffin, & Kahneman, 2002; Kahneman, Slovic, & Tversky, 1982). One such heuristic involves the inference that the conceptual similarity of two events implies that one caused the other. This heuristic, known as the representativeness heuristic (Kahneman & Tversky, 1973), the resemblance criterion (Nisbett & Ross, 1980), or the assumption that likeness implies likelihood (Shweder, 1977) can lead to magical beliefs (e.g., “the sun is yellow and the sky is blue, so together they make the grass green”). It can also lead to rational inferences (e.g., “the paint can is green and the wet puddle on the floor is green, so the puddle must be from the paint”). This account of magical thinking thus allows for the same cognitive processes to govern both magical beliefs and commonplace causality assessments. Coming from a more motivational perspective, another explanation for magical thinking suggests that it occurs, particularly in times of uncertainty or stress, to serve a motivational need for control. Support for this explanation comes from studies showing that people display signs of magical thinking when they are faced with a combination of uncertainty about an outcome and a desire for control over that outcome (e.g., Bleak & Frederick, 1998; Friedland, Keinan, & Regev, 1992; Keinan, 1994, 2002; Matute, 1994). For example, magical thinking has been documented among people such as inhabitants of Germany in the interwar period living in an environment of high unemployment and political instability (Padgett & Jorgensen, 1982), police officers with jobs that put them in dangerous situations (Corrigan, Pattison, & Lester, 1980), HIV-infected men lacking desired agency over their health (S. E. Taylor, Kemeny, Reed, Bower, & Gruenewald, 2000), and even lottery players who possess “illusions of control”
219
regarding their ability to influence chance gambles (Langer, 1975). Even when people recognize that control over life events may be impossible to achieve, magical beliefs may arise out of a motivation to find “meaning” in that which they cannot control (Pepitone & Saffiotti, 1997).
Perceptions of Magical Powers The foregoing review of research on magical thinking provides evidence for some forms of magical thinking and also for how and why such thinking might arise. In so doing, it also sheds light on our present concerns about how people might come to believe in their own magical powers. Previous research on magical thinking has suggested that people may act as though they believe they possess magical powers even when they might rationally deny that belief. It further has suggested that belief in magical powers could be traceable to basic cognitive errors involving the perception of causal relationships when only noncausal associations are present. It also has suggested a more motivational explanation for magical perceptions of control, involving a need to perceive oneself as able to attain desired outcomes in uncontrollable situations. With this prior theorizing in mind, we now explore in more detail a theory for how everyday magical powers could emerge. The belief that one has exercised magical powers necessarily involves erroneous perceptions of one’s own actions. Because human action often originates from mental processes and contextual cues that are beyond conscious awareness (Nisbett & Wilson, 1977; Wegner & Bargh, 1998), people’s perceptions of the origin of their own actions are frequently subject to error (Nisbett & Ross, 1980; Nisbett & Wilson, 1977). Rather than pleading ignorance regarding the causes of their actions, however, people may infer that they have personally caused or willed an action whenever they draw a causal inference linking themselves to the action (Wegner, 2002, 2003; Wegner & Wheatley, 1999). Consider the experience of thinking ill of someone just before that person falls victim to an unpleasant fate. One grumbles bitterly about Grandma just before she falls and breaks a hip or expresses anger toward the greedy landlord the day before he is arrested for tax evasion. Without a shred of evidence about who caused their troubles, one may still feel implicated. Returning to our analysis of the prior literature, this sense may range from an outright belief in personal responsibility for the bad outcome to a nagging feeling of responsibility that persists despite the rational belief that one is not actually responsible. In any case, the inference involves an erroneous perception of causality: The occurrence of an “evil thought” before a conceptually related negative event induces a sense of authorship for that event. It is also worth noting that the inference does not appear to be motivated by a desire to have attained the relevant outcome; the occurrence of an unwanted outcome (such as Grandma’s fall) could also elicit feelings of responsibility (and even accompanying feelings of guilt over that responsibility). On a more positive note, consider the experience of rooting for a favorite basketball team at its home stadium. A person watching with fingers crossed, while silently reciting a mantra for the team’s least reliable free-throw shooter, may feel deserving of some credit when the shot gracefully falls through the net. The point is similar to the one already stated: Generating consistent thoughts related to an event just prior to its occurrence may be sufficient to induce feelings of authorship for the event.
220
PRONIN, WEGNER, MCCARTHY, AND RODRIGUEZ
Apparent Mental Causation The hypothesis we propose is that belief in magical powers can arise when individuals infer that they have personally caused events on the basis of perceptions of the relation between their thoughts and subsequent events. No prior studies on magical thinking have examined the role of thoughts in magical perceptions of influence. The idea that relevant thoughts could elicit feelings of personal causality does relate, however, to more general theorizing about causal inference processes. Causal theorists have emphasized the role of perceived covariance and of perceived consistency in causal inference (e.g., Hume, 1739/2000; Kahneman & Tversky, 1973; Kelley, 1972; Nisbett & Ross, 1980). The theory of “apparent mental causation” (Wegner, 2002; Wegner & Wheatley, 1999) combines these theoretical principles involving inferences about causation to offer the novel suggestion that these principles also explain how one makes inferences about mental causation. Having thoughts prior to an action that are consistent with that action, and that occur in the absence of other obvious causes, can lead one to infer that he or she caused the action. This hypothesis about perceived mental causation begins with the idea that inferences about one’s own causal power can arise in the same way as inferences about physical causation. People infer that a particular physical event has caused an effect if it appears closely prior to the effect, is consistent with the effect, and appears exclusive of alternative causes of the effect (Alloy & Tabachnik, 1984; Einhorn & Hogarth, 1986; Michotte, 1946/1963). Perceptions of personal agency could similarly arise from apparent mental effects. Although thoughts are not the only possible source of information about personal authorship of action— other sources could include proprioceptive, positional, visual, and environmental cues—the fact that people often think about actions before performing them makes prior thought a regular and important cue to authorship (Aarts, Custers, & Wegner, 2005; Wegner & Sparrow, 2004; Wegner, Sparrow, & Winerman, 2004). Observing associations between external actions and one’s own thoughts, desires, and intentions may thus sometimes lead to the incorrect inference that one has somehow caused the actions to occur. This error in causal inference may underlie “illusions of control” (Langer, 1975; Matute, 1996; Thompson, Armstrong, & Thomas, 1998), in which people overestimate their causal impact on chance events by conceptualizing them as attributable to their own influence (although, it should be noted, no studies on the illusion of control have measured perceptions of the influence of thoughts). People may be overly prone to entertain their thoughts, desires, and intentions as possible causes of action because these mental experiences are salient to them and, perhaps consequently, overweighted in terms of their diagnostic relevance (Jones & Nisbett, 1972). People tend to show an “introspection illusion” (Pronin, Gilovich, & Ross, 2004), whereby they treat introspective information about their thoughts, intentions, and motives as a sovereign source of self-understanding. This has been shown to contribute to our concluding that we have not engaged in actions that we did not think about or intend to engage in (e.g., actions involving self-serving commissions of bias). It could also contribute to our concluding that we have engaged in actions that we did think about or intend to engage in. Consider again the case of Grandma and her hip-breaking fall. If, prior to her fall, one was overcome by momentary but mean-
spirited thoughts about her, the present theorizing suggests that these thoughts could induce in that person a feeling of responsibility for her fall. Let’s imagine, though, that instead of grumbling about poor Grandma before her fall, one instead heard his or her cousin express irritation with her. In that case, a person might feel that Grandma’s fall was somehow the cousin’s fault, and that he should not have had such evil thoughts about Grandma. Inferences about personal agency that arise from prior and consistent mental activity may not be limited to perceptions of one’s own agency. They may typically involve ourselves as the relevant agent, though, as it is our own mental activity of which we are most likely to be aware.
The Present Research If people attribute authorship for action on the basis of their own thoughts, this process may explain how they come to overestimate their personal influence in a variety of happenstance events. People may come to believe in the effectiveness of occult machinations such as voodoo curses, as well as in the influence of fan support in sports, by inferring that thoughts consistent with events are responsible for the events when these events occur. Our studies investigated these two examples. We examined the effects of individuals’ private thoughts on their perceived influence on external outcomes involving physical health symptoms (Study 1) and athletic performance (Study 2). Our first study tested whether belief in having harmed another person via a voodoo curse indeed arises when individuals are led to have prior thoughts that are consistent with the harm. Our second study tested whether belief in having helped another person via one’s spectatorship arises when people are led to have prior thoughts consistent with the help. It also explored whether an observer privy to these prior thoughts would arrive at the same belief. In Study 3, we examined whether this spectatorship effect would occur in a field setting. Study 4 was a correlational study that looked at whether these spectatorship effects would extend to people observing an outcome that they perceived as unwanted.
Study 1: The Witch Doctor’s Voodoo Curse This study tested whether college students might come to believe that they had caused another person pain through a voodoo curse when they had thoughts about the person consistent with such harm. Experimental participants assumed the role of “witch doctor” in an ostensible voodoo enactment involving a confederate as their “victim.” To examine the influence of evil thoughts about the victim, we arranged for participants to encounter either a victim who was offensive or one who was neutral. After this encounter, participants were instructed to stick pins in a voodoo doll representing the victim, in the victim’s presence. The victim subsequently responded by reporting a slight headache, and participants were queried about their reactions to this symptom. This paradigm allowed for the investigation of whether participants who think ill of a “victim” are more likely than neutral-thinking participants to perceive that they caused the victim’s harm. We did not predict that our evil-thinking participants would feel more guilt, regret, and related negative affect, however, because we suspected that the victim’s ill fate would seem deserved on account of his offensive personality and behavior.
EVERYDAY MAGICAL POWERS
Method Participants. Thirty-six individuals (16 men and 20 women) were randomly assigned to either the neutral thoughts condition or the evil thoughts condition. Participants were Harvard summer school students or other residents of Cambridge responding to participant recruitment flyers. Procedure. The experimenter greeted the participant and confederate (a 22-year-old man) in a waiting area and escorted them to the laboratory. She seated them at a table, distinguished only by a handmade twig-andcloth voodoo doll lying on it, and asked them to read and sign a sheet indicating informed consent. She explained that the experiment concerned “psychosomatic symptoms, physical health symptoms that result from psychological factors” and that the study was “investigating this question in the context of Haitian Voodoo.” (Although genuine Haitian Voodoo does not involve dolls, they were used here to conform to participants’ expectations about voodoo practice.) For background, the experimenter furnished both individuals with an abridged version of Cannon’s (1942) “Voodoo” Death. This scientific account of how voodoo curses might impact physical health (i.e., by inducing fear-associated psychological stress and acute hypotensive shock on the part of the intended victim) was included to bolster the plausibility of curse effects. It was during these initial stages of the procedure that the experimental manipulation was delivered. In the condition designed to induce evil thoughts, the confederate arrived at the experiment 10 min late, thus keeping the participant and experimenter waiting. (If the participant was late, the confederate arranged to be even later.) When the experimenter politely commented that she was really glad he made it, as she was beginning to worry, he muttered (with apparent condescension): “What’s the big deal?” He wore a T-shirt emblazoned with the phrase Stupid people shouldn’t breed, and he chewed gum with his mouth open. When the experimenter informed the participant and confederate that they had been given an extra copy of the consent form “to keep,” the confederate crumpled up his copy and tossed it toward the garbage can; he missed, shrugged, and left it on the floor. Finally, while he and the participant read the “Voodoo” Death article, he slowly rotated his pen on the tabletop, making a noise just noticeable enough to be grating. Postexperimental interviews indicated that participants in the evil thoughts condition indeed picked up on many of these annoyances and found themselves disliking the confederate. Although the confederate was, of course, aware of these adjustments in his behavior, he was otherwise uninformed about the study’s hypotheses. After reading “Voodoo” Death, participant and confederate were asked to pick slips from a hat to determine who would be “witch doctor” and who would be “victim.” Both slips were labeled witch doctor, but the confederate pretended that his said victim. The confederate victim was then asked to write his name on a slip of paper to be affixed to the doll. Both victim and witch doctor then completed a page entitled “Baseline Symptom Questionnaire” that asked them to indicate whether they currently had any of 26 physical symptoms (e.g., runny nose, sore muscles, headache), with space at the bottom for written elaboration. The confederate circled “No” for every symptom and elaborated with “Fine. No problems.” To ensure that the participant knew the victim’s purported health status, the experimenter verbally confirmed that he currently had no symptoms. At this point, the experimenter informed both individuals that “reported cases of voodoo” suggest that the witch doctor should have some time alone to “direct attention toward the victim, and away from external distractions” (before placing the curse by pricking the voodoo doll), and she escorted the victim from the room. The participant was then asked to generate vivid and concrete thoughts about the victim but not to say them aloud. After this minute, the experimenter returned with the victim, who was again seated across from the participant. The participant was then instructed to stick the five available pins into the doll in the locations of the “5 major weaknesses of the body: the head, the heart, the stomach, the left side, and the right side.” Once he or she was finished, and the doll was thus
221
appropriately pierced, the victim was asked to complete a second symptom questionnaire (identical to the first, but titled “Current Symptom Questionnaire”). This time, the victim invariably circled one symptom: a headache. He elaborated at the bottom of the page: “I have a bit of a headache now.” When asked to confirm this symptom, he averred with a slightly uncomfortable facial expression and the response “Yeah.” The experimenter then stated that she would like to take some time with the victim to question him in detail about his symptoms but that she would first quickly ask the witch doctor some questions about his or her experiences in the experiment and provide some debriefing information. Thus, with the victim escorted from the room, the participant was presented with the dependent measures. Dependent measures and debriefing. The participant’s questionnaire began by stating that one needed to complete it only if the victim reported physical health symptoms during the experiment (otherwise, it stated that a subject number atop the page would suffice). Attached to the page were the victim’s two symptom questionnaires. Our primary measure consisted of three items probing for participants’ feelings and beliefs about whether they harmed the victim (Cronbach’s ␣ ⫽ .83). These were “Did you feel like you caused the symptoms that the ‘victim’ reported, either directly or indirectly?” “Do you feel that your practice of voodoo affected the victim’s symptoms?” (both anchored by 1 ⫽ not at all, 5 ⫽ somewhat, 9 ⫽ yes, definitely) and “How much do you feel like you tried to harm the victim?” (1 ⫽ not very much, 5 ⫽ somewhat, 9 ⫽ very much). A secondary set of measures assessed affective responses. Participants were asked to rate their current feelings of guilt, surprise, sadness, regret, anxiety, and happiness on scales anchored at 1 (not at all) and 9 (extremely). An additional item directly dealt with perceptions of guilt, asking participants, “Do you feel that sticking the pins in the doll was a bad thing to do?” (1 ⫽ not at all, 5 ⫽ somewhat, 9 ⫽ yes, definitely). As a manipulation check on whether participants had generated appropriately malevolent or neutral thoughts, two final items were included (Cronbach’s ␣ ⫽ .86): “Did any negative thoughts about the victim pop into your head during the minute you had to yourself before the voodoo exercise?” and “Did you have any negative thoughts toward the victim before (or while) you did the pin pricks?” (both anchored by 1 ⫽ definitely not, 5 ⫽ somewhat, 9 ⫽ definitely yes). To probe for accurate suspicions that might render a participant’s data invalid, the experimenter preceded the debriefing by asking, “Do you think there was anything in this experiment that was not what it seemed?” In response to this probe, 5 participants (2 in the neutral thoughts condition, and 3 in the evil thoughts condition) accurately suspected that the victim was a confederate and/or had been told to report a headache, and they were thus excluded from further analyses. Finally, the participant was thoroughly debriefed about our hypotheses and deceptions, and the reasons for both, and was given course credit or monetary payment.
Results Evil thinking. The evil thoughts condition successfully led participants to think ill of their victim. Participants in the evil thoughts condition reported more negative thoughts about the victim (M ⫽ 5.00) than did those in the neutral thoughts condition (M ⫽ 2.19). This difference was significant according to Welch’s analysis of variance (ANOVA), F(1, 30) ⫽ 13.52, p ⫽ .001, 2 ⫽ .31. Welch’s ANOVA was used because Levene’s test indicated inequality of variances, F(1, 30) ⫽ 22.46, p ⬍ .0001. Perceived causality. As predicted, the participants led to generate evil thoughts about their victim were more likely than the neutral-thinking participants to believe that they caused his headache. On our three-item measure of feelings and beliefs about causing harm to the victim, participants felt more responsible for the harm if they had first generated evil thoughts (M ⫽ 3.94) rather
222
PRONIN, WEGNER, MCCARTHY, AND RODRIGUEZ
than neutral thoughts (M ⫽ 2.02), F(1, 30) ⫽ 5.29, p ⫽ .03, 2 ⫽ .20. These feelings of responsibility were apparent on each of the individual items in the composite, indicating that evil-thinking participants were more likely to feel that they had tried to harm their victim and also that they had in fact caused such harm (Fs ranged from 3.58 to 6.17, p-values ranged from .02 to .07). The presence of evil thoughts was related to perceptions of causing the harm across all participants, as the correlation between the summed manipulation check items and the summed measures of causation was substantial, r(29) ⫽ .38, p ⫽ .03. Supplemental (affective) responses. Participants’ affective reactions revealed no signs of guilt or negative affect, despite their sense of having harmed their victim. Factor analysis (with varimax rotation) of the affective responses revealed two factors, one involving guilt (i.e., guilt, sadness, regret, anxiety), and the other involving pleasant surprise (i.e., happiness, surprise). Participants prompted to think evil thoughts reported no more guilt than those prompted to think more neutrally (F ⬍ 1). Rather, they reported more pleasant surprise, F(1, 29) ⫽ 6.16, p ⫽ .02, 2 ⫽ .18. In addition, the item asking whether participants felt that they had done a “bad thing” revealed no differences between the two thoughts conditions (F ⬍ 1). Perhaps participants saw the victim’s headache as a just reward for his unpleasant behavior, and so they were not upset at having caused him pain.
Follow-Up Study of Instructed Thoughts Two aspects of these findings prompted us to conduct a follow-up experiment in this paradigm. First, we were concerned that the manipulation of negative thoughts, although effective as assessed by the manipulation checks, could also have been a manipulation of negative affect. It might have been that the negative feelings toward the victim engendered by his untoward behavior were the active ingredient that enhanced participants’ feelings of causality in harming him, and we were curious whether enhanced causality would be found if participants’ evil thoughts were manipulated without such an affective instigation. Second, we suspected that if participants did not actually dislike the victim, and yet felt that they had caused the victim harm, they might express the guilt for their action that was not observed in this study. The follow-up study set aside the manipulation of the victim’s behaviors as a way of inducing evil thoughts, and instead manipulated evil thoughts directly through verbal instructions. Participants (61 Harvard undergraduates) were randomly assigned to direct their attention toward a female victim either by thinking about her “worst possible fate” and reciting an evil chant about her (in the condition designed to induce evil thoughts) or by thinking about “what she may be like” and reciting a benign verse about her (in the condition designed to induce neutral thoughts). As in Study 1, they then stuck pins in a doll representing the victim, and she responded with “a headache.” Participants completed the same measures of their feelings of responsibility and/or causality and of their affective state in response to the victim’s reported symptom. Again, some participants were excluded from analyses because they were accurately suspicious (n ⫽ 9). Others were excluded because they failed to respond to the instructional manipulation— that is, they reported no negative thoughts in the evil-thoughts
condition or highly negative thoughts in the neutral-thoughts condition (n ⫽ 8). The results of the follow-up revealed that participants who followed instructions to have evil thoughts about a victim, as compared with those who followed instructions to have neutral thoughts, felt more responsibility for her pain on our index of three items probing for feelings and beliefs about having caused harm to the victim (Ms ⫽ 3.94 vs. 2.76), F(1, 42) ⫽ 5.14, p ⫽ .03, 2 ⫽ .11. In addition, however, our affective measures indicated that the malevolent-thinking participants felt no more happiness and surprise than the neutral-thinking participants (Ms ⫽ 3.89 vs. 3.45), F ⬍ 1, whereas they did feel more guilt and negative affect than those participants (Ms ⫽ 4.60 vs. 3.24), F(1, 42) ⫽ 4.05, p ⫽ .05, 2 ⫽ .09. Although the influence of the instructional manipulation in the follow-up study was not as strong on the cognitive measure as was the behavioral manipulation in the main experiment, participants apparently felt more guilt and negative affect about what they did in the case of the instructional study, perhaps because their evil thoughts about the victim were unjustified by any untoward behavior on her part.
Discussion This study found that participants who had been induced to think ill of their victim were likely to feel that they had caused the victim’s symptoms and that their practice of voodoo had affected these symptoms. Those in a control condition that did not elicit ill thoughts were less likely to hold these beliefs. This difference between the two conditions is striking given that participants in both of the conditions observed the same correlation between their actions and the victim’s symptoms. Given the population of students sampled for our study, though, it is not at all surprising that this mere correlation was not sufficient to induce a large proportion of them to believe that they had just placed a voodoo hex (even given their reading of Walter Cannon’s article). What is surprising, however, is that they were more inclined to believe in the effectiveness of voodoo when their practice of it was accompanied by ill will. In both conditions, participants placed a voodoo hex on their victim by sticking pins in a voodoo doll affixed with the victim’s name, and they observed the victim report a headache shortly thereafter. However, although participants in both conditions engaged in the same voodoo activities, those in the evil-thoughts condition felt somewhat more that they had tried to harm their victim. Perhaps their prior evil thoughts toward the rude and unpleasant confederate led them to feel more zealous about the voodoo task at hand. This could also explain why these participants experienced more pleasant surprise following the voodoo enactment. They may have viewed his suffering as a just punishment for his bad behavior. This experiment involved causal inferences elicited by the awareness of one’s malevolent thoughts toward someone before harm befalls this person. But is this result limited to causal perceptions deriving from negative thoughts? Perhaps the perception that one has personally influenced a relevant outcome could also derive from positive thoughts, such as “healing thoughts” directed toward an ailing loved one or “hopeful thoughts” directed toward a friend in need. Perhaps the sports spectator’s inner cheer operates
EVERYDAY MAGICAL POWERS
in the same way, leading the spectator to feel a bit of everyday magical power when the game goes as hoped.
Study 2: The Spectator’s Inner Cheer The inspiration for this experiment came from sporting-event spectators who perceive themselves as playing a role in their team’s performance even when their own participation involves nothing more than intense thoughts of hope or confidence, or perhaps a little armchair coaching. We sought to test the hypothesis that success thoughts directed toward a target (a basketball shooter) before his successful performance would lead to the perception of having influenced that performance. This experiment included a set of observer conditions designed to test whether observers, aware of actor participants’ thoughts, would arrive at the same causal conclusions as those actors. Some hints from prior research led us to suspect that they might. Specifically, this work has suggested that when observers are privy to the same internal information (such as thoughts, feelings, and intentions) as actors, they may come to the same conclusions as those actors (Buehler, Griffin, & Ross, 1994; Nisbett & Ross, 1980, Chapter 9). This experiment used a mock basketball court with a confederate shooter trained to make successful shots and pairs of participants watching the confederate take those shots. “Spectator” participants were instructed to produce thoughts (i.e., mental visualizations) that were either consistent or inconsistent with the shooter’s success, and “witness” participants were provided with access to these thought instructions. The main prediction of the study was that participants who had consistent thoughts before a series of successful shots would feel more responsibility for the success of those shots than would participants who had inconsistent thoughts before the shots. The study also tested whether a witness who had access to the spectator’s visualization instructions would similarly view the spectator as responsible for the shooter’s success.
Method Participants. One-hundred two high school and college students (29 men and 73 women, all age 18 years or older) attending Harvard summer school received course credit or monetary payment for their participation. Experimental setup. To ensure consistent success in the shooter’s performance, the role of the shooter was played by a confederate (a male undergraduate). His task involved shooting a toy basketball (10 cm in diameter) while blindfolded into a 20-cm-diameter basket that was 1.37 m away and 1.52 m high. The blindfold was used to increase the apparent difficulty of the task (so that our participants would not perceive the shooter’s success as inevitable). It was actually semitransparent, thereby allowing the confederate to shoot quite successfully. Procedure. The experimenter greeted the participants and confederate and told them that the experiment was about “the effects of spectator influence on athletic performance.” The participants and confederate were then asked to provide written informed consent, and the experiment began. The experimenter explained that the study required a shooter, a spectator, and a witness. She then stated that each of these roles would be randomly assigned using the last four digits of the participants’ social security numbers so that the participant with the lowest number would be shooter, the next-lowest spectator, and the highest witness. The 2 participants were randomly assigned to their roles via this method, with the exception that the confederate always secured the role of shooter (by offering his number last so that he could offer the lowest number of the three).
223
The experimenter next described the three roles. To the confederate, she said: “You will be playing the role of the shooter. You will attempt 8 shots and your performance will be recorded. In order to vary the difficulty of the task across participants, we will have you wear this blindfold while shooting.” To the spectator, she said: “You will play the role of the spectator. You will be asked to visualize something different before each shot. The details about each visualization are given to you in this packet. Before each shot, you will read and memorize the visualization. Then, you will close your eyes and visualize the action described.” Finally, to the witness, she said: “You will play the role of the witness. You are asked to simply observe the spectator and shooter as they perform their roles. Just to let you know everything that will be going on, please take a minute to read this packet to familiarize yourself with the visualization instructions that will be given to the spectator.” The experimenter then gave the visualization packet to the witness and asked the confederate to step to a line on the floor to be blindfolded. Once the witness finished reading, the spectator was instructed to read the first visualization. Participant pairs were randomly assigned to the consistent thoughts or inconsistent thoughts conditions. The confederate was never informed of his condition. Examples of the eight visualizations (one for each shot) provided to participants in the consistent thoughts condition included the following: (a) the shooter releases the ball and it swooshes through the net, (b) the shooter’s arm extends and the ball falls into the hoop, and (c) the shooter tosses the ball and it falls through the net. Examples of visualizations in the inconsistent thoughts condition included the following: (a) the shooter’s arm curls to lift the dumbbell to his/her shoulder, (b) the shooter’s elbow bends to lift the dumbbell to his/her shoulder, and (c) the shooter pulls the dumbbell up from thigh to shoulder level. While the spectator read these instructions, the confederate took three practice shots. He was trained to make only one of these shots, as a way of demonstrating to participants that he was not naturally brilliant at this task. The experimenter then said: The main part of the experiment will now begin. In order to be as unobtrusive as possible, I will not speak during the experiment except to cue the spectator and shooter. First I will say “OK,” which will cue the spectator to flip to the next page and begin the visualization. About 10 seconds later, I will say “shoot,” which will cue the spectator to open his/her eyes and watch, and will cue the shooter to take a shot. I will record whether the shot is successful or not for each trial. The witness will simply observe both the spectator and the shooter while they perform their tasks. The spectator and witness should keep track of how many of the 8 shots go in. After ensuring that everyone understood, the experimenter said “OK” to start the trials. The confederate was trained to make 6 of the 8 shots. Because he was not perfectly able to control his performance (he averaged 5.4 successful shots), it was recorded each time.1 After the eight trials, the shooter was told to remove his blindfold. Shooter, spectator, and witness were told that they would be interviewed before they would be dismissed. The shooter was told that his interview would be longer and that for that reason, the other participants would be interviewed first. He was asked to wait in the hall while these interviews were conducted. The participants each then received the dependent measure questionnaire. Dependent measures and debriefing. As a check on participants’ attention to the shooter’s performance, the questionnaire began by asking, “How many shots did the shooter make?” (with “____/8” as the response
1 The number of successful shots made by the confederate differed by about one half of one shot between the consistent thoughts condition (M ⫽ 5.61) and the inconsistent thoughts condition (M ⫽ 5.15), F(1, 101) ⫽ 7.16, p ⬍.01. The reported effects and their statistical significance were unaffected when the confederate’s success rate was used as a covariate.
224
PRONIN, WEGNER, MCCARTHY, AND RODRIGUEZ
stimulus).2 The next two questions constituted our measure of perceived visualization clarity (Cronbach’s ␣ ⫽ .86). These items were necessarily worded differently for the spectator than for the witness. The spectator was asked, “How clearly did you visualize each of the actions you were asked to?” and “How vivid were your visualizations?”; the witness was asked, “How clearly do you think the spectator visualized each of the actions he/she was asked to?” and “How vivid do you think the spectator’s visualizations were?” The next set of questions involved perceptions of the spectator’s influence on the shooter’s successful shots (Cronbach’s ␣ ⫽ .89). These items involved the feeling that the spectator’s thoughts influenced the shooter’s success (“Did you feel like [your/the spectator’s] thoughts influenced the success of the shooter’s shots?” “Did you feel that [your/the spectator’s] visualizations affected the shooter’s performance?”), the belief that the spectator somehow caused the shooter’s success (“How much responsibility do you think [you deserve/the spectator deserves] for the shooter’s successful shots?”; “Did [you/the spectator] cause the shooter’s successful shots?”), and the perception that the spectator had intended to influence the shooter’s success (“How much do you feel like [you/the spectator] tried to influence the shooter’s performance?”). All items, except for the question regarding the shooter’s hit rate, were accompanied by 7-point response scales (anchored at 1 ⫽ not at all and 7 ⫽ very, with the midpoint of 4 labeled somewhat). To probe for accurate suspicions that might render a participant’s data invalid, the experimenter began the debriefing with the question, “In some psychological experiments, not everything is exactly what it seems. Was there anything in this study that you thought may not have been what it seemed?” and followed this up with the questions “Did you have any suspicions about the study? If so, what specifically?” On the basis of this probe, 1 spectator in the consistent thoughts condition, 2 witnesses in the consistent thoughts condition, 3 spectators in the inconsistent thoughts condition, and 1 witness in the inconsistent thoughts condition were excluded from analyses because they correctly suspected the nature of the study’s hypothesis. (Less pointed suspicions that the shooter might be a confederate or that his blindfold might be inadequate revealed no associations with condition and/or role, and thus participants with either of those suspicions were kept in our sample.) Finally, the participant was thoroughly debriefed about our purposes, predictions, and deceptions.
Results Spectators’ causal influence. Our primary prediction was that spectators who generated thoughts consistent with the shooter’s success would feel they had more causal impact than spectators who generated thoughts irrelevant to it. We also wondered whether yoked observers privy to the spectators’ visual thoughts would also see the spectators as more responsible when they generated consistent rather than inconsistent thoughts. To test these predictions, we performed a two-way (Thought: Consistent vs. Inconsistent ⫻ Role: Spectator vs. Witness) ANOVA on our five-item spectator’s influence composite measure. It revealed that spectators and witnesses in the consistent thoughts condition attributed more causal influence to the spectator (M ⫽ 2.38) than did spectators and witnesses in the inconsistent thoughts condition (M ⫽ 1.63), F(1, 92) ⫽ 11.45, p ⫽ .001, 2 ⫽ .11. These differences were apparent for items involving the feeling that the spectator’s thoughts influenced the shooter, the belief that the spectator caused his success, and the perception that the spectator tried to influence his performance (F values ranged from 6.35 to 8.94, p-values ranged from .01 to .004). There was no significant main effect of role, and there was no significant Thought ⫻ Role interaction (Fs ⫽ 1.39, and 0.37, respectively). Visualization clarity. Visualizations that were consistent with the shooter successfully making shots were reported as clearer
(M ⫽ 5.36) than were inconsistent visualizations involving him lifting a dumbbell (M ⫽ 4.79), F(1, 92) ⫽ 6.63, p ⫽ .01, 2 ⫽ .06. Perceived clarity was also associated with perceptions of spectators’ causal influence on our measure of spectator influence, F(1, 92) ⫽ 5.53, p ⫽ .02. The results of the subsequent analyses, and their statistical significance, were not changed when visualization clarity was included as a covariate.
Discussion Spectators of a basketball shooter perceived themselves, and were perceived by others, as more responsible for the player’s success when they generated positive visualizations consistent with that success prior to its occurrence. The spectating participants themselves, as well as witnesses privy to the spectators’ visualizations, fell victim to this error in causal perception. This misperception was observable for questions involving both participants’ feelings that the spectator’s thoughts had affected the shooter and involving their belief that the spectator was somehow responsible for the shooter’s success. Apparently, generating visual thoughts about the success of the basketball-playing confederate, in his immediate presence, led to the inference that one contributed to his subsequent success. This is noteworthy given that our general conception of the way in which spectators influence athletes involves more direct and less “magical” routes than positive visualizations. Both laypeople speculating about spectator influence, and researchers studying it, focus on the importance of a visible (and often loud) fan presence in motivating athletes by reminding them that their fans care and by providing moral support (Agnew & Carron, 1994; Baumeister & Steinhilber, 1984). In the current experiment, however, participants in both thought conditions provided a visible spectator presence, and participants in neither condition engaged in any observable cheering. The difference between the two conditions involved the private thoughts being entertained in the mind of the spectator. The impact of such private thoughts on perceptions of causal influence could in part explain why people sometimes exhibit an obstinate determination to watch their favorite team play in a crucial game or to stay glued to the television, cheering as forcefully as possible, during especially critical moments of play. The experience of everyday magical powers makes people wary of cutting off their support at such key times, as they put off trips to the fridge and even avoid bathroom breaks in the pursuit of their team’s success. The present experiment examined how people feel about the spectator influence of themselves and of other people. Spectators engaged in a sort of mental cheerleading before each of the shooter’s shots, whereas witnesses were aware of the spectators’ 2 Participants’ accuracy regarding the number of shots made by the confederate was measured by taking the absolute deviation of the number of shots the participant thought the shooter successfully made from the number of shots that he actually successfully made. Overall deviation from accuracy was quite low (M ⫽ 0.27). Out of 102 participants, only 7 misestimated by more than one shot (21 misestimated by exactly one shot). There were no significant effects on accuracy of participants’ assigned thought condition or their role as spectator versus witness; there was also no interaction between these two independent variables.
EVERYDAY MAGICAL POWERS
mentation but were not instructed to mimic it. Nevertheless, witnesses of spectators who generated relevant thoughts were more likely to feel that the spectators had contributed to the shooter’s success. Witnesses seemed to be persuaded of spectators’ influence in much the same way that those spectators were—that is, by the spectators’ antecedent thoughts. The question of how a third-party observer perceives an individual’s causal influence over another person’s actions need not be investigated in as magical a context as the one here. The question has previously been asked by researchers who used the teacher– learner paradigm, in which teachers and observers assign credit for a student’s success to the teacher or to the student (Beckman, 1970; Frieze & Weiner, 1971; Johnson, Feigenbaum, & Weiby, 1964; Ross, Bierbrauer, & Polly, 1974). The present experiment suggests that teachers will overassign credit to themselves when they generate prior thoughts relevant to a student’s success (e.g., when they think of the answer to the math problem the student is computing just before the student writes it down). It also suggests that observers will similarly overassign credit to such teachers if they are made aware of the teachers’ thoughts prior to the student’s success.
Study 3: The Fan’s Thoughtful Contributions (a Field Study) The studies we have reported thus far reveal that individuals are more likely to see themselves as having influenced outcomes when they have first generated thoughts relevant to those outcomes. Our next two studies bring our interests out of the laboratory and into the field to see whether our results obtain in less-controlled but more ecologically valid settings. Study 3 involved spectators at a college basketball game. The spectators were Princeton fans attending a critical match-up against Harvard, in which the Princeton team sought both to avenge a road loss against Harvard earlier in the season and to maintain a 15-game winning streak at home against the Harvard team. The study examined whether spectators’ perceptions of their influence on the game would vary depending on an experimental manipulation of the contents of their thoughts. This manipulation led them to think about how specific key players on their school’s team could contribute to the team’s play that day. In a control condition, participants were also led to think about those same key players, but this time in terms of how each one could be identified in a crowd. Our prediction was that in the context of a live basketball game, spectators would feel more responsible for the outcome of the game if they had, before the start of it, entertained outcome-relevant thoughts about how each player could contribute to the game.
Method Participants. Participants were 67 people (31 women and 36 men) in attendance at a men’s college basketball game at Princeton University’s Jadwin Gymnasium. Their median age was 49 years (range ⫽ 18 to 85). Procedure. Before the game began, spectators who had taken their seats in the stadium were approached by one of four experimenters and asked to provide consent to participate in the study. They were then given a pencil and a two-page survey (with an initial page of instructions). The first of the two survey pages introduced our experimental manipulation. The second page, which was folded over, stapled, and sealed from the
225
participant’s view, contained our dependent measures. Participants were also provided with the verbal (and written) instruction that they “complete the first page only” and that an experimenter would return to let them know “when to continue to the stapled page.” On the visible portion of the stapled page, further instructions reiterated, “Do not open this page until instructed by experimenter (the experimenter will return during a time-out).” An experimenter returned to each participant during a time-out and informed the participant that it was time to break the staple and finish the final part of the survey. Each participant was approached for a third time by an experimenter who collected the participant’s survey, offered some candy in appreciation, and answered any questions the participant had about the study. Experimental manipulation. The first page of participants’ surveys contained our experimental manipulation. This page was identical across conditions except for the instructions at the top. The page provided a photograph and an accompanying set of facts for each of seven different players on the Princeton team. These players were chosen because we expected (accurately) that the five starters would be selected from among them. The information provided about each player included his name, number, position, Princeton class, height, weight, high school, hometown, and major. In the player contribution thoughts condition, participants received the following instructions at the top of their page: The players below often start for the Princeton team. For each player, list one or two ways that you think that player could contribute to the team’s play today (based on the information listed about the player, or any other information you have). For example, next to one player you might list “Good 3-point shooter; quick rebounder” or “Looks big, could be strong on defense.” In the player identification thoughts condition, participants instead received the following instructions atop their page: The players below often start for the Princeton team. For each player, list one or two characteristics that could be used to identify this person in a crowd (based on the player’s appearance, the information listed about the player, or any other information you have). For example, next to one player you might list “Crew cut hair; thin build” or “Brown hair; shy.” This difference in instructions constituted our experimental manipulation. Dependent measures. Participants completed a page of dependent measures entitled “Final Survey.” Our primary measure consisted of three items probing for participants’ feelings and beliefs about whether they had influenced the athletes’ playing (thus far). These items were “How responsible do you feel for how the players have been playing so far?” “Up to this point, do you feel like you have affected how the players have performed?” and “How much do you feel like you have tried to influence the outcome of the game (up until now)?” All three items were on 6-point scales (1 ⫽ not at all, 2 ⫽ very slightly, 3 ⫽ a little bit, 4 ⫽ somewhat, 5 ⫽ a fair amount, 6 ⫽ a great deal). Cronbach’s alpha coefficient of reliability was .72. Participants were also asked to report their gender and age, the number of other Princeton men’s basketball games they had attended that season, whether they knew any players on the team, the current game score for both teams, and which team they thought would win.
Results As predicted, the participants led to generate thoughts about how the team’s leading seven players could contribute to the game were more likely to feel that they influenced how the team played than were participants who thought about how those players could be identified in a crowd. On our three-item measure of feelings and
226
PRONIN, WEGNER, MCCARTHY, AND RODRIGUEZ
beliefs about influencing the athletes’ playing, participants felt more responsible for the team’s play if they had first thoughts about how certain key players could contribute to it (M ⫽ 2.24) rather than about how those team members could be identified (M ⫽ 1.57), F(1, 66) ⫽ 6.93, p ⫽ .01, 2 ⫽ .10. This result was apparent for each of the individual items in the responsibility composite (Fs ranged from 3.90 to 4.80, p-values ranged from .03 to .05). We also measured a number of background variables to compare their effects with those of our manipulated independent variable. None of the following variables were associated with perceptions of having influenced the game: (a) prior attendance at the team’s games that season, (b) personal acquaintance with any of the players, (c) beliefs about which team would win, or (d) participant gender. (Pearson’s correlations ranged from ⫺.08 to .16, ps ⬎ .21.) The only variable other than experimental condition that was associated with perceived responsibility was age, r(64) ⫽ .31, p ⫽ .01. Given the impact of this particular variable, our initial analysis involving perceived responsibility was also conducted with age as a covariate; the reported result was unaffected when effects of age were controlled for statistically, F(1, 66) ⫽ 8.69, p ⫽ .005, 2 ⫽ .11.
Discussion People sitting in a stadium and watching a live college basketball game felt more responsible for the players’ performance in that game if they had first thought about how those players could contribute to that game rather than about how the players could be identified in a crowd. A small manipulation delivered before the start of the game influenced spectators’ perceptions of their influence over its outcome, even though their perceptions were measured more than 30 min later and participants had been engrossed in the game in the interim. A number of details of the experimental manipulation used in this study lend to the power of the present results. For one, participants in this study were not watching the game in a laboratory environment in which no other factors were likely to be in place to influence their perceptions of responsibility. On the contrary, the participants varied in their familiarity with the team, their dedication to attending its games, their motivation to cheer the team on, and their interest and ability to focus on the game (in the face of obstacles such as bored children, annoying bleachermates, and uncomfortable seats). Furthermore, the outcome of the game itself was not experimentally controlled, and thus participants were faced with a more ambiguous outcome than in our laboratory settings. It is important to note, with respect to all of the details of this study that were necessarily left free to vary, that only one of them (participant age) was associated with feelings of responsibility for the game’s outcome. The sorts of things that one might think would lead people to feel more influential in the game, such as personally knowing some of the players or attending many of the team’s games, had no noticeable effect. What did have an effect, by contrast, were participants’ prior thoughts about the team before they began watching the game. Perhaps the most important detail to note about our experimental manipulation of participants’ thoughts is how little it differed between conditions. In both conditions, participants thought in
detail about seven of the most critical players on the team. The precise content of their thoughts, in fact, was sometimes fairly similar: In both conditions, participants were likely to have thought about the height and weight of the players and, perhaps, whether they seemed sly, shy, or aggressive. The key difference, of course, was that in the condition predicted to induce feelings of responsibility, participants’ thoughts specifically focused on what the players could contribute to that day’s game. Thus whereas participants in both conditions may have thought about a player as being tall and heavyset, the participant who had such thoughts in the context of how the player could contribute to the game was more likely to feel responsible for the team’s performance than the participant who had such thoughts in the context of how the player could be identified in a crowd.
Study 4: The Armchair Quarterback’s Losing Thoughts When placing evil hexes on people, one typically hopes that some harm will befall them. Similarly, when rooting for favorite athletic teams, one typically hopes that they will win. If the hexes (or cheers) are met with success, one thereby finds oneself experiencing outcomes that were not only thought about but also wanted. In these situations, our results suggest, people are likely to perceive that they have contributed to the relevant outcomes. Do similar feelings of responsibility accompany the occurrence of outcomes that people have thought about, but not wanted? Such instances are commonplace. After taking an important exam, one might find one’s self thinking, “I failed it.” Or, after hearing about a faraway earthquake or tropical storm with as-yet-undetermined consequences, people might find themselves thinking that it will cause immense harm. In these cases, our thoughts are quite different from our desires. The present theorizing suggests that people feel more responsible for a failing grade on an exam or for death and destruction following a natural disaster if they have had thoughts related to those outcomes before their occurrence (rather than if they had been thinking about something else). This study examined whether individuals who generate thoughts relevant to an outcome feel more responsible for that outcome than their peers— even if they consider the outcome to be undesirable. The example investigated in this study involved watching a highstakes athletic competition in which a favorite team is unable to take the lead. In this situation, we expected, a person would be likely to have many thoughts about the team’s prospects for winning or losing the game. In our theories, we suggested that in this situation one would feel more responsible for the team’s eventual loss than would a person who had not been thinking about the game. By examining whether individuals view themselves as responsible for a losing outcome, we address the possibility that perceptions of responsibility for thought-about outcomes simply reflect a self-serving tendency to take personal responsibility for desirable outcomes and to deny responsibility for undesirable ones. This study made use of a real event about which our participants felt passionately: The Super Bowl. Participants were residents of Princeton, New Jersey, which is less than 50 miles from Philadelphia. They had gathered to watch the 39th Super Bowl, in which the Philadelphia Eagles were taking on the defending Super Bowl champion New England Patriots. In this study, we sought to test whether fans who had just finished watching the game on television would feel differing degrees of responsibility for the outcome
EVERYDAY MAGICAL POWERS
of the game depending on how much they had thought about it—regardless of whether they perceived the outcome of the game as desirable or undesirable. This study also differed from our previous studies in that it measured participants’ own perceptions of their thoughts rather than manipulating those thoughts via external means.
Method Participants. Participants were 58 people (17 women, 39 men, and 2 who did not indicate gender) who had been watching the Super Bowl on a big-screen television in the student center at Princeton University. Procedure. Three experimenters distributed surveys to participants as the game came to a close, with the New England Patriots triumphing over the Philadelphia Eagles, 24 to 21. Participants were instructed to complete the one-page survey individually, and upon returning it, they were provided with candy as compensation. Survey. The survey was entitled “Super Bowl XXXIX Survey.” Two items measured participants’ amount of thinking about the game (Cronbach’s ␣ ⫽ .59). These were “During this Super Bowl, how often did you think in advance about whether the upcoming play would be a run or a pass?” (1 ⫽ didn’t think about it before any of the plays, 5 ⫽ thought about it before approximately 10 of the plays) and “During this Super Bowl, what percent of the time were you thinking about the game?” (1 ⫽ 1%, or less [none of the time], 5 ⫽ 60%, or more [most of the time]).3 Two items measured participants’ perceived responsibility for the outcome of the game (Cronbach’s ␣ ⫽ .91). These were “How responsible do you feel for the outcome of this Super Bowl Game?” and “Do you feel like you tried to influence the outcome of this Super Bowl?” (for both items, 1 ⫽ not at all, 2 ⫽ very slightly, 3 ⫽ a little bit, 4 ⫽ somewhat, 5 ⫽ a fair amount, 6 ⫽ a great deal). Participants were also asked to place a checkmark next to either the Patriots or the Eagles in response to the question, “Which team were you rooting for?”
Results Perceived causality. Consistent with our experimental findings, the more participants perceived themselves as having thought about the game, the more they felt responsible for the game’s outcome. This correlation was significant, r(56) ⫽ .40, p ⫽ .002. Indeed, whereas participants in the bottom quartile for their perceived degree of thinking about the Super Bowl did not view themselves as being at all responsible for its outcome (M ⫽ 1.14), participants in the top quartile actually viewed themselves as slightly responsible for the outcome of the game they had just watched on television (M ⫽ 2.48), F(1, 41) ⫽ 9.91, p ⫽ .003. The relevant correlations were significant for both the responsibility item concerning perceptions of trying to influence the game (r ⫽ .45, p ⫽ .0004) and for the item concerning actually having influenced it (r ⫽ .30, p ⫽ .02). Effects of seeing the outcome as a win versus a loss. Our primary interest in this study concerned whether participants who thought more about the game would perceive themselves as having had a greater impact on its outcome— even when the outcome was the opposite of what they had wanted. To investigate this question, we conducted correlational analyses separately for the 16 fans of the winning team and the 39 fans of the losing team. (Only 3 participants claimed not to have rooted for either team, and they are not included in these analyses.) The more that fans of the winning team reported thinking about the game, the more responsible they felt for its outcome, r(14) ⫽
227
.49, p ⫽ .05. More central to our present predictions, we expected to see similar results among participants who had experienced an unwanted loss. Consistent with our expectations, these losing participants also felt more responsible for the game’s outcome the more they had thought about the game, r(37) ⫽ .44, p ⫽ .005. The correlation between thoughts and perceived influence did not differ depending on whether the participant achieved a desired or undesired outcome, according to a z test of the difference between correlations, z ⫽ .19, p ⫽ .85. Finally, participants did not show any general tendency to make attributions of personal responsibility in a self-serving way. Their reported feelings of responsibility for the outcome of the game did not differ depending on whether they had experienced a win (M ⫽ 1.63) or a loss (M ⫽ 1.79), F ⬍ 1. (There also was no difference between the winners and losers in their reported amounts on thinking about the game, F ⬍ 1.)
Discussion This study provides two main results supportive of our theorizing. First, it provides evidence consistent with the experimental findings of our previous studies. Although the present results are correlational, and thus do not yield the opportunity to make causal inferences, it is noteworthy that when participants’ thoughts were left unperturbed by experimental manipulation, the same pattern evident in our experimental studies was again present. In this study, viewers of the Super Bowl on television who reported more thoughts about the game also reported feeling more responsible for its outcome. Even though they knew that the game they were watching was taking place hundreds of miles away, they nevertheless felt as though they had influenced it. A second finding of this study is equally important. That is, some of the participants had been rooting for the team that won and some (in fact, most) had been rooting for the team that lost. This allowed us to compare the association between thoughts and perceived influence for people who observed a desired outcome versus an undesired outcome. No difference between these two groups was apparent. Even those who witnessed the opposite of what they had hoped for felt more responsible for the outcome of the game the more they had thought about it. This result suggests that the tendency to take more credit for thought-about outcomes is not simply an artifact of a self-serving bias. That is, the experience of a causal link between thoughts and events does not seem to be a mere reflection of the tendency for people to take credit for things that they want (and to thereby also take credit for things that they think about, when those are also things that are wanted). Even when people have thoughts that are later followed by an unwanted outcome, we still are more likely to feel responsible for that outcome if our prior thoughts were related to it. 3
An alternative set of labels was also used for these scales, in which the first question was anchored at 1 (thought about it before 10 of the plays, or less) and 5 (thought about it before almost every play) and the second question was anchored at 1 (50%, or less [about half the time, or less]) and 5 (100% [constantly, the entire time]). Because these label differences are not the focus of the current study, they are not discussed in this article.
228
PRONIN, WEGNER, MCCARTHY, AND RODRIGUEZ
General Discussion These studies illustrate one means by which people may come to experience magical powers, or the feeling of having caused events they did not actually control. The particular means by which people may come to hold such beliefs is their inference of an association between an observed outcome and prior thoughts that are conceptually related to that outcome. In our first study, the outcome in question involved a peer’s adverse physical symptoms, and in the second study, it involved a peer’s successful athletic performance. In our third and fourth studies, the outcomes involved real athletic competitions. In each study, the relevant outcome occurred regardless of participants’ thoughts (it was experimentally predetermined in our first two studies, and it was part of a live sporting event in our second two studies). However, in each study, participants were more likely to feel and to believe that they were responsible for the relevant outcome if they had generated prior thoughts related to it. Participants in Study 1 were more likely to report having harmed their victim and caused his negative health symptoms if they had first been in a position to think ill of him. Comparison participants who had encountered no inducement to think ill of their victim were less likely to feel responsible for his reported plight. A follow-up study indicated that the tendency to feel responsible for harm to the victim even surfaced when participants merely followed instructions to think negative thoughts. Participants in Study 2 were more likely to report having aided their peer in his athletic success if they had first been asked to generate visual thoughts consistent with that success. Comparison participants who had been asked to generate visual thoughts unrelated to that success were less likely to feel and believe that they had somehow contributed to it. Furthermore, this effect was similar for participants who were not themselves asked to generate any such visualizations but who were rather exposed to a fellow participant’s visualizations and then asked about that participant’s effect on the athlete’s performance. In Study 3, fans of a college basketball team felt more responsible for the team’s performance if they had first thought about how specific key players could contribute to that performance (rather than if they had thought about how those players could be identified in a crowd). Finally, the results of our fourth study showed that participants who had thought more about a Super Bowl game they had been watching on television took more credit for its outcome— even if they viewed that outcome as a loss. Could the results of these studies be explained by experimental demand? Perhaps participants’ reported feelings of responsibility were prompted by their inference that this was what the experimenter expected. In experiments that are purportedly about the impact of voodoo curses on physical health or about the impact of spectatorship on athletic performance, it might be reasonable for a participant to infer that the experimenter expects such effects. That inference may have been further encouraged by our dependent measures, which directly asked participants about their feelings of personal responsibility. However, although these details may have suggested the experimenter’s expectations, that suggestion was clearly not sufficient to lead participants to claim responsibility. Participants in our control conditions were exposed to the same experimental cover story, and to identically worded dependent measures, but they readily denied responsibility for the outcomes
in question. It was specifically those participants who had generated outcome-relevant thoughts who expressed such responsibility. It would be difficult to argue that these two sets of participants held different beliefs about the experimenter’s expectations. For example, in the voodoo experiment, both sets of participants were told that the experimenter was interested in voodoo as a way of testing whether psychological factors can affect physical health. Both sets were asked to direct their thoughts toward their victim before placing a hex on him, and both sets were explicitly asked about their responsibility for the victim’s headache. The only difference was whether the victim himself behaved in a way that prompted the participant to think ill of him. Although our manipulations may have suggested to participants that it would be acceptable for them to express feelings of responsibility, it appears that when they did express such feelings it was because of their own thoughts and not because of any felt demand from the experimenter. Although our experiments did not create a demand that was sufficient to induce participants to claim responsibility for external outcomes, they likely did license participants to express such feelings when they had them. It is possible that people often have an intuitive sense of responsibility when external outcomes are preceded by their own relevant thoughts but that their rational mind leads them to disregard those feelings. Participants in the present experiments may have been less inclined to override their intuitions for the reasons described. To that end, an interesting direction for future work would be to use more spontaneous or indirect measures of responsibility. These could involve implicit measures of construct accessibility (e.g., accessibility of priderelated feelings in reaction to generating positive thoughts prior to an athlete’s success) or nonobvious measures of felt responsibility (e.g., willingness to lend a headache-stricken victim $5 to buy some aspirin). Such measures could shed light on the extent of participants’ magical beliefs. On a related note, it would be useful to use implicit manipulations of thought content that could not be plausibly linked to the experimenter’s goals or expectations. Such manipulations have been shown to induce causal inferences regarding ordinary outcomes (see Aarts et al., 2005); it would be interesting to see whether they would also induce such inferences regarding seemingly magical ones. Our studies dealt with positive thoughts and negative thoughts as well as with outcomes likely to be perceived as both desirable and undesirable. Evidence from a follow-up study using our voodoo scenario suggested that participants do not need to have desired the attending outcome to feel more responsible for it after generating relevant thoughts. In that study, evil-thinking participants felt more responsible for their victim’s negative health even though they also felt more guilt about that outcome (presumably because their victim had done nothing to merit their ill will). The results of our fourth study are also consistent with these results. Fans of the losing Super Bowl team felt just as responsible for the game’s outcome as winning fans. The more both sets of fans reported thinking about the game, the more responsibility they felt over it. Notably, participants in this study were likely to have differed in the precise content of their game-related thoughts. Some may have thought mostly about their team winning, others may have thought mostly about their team losing, and still others may have had both thoughts. The results of the study do not shed light on the question of whether these different thoughts may have
EVERYDAY MAGICAL POWERS
affected participants’ feelings of responsibility (e.g., such that envisioning the correct outcome engendered greater feelings of responsibility). The results of our two experiments involving athletic performance make it clear that thinking about athletes’ performance, rather than about something less relevant to that performance (such as their lifting a dumbbell, in Study 2, or their identifiability in a crowd, in Study 3)—affects perceptions of personal influence. However, those experiments did not examine whether participants feel more responsible for athletes’ performance after generating “winning” versus “losing” thoughts about athletes’ subsequent winning versus losing performance. The question of whether people feel more responsible for outcomes they have thought about, even when their thoughts run contrary to the valence of the outcome, is a useful one for future research. One interesting case about this involves the feeling of having “jinxed” a desired outcome by thinking about it. People sometimes have the experience of generating positive thoughts about something in the hopes that it will happen and then blaming themselves afterward when it does not—as though their premature and overly positive thoughts must have brought on the bad luck. The admonition against “counting one’s chickens before they hatch” may reflect not only pragmatic wisdom about not investing too many resources in an uncertain outcome but also a more magical concern that such “mental counting” may decrease the likelihood that the desired outcome will occur. Displays of defensive pessimism (e.g., Norem, 2001; Norem & Cantor, 1986; K. M. Taylor & Shepperd, 1998) may in part be a way of preventing this; in these cases, people may intentionally envision negative outcomes rather than positive ones to avoid the possibility of “jinxing” themselves. The present studies support the notion that everyday processes of causal inference can lead normal people to develop the perception that they have magical powers. In so doing, this research also supports the theory of apparent mental causation (Wegner, 2002; Wegner & Wheatley, 1999) regarding how people perceive the cause of action more generally. According to this theory, people perceive human agents and their thoughts to be the cause of physical outcomes in much the same way that they perceive physical objects to be the cause of contiguous physical outcomes. That is, we conclude that the relevant agent has been causal when its thoughts are conceptually consistent with, and apparent prior to, the relevant physical outcome. Although our focus in this research has been on the magical nature of participants’ causal perceptions, it is worth placing that emphasis in cultural perspective. The beliefs in personal responsibility reported by participants in our studies seem to defy any known scientific mechanism of causation. However, some of our participants (and, perhaps, some of our readers) might defend their beliefs in the harmful power of voodoo or the positive power of athletic fans’ positive thinking. Cultural, national, and religious differences are likely to underlie such differences in beliefs. In future research, it would be interesting to examine whether such differences in background account for some of the effects observed in magical thinking experiments such as those reported here. Although many of our participants were White American college students, not all of them were, and it could be that some of our effects are attributable to the beliefs of a small subset of our participants who were from backgrounds in which belief in voodoo (or positive visualization) would not be considered magical.
229
Indeed, the present research suggests how certain cultures and groups might come to adopt beliefs in forms of causation for which Western science knows no physical mechanism. It could be that the adoption of magical beliefs involves the psychological mechanism we have put forth, combined with either a temporary or chronic disregard for (or unawareness of) Western scientific causal principles. For most people, magical beliefs may come to mind when their thoughts precede relevant external outcomes, but they may suppress them when concerns about scientific rationality are prominent. Participation in a scientific experiment in a psychology laboratory could be one such condition. In this regard, it is worth noting that participants in our studies were loath to use the top half of our scales when expressing beliefs in their personal responsibility for external outcomes. Those who generated evil thoughts about a voodoo victim before he got a headache or those who generated positive visualizations about a basketball shooter before he succeeded at making his shots acknowledged feeling some responsibility for the outcome—in comparison to their peers who did not report any such feelings— but their responses indicated that they attributed even more of that responsibility to some other source. In a sense, all of our participants were reluctant to express beliefs in personal mental causation; what is noteworthy is that those who generated relevant thoughts were more likely to overcome such reluctance. Although the focus of these studies is belief in personal causation, the results of one study (Study 2) suggest that a similar mechanism influences beliefs about other people’s causation. In that study, yoked observers were more likely to view spectators as responsible for an athlete’s success when those observers were privy to the spectators’ prior relevant thoughts. This tendency to infer others’ causal responsibility on the basis of their prior thoughts has numerous consequences. It suggests one reason why verdicts in murder cases often take into account a person’s thoughts prior to the killing. Premeditated killings involve prior relevant thoughts, and their agents are deemed more responsible (and hence subject to tougher sentencing). Such causal inferences need not only involve the placement of blame. People may be given far more personal credit for successes when they have thought long and hard about how to achieve those successes rather than when those successes simply have fallen in their lap. In such cases, mental effort, much like physical effort, may help a person lay claim to a subsequent success. Although the nature of one’s assessments of one’s own and others’ causal agency may both be responsive to the presence of relevant mental events, important differences in the causal assessments made are likely to arise because people are far less aware of others’ mental events than of their own. As a consequence, attention to mental events in making causal assessments may be a cause of seeing oneself as far more causally responsible than others for external outcomes (Ross & Sicoly, 1979). These studies focused on voodoo hexes and athletic fandom. The results also suggest, though, that healing thoughts directed at an ailing loved one or motivational thoughts of oneself succeeding at impending challenges would also yield the perception that one has personally caused the relevant outcomes, should they occur. Hopes and prayers, like curses and armchair cheers, may contribute to inflated estimates of personal agency when they occur just before the events they portend. In this sense, this research is not really about voodoo spells or athletic spectatorship at all. Although
PRONIN, WEGNER, MCCARTHY, AND RODRIGUEZ
230
its point of departure was the odd circumstance of a magical hex, it might be better understood as an examination of the inference processes that underlie the experience of authorship of action. Such inference processes can lead modern American college students to believe they have hurt someone with a voodoo doll, so perhaps they play a role in the self-perception of action more generally.
References Aarts, H., Custers, R., & Wegner, D. M. (2005). On the inference of personal authorship: Enhancing experienced agency by priming effect information. Consciousness and Cognition, 14, 439 – 458. Agnew, G. A., & Carron, A. V. (1994). Crowd effects and the home advantage. International Journal of Sport Psychology, 25, 53– 62. Alloy, L. B., & Tabachnik, N. (1984). Assessment of covariation by humans and animals: The joint influence of prior expectations and current situation information. Psychological Review, 91, 112–149. Baumeister, R. F., & Steinhilber, A. (1984). Paradoxical effects of supportive audiences on performance under pressure: The home field disadvantage in sports championships. Journal of Personality and Social Psychology, 47, 85–93. Beckman, L. (1970). Effects of students’ performance on teachers’ and observers’ attributions of causality. Journal of Educational Psychology, 61, 76 – 82. Bleak, J., & Frederick, C. M. (1998). Superstitious behavior in sport: Levels of effectiveness and determinants of use in three collegiate sports. Journal of Sport Behavior, 21, 1–15. Buehler, R., Griffin, D., & Ross, M. (1994). Exploring the “planning fallacy”: Why people underestimate their task completion times. Journal of Personality and Social Psychology, 67, 366 –381. Cannon, W. B. (1942). “Voodoo” death. American Anthropologist, 44, 182–190. Ciborowski, T. (1997). “Superstition” in the collegiate baseball player. Sport Psychologist, 11, 305–317. Corrigan, R. S., Pattison, L., & Lester, D. (1980). Superstition in police officers. Psychological Reports, 46, 830. Eckblad, M., & Chapman, L. J. (1983). Magical ideation as an indicator of schizotypy. Journal of Consulting and Clinical Psychology, 51, 215– 225. Einhorn, H. J., & Hogarth, R. M. (1986). Judging probable cause. Psychological Bulletin, 99, 3–19. Frazer, J. G. (1959). The golden bough: A study in magic and religion. New York: Macmillan. (Original work published in 1890) Freud, S. (1950). Totem and taboo: Some points of agreement between the mental lives of savages and neurotics (J. Strachey, Trans.). New York: W. W. Norton. (Original work published in 1913) Friedland, N., Keinan, G., & Regev, Y. (1992). Controlling the uncontrollable: Effects of stress on illusory perceptions of controllability. Journal of Personality and Social Psychology, 63, 923–931. Frieze, I., & Weiner, B. (1971). Cue utilization and attributional judgments for success and failure. Journal of Personality, 39, 591– 606. Gilovich, T., Griffin, D., & Kahneman, D. (2002). Heuristics and biases: The psychology of intuitive judgment. New York: Cambridge University Press. Golden, K. M. (1977). Voodoo in Africa and the United States. American Journal of Psychiatry, 134, 1425–1427. Hume, D. (2000). A treatise of human nature (D. Norton & M. Norton, Eds.). London: Oxford University Press. (Original work was published 1739) Johnson, T. J., Feigenbaum, R., & Weiby, M. (1964). Some determinants and consequences of the teacher’s perception of causation. Journal of Educational Psychology, 55, 237–246. Jones, E. E., &, & Nisbett, R. E. (1972). The actor and the observer:
Divergent perceptions in the causes of behavior. In E. E. Jones, D. E. Kanouse, H. H. Kelley, R. E. Nisbett, S. Valins, & B. Weiner (Eds.), Attribution: Perceiving the causes of behavior (pp. 79 –94). Morristown, NJ: General Learning Press. Kahneman, D., Slovic, P., & Tversky, A. (1982). Judgment under uncertainty: Heuristics and biases. New York: Cambridge University Press. Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. Psychological Review, 80, 237–251. Keinan, G. (1994). Effects of stress and tolerance of ambiguity on magical thinking. Journal of Personality and Social Psychology, 67, 48 –55. Keinan, G. (2002). The effects of stress and desire for control on superstitious behavior. Personality and Social Psychology Bulletin, 28, 102– 108. Kelley, H. H. (1972). Causal schemata and the attribution process. In E. E. Jones, D. E. Kanouse, H. H. Kelley, R. E. Nisbett, S. Valins, & B. Weiner (Eds.), Attribution: Perceiving the causes of behavior (pp. 151– 174). Morristown, NJ: General Learning Press. Langer, E. J. (1975). The illusion of control. Journal of Personality and Social Psychology, 32, 311–328. Matute, H. (1994). Learned helplessness and superstitious behavior as opposite effects of uncontrollable reinforcement in humans. Learning and Motivation, 25, 216 –232. Matute, H. (1996). Illusion of control: Detecting response-outcome independence in analytic but not naturalistic conditions. Psychological Science, 7, 289 –293. Mauss, M. (1972). A general theory of magic (R. Brain, Trans.). New York: W. W. Norton. (Original work published in 1902) Michotte, A. (1963). The perception of causality (T. R. Miles & E. Miles, Trans.). New York: Basic Books. (Original work published in 1946) Nemeroff, C., & Rozin, P. (2000). The makings of the magical mind: The nature and function of sympathetic magical thinking. In K. S. Rosengren, C. N. Johnson, & P. L. Harris (Eds.), Imagining the impossible: Magical, scientific, and religious thinking in children (pp. 1–34). New York: Cambridge University Press Nisbett, R. E., & Ross, L. (1980). Human inference: Strategies and shortcomings of social judgment. Englewood Cliffs, NJ: Prentice-Hall. Nisbett, R. E., & Wilson, T. D. (1977). Telling more than we can know: Verbal reports on mental processes. Psychological Review, 84, 231–259. Norem, J. K. (2001). The positive power of negative thinking: Using defensive pessimism to manage anxiety and perform at your peak. New York: Basic Books. Norem, J. K., & Cantor, N. (1986). Defensive pessimism: Harnessing anxiety as motivation. Journal of Personality and Social Psychology, 51, 1208 –1217. Padgett, V. R., & Jorgensen, D. O. (1982). Superstition and economic threat: Germany 1918 –1940. Personality and Social Psychology Bulletin, 8, 736 –741. Pepitone, A., & Saffiotti, L. (1997). The selectivity of nonmaterial beliefs in interpreting life events. European Journal of Social Psychology, 27, 23–35. Piaget, J. P. (1929). The child’s conception of the world. London: Routledge & Kegan Paul. Pronin, E., Gilovich, T., & Ross, L. (2004). Objectivity in the eye of the beholder: Divergent perceptions of bias in self versus others. Psychological Review, 111, 781–799. Ross, L., Bierbrauer, G., & Polly, S. (1974). Attribution of educational outcomes by professional and nonprofessional instructors. Journal of Personality and Social Psychology, 29, 609 – 618. Ross, M., & Sicoly, F. (1979). Egocentric biases in availability and attribution. Journal of Personality and Social Psychology, 37, 322–336. Rozin, P., Millman, L., & Nemeroff, C. (1986). Operation of the laws of sympathetic magic in disgust and other domains. Journal of Personality and Social Psychology, 50, 703–712. Rozin, P., & Nemeroff, C. (2002). Sympathetic magical thinking: The
EVERYDAY MAGICAL POWERS contagion and similarity “heuristics.” In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases: The psychology of intuitive judgment (pp. 201–216). New York: Cambridge University Press. Shweder, R. A. (1977). Likeness and likelihood in everyday thought: Magical thinking in judgments about personality. Current Anthropology, 18, 637– 658. Subbotsky, E. (2000). Causal reasoning and behaviour in children and adults in a technologically advanced society: Are we still prepared to believe in magic and animism? In K. J. Riggs & P. Mitchell (Eds.), Children’s reasoning and the mind (pp. 327–347). Hove, England: Psychology Press/Taylor & Francis (UK). Subbotsky, E. (2004). Magical thinking in judgments of causation: Can anomalous phenomena affect ontological causal beliefs in children and adults? British Journal of Developmental Psychology, 22, 123–152. Taylor, K. M., & Shepperd, J. A. (1998). Bracing for the worst: Severity, testing, and feedback timing as moderators of the optimistic bias. Personality and Social Psychology Bulletin, 24, 915–926. Taylor, S. E., Kemeny, M. E., Reed, G. M., Bower, J. E., & Gruenewald, T. L. (2000). Psychological resources, positive illusions, and health. American Psychologist, 55, 99 –109. Thalbourne, M. A., & French, C. C. (1995). Paranormal belief, manicdepressiveness, and magical ideation: A replication. Personality and Individual Differences, 18, 291–292. Thompson, S. C., Armstrong, W., & Thomas, C. (1998). Illusions of control, underestimations, and accuracy: A control heuristic explanation. Psychological Bulletin, 123, 143–161.
231
Wegner, D. M. (2002). The illusion of conscious will. Cambridge, MA: MIT Press. Wegner, D. M. (2003). The mind’s best trick: How we experience conscious will. Trends in Cognitive Science, 7, 65– 69. Wegner, D. M., & Bargh, J., A. (1998). Control and automaticity in social life. In D. T. Gilbert, S. T. Fiske, & G. Lindzey (Eds.), Handbook of social psychology (4 ed., Vol. 1, pp. 446 – 496). New York: McGrawHill. Wegner, D. M., & Sparrow, B. (2004). Authorship processing. In M. Gazzaniga (Ed.), The new cognitive neurosciences (3rd ed., pp. 1201– 1209). Cambridge, MA: MIT Press. Wegner, D. M., Sparrow, B., & Winerman, L. (2004). Vicarious agency: Experiencing control over the movements of others. Journal of Personality and Social Psychology, 86, 838 – 848. Wegner, D. M., & Wheatley, T. P. (1999). Apparent mental causation: Sources of the experience of will. American Psychologist, 54, 480 – 492. Woolley, J. D. (1997). Thinking about fantasy: Are children fundamentally different thinkers and believers from adults? Child Development, 68, 991–1011. Zusne, L., & Jones, W. H. (1989). Anomalistic psychology: A study of magical thinking (2nd ed.). Hillsdale, NJ: Erlbaum.
Received April 28, 2005 Revision received November 7, 2005 Accepted November 28, 2005 䡲
Journal of Personality and Social Psychology 2006, Vol. 91, No. 2, 232–242
Copyright 2006 by the American Psychological Association 0022-3514/06/$12.00 DOI: 10.1037/0022-3514.91.2.232
Subgoals as Substitutes or Complements: The Role of Goal Accessibility Ayelet Fishbach
Ravi Dhar
University of Chicago
Yale University
Ying Zhang University of Chicago The self-regulation process often involves breaking an ongoing goal (e.g., keeping in shape) into many individual, constituent subgoals that monitor actual actions (e.g., eating healthy meals, going to the gym). The article examines how pursuing each of these subgoals may influence subsequent goal pursuit. The authors show that when people consider success on a single subgoal, additional actions toward achieving a superordinate goal are seen as substitutes and are less likely to be pursued. In contrast, when people consider their commitment to a superordinate goal on the basis of initial success on a subgoal, additional actions toward achieving that goal may seem to be complementary and more likely to be pursued. These predictions were tested in four studies that explored the conditions under which subgoals attainment have a counterproductive versus favorable effect on further pursuit of similar actions. Keywords: goal, subgoal, self-regulation, self-control, priming
ship between subgoals when the focus is on one’s commitment to the corresponding superordinate goal, because success in an initial task increases the subsequent commitment. As an illustration, consider the example of a person whose goal is to have an attractive figure. This goal might be pursued through several different subgoals (e.g., exercising, eating healthy food). If the individual focuses on subgoal attainment, a successful workout might temporarily lower the likelihood of consuming healthy food as it would be viewed as a substitutable activity for a goal that has progressed. However, if the individual focuses on the superordinate goal, a successful workout enhances the commitment to the goal of having an attractive figure, such that exercising complements rather than substitutes for the consumption of healthy food. This article proposes a novel analysis of self-regulation by breaking a goal into subgoals. In what follows, we examine the process of self-regulation through subgoals, which leads to our prediction that subgoals can be seen as either complementary or substitutable, depending on the accessibility of a superordinate goal.
Setting goals and monitoring progress toward goal achievement is fundamental to theories of self-regulation, and examining the properties of goals that enhance self-regulation has been a major focus of past research (e.g., Bargh & Barndollar, 1996; Carver & Scheier, 1998; Higgins, 1989). More recently, research has addressed the question of how individuals evaluate and choose among several different means or subgoals that are linked to the same overriding goal (Kruglanski et al., 2002; Shah & Kruglanski, 2003). For example, a student who has the goal to do well academically can pursue it through different means such as studying in the library and attending tutorials. A question that arises is the conditions under which prior pursuit of a means or a subgoal (e.g., attending a tutorial) might increase or decrease the likelihood of pursuing additional subgoals related to the same ongoing goal (e.g., spending time studying in the library). In general, the successful attainment of a subgoal may increase, decrease, or have no effect on the pursuit of other similar subgoals that are linked to the same goal. The present article addresses the effect of successful subgoal attainment on subsequent self-regulation, depending on whether the individual focuses on the attainment of a specific subgoal or on the commitment to a superordinate goal. We expected an inhibitory relationship between subgoals to be connected to a superordinate goal when the focus is on subgoal attainment. The intuition behind this prediction is that subgoal attainment elicits a sense of accomplishment, which justifies temporary disengagement from the superordinate goal. We further predicted a reinforcing relation-
Self-Regulation Through Subgoals A number of previous studies have examined the effect of setting subgoals on effective self-regulation (Carver & Scheier, 1998; Emmons, 1992; Locke & Latham, 1990; Vallacher & Wegner, 1987). For instance, Gollwitzer and his colleagues have shown that individuals who set subgoals, in the form of a mental link between situational cues and anticipated goal-related actions, are more likely to successfully pursue these actions in the face of situational obstacles (Gollwitzer, 1999; Gollwitzer & Brandstaetter, 1997). A limitation of past goal research is that it has been conducted under the assumption that the individual has a single goal that is connected to a single set of attainment means. However, in many real-life situations, people hold multiple ongoing goals, which are in turn connected to multiple, low-level subgoals
Ayelet Fishbach and Ying Zhang, Graduate School of Business, University of Chicago; Ravi Dhar, Yale School of Management, Yale University. Correspondence concerning this article should be addressed to Ayelet Fishbach, University of Chicago, Graduate School of Business, 5807 South Woodlawn Avenue, Chicago, IL 60637. E-mail: ayelet.fishbach@chicagogsb .edu 232
SUBGOALS AS SUBSTITUTES OR COMPLEMENTS
(Fishbach, Shah, & Kruglanski, 2004; Kruglanski et al., 2002). For example, high-level goals of attaining academic success and having an active social life may each be connected to a specific set of activities serving as its attainment means (e.g., studying in the library and attending classes to achieve academic success). In this configuration, the successful attainment of one subgoal may motivate further pursuit of similar actions, or it may also justify temporary disengagement to pursue competing goals. We propose that on the basis of the initial goal-related action, a person can infer either goal commitment or goal progress (i.e., partial attainment). If one interprets one’s subgoal achievement in terms of general level of goal commitment (e.g., Bem, 1972; Festinger, 1957), it is likely to increase one’s motivation toward similar complementary actions and inhibit competing goals (Shah, Friedman, & Kruglanski, 2002). If, however, one interprets the same subgoal achievement in terms of general level of goal progress (e.g., Carver & Scheier, 1998), it serves as a justification for one to move away from the focal goal to pursue other goals. These two dynamics— goal commitment versus goal progress— were recently illustrated by Fishbach and Dhar (2005), who found, for example, that when initial academic success indicated greater commitment to academic goals, students were subsequently more interested in similar academic tasks and less interested in incongruent social activities. Yet, this same level of academic performance decreased interest in academic tasks and increased interest in social activities if students inferred that progress had been made on the academic goals. Furthermore, a failure to pursue a subgoal can also indicate either lack of sufficient commitment or lack of progress to an already committed goal. If a person interprets initial failure in terms of low goal commitment, we expected this person to subsequently disengage from the goal (Soman & Cheema, 2004). If, however, a person interprets initial failure in terms of lack of progress toward a goal to which commitment remains intact, we expected this person to be motivated to work harder toward the goal by choosing compensatory subgoals or by perpetuating the same subgoal (Brunstein & Gollwitzer, 1996; Steele, 1988; Wicklund & Gollwitzer, 1982). Thus, for example, failing to study might decrease the subsequent motivation to pursue similar actions if it signals low commitment to doing well academically, but it might increase the motivation to study if it signals the absence of progress on the goal of academic excellence.
When Subgoals Lead to Disengagement or Reinforcement The degree to which individuals interpret subgoal attainment in terms of progress or commitment depends on their attention to the relatively concrete subgoal in comparison to the corresponding, relatively abstract superordinate goal. When individuals consider the attainment of the subgoal itself, they may experience some of the benefits associated with goal fulfillment, which motivates moving temporarily away from a goal (Dhar & Simonson, 1999; Monin & Miller, 2001; Tesser, Martin, & Cornell, 1996). The adverse effect of initial success (vs. failure) on pursuing similar actions further reflects individuals’ need to consider various goals, desires, needs, and aspirations (e.g., Emmons & King, 1988; Higgins, Strauman, & Klein, 1986; Markus & Ruvolo, 1989). In the course of pursuing multiple goals (e.g., weight loss and food enjoyment), partial fulfillment of a focal goal suggests to the
233
individual that other objectives are somewhat neglected, further motivating disengagement from that goal. On the other hand, when the focus is on the superordinate goal, the same level of successful attainment highlights commitment to that overall goal. By drawing attention to the superordinate goal, the success (vs. failure) toward reaching a subgoal thus alters a person’s self-identity and provides evidence for a person’s higher (vs. lower) commitment to the superordinate goal more than it indicates goal progress. For example, when the goal to have an attractive figure is highly accessible, an initial success at losing weight strengthens the commitment to this goal as well as related activities toward that end (e.g., working out). An initial failure to lose weight, however, signals low commitment to the overall goal of having an attractive figure, further impairing the dieter’s commitment to other actions toward this goal.1 Several factors might underlie whether individuals process any subgoal at a more concrete level, focusing on the subgoal itself, or at a more abstract level, focusing on the link to the superordinate goal. First, although certain superordinate goals might chronically be more accessible, the interpretation of any subgoal in terms of a higher order goal might also be elicited directly through priming of the corresponding superordinate goal. Second, the relative focus on the superordinate goal can also be elicited by referring to subgoal pursuit in the distant (vs. proximal) future. Because actions that are scheduled in the distant future are represented in more abstract terms (e.g., Liberman & Trope, 1998; Trope & Liberman, 2003), considering the pursuit of subgoals in the distant future (e.g., several months from now) should have similar effects as focusing on a primed superordinate goal. That is, it should lead to inferences of commitment, which increase the motivation to pursue complementary subgoals after initial success. However, considering the pursuit of subgoals in the proximal future (e.g., tomorrow) leads to more concrete processing, which has similar effects as focusing on subgoal attainment. Hence, it encourages temporary disengagement with similar subgoals after initial success.
Abstract Framing Versus Goal Priming We propose that contextual cues for a superordinate goal increase the focus on overall goal commitment, which may or may not lead to greater choice of goal-related actions, depending on a person’s initial performance. Thus, for example, an achievement prime (e.g., the words achieve or success presented in an unrelated task) increases the focus on one’s general achievement motivation while performing a cognitive test. Those who succeed on the test, in turn, would infer high commitment and be more interested in similar achievement tasks, whereas those who failed initially 1
This analysis concerns situations in which individuals infer commitment on the basis of goal performance (e.g., self-perception theory; Bem, 1972) and should be distinguished from situations in which individuals succeed or fail in accomplishing goals to which they are already committed (e.g., when a professional dancer performs poorly). In line with selfcompletion theory (Brunstein & Gollwitzer, 1996; Wicklund & Gollwitzer, 1982) and self-affirmation theory (Steele, 1988), we predicted that if commitment is already established, failure (more than success) should increase the motivation to seek complementary subgoals to a goal because it indicates low progress.
FISHBACH, DHAR, AND ZHANG
234
would infer lower commitment, which further impairs their subsequent motivation to choose similar achievement tasks. Whereas previous goal research finds that contextual cues for a superordinate goal (through priming) increase choice of congruent means (e.g., Aarts & Dijksterhuis, 2000; Chartrand & Bargh, 1996), the present research addresses subsequent self-regulation after initial performance on the same goal. The effect of goal priming in successive choice situations depends on a person’s initial performance, such that failure decreases the likelihood of subsequently choosing similar actions because it indicates low commitment, whereas success increases the likelihood of subsequently choosing similar actions because it indicates high commitment. In general terms, the focus on an accessible superordinate goal promotes behavioral consistency between initial success (vs. failure) and subsequent choice of congruent (vs. incongruent) subgoals. It further implies that initial success facilitates choice of complementary means in the presence of an accessible superordinate goal when it signals commitment, but initial failure to meet a subgoal is more motivating in the absence of a superordinate goal when it signals the absence of progress.
Research Overview Four studies tested whether, when the focus is on subgoal attainment, initial success has a counterproductive effect on further pursuit of complementary actions, whereas initial failure has a favorable effect on pursuing these actions. In contrast, situations that focus on a superordinate goal were expected to be more likely to elicit consistency in the choice of initial and subsequent actions. Under these conditions, initial success was expected to have a favorable effect on pursuing complementary actions, and initial failure was expected to have a counterproductive effect on pursuing these actions. These studies manipulated the pursuit of an initial subgoal and the focus on a superordinate goal and then assessed participants’ interest in complementary subgoals. Specifically, Study 1 tested the general assertion that subgoals inhibit each other when the focus is on subgoal completion but reinforce each other when the focus is on the superordinate goal. Studies 2 and 3 directly manipulated success and failure on the initial subgoal task. Thus, Study 2 used a field setting, which involved actual self-regulation toward the goal of keeping in shape. Study 3 tested for the time and effort that participants invested in attempting to work on an academic test after completing another test of the same ability. Finally, Study 4 assessed whether subgoals that are scheduled in the distant future signal commitment to a superordinate goal, whereas subgoals that are scheduled in the proximal future signal their own attainment, and whether these inferences (of commitment vs. progress) mediate subsequent choice of additional actions to the same superordinate goal.
Study 1: Substitute and Complement Subgoals Participants’ initial choice was thought to influence their subsequent choice of similar actions, such that initial subgoal attainment would substitute for similar actions when the focus was on one’s initial action but would reinforce the choice of similar actions when the focus was on a superordinate goal. We tested for this prediction across different goal domains (e.g., preventing sun
damage) and using several subgoals (e.g., wearing a sun hat and applying sunscreen).
Method Participants Ninety-nine University of Chicago undergraduates (57 women and 42 men) participated in the experiment in exchange for $4. The gender of participants did not yield any effects here and in subsequent studies and is therefore omitted from further consideration.
Procedure The study used a 2 (superordinate goal prime: present vs. absent) ⫻ 2 (subgoal: present vs. absent) ⫻ 3 (vignette: preventing sun damage vs. keeping in shape vs. studying). The first two factors varied between subjects, and the third factor varied within subjects. On the basis of our pilot data that University of Chicago undergraduates stated that they pursued goals pertaining to doing well academically, preventing sun damage, and keeping in shape (see also Fishbach & Dhar, 2005), we presented scenarios corresponding to these three goals. Participants completed a series of supposedly unrelated experimental tasks that manipulated the accessibility of the superordinate goal by means of a scrambled-sentence task (cf. Bargh & Chartrand, 2000) and the completion of an initial subgoal to this goal, in a seemingly unrelated manner. We then assessed interest in pursuing subsequent subgoals. The study consisted of three vignettes, each corresponding to the three goals listed above. In each vignette, participants were first handed a scrambled-sentence task that was presented as being part of a lexical study on the evaluation of discrete words within a sentence. Participants were told that because of the length of the scrambled-sentence task, it was divided into three separate sessions to be administered at the beginning of each part of the experiment. The participant’s task was to unscramble five word sets into coherent and meaningful sentences. Each scrambledsentence task was followed by a short scenario describing the pursuit (or nonpursuit) of an initial subgoal before participants indicated their interest in pursuing another subgoal to an overall goal. The specific scenarios for preventing sun damage, studying, and keeping in shape goals are described below. The order of these vignettes was randomized. Preventing sun damage vignette. In this vignette, participants who were assigned to the superordinate goal prime condition unscrambled five sentences containing health-related words (i.e., skin, cancer, gym, medical, and ultraviolet) and which included, for example, “she is going to medical school this fall” and “Cancer is one of 12 astrological signs.” Participants in the no-prime condition unscrambled five neutral sentences including, for example, “she is going to business school this fall” and “Leo is one of 12 astrological signs.” After completing this task, participants were handed a survey titled “Product Usage.” Half of the participants, who were assigned to the subgoal attained condition, read the following scenario: “On a bright sunny summer afternoon, you have to walk in the sun for 45 min. You are wearing a sun hat and have a bottle of sunscreen with you.” The rest of the participants, in the subgoal absent condition, read a similar scenario that did not specify that they were already wearing a hat: “On a bright sunny summer afternoon, you have to walk in the sun for 45 min. You have a bottle of sunscreen with you.” The dependent variable referred to participants’ interest in applying the sunscreen, which constituted a second, complementary subgoal toward the focal goal of avoiding sun damage. They were asked to indicate on a 7-point scale how likely they were to use the sunscreen (1 ⫽ not at all likely, 7 ⫽ very likely). To conceal the purpose of the study, we embedded this item among other irrelevant items (e.g., “Do you enjoy walking in the sun?”).
SUBGOALS AS SUBSTITUTES OR COMPLEMENTS Academic performance vignette. Half of the participants in this vignette were asked to unscramble five sentences that contained academic concepts (i.e., honor, diligent, achieve, excellent, and hard) and were used to prime an academic achievement goal. These sentences included, for example, “most stores honor credit cards” and “we fight to achieve liberty.” The rest of the participants unscrambled similar, neutral sentences, including, for example, “most stores accept credit cards” and “we fight to obtain liberty.” Following this task, participants were handed a survey titled “Study Plans.” The participants assigned to the subgoal attained condition were asked to imagine that “the final exam is a week away, and today you studied very hard during the day,” whereas the rest of the participants in the absent subgoal condition were told to imagine that “the final exam is a week away, and today you studied as usual during the day.” The dependent variable referred to participants’ intention to study at night. They were asked to indicate on a 7-point scale how likely they were to study during the night (1 ⫽ not at all likely, 7 ⫽ very likely). This item was embedded among other irrelevant items. Keeping in shape vignette. Half of the participants in this vignette were asked to unscramble five sentences, which contained concepts related to keeping in shape (i.e., slim, weighted, fat, fit, figure, and workout). These sentences included, for example, “I cannot figure out how it works” and “he has a fat investment account.” The remaining participants unscrambled neutral sentences, including, for example, “I cannot tell how it works” and “he has an empty investment account.” Following this task, participants were handed a survey titled “Meal Combination.” Depending on experimental condition, half of the participants in the subgoal attained condition were asked to imagine that “on a random day you had a light lunch,” whereas the rest of the participants in the absent subgoal condition were asked to imagine that “on a random day you had a regular lunch.” The dependent variable referred to participants’ plans to have a light and healthy dinner. They rated the likelihood of getting a light dinner on a 7-point scale (1 ⫽ not at all likely, 7 ⫽ very likely). This item was embedded among other irrelevant items. On completion of the survey, participants were debriefed and dismissed.
Results and Discussion The ratings of the interest in pursuing the goal-congruent actions were analyzed as a function of Goal Prime ⫻ Subgoal ⫻ Vignette. An analysis of variance (ANOVA) of this index yielded a main effect for goal prime, F(1, 95) ⫽ 4.27, p ⬍ .05, indicating greater interest in pursuing the second subgoal when the superordinate goal was primed (M ⫽ 4.14, SE ⫽ 0.19) than not primed (M ⫽ 3.63, SE ⫽ 0.19), and a main effect for vignette, F(1, 95) ⫽ 6.16, p ⬍ .05, indicating that participants were more interested in studying (M ⫽ 4.25, SE ⫽ 0.21) than applying sunscreen (M ⫽ 3.92, SE ⫽ 0.20) or having a light dinner (M ⫽ 3.52, SE ⫽ 0.17). These main effects were qualified by the predicted two-way interaction between goal prime (present vs. absent) and the initial subgoal (present vs. absent), F(1, 95) ⫽ 12.60, p ⬍ .001. No other main effect or interaction (including the three-way interaction) emerged in this analysis (Fs ⬍ 1). To explore the two-way interaction, we used planned contrasts to compare the effect of completing an initial subgoal on further self-regulation as a function of goal priming. The results are displayed in Figure 1. As shown, in the absence of the superordinate goal, completing an initial subgoal resulted in lower interest in pursuing another substitutable subgoal (M ⫽ 3.25, SE ⫽ 0.21, in the presence of an initial subgoal; M ⫽ 4.12, SE ⫽ 0.21, in the absence of an initial subgoal), t(48) ⫽ 2.98, p ⬍ .01. However, when the superordinate goal was primed, completing an initial subgoal increased interest in pursuing another subgoal toward the
235
Figure 1. Interest in a subgoal as a function of superordinate goal prime and initial goal pursuit.
same goal (M ⫽ 4.42, SE ⫽ 0.21, in the presence of an initial subgoal; M ⫽ 3.81, SE ⫽ 0.21, in the absence of an initial subgoal), t(47) ⫽ 2.05, p ⬍ .05. This pattern of results provides initial support for our hypothesis that in itself, subgoal attainment leads to disengagement with similar actions unless the superordinate goal is the focus, in which case subgoal attainment increases interest in pursuing similar means. Importantly, we find this pattern with compensatory subgoals (i.e., sunscreen and sun hat, which both serve a health goal) as well as with persistence on similar activities (i.e., study during the morning and at night or having a light lunch and a light dinner), which yielded similar patterns of substitution or reinforcement as a function of overall goal prime. We infer that the same action implies that other actions are needed when one focuses on one’s commitment to an overall goal but that more actions are redundant when the focus is on the attainment of a specific subgoal. This initial demonstration is limited to hypothetical choices and to scenarios that manipulate the presence versus absence of subgoals. Because people’s actual behavior may deviate from their responses to these questions, a second study was conducted to test for the effect of subgoal attainment on subsequent choice of action in a real-life setting involving actual self-regulation. A second objective of the next study was to examine how failures in subgoal pursuit may influence subsequent self-regulation. We predicted that in the presence of contextual cues for a superordinate goal, the focus is on commitment. Ergo, failure hurts subsequent performance by indicating lower commitment. However, when the superordinate goal is not primed, the focus is on goal progress, and therefore failure increases interest in other similar actions by indicating lack of progress.
Study 2: Working Out and Eating Healthy: Substitutes or Complements? This study tested for the effect of exercising at the gym on subsequent interest in pursuing a healthy diet as well as repeating the exercising behavior (i.e., compensation on a different subgoal and perpetuation on same subgoal). We first conducted a pretest to confirm that working out and consuming low-fat food are often stated as two possible subgoals for pursuing the goal of keeping in shape; therefore, a successful workout may influence one’s subsequent interest in consuming low-fat food. The specific effect of
FISHBACH, DHAR, AND ZHANG
236
working out (either increase or decrease in healthy eating and exercising) was expected to vary as a function of contextual cues for the superordinate goal of keeping in shape. A sense of subgoal accomplishment was manipulated through social comparison. We assumed that when subgoals are subjectively defined, individuals often obtain valuable feedback regarding their attainment through comparison to others (e.g., Mussweiler, 2003). Specifically, a comparison to a low social standard suggests that one has successfully pursued a subgoal, whereas a comparison to a high social standard suggests that one has not yet accomplished the subgoal. A standard of social comparison that attests to the merit of one’s own workout was therefore expected to provide a greater sense of subgoal attainment, which should influence interest in healthy eating as well as perpetuation of exercising, as a function of contextual cues for the superordinate goal. Participants in this study were specifically asked to list the amount of time that they have spent working out at the gym over the past week, and they listed this information on a survey form that had been previously filled out, presumably by another participant, and partially erased. Following a procedure developed by Simonson, Nowlis, and Simonson (1993), in this “partially filledout” survey a fictitious participant listed either a small or a large amount of exercising time, which induced downward and upward social comparison, respectively. When the superordinate goal was contextually cued, making downward social comparison (i.e., success on a subgoal) was expected to increase interest in similar subgoals more than upward social comparison (i.e., failure on a subgoal), because downward social comparison indicates high commitment. Conversely, when the superordinate goal was not cued, making upward social comparison was expected to increase the motivation to pursue similar subgoals more than downward social comparison, because upward social comparison indicates lack of progress.
Method Participants Eighty-four University of Chicago undergraduates (45 women and 39 men) volunteered to participate in the experiment. They were all approached at the exit of the university’s gym facilities after completing their workout.
a fictitious participant. They were told that because that person only completed the first item, we could save paper by using this survey again. On the basis of our pilot data that gym members work out on average about 5 hr per week, the fictitious respondent listed either 1 hr (low standard) or 10 hr (high standard), depending on experimental condition. These responses were crossed out but were clearly visible. After providing their estimated times, participants were further asked to assess the amount of time that they were planning to spend exercising in that particular week, and then they were thanked for their participation. Next, in a supposedly unrelated survey regarding students’ food consumption habits, participants were asked to rate the extent to which they were interested in consuming each of the following low-fat food items during that day: (a) fresh fruits, (b) green vegetables, (c) a bottle of mineral water, and (d) pizza (the last item was reverse coded). They provided their ratings on 7-point scales (1 ⫽ not at all, 7 ⫽ very much). After completing their ratings, participants were fully debriefed and dismissed. None of them expressed any suspicion regarding the priming manipulations or the purpose of the study.
Results and Discussion In support of the manipulation, participants in this study reported having spent 5.46 hr a week on average at the gym (SD ⫽ 3.08). The low versus high comparison standards (1 vs. 10 hr) were therefore calibrated for the tested population. To test our hypothesis, we collapsed participants’ ratings of interest in consuming healthy foods across the different items and analyzed them as a function of Fitness Prime ⫻ Social Comparison. An ANOVA of this index yielded the predicted Fitness Prime ⫻ Social Comparison interaction, F(1, 80) ⫽ 9.05, p ⬍ .01. As shown in Figure 2, in the absence of fitness prime, comparison to a low standard (and the resulting sense of subgoal attainment) elicited lower interest in healthy eating (M ⫽ 4.73, SE ⫽ 0.26) than comparison to a high standard (M ⫽ 5.60, SE ⫽ 0.25), t(37) ⫽ 2.63, p ⫽ .01. But in the presence of the fitness prime, comparison to a low standard elicited greater interest in healthy eating (M ⫽ 5.36, SE ⫽ 0.24) than comparison to a high standard (M ⫽ 4.77, SE ⫽ 0.24), t(43) ⫽ 1.68, p ⬍ .05 (one-tailed). No main effects were obtained in this analysis. To further test whether activation of a superordinate goal motivates behavioral consistency, including greater interest in additional subgoals following success and lower interest after failure, we next compared the effect of goal primes in each social comparison condition. In support of the hypothesis, we found that goal
Procedure The study used a 2 (fitness prime: present vs. absent) ⫻ 2 (social comparison: high vs. low) between-subjects design. An experimenter, who was unaware of the purpose of the study or the specific hypotheses, approached each participant at the exit of the university’s gym facility and handed him or her an experimental survey. Depending on experimental condition, the survey was either clipped to a hardcover book titled “Fitness and Health,” featuring two (male and female) joggers on its front page, or to a hardcover phonebook. These different books, which participants used as clipboards, unobtrusively primed the superordinate goal of keeping in shape and the control condition. To ensure that participants saw the book covers, the experimental surveys were placed between the cover and the first page. At the beginning of the experimental survey, participants were asked to specify the amount of time that they spent exercising over the last week. They completed their answers on a survey form that was partially filled by
Figure 2. Interest in healthy eating as a function of fitness priming and social comparison standard.
SUBGOALS AS SUBSTITUTES OR COMPLEMENTS
priming (vs. no priming) increased interest in a subsequent subgoal in the low-standard (subgoal attainment) condition, t(43) ⫽ 2.22, p ⬍ .05. However, the goal priming (vs. no priming) decreased interest in another subgoal in the high-standard (subgoal nonattainment) condition, t(37) ⫽ 2.05, p ⬍ .05. Focusing on the superordinate goal apparently increases behavioral consistency, and in particular, under this condition initial success increases interest in other actions that favor this goal, but initial failure decreases interest in similar actions that favor this goal. Our theory predicted a similar pattern for compensation on different subgoals and perpetuation on the same subgoal. In line with this prediction, we observed similar effects on participants’ subsequent interest in repeating the focal subgoal (i.e., exercising later that week). On the basis of participants’ reported workout times, a difference score was calculated, representing the amount of change in workout time between the previous and upcoming week for that particular person. High scores on this index indicated greater interest in exercising. ANOVA of this index yielded a Fitness Prime ⫻ Social Comparison interaction, F(1, 80) ⫽ 12.78, p ⬍ .01. In the absence of fitness prime, comparison to a low standard (subgoal attainment) elicited less interest in exercising (M ⫽ ⫺0.38 hr, SE ⫽ 0.30) than comparison to a high standard (subgoal non-attainment; M ⫽ 1.20 hr, SE ⫽ 0.30), t(37) ⫽ 4.23, p ⬍ .001. However, in the presence of fitness prime, comparison to a low standard elicited directionally more interest in exercising (M ⫽ 1.00 hr, SE ⫽ 0.28) than comparison to a high standard (M ⫽ 0.45 hr, SE ⫽ 0.28), t(43) ⫽ 1.16, p ⫽ .25. No other effects emerged in this analysis. This suggests that when a single activity is broken down into subgoals that are repeated over time (e.g., exercising on different days), the pursuit of one action may affect one’s interest in repeating this activity in the immediate future (see also Camerer, Babcock, Loewenstein, & Thaler, 1997). Study 2 indicates that a comparison to a low (vs. high) social standard and the resulting positive (vs. negative) feedback regarding a person’s own performance on a subgoal have opposite effects on the choice of other subgoals as a function of the accessibility of a superordinate goal. In support of our hypotheses, a comparison to others provided valuable feedback regarding one’s own level of goal attainment as well as one’s overall commitment to the corresponding superordinate goal. The relative focus on either the attainment of a subgoal or commitment to a superordinate goal, in turn, determined whether participants chose to disengage subsequently from similar goal-related actions. This study further extends the results of Study 1 to a field setting involving real goal-related activities and actual experience of success at self-regulation. However, in light of the current results there are still some remaining questions: First, the studies thus far did not manipulate the direct success versus failure feedback on self-regulation through subgoals. Ergo, it is yet unclear whether providing such feedback would have similar effects on the choice of subgoals. Second, it is also unclear whether participants’ subsequent intentions are reflected in their actual behavior, including persistence on subgoals. To further address the dynamics of selfregulation through subgoals, our next study tested for actual persistence on a subgoal after an initial (successful or failed) pursuit of similar actions and as a function of contextual cues for a superordinate goal.
237
Study 3: Initial Achievement and Subsequent Persistence People persist more on academic tasks when they experience greater commitment to the superordinate academic goal but persist less if they experience greater academic accomplishment. Therefore, the effect of initial academic success on subsequent pursuit of similar academic tasks should vary as a function of the inference of commitment versus progress that is made on the basis of initial goal pursuit, which depends on contextual cues for an overall academic goal during the initial performance. To test this hypothesis, we gave participants in Study 3 an opportunity to work on two independent verbal ability tests that represented subgoals to an academic achievement goal. The first test had correct solutions, whereas the second test was unsolvable. We predicted that success feedback on the first test would decrease participants’ motivation to persist on another unsolvable test unless the overall achievement goal was primed. When the achievement goal was primed, success feedback was expected to increase the motivation to persist on the second test.
Method Participants Sixty-five University of Chicago undergraduates (34 women and 31 men) participated in the experiment in exchange for $7.
Procedure The study used a 2 (achievement prime: present vs. absent) ⫻ 2 (success on an initial test: high vs. low) between-subjects design. It was completed on desktop computers. The first part of the experiment included the first academic test, which either included contextual cues for the overall academic achievement goal or not. Participants read that in this study they were going to take “verbal reasoning” tests that were described as reliable tests of college students’ verbal ability, which pertains to academic success. The first test was said to include a set of scrambled sentences, which participants were to unscramble into coherent sentences. Participants then read that they would be presented with several sets of five words, and their task was to pick exactly four words that form a sentence out of each set. They were further informed that their performance was based on their total number of correct solutions as well as their reaction times for each problem. The first test had sixteen problems. Following a procedure developed by Bargh and his colleagues (cf., Bargh & Chartrand, 2000), participants assigned to the achievement prime condition were asked to unscramble sentences that included words related to academic achievement (e.g., “firm the door succeed must,” “orange the he master was” and “accomplished pianists very lot are”; italics have been added to indicate achievementrelated primes). Participants in the control prime condition completed a set of similar sentences that did not include achievement-related concepts (e.g., “firm the door open must,” “orange the he color was,” and “musical pianists very decide are”). On completion of the first test, participants received computational feedback on their test performance. Depending on experimental condition, the computer program announced that on the basis of an analysis of their response times and number of correct solutions they have performed very well compared with others or that on the basis of these data they have performed below average on this test. Participants were then handed a short filler task before moving to the second test. Our main dependent variable referred to the time spent on the second test. This test comprised a set of eight scrambled sentences that had no correct solutions. Participants were asked to pick exactly seven words from each set of eight words to form a complete sentence (e.g., “ball the hoop
FISHBACH, DHAR, AND ZHANG
238
tosses normally iron often bounce”). Because these sentences had no correct solution, performance was indicated by the time participants persisted on this frustrating task (e.g., Muraven, Tice, & Baumeister, 1998). On completion of this test, participants were thoroughly debriefed and probed for possible suspicion. None of them reported having been aware of the achievement priming manipulation.
Results and Discussion An ANOVA of the time participants spent on the second test yielded the predicted Superordinate Goal Prime ⫻ Success interaction, F(1, 61) ⫽ 8.60, p ⬍ .01. As shown in Figure 3, in the absence of achievement priming, participants were less likely to persist on the second test after receiving high (M ⫽ 6.34 min, SE ⫽ 1.06) compared with low (M ⫽ 9.96 min, SE ⫽ 1.06) success feedback on the first test, t(32) ⫽ 2.06, p ⬍ .05. By contrast, with achievement priming participants persisted more on the second test after receiving high (M ⫽ 9.97, SE ⫽ 1.12) compared with low (M ⫽ 7.24 min, SE ⫽ 1.10) success feedback on the first test, t(29) ⫽ 2.28, p ⬍ .05. No main effect emerged in this analysis. These results support our prediction that, in itself, initial success decreases the motivation to persist on a similar task, unless the superordinate goal is highly accessible. The studies thus far manipulated the experience of an initial action (in terms of its own attainment or commitment to a superordinate goal) by changing the relative focus on the action itself or its relationship with the corresponding goal. We assumed that contextual cues for a superordinate goal rendered the selfregulatory outcome as evidence for a person’s general level of goal commitment. However, we have not yet tested directly for the inference that is made on the basis of subgoal attainment and which may refer to either greater goal progress when the focus is on subgoal attainment or greater goal commitment when the focus is on the superordinate goal. In addition, the first studies focused on one variable that determines the relative focus on attainment versus commitment (i.e., contextual cues for an overall goal). Another such variable refers to the temporal distance from subgoal pursuit: When people consider the pursuit of a subgoal in the proximal future, they focus on the concrete action, whereas when they consider the pursuit of this subgoal in the distant future, they focus on its higher order essence (e.g., Trope & Liberman, 2003). The pursuit of a subgoal in a proximal future should therefore highlight its concrete features (i.e., the how), which leads to
Figure 3. Persistence on the unsolvable test as a function of success on an initial test and achievement priming.
inferences of subgoal attainment, whereas the pursuit of this same action in the distant future should highlight its abstract characteristics (i.e., the why), which corresponds to overall goal priming and leads to inferences of goal commitment. Another final study was set to test for these possibilities. This study tested whether individuals frame proximal actions in terms of subgoal attainment (i.e., goal progress) but frame distant actions in terms of commitment to a superordinate goal. These framings in terms of progress or commitment were expected to influence the subsequent interest in complementary subgoals.
Study 4: Subgoals in the Proximal and Distant Future This study tested whether the greater focus on commitment to a superordinate goal when utilizing a distant (vs. proximal) framing of initial subgoal pursuit encourages the pursuit of additional subgoals that contribute to the same overall goal. Unlike previous studies, the focus on a superordinate goal versus specific subgoals was manipulated through temporal distance. Those in the proximal condition were asked to consider pursuing subgoals in the proximal future, whereas those in the distant condition considered pursuing these same actions in the distant future (Liberman & Trope, 1998; Trope & Liberman, 2003). We predicted that subgoal completion in the distant future would signal greater commitment to the superordinate goal, whereas subgoal completion in the near future would signal greater progress on subgoals themselves. Participants’ inferences (of commitment vs. progress), in turn, were expected to mediate their interest in additional subgoals to the overall goal.
Method Participants One hundred thirty-nine University of Chicago undergraduates (75 women and 64 men) participated in the experiment in return for $2.
Procedures This study used a 2 (temporal distance: proximal vs. distant) ⫻ 2 (goal vignette: workout vs. study) between-subjects design. Following research on temporal distance, each participant read one of the two scenarios that described subgoal pursuit in either the proximal or the distant future. Study vignette. The study vignette instructed participants to imagine studying in the library for two unrelated exams for two courses. In the proximal future condition, participants were asked to imagine that they have two exams coming up tomorrow and were told that “you are now studying in the library for the first exam and have studied for four hours.” In the distant future condition, participants read that these two exams are scheduled to take place a month from tomorrow, and they are now one day before the exams. The rest of the instructions were identical. The framing of the first activity (i.e., studying for the first exam) was then determined by the extent to which participants agreed with eight framing statements, four of which described subgoal attainment, that is, they focused on progress from subgoal pursuit (e.g., “Studying that much means I am getting closer to my academic objectives” and “Studying that much would really improve my academic performance”). The other four statements described goal commitment, that is, they focused on the experience of commitment from subgoal pursuit (e.g., “Studying that much, I am committed to doing well academically” and “Studying that much, I must really care about my academic performance”). All ratings were given on 7-point scales (1 ⫽ strongly disagree, 7 ⫽ strongly agree), and the
SUBGOALS AS SUBSTITUTES OR COMPLEMENTS order of statements was mixed to avoid possible ordering effect on subgoal framing. After rating the extent to which they agreed with framing statements, participants indicated the number of hours they would spend studying for the second exam. Three filler questions preceded this question, and they were added to conceal the purpose of the study (e.g., “Do you prefer to take exams in the morning or in the afternoon?”). Workout vignette. The workout vignette was similar to the study vignette, except it instructed participants to imagine working out in the gym for 3 hr during next week (proximal condition) versus during a week that is 3 months from today (distant condition). The framing of the subgoal was then determined by the extent to which participants agreed with eight statements, which were similar to those in the studying vignette. Four statements described subgoal attainment or progress (e.g., “Working out that much means I am making progress to my health objectives” and “Working out that much would really improve my health”), and four statements described goal commitment (e.g., “Working out that much, I am committed to my health objectives” and “Working out that much, I must really care about my health”). All ratings were given on 7-point scales (1 ⫽ strongly disagree, 7 ⫽ strongly agree), and the order of statements was mixed. After rating the extent to which they agreed with those statements, participants listed the number of additional hours they intended to spend in the gym working out during the specified week, among three other filler questions (e.g. “Do you prefer to drink water during or after the workout?”). On completion of the experimental survey, participants were debriefed and dismissed.
Results and Discussion The statements of subgoal framing were averaged across vignettes into two composite indices of goal progress (i.e., the extent to which participants focused on the attainment from a subgoal, ␣ ⫽ .79) and goal commitment (i.e., the extent to which participants focused on commitment to the superordinate goal, ␣ ⫽ .78). These separate indices were moderately, positively correlated (r ⫽ .42, p ⬍ .01). An ANOVA of Index (goal progress vs. goal commitment) ⫻ Time (proximal vs. distant) ⫻ Vignette (workout vs. study) yielded the predicted two-way Index ⫻ Time interaction, F(1, 135) ⫽ 33.77, p ⬍ .001, indicating that participants focused on the progress from subgoals in the proximal future (M ⫽ 4.72, SE ⫽ 0.16) more than in the distant future (M ⫽ 4.12, SE ⫽ 0.16), F(1, 135) ⫽ 7.16, p ⬍ .001. In addition, participants focused on the commitment to superordinate goals in the distant future (M ⫽ 4.94, SE ⫽ 0.15) more than in the proximal future (M ⫽ 4.28, SE ⫽ 0.15), F(1, 135) ⫽ 9.60, p ⬍ .001. No other effect emerged in this analysis and, in particular, there was no effect for vignette (Fs ⬍ 1). These results demonstrate that subgoals that are scheduled in the near future signal their own attainment, whereas subgoals that are scheduled in the distant future signal commitment to a superordinate goal. Next, we tested for the effect of time frame on interest in additional subgoals to the superordinate goal. The amount of time participants intended to invest on additional subgoals (i.e., study for an unrelated exam and work out again during the assigned week) was analyzed as a function of time frame and vignette. An ANOVA yielded the predicted main effect for time frame, F(1, 135) ⫽ 12.17, p ⬍ .01, indicating a greater intention to pursue additional subgoals in the distant (M ⫽ 4.01 hr, SE ⫽ 0.23) than in the proximal (M ⫽ 2.94 hr, SE ⫽ 0.23) future. There
239
was also a main effect for vignette, F(1, 135) ⫽ 5.05, p ⬍ .05, indicating greater intention to invest time in working out (M ⫽ 3.83 hr, SE ⫽ 0.24) than studying (M ⫽ 3.17 hr, SE ⫽ 0.22); however, as expected, framing (commitment vs. progress) and vignette did not interact (Fs ⬍ 1). Next, to test whether the framings of subgoals mediated the effect of temporal distance on participants’ interest in pursuing additional subgoals, we created a composite measure of subgoal framing, which reflects the simple contrast of framing (i.e., the difference between commitment and progress ratings). A higher score on this variable represents a general tendency to focus on commitment to a superordinate goal than on progress on specific subgoals. As shown in Figure 4, temporal distance directly increased participants’ interest in pursuing additional subgoals,  ⫽ .27, t(137) ⫽ 3.25, p ⬍ .01. However, indirectly, temporal distance increased participants’ tendency to focus on the commitment to subgoals rather than on goal progress,  ⫽ .45, t(137) ⫽ 5.89, p ⬍ .01, and this focus in turn enhanced the interest in pursuit of additional subgoals,  ⫽ .31, t(137) ⫽ 3.85, p ⬍ .01. Most crucial to the current analysis, controlling for the focus on commitment versus progress, the path from temporal distance to interest in pursuing additional subgoals became nonsignificant ( ⫽ .16, ns). The Sobel test statistic found that the focus significantly mediated the interest in pursuing additional goals (z ⫽ 3.22, p ⬍ .01). This analysis confirms the hypothesis that people’s interest in pursuing additional subgoals in the distant future stems from their general tendency to focus on the commitment to subgoals rather than to goal progress. Taken together, the results of this study suggest that people construe their subgoal attainment as commitment to a superordinate goal when the subgoal is temporally distant but as progress toward the subgoal when the subgoal is temporally proximal. Moreover, the framing of successful subgoal pursuit as commitment accounts for the tendency to choose additional subgoals in the distant versus proximal future. In other words, temporally distant plans are more motivating than proximal plans after initial success. Further analysis revealed that both vignettes, if analyzed separately, produced the same mediational relations. This suggests that an abstract framing of goal pursuit in the distant future facilitates both the pursuit of several different subgoals (i.e., studying for unrelated exams) as well as perpetuation of the same subgoal (i.e., working out on different days).
Figure 4. Path model of the influence of temporal distance on interest in additional subgoals. The number in parentheses is the zero-order standardized beta.
FISHBACH, DHAR, AND ZHANG
240 General Discussion
Past research has identified the importance of breaking a goal into subgoals as an adaptive means of self-regulation (cf., Carver & Scheier, 1990; Gollwitzer, 1999; Shah & Kruglanski, 2003; Vallacher & Wegner, 1987). The current investigation focused on the effect of initial subgoal pursuit on subsequent self-regulation. We proposed that the effect of subgoal attainment would change systematically when focusing on the subgoal versus the superordinate goal it is assumed to serve. When the focus is on subgoal attainment, the pursuit of the initial subgoal would hinder the selection of similar means, and failure on this subgoal would encourage selection of similar means. Conversely, when the focus is on a superordinate goal, the pursuit of an initial subgoal would increase commitment to similar means that favor the same overriding goal, whereas failure would decrease commitment to similar means. The focus on a salient superordinate goal thus would moderate the effect of subgoal attainment on subsequent self-regulation. This self-regulatory process through subgoals was explored in four studies involving different goal domains (e.g., academic goals, health objectives) and with different experimental techniques, including hypothetical scenarios, field studies, and lab experiments. In these studies, we found consistent support for our hypothesis that, in itself, subgoals are substitutable, but they reinforce each other when the focus is on the commitment to a superordinate goal. Specifically, participants in Study 1 chose to disengage with a goal-related activity (e.g., applying sunscreen) after successfully pursuing an initial subgoal toward this aim (e.g., wearing a sun hat). This pattern was reversed when constructs representing the superordinate goal were primed outside of conscious awareness in a scrambled-sentence task. Studies 2 and 3 extended the basic effect to real-life decisions, which involve actual success and failure at pursuing subgoals. In Study 2, participants who learned that they exercise more (vs. less) than others had a greater sense of subgoal attainment, and they were subsequently less interested in maintaining a healthy diet and exercising again. However, when the overriding goal of keeping in shape was primed, those who learned that they exercise more (vs. less) than others were more interested in maintaining a healthy diet as well as exercising again. Study 3 found that initial success (vs. failure) on a test decreased the motivation to persist on another test of the same cognitive ability, unless participants were primed with a superordinate achievement goal, which then led to greater persistence following initial success (vs. failure) on the test. These studies further found that success is more motivating when the focus is on the superordinate goal, that is, when it indicates commitment, but initial failure is more motivating when the focus is on subgoal attainment, that is, when it indicates low goal progress. Finally, to address more closely the framing of subgoals in terms of partial goal attainment or commitment to a superordinate goal, we designed Study 4 to assess the framing of temporally proximal versus distant subgoals. Study 4 found a tendency to focus on the progress from a proximal subgoal attainment, which leads to moving away from similar subgoals, but to infer commitment from a distant subgoal attainment, which motivates pursuit of additional subgoals toward the superordinate goal.
Although previous research on cognitive consistency has documented a general tendency to choose actions that are similar to a person’s previous actions (e.g., Bem, 1972; Cialdini, Trost, & Newsom, 1995), the current investigation provides evidence for both disengagement and reinforcement following initial choice. This investigation is therefore consistent with research in other domains, which provide evidence for both disengagement and reinforcement following initial actions. For example, our research is consistent with previous findings regarding the liberating effect of nondiscriminatory behaviors on subsequent discriminatory actions (e.g., Monin & Miller, 2001; Steele, 1988). In our terms, nondiscriminatory behaviors signal that the goal is met and therefore they justify incongruent, discriminatory actions. However, our analysis further implies that when individuals attribute the meaning of their initial behavior to their central values and beliefs, they are more likely to infer commitment to egalitarian values and avoid discriminatory actions. In general, our research provides a framework that can account for substitution as well as reinforcement in the regulation of multiple goals.
Implications for Research on Self-Regulatory Failures Breaking a goal into subgoals is often an effective means of self-regulation. Relatively little is understood about subsequent behavior on actions relating to the same superordinate goal. The current research suggests that having subgoals may backfire and lead to poorer self-regulation in certain situations. For instance, we found that following initial success, students were less likely to persist on another similar academic test and that positive feedback regarding one’s exercising habits reduced one’s motivation to eat healthy food or exercise again that week. We believe however, that the mechanism of balancing between subgoals (cf. Dhar & Simonson, 1999; Fishbach & Dhar, 2005) allows multiple goals to be selected and pursued. Because many life situations involve striving toward multiple goals that may be inconsistent with each other (e.g., studying and traveling), it is adaptive to express a certain degree of balancing between different motivational tendencies. By making incongruent choices, individuals not only secure the pursuit of multiple personal goals but further maximize the successful pursuit of their entire goal set. However, inconsistent actions can become maladaptive whenever individuals fail to attain important goals that are in conflict with some low-level desires or temptations (e.g., Fishbach & Trope, 2005; Loewenstein, 1996; Metcalfe & Mischel, 1999; Muraven & Baumeister, 2000). We find that in the absence of an accessible superordinate goal, individuals tend to move away from a superordinate goal following successful pursuit of a subgoal. Possibly then, when individuals naturally think in terms of subgoals attainment rather than focusing on the overriding goal, they tend to move away from a goal too quickly and in favor of immediate temptations. Such maladaptive self-regulatory patterns were documented before in research on choice bracketing (Camerer et al., 1997; Read, Loewenstein, & Rabin, 1999). For example, in one study cab drivers stopped working once they reached their subgoal for the day, even though their overall goal of making money would have been better served by working longer hours on these days (Camerer et al., 1997). This suggests that thinking purely in terms of subgoal attainment interferes with adequate self-regulation, in particular, when people mistake their subgoals
SUBGOALS AS SUBSTITUTES OR COMPLEMENTS
for an overriding goal such that they focus solely on the pursuit of separate actions (e.g., eating healthy today as opposed to leading a healthy lifestyle in general).
Patterns of Sequencing This research suggests that voluntarily chosen actions can potentially elicit congruent or incongruent subsequent choices. In this respect, our findings support the notion that people seek consistency in choice sequences as well as the notion that people are driven by an inherent desire to appear flexible and variety seeking. Specifically, a number of previous studies have shared the underlying assumption that individuals are driven by a general desire to appear consistent, both in the eyes of others as well as in their own eyes (see Aronson, 1997; Bem, 1972; Cialdini, 2001; Cooper & Fazio, 1984; Steele, 1988). Moreover, researchers have assumed that behavioral consistency is desirable and rewarded by society; thus, people should prefer to pursue actions that mostly resemble their previously chosen actions. However, there are also other studies, which have attested to the inherent value of diversity or variety seeking, and these studies have found a general desire to make inconsistent choices to maximize choice variety, even in situations in which one choice alternative clearly dominates others (Loewenstein & Prelec, 1992; Thaler, 1991). On the basis of that research, people should prefer actions that are mostly different from their previously chosen actions. Our research distinguishes between the conditions that facilitate consistency and inconsistency in choice sequences. We suggest that when the focus is on subgoal attainment, people believe that they should act differently, but when the focus is on the commitment to and identification with a superordinate goal, people believe that they should choose other consistent actions. It appears that the psychological meaning of choice matters. Rather than assuming a universal tendency to pursue consistency or diversity in choice sequence, our analysis suggests that initial actions can motivate a need for consistency as well as a need for diversity. Finally, our research is also relevant to the study of the selfregulation of discrete actions that are spread over time from a standpoint of maximizing mental resources (e.g., Aspinwall & Taylor, 1997; Trope, Ferguson, & Raghunathan, 2001). We have explored the undermining versus motivating effect that an initial goal pursuit may have on subsequent pursuit of any action. However, it is further possible that goal pursuits influence the specific type of inconsistent actions that may follow. For example, as some goals (e.g., studying) are mentally depleting, pursuit of these goals is often followed by withdrawal and the choice of more relaxing activities (Muraven & Baumeister, 2000). On the other hand, some goal pursuits are resource fulfilling (e.g., watching a light comedy), and pursuing these goals may facilitate the pursuit of other, more effortful goals. Future research would have to examine the relationship between inconsistent goal pursuits in a sequence and their underlying principles.
References Aarts, H., & Dijksterhuis, A. (2000). Habits as knowledge structures: Automaticity in goal-directed behavior. Journal of Personality and Social Psychology, 78, 53– 63. Aronson, E. (1997). The theory of cognitive dissonance: The evolution and
241
vicissitudes of an idea. In C. McGarty & S. A. Haslam (Eds.), The message of social psychology: Perspectives on mind in society (pp. 20 –35). Cambridge, MA: Blackwell. Aspinwall, L. G., & Taylor, S. E. (1997). A stitch in time: Self-regulation and proactive coping. Psychological Bulletin, 121, 417– 436. Bargh, J. A., & Barndollar, K. (1996). Automaticity in action: The unconscious as repository of chronic goals and motives. In P. M. Gollwitzer & J. A. Bargh (Eds.), The psychology of action (pp. 457– 481). New York: Guilford Press. Bargh, J. A., & Chartrand, T. L. (2000). The mind in the middle: A practical guide to priming and automaticity research. In H. T. Reis & C. M. Judd (Eds.), Handbook of research methods in social and personality psychology (pp. 253–285). New York: Cambridge University Press. Bem, D. J. (1972). Self-perception theory. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 6, pp. 1– 62). New York: Academic Press. Brunstein, J. C., & Gollwitzer, P. M. (1996). Effects of failure on subsequent performance: The importance of self-defining goals. Journal of Personality and Social Psychology, 70, 395– 407. Camerer, C., Babcock, L., Loewenstein, G., & Thaler, R. (1997). Labor supply of New York City taxi drivers: One day at a time. Quarterly Journal of Economics, 112, 407– 441. Carver, C. S., & Scheier, M. F. (1990). Principles of self-regulation: Action and emotion. In E. T. Higgins & R. M. Sorrentino (Eds.), Handbook of motivation and cognition: Foundations of social behavior (Vol. 2, pp. 3–52). New York: Guilford Press. Carver, C. S., & Scheier, M. F. (1998). On the self-regulation of behavior. New York: Cambridge University Press. Chartrand, T. L., & Bargh, J. A. (1996). Automatic activation of impression formation and memorization goals: Nonconscious goal priming reproduces effects of explicit task instructions. Journal of Personality and Social Psychology, 71, 464 – 478. Cialdini, R. B. (2001). Influence: Science and practice (4th ed.). Boston: Allyn & Bacon. Cialdini, R. B., Trost, M. R., & Newsom, J. T. (1995). Preference for consistency: The development of a valid measure and the discovery of surprising behavioral implications. Journal of Personality and Social Psychology, 69, 318 –328. Cooper, J., & Fazio, R. H. (1984). A new look at dissonance theory. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 17, pp. 229 –264). Orlando, FL: Academic Press. Dhar, R., & Simonson, I. (1999). Making complementary choices in consumption episodes: Highlighting versus balancing. Journal of Marketing Research, 36(1), 29 – 44. Emmons, R. A. (1992). Abstract versus concrete goals: Personal striving level, physical illness, and psychological well-being. Journal of Personality and Social Psychology, 62, 292–300. Emmons, R. A., & King, L. A. (1988). Conflict among personal strivings: Immediate and long-term implications for psychological and physical well-being. Journal of Personality and Social Psychology, 54, 1040 – 1048. Festinger, L. (1957). A theory of cognitive dissonance. Evanston, IL: Row, Peterson. Fishbach, A., & Dhar, R. (2005). Goals as excuses or guides: The liberating effect of perceived goal progress on choice. Journal of Consumer Research, 32, 370 –377. Fishbach, A., Shah, J. Y., & Kruglanski, A. W. (2004). Emotional transfer in goal systems. Journal of Experimental Social Psychology, 40, 723– 738. Fishbach, A., & Trope, Y. (2005). The substitutability of external control and self-control. Journal of Experimental Social Psychology, 41, 256 – 270.
242
FISHBACH, DHAR, AND ZHANG
Gollwitzer, P. M. (1999). Implementation intentions: Strong effects of simple plans. American Psychologist, 54, 493–503. Gollwitzer, P. M., & Brandstaetter, V. (1997). Implementation intentions and effective goal pursuit. Journal of Personality and Social Psychology, 73, 186 –199. Higgins, E. T. (1989). Self-discrepancy theory: What patterns of selfbeliefs cause people to suffer? In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 22, pp. 93–136). San Diego, CA: Academic Press. Higgins, E. T., Strauman, T., & Klein, R. (1986). Standards and the process of self-evaluation: Multiple affects from multiple stages. In R. M. Sorrentino & E. T. Higgins (Eds.), Handbook of motivation and cognition: Foundations of social behavior (pp. 23– 63). New York: Guilford Press. Kruglanski, A. W., Shah, J. Y., Fishbach, A., Friedman, R., Chun, W. Y., & Sleeth-Keppler, D. (2002). A theory of goal systems. In M. P. Zanna (Ed.), Advances in experimental social psychology (Vol. 34, pp. 331– 378). San Diego, CA: Academic Press. Liberman, N., & Trope, Y. (1998). The role of feasibility and desirability considerations in near and distant future decisions: A test of temporal construal theory. Journal of Personality and Social Psychology, 75, 5–18. Locke, E. A., & Latham, G. P. (1990). A theory of goal setting & task performance. Upper Saddle River, NJ: Prentice Hall. Loewenstein, G. (1996). Out of control: Visceral influences on behavior. Organizational Behavior and Human Decision Processes, 65, 272–292. Loewenstein, G., & Prelec, D. (1992). Anomalies in interpersonal choice: Evidence and an interpretation. In G. Loewenstein & J. Elster (Eds.), Choice over time (pp. 119 –145). New York: Russell Sage Foundation. Markus, H., & Ruvolo, A. (1989). Possible selves: Personalized representations of goals. In L. A. Pervin (Ed.), Goal concepts in personality and social psychology (pp. 211–241). Hillsdale, NJ: Erlbaum. Metcalfe, J., & Mischel, W. (1999). A hot/cool-system analysis of delay of gratification: Dynamics of willpower. Psychological Review, 106, 3–19. Monin, B., & Miller, D. T. (2001). Moral credentials and the expression of prejudice. Journal of Personality and Social Psychology, 81, 33– 43. Muraven, M., & Baumeister, R. F. (2000). Self-regulation and depletion of limited resources: Does self-control resemble a muscle? Psychological Bulletin, 126, 247–259. Muraven, M., Tice, D. M., & Baumeister, R. F. (1998). Self-control as a
limited resource: Regulatory depletion patterns. Journal of Personality and Social Psychology, 74, 774 –789. Mussweiler, T. (2003). Comparison processes in social judgment: Mechanisms and consequences. Psychological Review, 110, 472– 489. Read, D., Loewenstein, G., & Rabin, M. (1999). Choice bracketing. Journal of Risk and Uncertainty, 19(1), 171–197. Shah, J. Y., Friedman, R., & Kruglanski, A. W. (2002). Forgetting all else: On the antecedents and consequences of goal shielding. Journal of Personality and Social Psychology, 83, 1261–1280. Shah, J. Y., & Kruglanski, A. W. (2003). When opportunity knocks: Bottom-up priming of goals by means and its effects on self-regulation. Journal of Personality and Social Psychology, 84, 1109 –1122. Simonson, I., Nowlis, S. M., & Simonson, Y. (1993). The effect of irrelevant preference arguments on consumer choice. Journal of Consumer Psychology, 2, 287–306. Soman, D., & Cheema, A. (2004). When goals are counter-productive: The effects of violation of a behavioral goal on subsequent performance. Journal of Consumer Research, 31(1), 52– 62. Steele, C. M. (1988). The psychology of self-affirmation: Sustaining the integrity of the self. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 21, pp. 261–302). New York: Academic Press. Tesser, A., Martin, L. L., & Cornell, D. P. (1996). On the substitutability of self-protective mechanisms. In P. M. Gollwitzer & J. A. Bargh (Eds.), The psychology of action: Linking cognition and motivation to behavior (pp. 48 – 68). New York: Guilford. Thaler, R. H. (1991). Quasi rational economics. New York: Russell Sage Foundation. Trope, Y., Ferguson, M., & Raghunathan, R. (2001). Mood as a resource in processing self-relevant information. In J. P. Forgas (Ed.), Handbook of affect and social cognition (pp. 256 –274). Hillsdale, NJ: Erlbaum. Trope, Y., & Liberman, N. (2003). Temporal construal. Psychological Review, 110, 403– 421. Vallacher, R. R., & Wegner, D. M. (1987). What do people think they’re doing? Action identification and human behavior. Psychological Review, 94, 3–15. Wicklund, R. A., & Gollwitzer, P. M. (1982). Symbolic self-completion. Hillsdale, NJ: Erlbaum.
Received February 1, 2005 Revision received November 22, 2005 Accepted November 24, 2005 䡲
Instructions to Authors For Instructions to Authors, please visit www.apa.org/journals/psp and click on the “Instructions to Authors” link in the Journal Info box on the right.
242
FISHBACH, DHAR, AND ZHANG
Gollwitzer, P. M. (1999). Implementation intentions: Strong effects of simple plans. American Psychologist, 54, 493–503. Gollwitzer, P. M., & Brandstaetter, V. (1997). Implementation intentions and effective goal pursuit. Journal of Personality and Social Psychology, 73, 186 –199. Higgins, E. T. (1989). Self-discrepancy theory: What patterns of selfbeliefs cause people to suffer? In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 22, pp. 93–136). San Diego, CA: Academic Press. Higgins, E. T., Strauman, T., & Klein, R. (1986). Standards and the process of self-evaluation: Multiple affects from multiple stages. In R. M. Sorrentino & E. T. Higgins (Eds.), Handbook of motivation and cognition: Foundations of social behavior (pp. 23– 63). New York: Guilford Press. Kruglanski, A. W., Shah, J. Y., Fishbach, A., Friedman, R., Chun, W. Y., & Sleeth-Keppler, D. (2002). A theory of goal systems. In M. P. Zanna (Ed.), Advances in experimental social psychology (Vol. 34, pp. 331– 378). San Diego, CA: Academic Press. Liberman, N., & Trope, Y. (1998). The role of feasibility and desirability considerations in near and distant future decisions: A test of temporal construal theory. Journal of Personality and Social Psychology, 75, 5–18. Locke, E. A., & Latham, G. P. (1990). A theory of goal setting & task performance. Upper Saddle River, NJ: Prentice Hall. Loewenstein, G. (1996). Out of control: Visceral influences on behavior. Organizational Behavior and Human Decision Processes, 65, 272–292. Loewenstein, G., & Prelec, D. (1992). Anomalies in interpersonal choice: Evidence and an interpretation. In G. Loewenstein & J. Elster (Eds.), Choice over time (pp. 119 –145). New York: Russell Sage Foundation. Markus, H., & Ruvolo, A. (1989). Possible selves: Personalized representations of goals. In L. A. Pervin (Ed.), Goal concepts in personality and social psychology (pp. 211–241). Hillsdale, NJ: Erlbaum. Metcalfe, J., & Mischel, W. (1999). A hot/cool-system analysis of delay of gratification: Dynamics of willpower. Psychological Review, 106, 3–19. Monin, B., & Miller, D. T. (2001). Moral credentials and the expression of prejudice. Journal of Personality and Social Psychology, 81, 33– 43. Muraven, M., & Baumeister, R. F. (2000). Self-regulation and depletion of limited resources: Does self-control resemble a muscle? Psychological Bulletin, 126, 247–259. Muraven, M., Tice, D. M., & Baumeister, R. F. (1998). Self-control as a
limited resource: Regulatory depletion patterns. Journal of Personality and Social Psychology, 74, 774 –789. Mussweiler, T. (2003). Comparison processes in social judgment: Mechanisms and consequences. Psychological Review, 110, 472– 489. Read, D., Loewenstein, G., & Rabin, M. (1999). Choice bracketing. Journal of Risk and Uncertainty, 19(1), 171–197. Shah, J. Y., Friedman, R., & Kruglanski, A. W. (2002). Forgetting all else: On the antecedents and consequences of goal shielding. Journal of Personality and Social Psychology, 83, 1261–1280. Shah, J. Y., & Kruglanski, A. W. (2003). When opportunity knocks: Bottom-up priming of goals by means and its effects on self-regulation. Journal of Personality and Social Psychology, 84, 1109 –1122. Simonson, I., Nowlis, S. M., & Simonson, Y. (1993). The effect of irrelevant preference arguments on consumer choice. Journal of Consumer Psychology, 2, 287–306. Soman, D., & Cheema, A. (2004). When goals are counter-productive: The effects of violation of a behavioral goal on subsequent performance. Journal of Consumer Research, 31(1), 52– 62. Steele, C. M. (1988). The psychology of self-affirmation: Sustaining the integrity of the self. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 21, pp. 261–302). New York: Academic Press. Tesser, A., Martin, L. L., & Cornell, D. P. (1996). On the substitutability of self-protective mechanisms. In P. M. Gollwitzer & J. A. Bargh (Eds.), The psychology of action: Linking cognition and motivation to behavior (pp. 48 – 68). New York: Guilford. Thaler, R. H. (1991). Quasi rational economics. New York: Russell Sage Foundation. Trope, Y., Ferguson, M., & Raghunathan, R. (2001). Mood as a resource in processing self-relevant information. In J. P. Forgas (Ed.), Handbook of affect and social cognition (pp. 256 –274). Hillsdale, NJ: Erlbaum. Trope, Y., & Liberman, N. (2003). Temporal construal. Psychological Review, 110, 403– 421. Vallacher, R. R., & Wegner, D. M. (1987). What do people think they’re doing? Action identification and human behavior. Psychological Review, 94, 3–15. Wicklund, R. A., & Gollwitzer, P. M. (1982). Symbolic self-completion. Hillsdale, NJ: Erlbaum.
Received February 1, 2005 Revision received November 22, 2005 Accepted November 24, 2005 䡲
Instructions to Authors For Instructions to Authors, please visit www.apa.org/journals/psp and click on the “Instructions to Authors” link in the Journal Info box on the right.
Journal of Personality and Social Psychology 2006, Vol. 91, No. 2, 243–254
Copyright 2006 by the American Psychological Association 0022-3514/06/$12.00 DOI: 10.1037/0022-3514.91.2.243
Distinguishing Stereotype Threat From Priming Effects: On the Role of the Social Self and Threat-Based Concerns David M. Marx and Diederik A. Stapel University of Groningen It has been argued that priming negative stereotypic traits is sufficient to cause stereotype threat. The present research challenges this assumption by highlighting the role of the social self and targets’ concerns about confirming a negative group-based stereotype. Specifically, in 3 experiments the authors demonstrate that stereotype threat adversely affects the test performance and threat-based concerns of targets (but not nontargets) because only targets’ social self is linked to the negative group stereotype. Trait priming, however, harms the test performance of both targets and nontargets but has no effect on their threat-based concerns because trait priming does not require such a link between the social self and the group stereotype. Moreover, the authors show that merely increasing the accessibility of the social self in nonthreatening situations leads to the underperformance of targets but has no meaningful effect on nontargets’ test performance. Keywords: stereotype threat, priming, threat-based concerns, social identity, performance
fects. Because there is a growing body of research investigating priming and stereotype threat, it appears particularly important and timely to distinguish between these two accounts. Moreover, because the behavioral outcomes of priming and stereotype threat can look the same (poor performance), a better understanding of stereotype-based performance seems needed.
To date, considerable research has investigated the adverse effects of stereotype threat on targets’ test performance (Steele, 1997; Steele, Spencer, & Aronson, 2002). This predicament, known as stereotype threat, refers to situations in which stereotyped targets underperform when a negative stereotype is relevant to their performance. For example, if women take a math test that they think is diagnostic of their mathematical ability, they typically have lower scores than men because the diagnosticity manipulation (see Steele & Aronson, 1995) makes the stereotype relevant to women’s performance. When the same math test is described in nonthreatening terms, however, women and men perform similarly because the negative stereotype is irrelevant to women’s performance. Stereotype threat is now well documented among a variety of groups and performance domains (e.g., Croizet & Claire, 1998; Gonzales, Blanton, & Williams, 2002; Inzlicht & Ben-Zeev, 2000; Marx & Roman, 2002; Spencer, Steele, & Quinn, 1999; Steele & Aronson, 1995; Stone, Lynch, Sjomeling, & Darley, 1999). Given the ubiquity of stereotype threat effects, the next step is to clarify how stereotype threat differs from other processes that also lead to stereotype confirming behavior, namely stereotype priming ef-
Priming and Stereotype Threat One of the primary aspects of stereotype threat is that a negative stereotype (e.g., “women are bad at math”) must be activated to harm targets’ test performance (e.g., women’s poor math test performance). This has led some researchers to view the priming of negative stereotypic traits as a sufficient means to create a stereotype threat situation, because a stereotype threat situation is also a situation in which stereotype activation leads directly to stereotype-related behaviors (Ambady, Paik, Steele, Owen-Smith, & Mitchell, 2004; Dijksterhuis & Bargh, 2001; Dijksterhuis & Corneille, 2004; Gladwell, 2005; Oswald & Harvey, 2000; Shih, Ambady, & Pittinsky, 1999; Wheeler, Jarvis, & Petty, 2001). As several experiments have shown, when participants are primed with labels or stereotypic traits, they behave in a stereotypeconsistent manner, such as walking slower down a hallway when primed with the stereotype of an older person (Bargh, Chen, & Burrows, 1996) or performing less intelligently on general knowledge tests when thinking about soccer hooligans (Dijksterhuis & van Knippenberg, 1998). In contrast to this priming perspective, which suggests that stereotype threat could be viewed as a general priming effect (i.e., stereotype activation 3 stereotype-confirming test performance), others have argued that stereotype threat is more than a mere priming effect. For instance, according to Marx, Brown, and Steele (1999), targets’ poor test performance is due to the “situational pressure posed by the prospect of being seen or treated through the lens of a negative group stereotype” (p. 493). This situational pressure, in turn, increases targets’ worry about being judged in
David M. Marx and Diederik A. Stapel, Department of Social and Organizational Psychology, University of Groningen, Groningen, the Netherlands. This research was supported in part by “Pionier” Grant “Making Sense of Hot Cognition” from the Dutch National Science Foundation (Nederlandse Organisatie voor Wetenschappelijk Onderzoek) and a research grant from the Heymans Institute of the University of Groningen awarded to Diederik A. Stapel. We thank Sei Jin Ko for her helpful comments on earlier versions of this article. Correspondence concerning this article should be addressed to David M. Marx, who is now at the Psychology Department, San Diego State University, 5500 Campanile Drive, San Diego, CA 92182-4611, or to Diederik A. Stapel, Department of Social and Organizational Psychology, University of Groningen, Grote Kruisstraat 2/1, Groningen 9712 TS, the Netherlands. E-mail:
[email protected] or
[email protected] 243
244
MARX AND STAPEL
terms of the stereotype associated with their stereotyped identity if they underperform (e.g., Steele, 1997; Steele et al., 2002). Consider a situation in which women and men are taking a difficult math test. If the women perform poorly, this could be perceived as evidence for the stereotype that “women cannot do math.” Consequently, in stereotype threat experiments, unlike in priming experiments, there is a clear relationship between the negative stereotype and targets’ (but not nontargets’) performance on the test. Even though there is a large amount of research exploring the effects of stereotype threat and priming on participants’ performance, a direct empirical comparison between these two accounts of stereotype-based performance has not been made until now. Thus, at present, we do not know to what extent stereotype threat and stereotype priming effects are empirically similar or different (but see Wheeler and Petty, 2001, for a review of stereotype threat and priming effects on behavior). In this article, we argue and demonstrate that stereotype threat is more than a general priming effect: Stereotype threat is a situational predicament that links one’s performance to the concern about confirming a negative group stereotype (i.e., “I worry that my test performance may confirm the negative stereotype about my group”). In other words, although priming effects typically rely on perception-to-behavior effects that refer to relatively global, nonspecific links between activation and behavior (e.g., priming soccer hooligans causes aggressive behavior for anyone because this category label activates the relevant concept, aggressiveness), stereotype threat relies on perception-to-behavior effects that are less global and more specific. Stereotype priming can affect anyone, whereas stereotype threat, by definition, only occurs for those people who are targeted by the relevant stereotype. Of course, in both cases the stereotype needs to be activated, but it is how people view the stereotype that distinguishes between these two accounts of stereotype-based performance. Thus, whereas stereotype knowledge (e.g., associating older people with the trait slow or associating hooligans with the trait aggressive) is sufficient for stereotype activation to result in general behavioral effects (e.g., walking slower, acting aggressively), such knowledge is not enough for stereotype threat effects to occur. For stereotype threat, both knowing and being are necessary (see Marx, Stapel, & Muller, 2005; Steele, 1997; Steele et al., 2002). For example, in stereotype threat situations, targets (but not nontargets) are affected because they know the group stereotype (“women are bad at math”) and because they are members of the group that is targeted by the stereotype (“I am a woman”). Hence, self-relevance of a stereotype is one of the keys to stereotype threat effects. Priming self-relevant or other-relevant stereotypes can lead to stereotype-consistent behavior (e.g., Shih, Ambady, Richeson, Fujita, & Gray, 2002; Wheeler et al., 2001), but only when the stereotype is self-relevant may the primed behavior be related to the stereotype threat experience. Thus, on the surface, the differences between negative trait priming and threat appear somewhat indistinguishable: Both trait priming and threat can lead to the same outcome (poor performance). However, if one looks below the surface, the differences are clear. For trait priming effects, it does not matter who one is; for threat effects, it does. That is, if one is targeted by a negative stereotype, then one’s concerns about confirming that stereotype are likely to be increased. Not only does stereotype threat induce stereotype-consistent behavior, it also induces worry and concern (Spencer et al., 1999; Steele et al.,
2002). It is interesting that none of the research that has looked at stereotype threat from a priming perspective has included such a measure of worry or concern. Rather, the relevant research to date has focused mainly and primarily on test performance and not on participants’ self-related, threat-based concerns (Ambady et al., 2004; Shih et al., 1999, 2002; Wheeler et al., 2001).
Threat-Based Concerns Although at first glance the results from many priming experiments appear consistent with the notion that priming stereotypic traits may be sufficient for causing stereotype threat effects on targets’ performance, we think that priming procedures are insufficient because they bypass one of the hallmarks of stereotype threat: the worry about confirming a negative stereotype associated with one’s group (Major & O’Brien, 2005; Marx et al., 1999; Steele et al., 2002). It is also important to note that because much of the research exploring the impact of negative traits on performance has not included both targets and nontargets (see e.g., Ambady et al., 2004; Bargh et al., 1996; Dijksterhuis & van Knippenberg, 1998; Wheeler et al., 2001; but see Shih et al., 2002, for work on positive trait priming among targets and nontargets), it remains difficult to determine whether negative trait priming affects the performance of targets and nontargets in the same way and whether trait priming has any effect on participants’ threatbased concerns (see also Wheeler & Petty, 2001). We contend, therefore, that what is needed to assess this knowing-and-being aspect of stereotype threat are measures of participants’ threat-based concerns, as this type of measure can show whether targets link their performance in the testing situation to their stereotyped identity. That no measure of participants’ threat appraisals has been included in past priming research only helps to fuel the misperception that negative trait priming can elicit stereotype threat. By including such a measure, it may be possible to show that despite similar performance outcomes the primes have very different effects on participants’ threat-based concerns (high concern ⫽ stereotype threat, low or no concern ⫽ priming).
The Role of the Social Self This knowing-and-being logic of stereotype threat also suggests that stereotype threat situations are especially likely to increase the accessibility of the social identity that is most relevant to the stereotype. Recent research by Marx et al. (2005) provides an illustration of this point. These researchers showed that female participants’ gender group (relative to other social groups, such as friends and family) is more accessible in a math stereotype threat situation than in a nonthreat situation. This finding shows that stereotype threat situations increase the accessibility of the relevant social self (gender).1 In the present research, we test whether 1
Some may view this reasoning as similar to research on identity bifurcation by Pronin, Steele, and Ross (2004). On the contrary, we see the Pronin et al. research and the present research as being rather distinct. Pronin et al. focused on how the situation (either a stereotype threat or neutral one) can lead to differences in targets’ identification with certain characteristics that are stereotype relevant, whereas we focus on how one’s social identity in combination with the situational cues can trigger stereotype threat (see also Marx et al., 2005).
STEREOTYPE THREAT AND PRIMING
the reverse is also true: Will increased accessibility of a generalized social self change a nonthreatening testing situation into a threatening one if targets are given a relevant situational cue (seeing math problems)? Our reasoning behind this notion is that increasing accessibility of a generalized social self (“we-ness”) may thus enhance targets’ threat-based concerns— even in nonthreatening testing situations— because the relevant cues likewise increase the accessibility of the group stereotype associated with the situation (Major & O’Brien, 2005; Marx et al., 1999, 2005; Steele et al., 2002). Although we are not the first to discuss the role of the social self in stereotype-based performance effects, we are the first to demonstrate empirically how the social self can help distinguish between stereotype threat and priming effects. For instance, Wheeler and Petty (2001) have also detailed how the social self plays a role in stereotype threat, but they, unlike us, allowed the social self to play a smaller role than stereotype activation plays. We think the roles should be reversed, such that stereotype activation plays a supporting role to the leading role of the social self. Therein lies the fundamental difference between priming and stereotype threat. Stereotype threat is a situational predicament in which negative stereotypes are activated. Then, depending on who one is, concerns about confirming the negative stereotype may be increased if the stereotype is self-relevant. Although stereotype priming also relies on stereotype activation, the similarities stop there. Once the negative stereotype becomes linked to the social self and how a person performs, it is no longer a priming effect: It becomes a stereotype threat effect. In short, we argue that stereotype threat involves knowing as well as being the stereotype. Thus, stereotype threat occurs for people who know and are targeted by the relevant stereotype, leading them to feel threatened (“I worry about my performance because I am a woman, and I know women are perceived to be bad at math”), whereas priming effects can occur for anyone because no such link is necessary. Therefore, when distinguishing between these two routes to stereotype-based performance (stereotype priming, stereotype threat), it seems critical to include targets and nontargets, stereotype threat conditions (i.e., when targets’ performance is relevant to the group stereotype) and nonstereotype threat conditions (i.e., when targets’ performance is irrelevant to the group stereotype), as well as measures of participants’ threat-based concerns in the research design. By including all of these factors, we can assess how priming of negative traits and stereotype threat differentially affect stereotyped and nonstereotyped participants’ test performance and threat-based concerns, something that prior research on negative trait priming was unable to do (e.g., Ambady et al., 2004; Oswald & Harvey, 2000; Wheeler et al., 2001).
Research Overview In three experiments, we compared and contrasted priming with stereotype threat so that we could test the knowing-and-being logic of stereotype threat. The first two experiments pitted the effects of priming against stereotype threat on participants’ math test (Experiment 1) and emotion test (Experiment 2) performance and their threat-based concerns. The final experiment was conducted to examine further our reasoning regarding the knowing-and-being logic of stereotype threat and how merely heightening the accessibility of a generalized social self can cause targets, but not
245
nontargets, to have lower test performance and higher threat-based concerns as a result. In other words, we tested the notion that accessibility of the social self just before taking a nonthreatening test is sufficient for causing stereotype threat among targets, but not nontargets, because of the negative stereotype associated with targets’ social self.
Experiment 1: Math Test Performance The primary goal of Experiment 1 was to highlight the distinction between priming and stereotype threat. To do this, we focused on the stereotype that women possess less math ability than men. Accordingly, male and female participants took a difficult math test under stereotype threat or nonstereotype threat conditions. In addition, half the participants were primed with the negative trait dumb and its semantic associates before they took the test, whereas the other half were not primed. On the basis of our knowing-and-being reasoning, we expected that we would find the typical stereotype threat pattern (i.e., targets having lower scores when the group stereotype is relevant to their performance) on targets’ test performance and threat-based concerns in the stereotype threat conditions. We further predicted that in the nondiagnostic conditions both male and female participants would have lower scores in the primed than in the not primed condition but that there would be no difference in their threatbased concerns between these conditions, because priming should not affect threat-based concerns. In other words, we should find a general priming effect only on participants’ test performance, but not on their threat-based concerns.
Method Participants and Design Participants were 60 female and 50 male Dutch undergraduates who took part in exchange for course credit. For this experiment, we used a 2 (gender of participant: female, male) ⫻ 2 (test description: diagnostic, nondiagnostic) ⫻ 2 (stereotype prime: not primed, primed) betweenparticipants design.
Procedure On entering the laboratory, participants were informed that they would be involved in a series of short, unrelated tasks. Scrambled sentence task. Participants in the primed condition unscrambled sentences that were designed to activate the trait dumb or its semantic associate (e.g., unintelligent, foolish). To do this, participants were given a list of 20 scrambled five-word groups. Participants were then instructed to reorganize and write out the word groups into meaningful sentences, using at least four words from each group (cf. Srull & Wyer, 1979). Twelve of the word groups contained a concept that was relevant to the focal trait (dumb). Filler word groups containing behaviors not related to the trait (e.g., “a packed trip suitcase for”) were interspersed among the word groups containing the priming stimuli. The participants in the not primed conditions completed the same number of sentences with only the filler word groups. Test description manipulation. For all conditions, the test format resembled a standard Graduate Record Exam (GRE) math section but varied in a number of important ways. In the diagnostic condition, the test was described as being diagnostic of math ability as well as one that can identify a person’s mathematical strengths and weaknesses. Moreover, written on the cover of the test booklet was the name of a fictitious testing
MARX AND STAPEL
246
Table 1 Means and Standard Deviations of Math Test Performance and Threat-Based Concerns as a Function of Stereotype Prime, Test Description, and Participant Gender Stereotype prime Not primed Diagnostic Test description and gender Math test performance Female participant Male participant Threat-based concerns Female participant Male participant Note.
Primed
Nondiagnostic
Diagnostic
Nondiagnostic
M
SD
M
SD
M
SD
M
SD
10.67a 13.50c
1.97 1.45
12.29b 12.25b
1.57 0.87
9.27d 12.50b
1.79 1.45
10.63a 10.86a
1.54 1.17
3.03b 2.31a
0.58 0.56
2.08a 1.92a
0.53 0.38
2.87b 1.78a
0.63 0.64
2.02a 2.21a
0.66 0.61
All means that do not share a common subscript differ at p ⬍ .05.
center, “Massachusetts Aptitude Assessment Center (MAAC),” followed by the label “Diagnostic Exam.” This procedure has successfully created a situation of stereotype threat in previous research (Gonzales et al., 2002; Marx et al., 2005; Steele & Aronson, 1995). In the nondiagnostic condition, the same test was described as a reasoning exercise, thus purposefully not activating the negative stereotype about women and math. Furthermore, in both conditions, participants were told that they would receive feedback about their test performance at the conclusion of the experiment (no feedback was actually given). Participants had 30 min to complete the 20-problem math test. Performance could range from 0 to 20. Threat-based concerns. To assess whether our manipulations affected the participants’ threat-based concerns, we had them indicate how much they agreed with the following three statements: “I worry that my ability to perform well on math tests is affected by my gender”; “I worry that if I perform poorly on this test, the experimenter will attribute my poor performance to my gender”; “I worry that, because I know the negative stereotype about women and math, my anxiety about confirming that stereotype will negatively influence how I perform on math tests.” Responses were recorded on a 7-point scale anchored with the terms (1) strongly disagree and (7) strongly agree. We averaged the participants’ responses to form a single threat-based concern score (␣⫽ .74).2 When participants were finished, they were debriefed and thanked for their time.
Results and Discussion Math Test Performance The participants’ math test performance was analyzed by means of a 2 (participant gender) ⫻ 2 (test description) ⫻ 2 (stereotype prime) analysis of variance (ANOVA; see Table 1). This analysis revealed main effects for participant gender, F(1, 102) ⫽ 28.67, p ⬍ .05, 2 ⫽ .22, and stereotype prime, F(1, 102) ⫽ 21.86, p ⬍ .05, 2 ⫽ .18. We also found a Participant Gender ⫻ Test Description interaction, F(1, 102) ⫽ 25.32, p ⬍ .05, 2 ⫽ .20 (all other effects, Fs ⬍ 1.00). Table 1 shows, as hypothesized, that within the primed– diagnostic condition, female participants (M ⫽ 9.27, SD ⫽ 1.79) underperformed relative to male participants (M ⫽ 12.50, SD ⫽ 1.45), F(1, 102) ⫽ 30.21, p ⬍ .05, 2 ⫽ .23. This same pattern occurred for female (M ⫽ 10.67, SD ⫽ 1.97) and male (M ⫽ 13.50, SD ⫽ 1.45) participants in the not primed– diagnostic condition, F(1, 102) ⫽ 20.87, p ⬍ .05, 2 ⫽ .17. We also found that female participants in the not primed– diagnostic condition (M ⫽
10.67, SD ⫽ 1.97) had lower math scores than female participants in the not primed–nondiagnostic condition (M ⫽ 12.29, SD ⫽ 1.57), F(1, 102) ⫽ 8.02, p ⬍ .05, 2 ⫽ .07.3 On the whole, these results demonstrate the standard stereotype threat effect (i.e., targets performing poorly when a group stereotype is relevant to their performance). Our results showed that male participants in the primed– nondiagnostic condition (M ⫽ 10.86, SD ⫽ 1.17) had lower math scores than male participants in the not primed–nondiagnostic condition (M ⫽ 12.25, SD ⫽ 0.87), F(1, 102) ⫽ 5.43, p ⬍ .05, 2 ⫽ .05. This pattern also occurred in the nondiagnostic condition for female participants in the primed (M ⫽ 10.63, SD ⫽ 1.54) compared with the not primed conditions (M ⫽ 12.29, SD ⫽ 1.57), F(1, 102) ⫽ 9.88, p ⬍ .05, 2 ⫽ .09. Taken together, these results are consistent with a general priming effect. These results also indicate that when the social self is not accessible, priming harms targets’ test performance in the same way that it does nontargets. Interestingly, we also found a lift effect (Walton & Cohen, 2003) in the primed condition, such that male participants performed better in the diagnostic condition (M ⫽ 12.50, SD ⫽ 1.45) than in the nondiagnostic condition (M ⫽ 10.86, SD ⫽ 1.17), F(1, 102) ⫽ 7.55, p ⬍ .05, 2 ⫽ .07. Male participants in the not primed condition also performed better in the diagnostic (M ⫽ 13.50, SD ⫽ 1.45) relative to the nondiagnostic condition (M ⫽ 12.25, SD ⫽ 0.87), F(1, 102) ⫽ 4.07, p ⬍ .05, 2 ⫽ .04. Next, we turned to the question of whether priming has an effect on targets’ 2
It is important to note that, as in all the experiments reported in this article, separate analyses for each of the threat-based concern items yielded a similar pattern of effects. For reasons of parsimony, we only reported the effects on the composite scores. 3 For Experiments 1 and 2, we did not compare targets’ test performance between the diagnostic and nondiagnostic conditions within the primed condition because targets’ test performance should be negatively affected in the primed–nondiagnostic condition (as would be expected from a priming perspective); thus, their test performance may not differ reliably from targets’ test performance in the diagnostic condition. Moreover, given that priming should negatively affect targets’ performance in the nondiagnostic condition, it is unclear what the comparison between the diagnostic and nondiagnostic condition would mean within the primed condition.
STEREOTYPE THREAT AND PRIMING
threat-based concerns, as this is also a critical test of the knowingand-being logic of stereotype threat (Marx & Stapel, in press; Marx et al., 2005; Steele, 1997; Steele et al., 2002).
Threat-Based Concerns Participants’ threat-based concerns were analyzed by means of a 2 (participant gender) ⫻ 2 (test description) ⫻ 2 (stereotype prime) ANOVA (see Table 1). This analysis revealed main effects for participant gender, F(1, 102) ⫽ 13.09, p ⬍ .05, 2 ⫽ .12, and test description, F(1, 102) ⫽ 17.62, p ⬍ .05, 2 ⫽ .14. There was also a Participant Gender ⫻ Test Description interaction, F(1, 102) ⫽ 17.47, p ⬍ .05, 2 ⫽ .14, and a marginally reliable Test Description ⫻ Stereotype Prime interaction, F(1, 102) ⫽ 3.77, p ⬍ .06, 2 ⫽ .04 (all other effects, ps ⬎ .11). As can be seen in Table 1, within the primed– diagnostic condition female participants (M ⫽ 2.87, SD ⫽ 0.63) experienced more concern relative to male participants (M ⫽ 1.78, SD ⫽ 0.64), F(1, 102) ⫽ 23.36, p ⬍ .05, 2 ⫽ .18. Within the not primed– diagnostic condition, we also found that female participants (M ⫽ 3.03, SD ⫽ 0.58) had higher concern scores than male participants (M ⫽ 2.31, SD ⫽ 0.56), F(1, 102) ⫽ 9.18, p ⬍ .05, 2 ⫽ .08. Moreover, female participants in the primed condition had higher concern scores in the diagnostic (M ⫽ 2.87, SD ⫽ 0.63) than in the nondiagnostic conditions (M ⫽ 2.02, SD ⫽ 0.66), F(1, 102) ⫽ 16.50, p ⬍ .05, 2 ⫽ .14. Female participants in the not primed condition likewise experienced more concern in the diagnostic (M ⫽ 3.03, SD ⫽ 0.58) than in the nondiagnostic condition (M ⫽ 2.08, SD ⫽ 0.53), F(1, 102) ⫽ 18.73, p ⬍ .05, 2 ⫽ .15. In the diagnostic condition, there was no reliable difference in female participants’ threat-based concern between the primed (M ⫽ 2.87, SD ⫽ 0.63) and not primed conditions (M ⫽ 3.03, SD ⫽ 0.58, F ⬍ 1.00). Furthermore, there were no differences between the female and male participants’ threat-based concern scores in the nondiagnostic conditions ( ps ⬎ .20). Together, these results demonstrate the typical stereotype threat pattern on targets’ threat-based concerns and show that priming does not have any meaningful effect on their threat appraisals. In short, our findings highlight how stereotype threat and negative trait priming effects differ. In the diagnostic condition, female participants underperformed compared with male participants. Results also showed that female participants had lower math scores in the diagnostic condition relative to female participants in the nondiagnostic condition. Moreover, in the nondiagnostic condition we found the typical priming effect on participants’ math test performance (stereotype activation 3 stereotype-consistent test performance), such that both male and female participants performed in a stereotype-consistent manner. It is important that, as predicted, the prime did not elevate female participants’ threatbased concerns. These results support the notion that the activation of stereotypic traits is not enough for stereotype threat to occur. For stereotype threat to occur, targets need to make the connection between the stereotype and how well they perform in a specific testing situation; hence, in line with the knowing-and-being logic of stereotype threat, they need to link what they know about the stereotype to who they are.
247
Experiment 2: Emotion Test Performance The purpose of Experiment 2 was to replicate the effects of Experiment 1 as well as to generalize our knowing-and-being logic of stereotype threat effects to domains other than academics (e.g., Aronson et al., 1999; Leyens, Desert, Croizet, & Darcis, 2000; Marx & Stapel, in press). We also examined whether threat and trait priming lead to the same level of endorsement regarding the negative stereotype about men and emotional insensitivity, as it could be reasoned that participants’ stereotype endorsement may contribute to possible performance differences (cf. Blanton, Christie, & Dye, 2002). Inclusion of this item also serves as a manipulation check of sorts, in that we could assess whether there was equal endorsement of the stereotype as well as whether the participants were even aware of the stereotype about men and emotional insensitivity. This issue seemed important given that we were focusing on a relatively less studied stereotype than the stereotypes used in past work on stereotype threat and priming (e.g., women and math). To investigate these issues, we used procedures similar to the ones used in Experiment 1 except that participants took an emotion test and were primed with the negative trait emotional insensitivity and its semantic associates. We made the same predictions as in Experiment 1, but in this experiment male participants were the targets of a negative stereotype.
Method Participants and Design Participants were 54 female and 51 male Dutch undergraduates who took part in exchange for course credit. For this experiment we used a 2 (gender of participant: female, male) ⫻ 2 (test description: diagnostic, nondiagnostic) ⫻ 2 (stereotype prime: not primed, primed) betweenparticipants design.
Procedure On entering the laboratory, participants were informed that they would be involved in a series of short, unrelated tasks. Priming manipulation. We used the scrambled sentence task from Experiment 1 but modified it so that the priming stimuli activated the trait emotional insensitivity as well as its semantic associates (e.g., inconsiderate, cold). The filler word groups were identical to those used in Experiment 1. Emotion test. We used the same test description manipulation and test format from Experiment 1, but for this experiment we adapted them so that they fit with the stereotype about men and emotional insensitivity. Moreover, written on the cover of the diagnostic test booklet was the label “Emotional Sensitivity Exam” and written on the cover of the nondiagnostic test booklet was the label “Emotional Exercise.” The emotion test comprised several types of problems that were loosely based on other emotion measures and exercises (see Bar-On, 1997; Schutte et al., 1998). For instance, participants had to answer problems about which emotion is best captured by a particular facial expression, indicate which two basic emotions (e.g., joy, expectation) make a more complex emotion (optimism), and answer problems about how emotions typically develop (e.g., “If you feel more and more guilty and you lose your feeling of self-worth then you feel?”; shame). A version of this emotion test has been used in our past work on emotional stereotype threat (see for details, Marx & Stapel, in press). Participants had 20 minutes to complete the 10-problem emotion test. Test performance could range from 0 to 10.
MARX AND STAPEL
248
Table 2 Means and Standard Deviations of Emotion Test Performance and Threat-Based Concerns as a Function of Stereotype Prime, Test Description, and Participant Gender Stereotype prime Not primed Diagnostic Test description and gender Emotion test performance Female participant Male participant Threat-based concerns Female participant Male participant Note.
Primed
Nondiagnostic
Diagnostic
Nondiagnostic
M
SD
M
SD
M
SD
M
SD
8.92a 5.08c
1.00 1.38
6.64b 7.17b
1.21 0.94
6.92b 4.07d,c
1.38 1.86
5.28c 4.83c
1.07 1.03
1.56a 2.67b
0.62 0.84
1.64a 1.39a
1.04 0.49
1.90a 3.31b
1.37 0.81
1.78a 1.72a
1.12 1.05
All means that do not share a common subscript differ at p ⬍ .05.
Threat-based concerns and stereotype endorsement. To assess participants’ threat-based concerns, we modified the concern measure from Experiment 1 so that it was appropriate for the stereotype about men and emotional insensitivity (␣ ⫽ .72). After completing the threat-based concern items, participants answered an item about their endorsement of the stereotype about men and emotional insensitivity (“Women are more emotionally sensitive than men.”).4 Scores on this item could range from 1 (not at all) to 9 (very much), with higher numbers indicating more endorsement of the stereotype. Threat-based concerns and the stereotype endorsement item were not strongly related (r ⫽ .12, p ⫽ .22). On completion of these items, the participants were debriefed and thanked for their time.
Results and Discussion Stereotype Endorsement We analyzed participants’ stereotype endorsement scores using a 2 (participant gender) ⫻ 2 (test description) ⫻ 2 (stereotype prime) ANOVA. We found, as anticipated, no main or interactive effects ( ps ⬎ .27). That we found no differences suggests that neither our threat nor our priming manipulations affected participants’ level of endorsement regarding the stereotype about men and emotional insensitivity. It is also clear that participants in this experiment endorsed the stereotype equally. Hence, if we find effects as a function of our manipulations, it would be unlikely that those effects were due to differences in participants’ endorsement of the stereotype.5
formed male participants (M ⫽ 4.07, SD ⫽ 1.86), F(1, 97) ⫽ 33.80, p ⬍ .05, 2 ⫽ .26. In the not primed– diagnostic condition, we also found that female participants (M ⫽ 8.92, SD ⫽ 1.00) had higher scores than male participants (M ⫽ 5.08, SD ⫽ 1.38), F(1, 97) ⫽ 56.80, p ⬍ .05, 2 ⫽ .37. Moreover, when we compared male participants’ test performance within the not primed condition we found that they performed better in the nondiagnostic condition (M ⫽ 7.17, SD ⫽ 0.94) than in the diagnostic condition (M ⫽ 5.08, SD ⫽ 1.38), F(1, 97) ⫽ 16.83, p ⬍ .05, 2 ⫽ .15. These comparisons again demonstrate the typical stereotype threat pattern, but this time on male participants’ emotion test performance. Within the nondiagnostic conditions, male participants had lower scores in the primed condition (M ⫽ 4.83, SD ⫽ 1.03) than in the not primed condition (M ⫽ 7.17, SD ⫽ 0.94), F(1, 97) ⫽ 20.28, p ⬍ .05, 2 ⫽ .17. This was also the case for female participants in the primed (M ⫽ 5.28, SD ⫽ 1.07) and not primed conditions (M ⫽ 6.64, SD ⫽ 1.21), F(1, 97) ⫽ 7.80, p ⬍ .05, 2 ⫽ .07. Again, these effects underscore our contention that when targets do not have to contend with a negative group stereotype, priming harms their test performance in the same way that it does nontargets. Just as in Experiment 1 for math test performance, we found a lift effect for female participants in emotion test performance such 4
Emotion Test Performance The participants’ emotion test performance was analyzed by means of a 2 (participant gender) ⫻ 2 (test description) ⫻ 2 (stereotype prime) ANOVA (see Table 2). This analysis revealed main effects for participant gender, F(1, 97) ⫽ 44.02, p ⬍ .05, 2 ⫽ .31, and stereotype prime, F(1, 97) ⫽ 48.77, p ⬍ .05, 2 ⫽ .34. We also found a Participant Gender ⫻ Test Description interaction, F(1, 97) ⫽ 43.86, p ⬍ .05, 2 ⫽ .31, and a marginally reliable three-way interaction, F(1, 97) ⫽ 3.82, p ⬍ .06, 2 ⫽ .04 (all other effects, ps ⬎ .24). Table 2 shows, as predicted, that in the primed– diagnostic condition female participants (M ⫽ 6.92, SD ⫽ 1.38) outper-
Of course, we are aware that this statement specifically asks participants about their endorsement of the positive stereotype about women and emotional sensitivity. However, this statement also implies that men possess less emotional sensitivity than do women; thus, it serves as a reasonable proxy for endorsement of the stereotype about men and emotional insensitivity. 5 In addition to examining whether our stereotype endorsement item was affected by our experimental manipulations, we also examined whether participants knew the stereotype about men and emotional insensitivity. To do this, we looked at the percentage of participants who had scores above the midpoint of our scale (5), as this would indicate that participants were aware of the stereotype, whereas scores below the midpoint would indicate that participants were unaware of the stereotype. We found that 4% of participants were below the midpoint, 12% were at the midpoint, and 84% were above the midpoint.
STEREOTYPE THREAT AND PRIMING
that in the primed– diagnostic conditions (M ⫽ 6.92, SD ⫽ 1.38) female participants performed better than female participants in the primed–nondiagnostic conditions (M ⫽ 5.28, SD ⫽ 1.07), F(1, 97) ⫽ 12.53, p ⬍ .05, 2 ⫽ .12. The same effect occurred in the not primed condition for female participants in the diagnostic (M ⫽ 8.92, SD ⫽ 1.00) and nondiagnostic conditions (M ⫽ 6.64, SD ⫽ 1.21), F(1, 97) ⫽ 18.42, p ⬍ .05, 2 ⫽ .16. Next, we turned to the question of whether targets’ threat-based concerns are affected by priming of stereotypic traits.
Threat-Based Concerns The participants’ threat-based concerns were analyzed by means of a 2 (participant gender) ⫻ 2 (test description) ⫻ 2 (stereotype prime) ANOVA (see Table 2). We found main effects for participant gender, F(1, 97) ⫽ 9.03, p ⬍ .05, 2 ⫽ .08, test description, F(1, 97) ⫽ 14.64, p ⬍ .05, 2 ⫽ .13, and stereotype prime, F(1, 97) ⫽ 4.59, p ⬍ .05, 2 ⫽ .04. We also found a Participant Gender ⫻ Test Description interaction, F(1, 97) ⫽ 13.90, p ⬍ .05, 2 ⫽ .12 (all other effects, Fs ⬍ 1.00). Table 2 shows that within the primed– diagnostic condition, male participants (M ⫽ 3.31, SD ⫽ 0.81) had higher concern scores relative to female participants (M ⫽ 1.90, SD ⫽ 1.37), F(1, 97) ⫽ 14.33, p ⬍ .05, 2 ⫽ .13. The same effect occurred in the not primed– diagnostic condition such that male participants (M ⫽ 2.67, SD ⫽ 0.84) were more concerned than female participants (M ⫽ 1.56, SD ⫽ 0.62), F(1, 97) ⫽ 8.22, p ⬍ .05, 2 ⫽ .08. We also found that male participants in the primed– diagnostic condition (M ⫽ 3.31, SD ⫽ 0.81) felt more concern than male participants in the primed–nondiagnostic conditions (M ⫽ 1.72, SD ⫽ 1.05), F(1, 97) ⫽ 17.47, p ⬍ .05, 2 ⫽ .15. This was also the case for male participants in the not primed– diagnostic conditions (M ⫽ 2.67, SD ⫽ 0.84) compared with male participants in the not primed–nondiagnostic condition (M ⫽ 1.39, SD ⫽ 0.49), F(1, 97) ⫽ 10.93, p ⬍ .05, 2 ⫽ .10. Within the diagnostic conditions, there was a marginal difference in male participants’ threat-based concern scores between the primed (M ⫽ 3.31, SD ⫽ 0.81) and not primed conditions (M ⫽ 2.67, SD ⫽ 0.84), F(1, 97) ⫽ 2.95, p ⫽ .09, 2 ⫽ .03, indicating that our priming procedure had a slight effect on the male participants’ concern scores. It is important that, as would be predicted from stereotype threat theory, there were no differences between male and female participants’ threat-based concern scores in the nondiagnostic conditions, Fs ⬍ 1.00. Taken together, the findings from Experiment 2 provide additional support for how stereotype threat and priming effects differ. As before, we found the standard stereotype threat pattern on participants’ test performance. In the nondiagnostic condition, we found the typical priming effect (stereotype activation 3 stereotype-consistent test performance), such that both targets and nontargets performed in a stereotype-confirming manner. But what is probably more critical to our reasoning regarding the knowingand-being aspect of stereotype threat is the fact that the prime in the nondiagnostic condition did not elevate targets’ threat appraisals nor did our manipulations lead to differences in participants’ stereotype endorsement. In sum then, these results show that activation of stereotypic traits is not sufficient for targets to make the connection between the stereotype and their performance in the testing situation: Participants need to know as well as be the stereotype in order for the
249
stereotype to affect their threat-based concerns (Major & O’Brien, 2005; Marx et al., 1999, 2005; Steele et al., 2002). This reasoning is also supported by the fact that when we correlated participants’ emotion test performance with their stereotype endorsement and threat-based concern scores, controlling for the experimental variables (i.e., gender, test description, priming), we only found a reliable partial correlation between test performance and threatbased concerns (r ⫽ ⫺.20, p ⬍ .04). The partial correlation between test performance and stereotype endorsement was not reliable (r ⫽ ⫺.02, p ⫽ .83); thus, stereotype endorsement and threat-based concerns may co-occur, but only threat-based concerns should be affected by being the stereotype. Simply knowing the stereotype is not enough to cause stereotype threat; targets need to link being the stereotype to how they perform in the situation.
Experiment 3: The Social Self and Stereotype Threat Now that we have demonstrated that priming and stereotype threat yield a different pattern of effects on participants’ test performance and threat-based concerns, we turned to the question of whether increasing accessibility of the social self just prior to taking a test would create a stereotype threat experience for targets, but not nontargets, even when the test is nonthreatening in nature (when it is presented as a reasoning exercise). This notion is based on our prior work showing that stereotype threat leads to greater accessibility of targets’ stereotyped identity compared with other equally important social identities (e.g., friends and family; Marx et al., 2005). Thus, we argue that it should be possible to cause poor performance among targets, not only by using a standard stereotype threat manipulation (i.e., test diagnosticity) but also by making the social self accessible before taking a nonthreatening test. Of course, activating a general social self (we-ness) may make many social identities accessible, but once the context is factored in (e.g., I am a person taking a math test) then the social identity (gender) that is most relevant to the situation (taking a math test) may win out in the end (e.g., Major & O’Brien, 2005; Onorato & Turner, 2004; Turner, Hogg, Oakes, Reicher, & Wetherell, 1987). And if this social identity is linked to a negative stereotype, then it seems reasonable to suggest that one’s threat-based concerns may also be affected (e.g., Marx et al., 1999, 2005). Although our manipulation of the social self is akin to manipulations used in previous research on stereotype-based performance (see Shih et al., 1999; Steele & Aronson, 1995), our approach deviates from this past work in that we do not ask participants to indicate their stereotyped identity prior to taking a test. Thus, we do not purposefully remind targets about the associated group stereotype. This approach therefore allows us to highlight the critical role of the situation and how the situational cues may actually shape a general awareness of one’s social self into a specific awareness of a particular social identity—the identity that is most closely aligned with the situation and associated stereotype (e.g., Onorato & Turner, 2004; Turner et al., 1987; see also Major & O’Brien, 2005; Marx et al., 1999, for discussions of contextual effects in stereotype threat). To test this line of reasoning regarding the interaction of the social self and the situation, we asked male and female participants to take a math test under one of four conditions. In the first condition, we increased the accessibility of participants’ social self before they took a nonthreatening math test. In the second condi-
MARX AND STAPEL
250
tion, we primed participants with stereotypic traits (see Experiment 1) before they took a nonthreatening math test. In the third condition, the test was described as diagnostic of math ability (stereotype threat), and in the fourth condition participants simply took a nonthreatening test (no stereotype threat). By including these four conditions, we were able to accomplish two things. One, we could replicate the effects from Experiments 1 and 2, and two, we could highlight how knowing the stereotype (e.g., “women are bad at math”) and being from the stereotyped group (e.g., “I am a woman”) in combination with the situational cues can lead to higher threat-based concerns and lower test performance among targets, but not nontargets. For this last experiment, we expected to find the typical stereotype threat pattern on targets’ math test performance and threatbased concerns in the diagnostic and nondiagnostic conditions. We also anticipated that female participants would have lower scores and higher threat-based concerns than male participants in the social self–nondiagnostic condition but that they would perform equally as poorly and not differ in their threat-based concerns in the dumb prime–nondiagnostic condition. We further predicted that male and female participants’ test performance would be lower in the dumb prime–nondiagnostic conditions than in the nondiagnostic conditions but that there would be no difference in their threat-based concern scores. And finally, we hypothesized that the pattern of performance and threat-based concerns would be the same in both the social self–nondiagnostic and the diagnostic conditions. This predicted pattern serves as the critical test of our reasoning regarding the knowing-and-being aspect of stereotype threat, as it would show that underperformance can occur for targets via the combination of social self activation and the situational cues or via more traditional stereotype threat manipulations (test diagnosticity). That is, both manipulations make those aspects of the social self that are most associated with the stereotype particularly accessible at the time, thus leading to underperformance for targets, but not nontargets.
Method Participants and Design Participants were 57 female and 46 male Dutch undergraduates who took part in exchange for course credit. For this experiment, we used a 2 (gender of participant: female, male) ⫻ 4 (type of condition: social self–
nondiagnostic test, dumb prime–nondiagnostic test, diagnostic test, nondiagnostic test) between-participants design.
Procedure On entering the laboratory, participants were informed that they would be involved in a series of short, unrelated tasks. One quarter of the participants completed a scrambled sentence task and then took a nondiagnostic math test. Another quarter of the participants completed a word search task and then took a nondiagnostic math test. Another quarter took a diagnostic math test, and the final quarter took a nondiagnostic math test. Scrambled sentence task. We used the scrambled sentence task from Experiment 1. Word search task. For this task, participants were told that as part of a proofreading and word search exercise they would read a short paragraph detailing a trip to the city. They were further told that as they read the paragraph they should circle all the pronouns that appeared in the text. The pronouns in the text were we, our, ourselves, and us (see for details Brewer & Gardner, 1996). Past research has shown that this procedure is effective in increasing feelings of we-ness or accessibility of the social self (Brewer & Gardner, 1996; Stapel & Koomen, 2001). Test description manipulation. We used the test description manipulation (diagnostic math test or reasoning exercise) and math test from Experiment 1. Threat-based concerns. We used the threat-based concerns measure from Experiment 1 (␣ ⫽ 74). On completion of this measure, participants were debriefed and thanked.
Results and Discussion Math Test Performance The participants’ math test performance was analyzed by means of a 2 (participant gender) ⫻ 4 (condition type) ANOVA (see Table 3). We found main effects for participant gender, F(1, 95) ⫽ 9.20, p ⬍ .05, 2 ⫽ .09, and condition type, F(3, 95) ⫽ 6.40, p ⬍ .05, 2 ⫽ .06. We also found an omnibus interaction, F(3, 95) ⫽ 2.92, p ⬍ .05, 2 ⫽ .08. Table 3 shows that within the diagnostic condition female participants (M ⫽ 10.54, SD ⫽ 0.97) underperformed relative to male participants (M ⫽ 12.31, SD ⫽ 1.97), F(1, 95) ⫽ 8.88, p ⬍ .05, 2 ⫽ .09. Moreover, female participants in the diagnostic condition (M ⫽ 10.54, SD ⫽ 0.97) had lower scores than female participants in the nondiagnostic condition (M ⫽ 12.47, SD ⫽ 1.64), F(1, 95) ⫽ 11.33, p ⬍ .05, 2 ⫽ .11. As expected, there was
Table 3 Means and Standard Deviations of Math Test Performance and Threat-Based Concerns as a Function of Condition Type and Participant Gender Condition type Diagnostic Test description and gender Math test performance Female participants Male participants Threat-based concerns Female participants Male participants Note.
Nondiagnostic
M
SD
M
SD
10.54a 12.31b
0.97 1.97
12.47b 12.45b
1.64 0.82
2.87b 1.90a
0.52 0.53
1.93a 1.85a
0.47 0.31
Social self
Dumb prime
M
SD
M
SD
1.90 0.70
10.67a 10.73a
1.76 1.42
0.76 0.47
1.93a 2.06a
0.63 0.53
11.07a 12.91b 2.76b 1.61a,c
All means that do not share a common subscript differ at p ⬍ .05.
STEREOTYPE THREAT AND PRIMING
no difference between male and female participants’ test performance in the nondiagnostic conditions (F ⬍ 1.00). These results are thus consistent with a typical stereotype threat effect. Critical to our reasoning about accessibility of the social self and the contextual cues, we found that in the social self–nondiagnostic condition female participants (M ⫽ 11.07, SD ⫽ 1.90) had lower scores than male participants (M ⫽ 12.91, SD ⫽ 0.70), F(1, 95) ⫽ 9.10, p ⬍ .05, 2 ⫽ .09. This result is similar to what we found in the diagnostic condition, demonstrating that accessibility of the social self leads to the same effect as the test diagnosticity manipulation because the context helps shape a general awareness of the social self into a more specific awareness of the social self (gender) that is relevant to the situation (cf. Shih et al., 1999; Steele & Aronson, 1995; see also Onorato & Turner, 2004; Turner et al., 1987). Our final comparison showed that male and female participants had lower math scores in the dumb prime–nondiagnostic condition than in the nondiagnostic condition, F(1, 95) ⫽ 17.15, p ⬍ .05, 2 ⫽ .15, thus underscoring our point that when the group stereotype is not linked to how participants perform, priming has a similar effect on targets’ and nontargets’ performance. We next assessed whether our manipulations affected participants’ threatbased concerns, as this measure could further show that targets are linking how they perform on the test to the stereotype associated with their social self.
Threat-Based Concerns We analyzed the participants’ threat-based concerns using a 2 (participant gender) ⫻ 4 (condition type) ANOVA (see Table 3). There were main effects for participant gender, F(1, 95) ⫽ 25.54, p ⬍ .05, 2 ⫽ .21, and condition type, F(3, 95) ⫽ 5.71, p ⬍ .05, 2 ⫽ .06. We also found an omnibus interaction, F(3, 95) ⫽ 9.55, p ⬍ .05, 2 ⫽ .23. Table 3 shows that in the diagnostic condition, female participants (M ⫽ 2.87, SD ⫽ 0.52) had higher concern scores compared with male participants (M ⫽ 1.90, SD ⫽ 0.53), F(1, 95) ⫽ 22.82, p ⬍ .05, 2 ⫽ .19. This effect highlights our main contention that when the stereotype (knowing) is relevant to one’s social self (being), targets’ threat-based concerns are elevated compared with nontargets. The same effect occurred in the social self– nondiagnostic condition, such that female participants (M ⫽ 2.76, SD ⫽ 0.76) had higher concern scores relative to male participants (M ⫽ 1.61, SD ⫽ 0.47), F(1, 95) ⫽ 30.40, p ⬍ .05, 2 ⫽ .24. This last result provides additional support for the notion that when a stereotype is relevant to targets’ test performance, either via the test diagnosticity manipulation or by increasing accessibility of the social self before taking a nonthreatening test, targets experience more concern than when the stereotype is irrelevant to their performance or when their social self is not as accessible. There were no differences, as expected, in participants’ threat-based concerns between the nondiagnostic and dumb prime–nondiagnostic conditions (Fs ⬍ 1.00). On the whole, these analyses make it quite apparent that when the stereotype is relevant to the social self (either because the test is presented as diagnostic or because participants’ social self is cognitively accessible) only stereotyped targets underperform, but when the stereotype is irrelevant to how participants perform (i.e., the nondiagnostic condition) both targets and nontargets react in a corresponding manner to the situation: When primed with stereo-
251
typic traits, they underperform; when not primed with such traits, they perform better. It is important that trait priming did not influence targets’ threat-based concerns, thus highlighting the link between accessibility of the social self and targets’ concerns about confirming a negative stereotype that is associated with their group.
General Discussion The three experiments presented here show how to distinguish between priming and stereotype threat effects. Throughout this article, we have argued that even though the outcomes of stereotype threat and priming often look the same (stereotype confirming poor performance), they are in fact different. Stressing this distinction seemed particularly important in light of the literature on stereotype-based performance, which has made the assumption that mere negative stereotype activation can lead to stereotype threat (Ambady et al., 2004; Dijksterhuis & Bargh, 2001; Dijksterhuis & Corneille, 2004; Gladwell, 2005; Oswald & Harvey, 2000; Wheeler et al., 2001; Wheeler & Petty, 2001). It is interesting that despite this growing body of research, no direct empirical comparison of stereotype threat and negative trait priming had been made until now. Moreover, we believe that it is important to demonstrate the distinction between these accounts so that there is a better understanding of stereotype-based underperformance in general. As the present research shows, participants’ threat-based concerns make the difference between priming and stereotype threat effects quite clear. Indeed, we found very consistent results using two types of tests (a math test and an emotions test) and stereotypes (i.e., “women are bad at math,” “men are emotionally insensitive”). Taken together, the results from the first two experiments demonstrate that stereotype threat adversely affects targets’, but not nontargets’, test performance and threat-based concerns because only targets’ social self is linked to the negative group stereotype. For example, in stereotype threat situations, targets (but not nontargets) are affected because they know the group-based stereotype (“women are bad at math”; “men are emotionally insensitive”) and because they are members of the group that is targeted by the stereotype (“I am a woman”; “I am a man”). This issue of knowing-and-being the stereotype, however, has received little to no attention in research examining the relationship between priming of negative traits and stereotype threat (e.g., Ambady et al., 2004; Dijksterhuis & Corneille, 2004; Wheeler et al., 2001; but see Shih et al., 2002, for work on priming positive traits). In the nondiagnostic condition, we found the standard priming effect (stereotype activation 3 stereotype-consistent poor test performance), such that both targets and nontargets performed in a stereotype-confirming manner. But what is probably most important to our reasoning regarding the knowing-and-being aspect of stereotype threat is the fact that the prime did not elevate targets’ concern scores in either the diagnostic or nondiagnostic conditions. Our final experiment revealed strong support for the notion that when the social self is accessible, only stereotyped targets underperform because accessibility of the social self is also linked to targets’, but not nontargets’, threat-based concerns (see Marx et al., 2005). This is not the case when targets are primed with negative traits. Simple trait priming procedures do not activate
MARX AND STAPEL
252
such a link; thus, they have comparable effects on targets’ and nontargets’ test performance. In summary, the findings from these experiments not only increase our understanding of stereotype threat, they also advance stereotype threat theory by showing that accessibility of the social self in a challenging testing situation is associated with the same threat-based concerns that accompany more typical manipulations of stereotype threat (i.e., test diagnosticity). These results also have several implications for stereotype threat and priming research.
Implications for Priming Effects It is widely accepted that priming effects are rather straightforward, such that perceiving leads to behaving. And for the most part, this may be an accurate portrayal of priming effects, but as recent research has shown, the conditions for this priming effect may not be that simple. For example, Spears, Gordijn, Dijksterhuis, and Stapel (2004) have demonstrated that when group differences are salient, the influence of the prime works differently from the usual perception-to-behavior effect for those participants who are not targeted by the stereotype. That is, when participants are aware of their group membership, they have a tendency to distance themselves from stereotypes that are not relevant for their group: College students in this case would walk faster rather than slower down a hallway when primed with traits associated with the stereotype of an older person, whereas older people would walk slower down the hallway when primed with the same traits. As Spears et al. (2004) argue, their research suggests that priming effects are not always unmediated, unidirectional effects: seeing often leads to doing, but sometimes seeing leads to doing the opposite, such as when group memberships are relevant to the seeing– doing sequence. In other words, when trying to understand priming effects in social contexts, it is important to take into account the notion that priming and behavioral outcomes seldom occur in a social vacuum. Perceivers often relate to the primed information in a certain way. Perceivers may view the primed information as in-group or out-group information, as stereotypical or nonstereotypical information, and this may dramatically affect the outcome of priming effects (see Spears et al., 2004). This line of reasoning also fits well with the present research. For instance, we showed that in the nondiagnostic conditions when participants are not particularly aware of their social identity, they behaved in a stereotype-consistent manner after being primed with negative traits. However, when participants’ social self (being) was accessible, then the situation (seeing math problems) triggered the negative group stereotype (knowing) for targets such that they performed in a stereotype-consistent manner, even though the test was described in nonthreatening terms. This did not occur for nontargets because they did not have to contend with a negative stereotype, nor is the stereotype particularly relevant to how they performed. As we argued earlier, this result came about because in stereotype threat situations, targets are worried about confirming a negative stereotype about their group if they underperform. Thus, we believe that the present findings demonstrate that general priming effects (Bargh et al., 1996; Dijksterhuis & van Knippenberg, 1998; Kawakami, Young, & Dovidio, 2002; Levy, 1996) may not be as straightforward as they seem. That is, simple priming explanations may only be appropriate descriptions of what is happening in those situations in which participants are unable to
draw the connection between their behavior and the stereotype. In light of this, we argue that priming experiments cannot and should not be used as evidence for stereotype threat, as these experiments bypass one of the hallmarks of stereotype threat: the worry about confirming a negative stereotype associated with one’s group (Major & O’Brien, 2005; Marx et al., 2005; Steele et al., 2002).
Implications for Stereotype Threat As we have argued throughout this article, knowledge of a group stereotype is necessary but not sufficient for causing stereotype threat; both knowing and being are critical (see Marx & Stapel, in press; Marx et al., 2005; Steele, 1997; Steele et al., 2002). This knowing-and-being logic suggests that threatening testing situations are especially likely to increase the accessibility of the social identity that is most relevant to the stereotype. In light of this, we demonstrated that increased accessibility of the social self in combination with relevant situational cues (i.e., seeing math problems) could change a normal testing situation into a threatening one, such that targets underperform relative to nontargets (e.g., Major & O’Brien, 2005; Marx et al., 1999). Moreover, we focused on what Steele and his colleagues (Marx et al., 1999; Steele, 1997; Steele et al., 2002) have argued is one of the core principles of the theory—the worry about being judged in terms of the stereotype associated with one’s group—so that we could highlight the difference between stereotype threat and priming effects. Indeed, stereotype threat and trait priming may have the same effect on targets’ performance, but they work via different routes.6 When performance is not relevant to the group stereotype (a nonstereotype threat situation), targets and nontargets perform similarly after stereotype priming. But, when performance is relevant to the group stereotype (a stereotype threat situation), the prime does not change the typical stereotype threat pattern of effects. Given this, it seems reasonable to suggest that participants’ threat-based concerns could differentiate between these two outcomes, as this measure shows whether participants are making the link between their social self and their performance in the testing situation. The present research also highlights the situational nature of stereotype threat by showing that cues in the testing session (e.g., seeing math problems in combination with increased accessibility of the social self) can undermine targets’ test performance even when the test is described in neutral terms. This finding underscores the importance of the social self in stereotype threat (Major & O’Brien, 2005; Marx et al., 2005; see also Wheeler & Petty, 2001). 6
It may be tempting here to argue for statistical mediation of our stereotype threat and priming effects. We believe that testing for mediation would be problematic in these experiments because the order in which we measured our variables prevents any strong causal arguments. Indeed, measuring participants’ threat-based concerns prior to taking a test may substantially alter the testing situation (e.g., artificially raise concerns, or even diffuse the threat by allowing participants to vent their feelings) such that it would be difficult to make any tenable claims. Because of this, we attempted to highlight the mediating process of threat-based concerns on stereotype threat effects via experimentation. That is, we manipulated the testing situation so we could show that in stereotype threat situations targets’ threat-based concerns are elevated, and in nonthreat situations their concerns are lower. We also showed that negative trait priming had no such effect on targets’ threat-based concerns.
STEREOTYPE THREAT AND PRIMING
Separating the Present Work From Past Work on Stereotype-Based Performance The last few years have seen a growing interest in the effects of stereotype activation on the performance of stereotyped and nonstereotyped targets (e.g., Ambady et al., 2004; Shih et al., 2002; Wheeler & Petty, 2001; Wheeler et al., 2001). In fact, some may be tempted to view the present work as simply replicating or extending this past work. We see clear differences, however. For example, the most obvious difference between the past work and the current work is that we focus on negative stereotypes, whereas other past research has focused on positive stereotypes (Shih et al., 2002). Furthermore, prior research examining negative trait priming (e.g., Ambady et al., 2004; Wheeler et al., 2001) has not included both stereotyped and nonstereotyped targets, nor have they included stereotype threat and nonthreat manipulations. And finally, none of the prior experiments that have looked at stereotype threat effects from a priming perspective have included a measure of worry or concern. In short, there are clear differences between the current work and past work on priming and threat. From a theoretical perspective, there are a number of parallels between the present work and relevant past work, yet there are also a number of distinctions (see Wheeler & Petty, 2001). For example, in a recent review of the priming and threat literature, Wheeler and Petty (2001) suggested that stereotype threat could be due to the simple activation of stereotype-relevant concepts. We agree with this interpretation to some extent; however, our results clearly show that stereotype threat also involves the interaction of situational cues and a person’s social identity. Therefore, stereotype activation is necessary, but it is not sufficient to cause stereotype threat. That is, once the stereotype becomes linked to the social self and how one performs in the situation, we argue that it is no longer a simple priming effect. Stereotype activation alone cannot cause stereotype threat, unless the activation likewise leads to an increase in a person’s concerns about confirming that stereotype. In short, whereas previous discussions about the effects of stereotype activation and performance have been somewhat elusive about the relationship between priming and threat (e.g., Wheeler & Petty, 2001), we argue and demonstrate quite clearly that stereotype threat and priming effects both rely on stereotype activation, but once the social self and threat-based concerns are factored into the equation, the paths to poor performance diverge depending on who you are and how you view the stereotype (e.g., Major & O’Brien, 2005; Marx et al., 2005).
Coda To date, the distinction between stereotype threat and priming effects has not been made explicitly nor has it been tested systematically; thus, some researchers have treated priming and stereotype threat as being one and the same (Ambady et al., 2004; Dijksterhuis & Bargh, 2001; Dijksterhuis & Corneille, 2004; Gladwell, 2005; Oswald & Harvey, 2000; Wheeler et al., 2001; Wheeler & Petty, 2001). Because of this, we felt it critical to show when and how these two accounts explain stereotype-based performance, particularly because many stereotype priming experiments often do not include all of the necessary factors (i.e., stereotype threat and nonstereotype manipulations, stereotyped and nonstereotyped targets, measures of threat-based concerns) to
253
make empirical comparisons between stereotype threat and priming effects. In the current research, we included such factors and showed that priming of stereotypic traits can lead to underperformance for both targets and nontargets (when their social self is not involved), but it had no meaningful effect on targets’ threat-based concerns. In sum, by highlighting how the social self and targets’ threat-based concerns can distinguish between these two accounts of stereotype-based performance, our results add to the growing literature on stereotype threat as well as the already vast stereotype priming literature.
References Ambady, N., Paik, S. K., Steele, J., Owen-Smith, A., & Mitchell, J. P. (2004). Deflecting negative self-relevant stereotype activation: The effects of individuation. Journal of Experimental Social Psychology, 40, 401– 408. Aronson, J., Lustina, M. J., Good, C., Keough, K., Steele, C. M., & Brown, J. L. (1999). When White men can’t do math: Necessary and sufficient factors in stereotype threat. Journal of Experimental Social Psychology, 35, 29 – 46. Bargh, J. A., Chen, M., & Burrows, L. (1996). Automaticity of social behavior: Direct effects of trait construct and stereotype activation on action. Journal of Personality and Social Psychology, 71, 230 –244. Bar-On, R. (1997). The emotional intelligence inventory (EQ-I). Technical manual. Toronto, Ontario, Canada: Multi-Health Systems. Blanton, H., Christie, C., & Dye, M. (2002). Social identity versus reference frame comparisons: The moderating role of stereotype endorsement. Journal of Experimental Social Psychology, 38, 253–267. Brewer, M. B., & Gardner, W. (1996). Who is this “we”? Levels of collective identity and self-representations. Journal of Personality and Social Psychology, 71, 83–93. Croizet, J.-C., & Claire, T. (1998). Extending the concept of stereotype threat to social class: The intellectual underperformance of students from low socioeconomic backgrounds. Personality and Social Psychology Bulletin, 24, 588 –594. Dijksterhuis, A., & Bargh, J. A. (2001). The perception– behavior expressway: Automatic effects of social perception on social behavior. Advances in Experimental Social Psychology, 33, 1– 40. Dijksterhuis, A., & Corneille, O. (2004). On the relations between stereotype activation and intellectual underperformance. Unpublished manuscript. Dijksterhuis, A., & van Knippenberg, A. (1998). The relation between perception and behavior, or how to win a game of trivial pursuit. Journal of Personality and Social Psychology, 74, 865– 877. Gladwell, M. (2005). Blink: The power of thinking without thinking. New York: Little, Brown. Gonzales, P. M., Blanton, H., & Williams, K. J. (2002). The effects of stereotype threat and double-minority status on the test performance of Latino women. Personality and Social Psychology Bulletin, 28, 659 – 670. Inzlicht, M., & Ben-Zeev, T. (2000). A threatening intellectual environment: Why females are susceptible to experiencing problem-solving deficits in the presence of males. Psychological Science, 11, 365–371. Kawakami, K., Young, H., & Dovidio, J. F. (2002). Automatic stereotyping: Category, trait, and behavioral activation. Personality and Social Psychology Bulletin, 28, 3–15. Levy, B. (1996). Improving memory in old age through implicit selfstereotyping. Journal of Personality and Social Psychology, 71, 1092– 1107. Leyens, J. P., Desert, M., Croizet, J.-C., & Darcis, C. (2000). Stereotype threat: Are lower status and history of stigmatization preconditions of stereotype threat? Personality and Social Psychology Bulletin, 26, 1189 –1199.
254
MARX AND STAPEL
Major, B., & O’Brien, L. T. (2005). The social psychology of stigma. Annual Review of Psychology, 56, 393– 421. Marx, D. M., Brown, J. L., & Steele, C. M. (1999). Allport’s legacy and the situational press of stereotypes. Journal of Social Issues, 55, 491–502. Marx, D. M., & Roman, J. S. (2002). Female role models: Protecting female students’ math test performance. Personality and Social Psychology Bulletin, 28, 1185–1197. Marx, D. M., & Stapel, D. A. (in press). It depends on your perspective: The role of self-relevance in stereotype-based underperformance. Journal of Experimental Social Psychology. Marx, D. M., Stapel, D. A., & Muller, D. (2005). We can do it: The interplay of construal orientation and social comparisons under threat. Journal of Personality and Social Psychology, 88, 432– 446. Onorato, R. S., & Turner. J. C. (2004). Fluidity in the self-concept: The shift from personal to social identity. European Journal of Social Psychology, 34, 257–278. Oswald, D. L., & Harvey, R. D. (2000). Hostile environments, stereotype threat, and math performance among undergraduate women. Current Psychology: Developmental, Learning, Personality, Social, 19, 338 – 356. Pronin, E., Steele, C. M., & Ross, L. (2004). Identity bifurcation in response to stereotype threat: Women and mathematics. Journal of Experimental Social Psychology, 40, 152–168. Schutte, N. S., Malouff, J. M., Hall, L. E., Haggerty, D. J., Cooper, J. T., Golden, C. J., & Dornheim, L. (1998). Development and validation of a measure of emotional intelligence. Personality and Individual Differences, 25, 167–177. Shih, M., Ambady, N., & Pittinsky, T. (1999). Stereotype susceptibility: Identity salience and shifts in quantitative performance. Psychological Science, 10, 80 – 83. Shih, M., Ambady, N., Richeson, J. A., Fujita, K., & Gray, H. M. (2002). Stereotype performance boosts: The impact of self-relevance and the manner of stereotype activation. Journal of Personality and Social Psychology, 83, 638 – 647. Spears, R., Gordijn, E., Dijksterhuis, A., & Stapel, D. A. (2004). Reaction in action: Intergroup contrast in automatic behavior. Personality and Social Psychology Bulletin, 30, 605– 616.
Spencer, S. J., Steele, C. M., & Quinn, D. (1999). Stereotype threat and women’s math performance. Journal of Experimental Social Psychology, 35, 4 –28. Srull, T. K., & Wyer, R. S., Jr. (1979). The role of category accessibility in the interpretation of information about persons: Some determinants and implications. Journal of Personality and Social Psychology, 37, 1660 –1672. Stapel, D. A., & Koomen, W. (2001). I, we, and the effects of others on me: How self-construal level moderates social comparison effects. Journal of Personality and Social Psychology, 80, 766 –781. Steele, C. M. (1997). A threat in the air: How stereotypes shape intellectual identity and performance. American Psychologist, 52, 613– 629. Steele, C. M., & Aronson, J. (1995). Stereotype vulnerability and the intellectual test performance of African Americans. Journal of Personality and Social Psychology, 69, 797– 811. Steele, C. M., Spencer, S. J., & Aronson, J. (2002). Contending with group image: The psychology of stereotype and social identity threat. In M. Zanna (Ed.), Advances in Experimental Social Psychology (Vol. 34, pp. 379 – 440). San Diego, CA: Academic Press. Stone, J., Lynch, C. I., Sjomeling, M., & Darley, J. M. (1999). Stereotype threat effects on Black and White athletic performance. Journal of Personality and Social Psychology, 77, 1213–1227. Turner, J. C., Hogg, M. A., Oakes, P. J., Reicher, S. D., & Wetherell, M. S. (1987). Rediscovering the social group: A self-categorization theory. Oxford, England: Blackwell. Walton, G. M., & Cohen, G. L. (2003). Stereotype lift. Journal of Experimental Social Psychology, 39, 456 – 467. Wheeler, S. C., Jarvis, B. G., & Petty, R. E. (2001). Think unto others: The self-destructive impact of negative racial stereotypes. Journal of Experimental Social Psychology, 37, 173–180. Wheeler, S. C., & Petty, R. E. (2001). The effects of stereotype activation on behavior: A review of possible mechanisms. Psychological Bulletin, 127, 797– 826.
Received May 9, 2005 Revision received November 24, 2005 Accepted November 28, 2005 䡲
INTERPERSONAL RELATIONS AND GROUP PROCESSES
Does Who You Marry Matter for Your Health? Influence of Patients’ and Spouses’ Personality on Their Partners’ Psychological Well-Being Following Coronary Artery Bypass Surgery John M. Ruiz and Karen A. Matthews
Michael F. Scheier
University of Pittsburgh
Carnegie Mellon University
Richard Schulz University of Pittsburgh Research suggests that presurgical personality attributes influence postsurgical well-being in both patients and their spouses in the context of coronary artery bypass grafting (CABG) surgery. The authors hypothesized that a spouse’s characteristics would influence a partner’s psychological well-being, regardless of whether he or she was the patient or the caregiver. In this study, 111 male patients and their caregiver spouses completed measures of neuroticism, optimism, perceived marital satisfaction, and depression prior to elective CABG. Follow-up was conducted at 18 months. As expected, higher caregiver presurgical neuroticism predicted higher patient depressive symptoms at follow-up, with caregiver’s concurrent 18-month affect controlled for. Likewise, higher patient presurgical neuroticism predicted higher caregiver depressive symptoms at follow-up. Additionally, higher patient presurgical depressive symptoms and lower presurgical optimism contributed to greater caregiving burden. Relationship satisfaction moderated these effects. These results suggest that partners’ personality traits are important determinants of both patients’ and their caregiving spouses’ well-being. Keywords: marital satisfaction, neuroticism, optimism, depression, coronary artery bypass grafting
conditions are predictive of subsequent morbidity and mortality (Eagle et al., 1999). Health quality of life following CABG is significantly improved in the majority of patients (Eagle et al., 1999; Herlitz et al., 2001; Wahrborg, 1999). Although distress is common prior to surgery, with nearly half of all patients reporting moderate to severe levels of fear and anxiety (Koivula, Paunonen-Ilmonen, Tarkka, Tarkka, & Laippala, 2001), psychological well-being is understandably improved following successful intervention (Brorsson, Bernstein, Brook, & Werko, 2001; Herlitz et al., 2001). Despite these improvements, clinically significant levels of depressive symptoms are reported in 20%–30% of patients 6 months following surgery (Boudrez & DeBacker, 2001; Pirraglia, Peterson, Williams-Russo, Gorkin, & Charlson, 1999). Depression is increasingly recognized as an important risk factor for cardiac morbidity and all-cause mortality (Smith & Ruiz, 2002; Suls & Bunde, 2005). For example, prospective research has demonstrated that following myocardial infarction, depressive symptoms are predictive of future myocardial infarction and mortality (Bush et al., 2001; Carney et al. for the ENRICHD Investigators, 2004; Frasure-Smith, Lesperance, & Talajic, 1993; Ladwig, Kieser, Konig, Breithardt, & Borggrefe,
Coronary heart disease is the leading cause of death in the United States and most industrialized Western nations (American Heart Association, 2002). With more than 570,000 procedures performed annually, coronary artery bypass grafting (CABG) is the most common invasive treatment with a procedural mortality rate of 1%–2% (American Heart Association, 2002). A host of factors including age at time of surgery, left ventricular function, time of intervention, number of grafts, and comorbid medical
John M. Ruiz and Karen A. Matthews, Department of Psychiatry, University of Pittsburgh; Michael F. Scheier, Department of Psychology, Carnegie Mellon University; Richard Schulz, University Center for Social and Urban Research, University of Pittsburgh. Portions of this article were presented at the 2003 meeting of the American Psychosomatic Society, Phoenix, Arizona. This research was supported in part by National Institutes of Health Grants HL07560, HL065111, HL065112, HL076852, and HL076858. We thank the participants, without whom this research would not have been possible. Correspondence concerning this article should be addressed to John M. Ruiz, who is now at the Department of Psychology, Washington State University, P.O. Box 644820, Pullman, WA 99164-4820. E-mail:
[email protected]
Journal of Personality and Social Psychology, 2006, Vol. 91, No. 2, 255–267 Copyright 2006 by the American Psychological Association 0022-3514/06/$12.00 DOI: 10.1037/0022-3514.91.2.255
255
256
RUIZ, MATTHEWS, SCHEIER, AND SCHULZ
1991). In addition, CABG patients who report more depressive symptoms after surgery appear to be at higher risk of mortality than those patients reporting fewer symptoms (Blumenthal et al. for the NORG Investigators, 2003; Borowicz et al., 2002; Connerney, Shapiro, McLaughlin, Bagiella, & Sloan, 2001; Peterson et al., 2002; Ruiz, Matthews, Scheier, Wortman, & Schulz, 2005). Hence, postsurgical depressive symptoms are an important measure of quality of life as well as risk of future morbidity and mortality.
Influence of Patients’ Presurgical Characteristics on Their Own Postsurgical Well-Being Patients’ premorbid emotional status is increasingly recognized as an important predictor of mental and physical outcomes after CABG. Presurgical distress and poor quality of life prior to surgery predict increases in postsurgical depressive symptoms, lower quality of life, greater functional impairment, and slower return to work at follow-up (Duits, Boeke, Taams, Passchier, & Erdman, 1997; Perski et al., 1998; Rumsfeld et al., 1999; Soderman, Lisspers, & Sundin, 2003). Presurgical depressive symptoms are particularly relevant to postsurgical cardiovascular health, as measured by subsequent cardiac events (Burg, Benedetto, Rosenberg, & Soufer, 2003; Perski et al., 1998; Saur et al., 2001) and post-
surgical mortality (Baker, Andrew, Schrader, & Knight, 2001; Burg, Benedetto, & Soufer, 2003). In addition to premorbid emotional status, patients’ presurgical personality traits, particularly neuroticism, are associated with higher levels of presurgical distress and with increases in depression and anxiety over a 6-month postsurgical follow-up (Duits et al., 1999). Moreover, neuroticism also appears to moderate the strength of the relationship between presurgical anxiety and later depressive symptoms, such that people with more neuroticism experience greater acute distress, which predicts later depression (Duits et al., 1999). In contrast to negative personality traits, higher optimism is associated with lower rehospitalization rates following difficult medical procedures such as CABG (Scheier et al., 1999). Hence, patients’ neuroticism, optimism, and distress appear to be important determinants of presurgical emotional distress as well as postsurgical physical and psychological outcomes. These relationships are noted in Figure 1, Pathway A.
Spouse Adaptation to CABG Surgery The emotional impact of illness extends beyond patients to include close social network members, particularly spouses (Delon, 1996; Fengler & Goodrich, 1979; Han & Haley, 1999). Spouses routinely experience significant emotional distress in re-
Figure 1. The transitive model illustrates three types of effects. First, individuals’ own presurgical traits influence their own postsurgical well-being (within-person effects; Pathways A and B). Second, the current affect of one spouse influences the current affect of the partner (Pathway C). Third, caregivers’ presurgical traits influence their partners’ postsurgical well-being (transitive effects; Pathways D and E). Finally, presurgical marital satisfaction is hypothesized to moderate the within-person and transitive effects of presurgical personality traits (Pathway F).
DOES WHO YOU MARRY MATTER FOR YOUR HEALTH?
sponse to patients’ CABG (Artinian, 1991, 1992; Karmilovich, 1994; Langeluddecke, Tennant, Fulcher, Barid, & Hughes, 1989; Stanley & Frantz, 1988). Spouses commonly report anxiety, depressive symptoms, increased irritability, and difficulties with sleep prior to CABG (Bengtson, Karlsson, Wahrborg, Hjalmarson, & Herlitz, 1996; Langeluddecke et al., 1989) and experience significant distress during the acute rehabilitation phase (O’Farrell, Murray, & Hotz, 2000). Moreover, spousal distress often remains elevated 12 months postsurgery while perceived caregiving burden increases during the 1st year (Artinian, 1992). Spouses report worries about prognosis, follow-up care, patient distress, changes in roles, changes in social relationships, financial concerns, sexual concerns, and patient helplessness or apathy (Artinian, 1992; Bengtson et al., 1996; Cozac, 1988; O’Farrell et al., 2000).
Influence of Spouses’ Presurgical Characteristics on Their Own Postsurgical Well-Being Like patients, spouses’ personality traits influence their own presurgical distress as well as postsurgical adaptation to patients’ CABG surgery (Patrick & Hayden, 1999). Numerous studies across illness domains support that spouses’ trait neuroticism predicts their own caregiving burden and distress (Bookwala & Schulz, 1998; Hooker, Monahan, Shifren, & Hutchinson, 1992; Nijboer, Tempelaar, Triemstra, van den Bos, & Sanderman, 2001; M. F. Reis, Gold, Andres, Markiewicz, & Gauthier, 1994; Vedhara, Shanks, Wilcock, & Lightman, 2001). Conversely, more optimistic spousal caregivers report greater resilience to the demands of caregiving including less depressive symptoms over time (Given et al., 1993; Hooker et al., 1992; Kurtz, Kurtz, Given, & Given, 1997). These relationships are noted in Figure 1, Pathway B.
Influence of One Spouse’s Affect on Partner’s Concurrent Affect In addition to within-person effects, spouses may also influence the experience of one another. Relationship experts suggest not only that interpersonal influences occur between individuals but that the frequency and magnitude of these effects are greater between people who share a close relationship such as with marital couples (i.e., interdependence theory; cf. Rusbult & Van Lange, 2003). For example, research supports the idea that the current affect of one spouse may influence the current affect of his or her partner—so-called contagion effects (Nieboer et al., 1998; Schulz, Bookwala, Knapp, Scheier, & Williamson, 1996). In a study of 1,040 older couples, Bookwala and Schulz (1996) found that the current affect of one spouse was associated with the partner’s affect. These results suggest that if one spouse is distressed following CABG surgery, the partner is likely to be distressed as well. Although the mechanisms for this phenomenon are not clear, contagion effects may be an additional important determinant of post-CABG depressive symptoms in both patients and their spouses. These relationships are noted in Figure 1, Pathway C.
257
also take place longitudinally. However, less is known about the potential influence of one person’s presurgical personality on the spouse’s postsurgical well-being. We propose the transitive model to illustrate these potential cross-sectional and longitudinal interpersonal influences among couples in the context of illness (see Figure 1). Derived from the clinical and interpersonal literature, the term transitive refers to the influence one person has on a second person (Benjamin, 1996; Kiesler, 1996; Sullivan, 1953). For example, someone smiling at us may shift our mood from neutral to positive. The person’s smile is transitive in that it causes our change in mood. Moreover, if that person’s positive personality motivated the smile, we might say that the person’s personality had a transitive influence on our experience. It is notable that this transitive model does not abandon the traditional within-person relationship whereby we expect that our affect and experience are influenced by our own personality (see Figure 1, Pathways A and B). Rather, the transitive model emphasizes an interpersonal framework in which each individual also influences the experience of the other dyad member (see Figure 1, Pathways D and E).
Support for the Transitive Perspective Research examining transitive interpersonal effects is sparse in the health literature and more generally. Despite interest in interpersonal processes among social psychologists, few prospective studies independently assess both an individual and members of the social network. Rather, researchers often rely on structural measures of social environments (e.g., size of one’s social network or marital status) and a perception of interpersonal attributes and influence (e.g., perceived characteristics, perceived support and support quality). However, the few studies that have examined such relationships appear to support the transitive hypothesis. For example, Beach and colleagues found that lower marital satisfaction in one spouse predicted later depressive symptoms in the partner (Beach, Katz, Kim, & Brody, 2003), although a similar study of newlyweds did not find such a relationship (Fincham, Beach, Harold, & Osborne, 1997). In another line of research, Agnew, Loving, and Drigotas (2001) found that in a sample of college-age couples, individual perceptions of relationship closeness predicted future relationship breakups. In the context of caregiving and health, a literature review of caregivers of patients with Alzheimer’s disease concluded that patient depression was strongly associated with distress and depression in their caregivers (Teri, 1997). Subsequent findings across several illness domains also support patient distress as a significant source of caregiver burden (Dyck, Short, & Vitaliano, 1999; Fang, Manne, & Pape, 2001; Northouse, Templin, & Mood, 2000; Scholte op Reimer, de Haan, Rijnders, Limburg, & van den Bos, 1998). However, we found no prospective studies examining spousal characteristics as predictors of patients’ health and well-being, no studies examining these relationships in the context of a discrete event such as CABG surgery (vs. longitudinal studies of cancer or dementia), and no studies looking at dispositional characteristics such as personality, as opposed to disease severity or emotional distress.
Transitive Influence of Couples’ Presurgical Characteristics on Their Partners’ Adaptation
Marital Satisfaction as a Potential Moderator
Although contagion effects support a cross-sectional influence of one spouse on the partner’s experience, spousal influences may
Presurgical relationship quality is an important predictor of postsurgical well-being for couples facing CABG (Kulik &
RUIZ, MATTHEWS, SCHEIER, AND SCHULZ
258
Mahler, 1989, 1993; Lindsay, Hanlon, Smith, & Wheatley, 2000; Pirraglia et al., 1999). For example, Elizur and Hirsh (1999) found that patients who reported higher preoperative marital quality displayed better psychological recovery 2 months following CABG procedures. Allen, Young, and Xu (1998) found that female patients who reported better marital quality prior to CABG displayed better functional status 6 and 12 months post-CABG. Because of a lack of research examining both constructs as simultaneous predictors, it is not clear that marital satisfaction predicts health independent of personality. Interpersonal theories along with supporting evidence suggest that personality traits impact trajectories of marital quality (cf. Karney & Bradbury, 1997; Kiecolt-Glaser & Newton, 2001). This evidence suggests that personality supersedes marital satisfaction with implications for health. A plausible alternative to both as predictors is that marital satisfaction may moderate the relationship between personality and health. H. T. Reis and Collins (2004) and H. T. Reis, Collins, and Berscheid (2000) hypothesized that the relational context of interpersonal behavior influences the impact of such behavior on the interpersonal target. For example, in two studies of couples coping with kidney transplants, Frazier, Tix, and Barnett (2003) demonstrated that unsupportive spousal-caregiver behaviors were associated with higher patient distress but only when the patient was less satisfied with the relationship. Thus, we hypothesized that the degree to which a person is satisfied or dissatisfied with his or her relationship moderates the extent to which that person is influenced by the spouse’s personality (Figure 1, Pathway F). That is, the impact of one person’s personality on the partner’s outcomes may be more or less strong depending on the level of satisfaction that the person experiences. In summary, CABG surgery is an efficacious intervention for advanced coronary heart disease. However, the stress of the procedure and the underlying disease is associated with significant distress in both patients and caregivers. Characteristics such as neuroticism, optimism, presurgical distress, and relationship quality predict adjustment in both patients and their caregivers. In addition, interpersonal theories coupled with sparse evidence suggest that presurgical characteristics may have transitive interpersonal effects such that the characteristics of one person affect the partner’s adaptation, particularly when illness occurs in the context of close relationships such as marriage. However, no published studies have prospectively examined these relationships in a discrete medical event context such as CABG surgery nor have studies examined these relationships as a function of dispositional characteristics. In addition, we are unaware of any published studies having examined both marital satisfaction and personality traits as concurrent predictors or studies examining marital satisfaction as a possible moderator of transitive relationships between personality and health. Given the importance of close relationships such as marriage, particularly during an individual’s time of need, transitive interpersonal effects of personality may be a fundamental part of adjustment to major life stressors.
Current Study One hundred and eleven male patients and their caregiver spouses completed measures of neuroticism, optimism, depressive symptoms, and perceived marital satisfaction prior to elective CABG surgery. Follow-up was conducted at 18 months postsurgery. For the current study, postsurgical well-being was concep-
tualized as depressive symptoms for both patients and caregivers, and caregivers’ well-being also encompassed caregiving burden and strain. Our major expectations were as follows: We expected patients’ depressive symptoms at follow-up to be associated with (a) higher presurgical depressive symptoms and neuroticism as well as lower optimism; (b) higher caregiver presurgical depressive symptoms, higher neuroticism, and lower optimism; and (c) the interaction between patients’ presurgical marital satisfaction and their caregivers’ personality. Caregiver-spouse predictions were analogous to patient hypotheses. In addition, caregivers completed measures of caregiving burden and strain. Thus, we expected caregivers’ postsurgical burden to be associated with (a) higher presurgical burden, neuroticism, and depressive symptoms as well as lower optimism; (b) higher patient presurgical depressive symptoms, higher neuroticism, and lower optimism; and (c) the interaction between caregivers’ presurgical marital satisfaction and patients’ personality. Parallel predictions were made for caregiver strain.
Method Sample Participants for this study were drawn from a larger pool of patients participating in the Pittsburgh Bypass Project. Participants were recruited from 528 consecutive patients scheduled for elective CABG surgery at Allegheny General Hospital in Pittsburgh, Pennsylvania, between June 1992 and January 1994 (Scheier et al., 1999). Eligibility criteria were (a) first-time referral for coronary artery bypass surgery with no concurrent procedures, (b) no acute chest pain at the time of the initial interview, (c) 1-day minimum between time of scheduled procedure and actual procedure, (d) not admitted to the intensive care unit, (e) English speaking, and (f) residence within 125 miles (201.17 km) of Allegheny General Hospital. The baseline patient sample consisted of 309 participants (215 men, 69.6%). The caregiver was identified by the patient as someone who would help or take care of him when he returned home. Of the original patient sample, 287 reported having a caregiver, of whom 206 were contacted, with 144 (128 women, 88.9%) participating at baseline. Participants for the present study consisted of the 111 married couples in which the husband was the patient and the wife participated as the caregiver. There were insufficient numbers of female patients/husbands as caregiver dyads to perform any meaningful analyses (n ⫽ 13). Husbands were generally older than their wives (see Table 1). Of the original 111
Table 1 Participant Baseline Characteristics Variable N Age (years) M SD Education ⬍HS diploma HS diploma or some college College degree Missing Employed Yes No
Patient 111
Caregiver 111
61.05 10.32
57.55 10.84
25 55 31 0
16 75 16 4
46 65
47 64
Note. Difference between patient and caregiver mean baseline age was significant, t(110) ⫽ 8.31, p ⬍ .001. HS ⫽ high school.
DOES WHO YOU MARRY MATTER FOR YOUR HEALTH? husband/patient couples, 97 participated in the 18-month data collection (11 couples either could not be located or declined further participation, and 3 patients had died, resulting in the loss of the dyad).
Procedure The office of the cardiothoracic surgeon made all referrals. Face-to-face baseline interviews were scheduled and conducted by trained interviewers between 1 and 20 days before the scheduled surgery. During the course of this study, surgical practices changed so that elective surgeries were scheduled within 1 or 2 days of diagnosis as opposed to 7–10 days, accounting for the variability in time for administration of the presurgical interviews. In all cases, informed consent was obtained prior to conducting the interviews. Follow-up interviews were conducted at 6 – 8 days, 6 months, and 18 months postoperatively. Caregivers were followed up at 6 months and 18 months postoperatively.
Psychosocial Measures All participants completed measures of dispositional optimism, depressive symptoms, neuroticism, and relationship satisfaction. Optimism was measured with the 6-item Revised Life Orientation Test (Scheier, Carver, & Bridges, 1994). Higher total score was interpreted as greater trait optimism. Neuroticism was measured with a 10-item version of the Neuroticism scale of the Eysenck Personality Questionnaire (Eysenck, 1958; Goh, King, & King, 1982). Depressive symptoms were measured with a 10-item version of the Center for Epidemiologic Studies Depression Scale (Radloff, 1977). The current version excluded somatic symptoms because these could be influenced by differences in cardiac/medical symptoms. These exclusions limit interpretation of scores with respect to their clinical significance. However, the measure is useful as an index of depressive symptoms. Alpha reliabilities for this version have been reported in the range of .82–.93 (Nieboer et al., 1998; Scheier et al., 1999). Finally, the Dyadic Relationship Scale (Skinner, Steinhauer, & Santa-Barbara, 1983) is a 14-item self-report measure of relationship satisfaction with a specific other. Individuals rate the degree to which they agree with statements regarding their partner’s communication, affective expression, and affective involvement. Example items include “Even if this person disagrees, he or she listens to my point of view” and “When I am upset, I know this person really cares.” Items are rated on a 4-point Likert scale (1 ⫽ strongly disagree, 4 ⫽ strongly agree) and are totaled to yield an overall score. In the current study, husbands and wives completed the measure with respect to each other, which was interpreted as a measure of marital satisfaction. Alpha reliabilities for patients and spouses ranged from .88 to .90. Caregivers also completed measures of caregiver burden and strain prior to surgery and at 6 and 18 months post-CABG. The Burden scale is based on the Zarit Burden Interview (Zarit, Reever, & Bach-Peterson, 1980) and assesses how often the individual feels oppressed by various aspects of caregiving. Participants responded using a 3-point Likert scale ranging from 1 (never) to 3 (often). Example items include “I feel useful in my interactions with this person” and “I feel resentful of other relatives who could but who do not do things with or for this person.” Alpha reliabilities were .81 for presurgical burden and .84 at the 18-month follow-up. The Zarit Burden scale assesses the frequency with which spouses experience certain feelings. To complement this frequency scale, a 20-item Caregiver Strain Scale was developed for this project to assess the magnitude of burden associated with common aspects of caregiving. Participants were given the stem “Please indicate how much strain each of these has caused for you in the past month” and asked to respond to 20 items on a 3-point Likert scale (1 ⫽ none, 3 ⫽ a great deal). Items included “lack of gratitude,” “social isolation,” and “insurance concerns.” Alpha reliabilities were .90 for presurgical strain and .94 at the 18-month follow-up. Pearson correlations between the Burden and strain scales were .77 ( p ⬍ .001) prior to surgery and .69 ( p ⬍ .001) at the 18-month follow-up.
259
Medical Information Medical information was gathered from several sources including the cardiac catheterization report, the operative report, and inpatient medical records. The catheterization report yielded the following information: (a) number of grafts, (b) number of coronary vessels occluded 50% or more, and (c) ejection fraction less than 40%. Hospital medical records yielded (a) current smoking status (smoker or not), (b) history of hypertension, (c) history of diabetes mellitus, and (d) presurgical serum cholesterol. A separate record contained the patient’s preoperative New York Heart Association classification as well as current or immediately preoperative episodes of acute myocardial infarction, congestive heart failure, or unstable angina.
Analysis Strategy Linear regression was used to test hypotheses regarding the influence of each person’s presurgical traits (optimism, depressive symptoms, neuroticism, and relationship satisfaction) on the spouse’s 18-month well-being (e.g., depressive symptoms). First, simple correlations were performed between presurgical and postsurgical outcome variables. Second, we calculated partial correlations in order to identify candidate presurgical predictors for inclusion in the overall multivariate analyses. Postsurgical outcomes were correlated with each patient and spouse presurgical predictor, with presurgical values of the outcome controlled for. For patients, none of the eight medical (five preoperative and three postoperative) or two demographic (age and income) variables were associated with 18-month depressive symptoms and, thus, were not included in the regression models. Similar null results were found for patient medical covariates and spouse-caregiver demographics on caregivers’ 18-month depressive symptoms and measures of caregiving burden.1 In the third analysis phase, we regressed each significant presurgical predictor to emerge from the second phase of analysis onto the 18-month postsurgical outcome. Linear regressions were conducted in steps. First, the outcome-matched presurgical term along with any of the individual’s own significant presurgical predictors were regressed simultaneously. This approach allowed us to account for as much variance in the postsurgical outcome as could be predicted by the individual’s own presurgical characteristics prior to examining any additional variance explained by the spouse’s presurgical characteristics. In the second step, any significant spouse presurgical predictors were entered in a stepwise manner to determine the best partner predictors of additional variance. In describing the analyses, we report the unstandardized (B), standard error, and standardized () regression coefficients. We also report the overall multiple correlation squared (R2) for the final model as well as the R2 for the model reflecting the individual’s own predictors and the ⌬R2 for each additional model. In separate analyses, linear regression was used to examine marital satisfaction as a potential moderator of transitive relationships on both patient and caregiver outcomes. First, all predictors were centered. Next, four sets of interaction terms were created, two for examining caregiver effects on patients’ well-being (patients’ perceived marital satisfaction with caregiver traits and caregivers’ perceived marital satisfaction with their own traits) and two for examining patient effects on caregivers’ well-being (caregivers’ perceived marital satisfaction with patient traits and patients’ marital satisfaction with their own traits). Consistent with the preceding
1 One medical/demographic covariate was correlated with caregiver outcomes. Partial correlations revealed that higher caregiver age was associated with lower 18-month perception of burden when we controlled for presurgical burden (r ⫽ ⫺.22, p ⬍ .05). Caregiver age was entered into the first step of the regression model but did not remain significant when additional caregiver traits were added. Thus, caregiver age was dropped from the final model.
RUIZ, MATTHEWS, SCHEIER, AND SCHULZ
260
analytic methodology, patient and caregiver predictors were identified through partial correlations. Centered terms were created for these analyses. Regressions were performed in three steps: (a) the individual’s own traits, (b) the partner’s traits, and (c) the interaction terms entered stepwise. Variables in the first two steps were forced in using the “enter” method in order to account for their variance before the interaction terms were examined (Aiken & West, 1991). Significant interactions were plotted and examined using Aiken and West’s (1991) approach.
Results Spousal Comparison on Traits and Outcomes Table 2 reports the means and standard deviations for the psychosocial measures at baseline and at 18 months for patients and spouse-caregivers. Table 2 also reports the results of pairedsample t tests conducted to assess whether significant differences existed between dyad pairs on these measures as well as the correlations between the dyad pairs on the measures assessed. As shown, patient and caregiver traits were similar, with the exception of depressive symptoms. Prior to surgery, caregivers reported significantly more depressive symptoms as compared with their patients but reported similar levels of optimism, neuroticism, and marital satisfaction. After surgery, couples reported similar levels of depressive symptoms. Finally, with the exception of optimism, for each of the variables assessed, a significant positive correlation existed between the ratings made by the dyad pairs.
Patient Outcomes We hypothesized that patient and caregiver traits measured prior to surgery would predict patients’ postsurgical depressive symptoms. In univariate analyses, patients’ presurgical and postsurgical depressive symptoms were positively correlated (r ⫽ .47, p ⬍ .001). However, patients’ depressive symptoms did not change significantly between presurgical levels and the 18-month follow-up (see Table 2). Partial correlations controlling for presurgical depressive symptoms revealed significant associations between patient 18-month depressive symptoms and both patient and caregiver presurgical predictors. Specifically, higher patient 18-month depressive symptoms were significantly associated with lower patient presurgical optimism, lower marital satisfaction, and
higher neuroticism (see Table 3). In addition, lower caregiver presurgical optimism and higher presurgical neuroticism were also significantly associated with higher patient 18-month depressive symptoms. Hence, these patient and caregiver predictors were included in the overall regression model. Multivariate analyses showed that higher 18-month depressive symptoms were predicted by patients’ higher presurgical depressive symptoms and lower presurgical optimism, accounting for 34% of the variance in scores. As expected, higher caregiver presurgical neuroticism accounted for an additional 6% of the variance (Table 4, Model 1). Thus, we found some support for the transitive hypothesis for patient depressive symptoms. Next, we examined patient and caregiver presurgical marital satisfaction as potential moderators of caregiver traits on patient depressive symptoms. Multivariate analyses showed that patient but not caregiver presurgical marital satisfaction emerged as a moderator of caregiver presurgical neuroticism on patient postsurgical depressive symptoms (Table 4, Model 2; see Figure 2A). Post hoc analyses revealed that postsurgical depressive symptoms were higher among patients who were less satisfied in their marriage prior to surgery and whose caregiving spouse was higher in presurgical neuroticism (B ⫽ 1.13, p ⬍ .001).
Caregiver Outcomes For caregivers, we hypothesized that patient and caregiver traits measured prior to surgery would predict caregivers’ postsurgical depressive symptoms and caregiving burden and strain. Depressive symptoms. In univariate analyses, caregivers’ presurgical and postsurgical depressive symptoms were positively correlated (r ⫽ .52, p ⬍ .001). In contrast to patient findings, caregivers reported a significant improvement in depressive symptoms over time, t(83) ⫽ 5.96, p ⬍ .001 (see Table 2). Partial correlations controlling for presurgical depressive symptoms revealed significant associations between spouse 18-month depressive symptoms and both caregiver and patient presurgical predictors. Specifically, higher caregiver 18-month depressive symptoms were significantly associated with lower caregiver presurgical optimism and marital satisfaction and higher caregiver neuroticism (see Table 3). Higher patient presurgical neuroticism, higher de-
Table 2 Mean Comparisons of Patient and Caregiver Psychosocial Measures and Their Associations Time of assessment and measure Presurgical Depressive symptoms Marital satisfaction Optimism Neuroticism 18-month follow-up Depressive symptoms
Scale
Patient
Caregiver
df
t
r
CES-D DRS LOT Eysenck
16.5a (5.9) 45.2 (5.8) 21.9 (3.6) 2.6 (2.5)
19.8b (6.7) 45.4 (6.6) 21.9 (3.7) 2.9 (2.4)
99 100 104 102
4.13*** ⫺0.22 ⫺0.04 ⫺1.11
.23* .39*** .17 .31***
CES-D
15.1a (5.4)
15.6b (5.4)
86
⫺0.90
.37***
Note. Standard deviations are in parentheses. CES-D ⫽ Center for Epidemiologic Studies Depression Scale; DRS ⫽ Dyadic Relationship Scale; LOT ⫽ Life Orientation Test; Eysenck ⫽ Eysenck Personality Questionnaire. a Comparison between patients’ presurgical and 18-month depressive symptoms, t(90) ⫽ 1.54, ns. b Comparison between caregivers’ presurgical and 18-month depressive symptoms, t(83) ⫽ 5.96, p ⬍ .001. * p ⬍ .05. *** p ⬍ .001.
DOES WHO YOU MARRY MATTER FOR YOUR HEALTH?
261
Table 3 Partial Correlations Between Patient and Caregiver Presurgical Traits and 18-Month Patient and Caregiver Outcomes, Controlling for Presurgical Levels of the Outcome Variable Variable Patient presurgical Depressive symptoms Optimism Marital satisfaction Neuroticism Caregiver presurgical Depressive symptoms Optimism Marital satisfaction Neuroticism
Patient depressive symptoms
Caregiver depressive symptoms
Caregiver burden
Caregiver strain
⫺.38*** (87) ⫺.24* (84) .37*** (86)
.24* (77) ⫺.09 (78) ⫺.23* (76) .47*** (77)
.48*** (78) ⫺.43*** (80) ⫺.15 (78) .38*** (78)
.42*** (77) ⫺.07 (79) ⫺.23* (77) .30** (77)
.16 ⫺.27** ⫺.06 .30**
⫺.34** (80) ⫺.25* (80) .46*** (78)
.05 ⫺.14 ⫺.13 .14
⫺.05 (78) ⫺.25* (82) ⫺.35*** (80) .24* (79)
(85) (87) (86) (85)
(79) (82) (81) (80)
Note. Numbers of couples are in parentheses. * p ⬍ .05. ** p ⬍ .01. *** p ⬍ .001.
pressive symptoms, and lower marital satisfaction were also associated with higher caregiver depressive symptoms at 18-month follow-up. Hence, these patient and caregiver predictors were included in the overall regression model. Multivariate analyses showed that higher caregiver 18-month depressive symptoms were predicted by their own higher presurgical depressive symptoms and higher neuroticism, accounting for 47% of the variance in scores. Consistent with our hypotheses, higher patient presurgical neuroticism accounted for an additional 12% of the variance (Table 5, Model 1). Multivariate analyses revealed that the relationship between patients’ presurgical traits and caregivers’ postsurgical depressive symptoms was not moderated by either patient or caregiver presurgical marital satisfaction. Caregiving burden. In univariate analyses, caregivers’ presurgical and postsurgical caregiving burden were positively correlated (r ⫽ .60, p ⬍ .001). No significant change was found between presurgical and 18-month postsurgical burden. Partial correlations
controlling for presurgical burden revealed no significant associations between caregiver 18-month burden and any other of their own presurgical predictors. Higher caregiver 18-month burden was significantly associated with higher patient depressive symptoms, higher neuroticism, and lower optimism (see Table 3). Hence, these patient predictors were included in the overall regression model. In multivariate analyses, higher 18-month burden scores were predicted by higher presurgical burden, accounting for 38% of the variance in burden scores (Table 5, Model 2). Consistent with expectations, higher patient presurgical depressive symptoms emerged as a significant predictor, accounting for an additional 13% of the variance. Finally, lower patient optimism also emerged as a significant predictor, accounting for an additional 5% of the variance in caregivers’ 18-month burden scores. Multivariate analyses showed that both patient (Table 5, Model 3; see Figure 2B) and caregiver (Table 5, Model 4; see Figure 2C)
Table 4 Multiple Regression Models Predicting Patient 18-Month Well-Being by Patient and Caregiver Presurgical Traits Model and patient 18-month depressive symptoms Model 1a Patient presurgical depressive symptoms Patient presurgical optimism Caregiver presurgical neuroticism Model 2b Patient presurgical depressive symptoms Patient presurgical optimism Patient presurgical neuroticism Patient presurgical marital satisfaction Caregiver presurgical optimism Caregiver presurgical neuroticism Caregiver presurgical marital satisfaction Caregiver Presurgical Neuroticism ⫻ Patient Presurgical Marital Satisfaction
B
SE B

0.34 ⫺0.51 0.57
0.09 0.15 0.20
.36*** ⫺.33** .25**
0.41 ⫺0.44 ⫺0.09 ⫺0.14 ⫺0.04 0.46 0.16
0.10 0.17 0.28 0.10 0.16 0.26 0.09
.41*** ⫺.29** ⫺.04 ⫺.15 ⫺.03 .20 .19
⫺0.12
0.04
⫺.31***
Note. Only models demonstrating a significant transitive effect are shown. All coefficient values reflect the final regression model. Overall R2 ⫽ .40; patient predictors, R2 ⫽ .34; caregiver predictor, ⌬R2 ⫽ .06, p ⬍ .01. b Overall R2 ⫽ .47; interaction term, ⌬R2 ⫽ .08, p ⬍ .001. ** p ⬍ .01. *** p ⬍ .001. a
262
RUIZ, MATTHEWS, SCHEIER, AND SCHULZ
Figure 2. Interactions between presurgical traits and marital satisfaction on patient and caregiver outcomes. A: Interaction between patients’ presurgical marital satisfaction and caregivers’ presurgical neuroticism for patient postsurgical depressive symptoms. B: Interaction between caregivers’ presurgical marital satisfaction and patients’ presurgical neuroticism for caregivers’ postsurgical burden. C: Interaction between patients’ presurgical marital satisfaction and their own presurgical depressive symptoms for caregivers’ postsurgical burden. D: Interaction between caregivers’ presurgical marital satisfaction and patients’ presurgical neuroticism for caregivers’ postsurgical strain.
marital satisfaction emerged as moderators of patient traits on caregiver burden. Post hoc analyses revealed that caregiving burden was higher among caregivers who were less satisfied in their marriage prior to surgery and who were caring for a patient-spouse higher in presurgical neuroticism (B ⫽ 0.56, p ⬍ .05). Caregiving burden was also higher among caregivers who were caring for patient-spouses who themselves were less satisfied in their marriage and who reported higher presurgical depressive symptoms (B ⫽ 0.43, p ⬍ .001). Caregiving strain. Univariate analyses revealed caregivers’ presurgical and postsurgical strain were positively correlated (r ⫽ .65, p ⬍ .001). No significant change was found between presurgical and 18-month postsurgical strain. Partial correlations controlling for presurgical strain revealed significant associations be-
tween caregivers’ 18-month strain and both caregiver and patient presurgical predictors. Specifically, higher 18-month strain was significantly associated with lower caregiver presurgical optimism, lower marital satisfaction, and higher neuroticism (see Table 3). Higher patient presurgical depressive symptoms and neuroticism as well as lower marital satisfaction were also associated with higher caregiver strain at 18-month follow-up. Hence, these caregiver and patient predictors were included in the overall regression model. Multivariate analyses showed that higher 18-month caregiving strain scores were predicted by higher caregiver presurgical strain, higher neuroticism, and lower marital satisfaction, accounting for 56% of the variance in scores (Table 5, Model 5). In addition, higher patient presurgical depressive symptoms also predicted
DOES WHO YOU MARRY MATTER FOR YOUR HEALTH?
263
Table 5 Multiple Regression Models Predicting Caregiver 18-Month Well-Being by Patient and Caregiver Presurgical Traits Model and presurgical trait Caregiver 18-month depressive symptoms Model 1a Caregiver presurgical depressive symptoms Caregiver presurgical neuroticism Patient presurgical neuroticism Caregiver 18-month burden Model 2b Caregiver presurgical burden Patient presurgical depressive symptoms Patient presurgical optimism Model 3c Caregiver presurgical burden Caregiver presurgical marital satisfaction Patient presurgical depressive symptoms Patient presurgical optimism Patient presurgical neuroticism Patient presurgical marital satisfaction Patient Neuroticism ⫻ Caregiver Presurgical Marital Satisfaction Model 4d Caregiver presurgical burden Caregiver presurgical marital satisfaction Patient presurgical depressive symptoms Patient presurgical optimism Patient presurgical neuroticism Patient presurgical marital satisfaction Patient Presurgical Depressive Symptoms ⫻ Patient Presurgical Marital Satisfaction Caregiver 18-month strain Model 5e Caregiver presurgical strain Caregiver presurgical marital satisfaction Caregiver presurgical neuroticism Patient presurgical depressive symptoms Model 6f Caregiver presurgical strain Caregiver presurgical optimism Caregiver presurgical neuroticism Caregiver presurgical marital satisfaction Patient presurgical depressive symptoms Patient presurgical neuroticism Patient presurgical marital satisfaction Patient Presurgical Neuroticism ⫻ Caregiver Presurgical Marital Satisfaction

B
SE B
0.15 1.05 0.93
0.08 0.24 0.21
.18* .43*** .37***
0.45 0.23 ⫺0.33
0.09 0.07 0.12
.44*** .30*** ⫺.25**
0.27 ⫺0.12 0.25 ⫺0.22 0.11 ⫺0.04 ⫺0.07
0.10 0.07 0.08 0.12 0.19 0.07 0.02
.27** ⫺.17 .29** ⫺.17 .06 ⫺.06 ⫺.26**
0.34 ⫺0.02 0.27 ⫺0.26 0.16 ⫺0.11
0.10 0.07 0.08 0.13 0.19 0.07
.33*** ⫺.03 .32** ⫺.20* .09 ⫺.14
⫺0.03
0.01
⫺.20*
0.38 ⫺0.28 0.81 0.50
0.10 0.09 0.30 0.11
.39*** ⫺.23** .25** .33***
0.36 ⫺0.10 0.75 ⫺0.30 0.56 ⫺0.35 ⫺0.07
0.10 0.19 0.35 0.10 0.13 0.30 0.11
.37*** ⫺.05 .23* ⫺.25** .37*** ⫺.11 ⫺.05
⫺0.08
0.04
⫺.16*
Note. Only models demonstrating a significant transitive effect are shown. All coefficient values reflect the final regression model. Overall R2 ⫽ .59; caregiver predictors, R2 ⫽ .47; patient predictor, ⌬R2 ⫽ .12, p ⬍ .001. b Overall R2 ⫽ .56; caregiver predictors, R2 ⫽ .38; patients’ presurgical depressive symptoms, ⌬R2 ⫽ .13, p ⬍ .001; patients’ presurgical optimism, ⌬R2 ⫽ .05, p ⬍ .01. c Overall R2 ⫽ .62; interaction term, ⌬R2 ⫽ .05, p ⬍ .005. d Overall R2 ⫽ .61; interaction term, ⌬R2 ⫽ .04, p ⬍ .05. e Overall R2 ⫽ .66; R2 for individuals’ own predictors ⫽ .56; patient predictors, ⌬R2 ⫽ .10, p ⬍ .001. f Overall R2 ⫽ .69; interaction term, ⌬R2 ⫽ .02, p ⬍ .05. * p ⬍ .05. ** p ⬍ .01. *** p ⬍ .001. a
higher caregiver strain at 18 months, accounting for an additional 10% of the variance. Finally, patient presurgical neuroticism interacted with caregivers’ presurgical marital satisfaction, suggesting that among caregivers who reported higher presurgical marital satisfaction, higher patient presurgical neuroticism was associated with lower post-CABG caregiving strain (Table 5, Model 6; see Figure 2D). Post hoc analyses revealed a paradoxical relationship such that caregiving strain was higher among caregivers who were more satisfied in their marriage prior to surgery and who were
caring for a patient-spouse lower in presurgical neuroticism (B ⫽ ⫺0.85, p ⬍ .05).
Spouses’ Current Versus Prospective Factors in Predicting Partner Depressive Symptoms Current affect of one spouse may influence the current affect of the partner— contagion effects (Figure 1, Pathway C). Contagion effects argue for congruent affective experiences among dyad
RUIZ, MATTHEWS, SCHEIER, AND SCHULZ
264
members. Hence, as an alternative to our prospective findings, we examined the possibility that 18-month outcomes were better accounted for by partners’ 18-month depressive symptoms.2 In univariate analyses, patients’ and caregivers’ 18-month depressive symptoms were positively correlated (r ⫽ .37, p ⬍ .001). In addition, partial correlations revealed significant associations between patients’ and spouses’ 18-month depressive symptoms when either patients’ (r ⫽ .27, p ⬍ .05) or caregivers’ (r ⫽ .29, p ⬍ .01) presurgical depressive symptoms were controlled for. These findings are consistent with Figure 1, Pathway C. Linear regression was used to test concurrent affect versus prospective factors of one spouse on the partner’s 18-month depressive symptoms. The regression model, including predictors and order of entry, was identical to those previously used for patient and caregiver 18-month depressive symptoms. To test whether the presurgical factors or concurrent affect of one spouse was a better predictor of the partner’s 18-month depressive symptoms, we also entered the predictor spouse’s concurrent 18-month depressive symptoms in the first step of the regression model. Concurrent affect was not a significant predictor for either patient or caregiver outcomes and did not significantly alter previously observed prospective findings. These findings suggest that presurgical factors and not concurrent affect are the stronger predictors of partner 18-month depressive symptoms among couples facing CABG surgery.
Discussion The current study supports the hypothesis that personality characteristics are important prospective risk factors for the well-being of both individuals and their spouses. First, the findings demonstrate the importance of presurgical personality traits as withinperson predictors of postsurgical adaptation. For patients, lower presurgical optimism predicted higher postsurgical depressive symptoms. For caregiver spouses, higher presurgical neuroticism predicted more postsurgical depressive symptoms and, along with lower presurgical marital satisfaction, contributed to greater postsurgical caregiving strain. Second, and perhaps more interesting, the current study demonstrates the importance of personality traits as independent predictors of partner’s adaptation in the context of marriage and illness. As predicted, partner’s presurgical neuroticism predicted higher postsurgical depressive symptoms for both patients and caregivers after we accounted for the individual’s own personality characteristics. It is important that we demonstrated that these transactional effects were not better explained by the concurrent affect of the spouse. In addition, higher patient presurgical depressive symptoms predicted higher caregiver postsurgical burden and strain, whereas higher patient optimism was associated with lower postsurgical caregiver burden. Together, these findings suggest that over and above a patient’s own risk factors, spouses’ psychosocial characteristics contribute unique risk to patient well-being. Consistent with expectations, marital satisfaction emerged as a consistent moderator of transitive neuroticism effects. We generally found that the more dissatisfied people were with their marriage prior to surgery, the more damaging was their spouses’ neuroticism to their own well-being after surgery. These findings suggest that for couples who are less happy in their marriage and who are facing CABG surgery, higher neuroticism in one spouse
is predictive of more postsurgical difficulties in the partner. Curiously, we found one exception to this pattern: Caregivers who were more satisfied in their marriage and who were caring for more neurotic patient-spouses reported less postsurgical strain. Future research should investigate whether this is a replicable finding or simply an anomaly in the data. Despite this finding, higher neuroticism emerged as an important interpersonal risk factor, either alone or as moderated by lower marital satisfaction, for all patient and caregiver outcomes examined here. These findings also clarify the role of presurgical marital satisfaction as a moderator of transitive personality influence rather than as a unique predictor of later well-being. Finally, analyses revealed that the prospective effects of personality remained significant after concurrent affect was accounted for. This finding suggests that personality is the more important predictor of partner depressive symptoms, at least in this data set, perhaps due to the stability of personality traits compared with the more transitive nature of mood variables such as depressive symptoms. In the absence of prior examples comparing such effects, our finding is somewhat exploratory. However, our rationale for the prediction was based on the idea that in established relationships, personality represents not only a consistent pattern of overt affect and behavior but a history of interacting with that pattern. This is not to say that transient mood is not important, but, at least in these data, experience outweighed the episodic. Our results support a broader transitive conceptualization of the patient/caregiver-spouse relationship and, particularly, the role of personality traits within the model. Extending findings of neuroticism as an important predictor of individual depressive symptoms, the quality of caregivers’ neuroticism appears to be a critical contributor to their partners’ well-being for both patients and their caregivers in the context of a stressful event such as CABG surgery. Prior research suggests that more distressed individuals have more difficulty adapting. Perhaps neuroticism activates or exacerbates partner distress through greater interpersonal expressions of worry, communicating cause for concern, and pulling for greater attention from the caregiver. Higher patient depression prior to surgery was associated with more within-person depressive symptoms at follow-up and also with greater perceived burden and strain among caregiving spouses. It is conceivable that patient depressive symptoms may communicate to the caregiver cause for concern and pull for the spouse to assist in the patient’s recovery. Future research should attempt to explicate these factors. Regardless of the transitive mechanism, clinical interventions aimed at reducing depressive symptoms are likely to benefit the health and well-being of both spouses. We should note that these findings also speak to the controversy regarding dispositional optimism and neuroticism as possibly two sides of the same coin (Scheier et al., 1994; Smith, Pope, Rhodewalt, & Poulton, 1989). Previously, Scheier et al. (1994) demonstrated that optimism accounted for a significant amount of variance in depressive symptoms, when they controlled for neuroticism. Our findings broaden this support by demonstrating 2
Analyses focused only on 18-month depressive symptoms because of the clear partner-matched variable (concurrent 18-month depressive symptoms). Caregiver burden and strain were not examined because of the lack of a patient-matched variable.
DOES WHO YOU MARRY MATTER FOR YOUR HEALTH?
that when examined together, optimism and neuroticism display unique as opposed to redundant interpersonal effects, at least within the context of marital relationships and health. More broadly, these findings support the transitive interpersonal model as an important concept for studying close relationships in the context of illness. This model has the potential to advance understanding of the complexities of individual health and wellbeing outcomes by considering the common social context in which they occur. The explicit emphasis on the transitive properties of personality traits may yield far more accurate predictions about patient– caregiver relationships than reliance on traditional assumptions of caregiver spouses as inherently supportive and patients as burdensome and instead suggests that each individual can improve as well as make more difficult the experience of his or her dyadic partner. Finally, in addition to considering the individual attributes of two people, the model also emphasizes the importance of relationship quality, with our findings supporting its role as a moderator of partner’s attributes on future well-being. Despite these advances, the current study is not without its limitations. Although sufficient for demonstrating the interpersonal properties of personality traits in coping with illness, future research should use a wider taxonomy of traits or target traits with known interpersonal consequences such as hostility (McCrae & Costa, 1987; Trapnell & Wiggins, 1990; cf. Smith, Glazer, Ruiz, & Gallo, 2004). In addition, the current study was limited to dyads in which the patient was the husband, leaving open the question as to whether the observed effects are truly representative of patient/ spouse-caregiver relationships, an effect of gender, or the interaction between gender and illness roles. Although unlikely, it is also possible that our findings are a result of a shared reporting style specific to couples facing a crisis such as CABG surgery. Comparisons with a control group not facing a crisis may have helped to examine this issue. We also used a number of nonindependent tests, which although based on a priori hypotheses may have inflated the risk of Type I error. Finally, it is unclear whether transitive processes are specifically a health phenomenon, a shared-challenge phenomenon, or a more general relationship phenomenon. It is possible that health challenges motivate couples to interact more intensely in order to meet shared goals and responsibilities and that these acute interaction periods expose couples to more (frequent, duration) transitive effects. However, most couples are faced with various shared challenges at one time or another that require greater interaction (Karney & Bradbury, 1995; Kiecolt-Glaser & Newton, 2001). Hence, it may be the case that any period of heightened interaction increases exposure to transitive effects. Alternatively, established relationships represent repeated interpersonal interactions with the same person that may extol cumulative effects independent of any specific challenge. Clearly, these issues were beyond the scope of the current study but warrant future research to explore the generalizability of transitive processes. These qualifications aside, this study supports the transitive model as an important step in understanding adaptation to disease. In addition to addressing the aforementioned limitations, future research should examine the interpersonal influence of a wider range of caregivers. For example, the patient’s primary physician, surgeon, and health care staff may engender confidence or worry with implications for the trajectory of physical and emotional recovery of both the patient and the spouse. Similar attention
265
should also be given to important social network members including family, friends, and coworkers as each group may influence the individual’s self-perception of recovery and a return to normal life or a need for continued concern. In addition, transactional processes also deserve more direct attention. For example, do patients higher in neuroticism transactionally create their own depression by overburdening their caregivers, causing them to be less supportive? Additional research on the transitive processes may facilitate a more complete understanding of psychosocial influences in the illness process and guide the development of appropriate interventions.
References Agnew, C. R., Loving, T. J., & Drigotas, S. M. (2001). Substituting the forest for the trees: Social networks and the prediction of romantic relationship state and fate. Journal of Personality and Social Psychology, 81, 1042–1057. Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park, CA: Sage. Allen, J. K., Young, D. R., & Xu, X. (1998). Predictors of long-term change in functional status after coronary artery bypass graft surgery in women. Progressive Cardiovascular Nursing, 13, 4 –10. American Heart Association. (2002). 2002 heart and stroke statistical update. Dallas, TX: Author. Artinian, N. T. (1991). Stress experience of spouses of patients having coronary artery bypass during hospitalization and 6 weeks after discharge. Heart and Lung, 20, 52–59. Artinian, N. T. (1992). Spouse adaptation to mate’s CABG surgery: 1 year follow-up. American Journal of Critical Care, 1, 36 – 42. Baker, R. A., Andrew, M. J., Schrader, G., & Knight, J. L. (2001). Preoperative depression and mortality in coronary artery bypass surgery: Preliminary findings. Annals of the New Zealand Journal of Surgery, 71, 139 –142. Beach, S. R. H., Katz, J., Kim, S., & Brody, G. H. (2003). Prospective effects of marital satisfaction on depressive symptoms in established marriages: A dyadic model. Journal of Social and Personal Relationships, 20, 355–371. Bengtson, A., Karlsson, T., Wahrborg, P., Hjalmarson, A., & Herlitz, J. (1996). Cardiovascular and psychosomatic symptoms among relatives of patients waiting for possible coronary revascularization. Heart and Lung, 25, 438 – 443. Benjamin, L. S. (1996). Interpersonal diagnosis and treatment of personality disorders. New York: Guilford Press. Blumenthal, J. A., Lett, H. S., Babyak, M. A., White, W., Smith, P. K., Mark, D. B., et al. for the NORG Investigators. (2003). Depression as a risk factor for mortality after coronary artery bypass surgery. The Lancet, 362, 604 – 609. Bookwala, J., & Schulz, R. (1996). Spousal similarity in subjective wellbeing: The Cardiovascular Health Study. Psychology and Aging, 11, 582–590. Bookwala, J., & Schulz, R. (1998). The role of neuroticism and mastery in spouse caregivers’ assessment of and response to a contextual stressor. Journal of Gerontology Series B, Psychological Sciences and Social Sciences, 53, 155–164. Borowicz, L., Jr., Royall, R., Grega, M., Selnes, O., Lyketsos, C., & McKhann, G. (2002). Depression and cardiac morbidity 5 years after coronary artery bypass surgery. Psychosomatics, 43, 464 – 471. Boudrez, H., & DeBacker, G. (2001). Psychological status and the role of coping style after coronary artery bypass graft surgery. Results of a prospective study. Quality of Life Research, 10, 37– 47. Brorsson, B., Bernstein, S. J., Brook, R. H., & Werko, L. (2001). Quality of life of chronic stable angina patients 4 years after coronary angio-
266
RUIZ, MATTHEWS, SCHEIER, AND SCHULZ
plasty or coronary artery bypass surgery. Journal of Internal Medicine, 249, 47–57. Burg, M. M., Benedetto, M. C., Rosenberg, R., & Soufer, R. (2003). Presurgical depression predicts medical morbidity 6 months after coronary artery bypass graft surgery. Psychosomatic Medicine, 65, 111–118. Burg, M. M., Benedetto, M. C., & Soufer, R. (2003). Depressive symptoms and mortality two years after coronary artery bypass graft surgery (CABG) in men. Psychosomatic Medicine, 65, 508 –510. Bush, D. E., Ziegelstein, R. C., Tayback, M., Richter, D., Stevens, S., Zahalsky, H., & Fauerbach, J. A. (2001). Even minimal symptoms of depression increase mortality risk after acute myocardial infarction. American Journal of Cardiology, 88, 337–341. Carney, R. M., Blumenthal, J. A., Freedland, K. E., Youngblood, M., Veith, R. C., Burg, M. M., et al. for the ENRICHD Investigators. (2004). Depression and late mortality after myocardial infarction in the Enhancing Recovery in Coronary Heart Disease (ENRICHD) Study. Psychosomatic Medicine, 66, 466 – 474. Connerney, I., Shapiro, P. A., McLaughlin, J. S., Bagiella, E., & Sloan, R. P. (2001). Relation between depression after coronary artery bypass surgery and 12-month outcome: A prospective study. The Lancet, 358, 1766 –1771. Cozac, J. (1988). The spouse’s response to coronary artery bypass graft surgery. Critical Care Nursing, 8, 65–71. Delon, M. (1996). The patient in the CCU waiting room: In-hospital treatment of the cardiac spouse. In R. Allan & S. Scheidt (Eds.), Heart and mind: The practice of cardiac psychology (pp. 421– 432). Washington, DC: American Psychological Association. Duits, A. A., Boeke, S., Taams, M. A., Passchier, J., & Erdman, R. A. M. (1997). Prediction of quality of life after coronary artery bypass graft surgery: A review and evaluation of multiple, recent studies. Psychosomatic Medicine, 59, 257–268. Duits, A. A., Duivenvoorden, H. J., Boeke, S., Taams, M. A., Mochtar, B., Krauss, X. H., et al. (1999). A structural modeling analysis of anxiety and depression in patients undergoing coronary artery bypass graft surgery: A model generating approach. Journal of Psychosomatic Research, 46, 187–200. Dyck, D. G., Short, R., & Vitaliano, P. P. (1999). Predictors of burden and infectious illness in schizophrenia caregivers. Psychosomatic Medicine, 61, 411– 419. Eagle, K. A., Guyton, R. A., Davidoff, R., Edwards, F. H., Ewy, G. A., Gardner, T. J., et al. (1999). ACC/AHA guidelines for coronary artery bypass graft surgery: Executive summary and recommendations: A report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines (Committee to Revise the 1991 Guidelines for Coronary Artery Bypass Graft Surgery). Circulation, 100, 1464 –1480. Elizur, Y., & Hirsh, E. (1999). Psychosocial adjustment and mental health two months after coronary artery bypass surgery: A multisystemic analysis of patients’ resources. Journal of Behavioral Medicine, 22, 157–177. Eysenck, H. J. (1958). A short questionnaire for the measurement of two dimensions of personality. Journal of Applied Psychology, 42, 14 –17. Fang, C. Y., Manne, S. L., & Pape, S. J. (2001). Functional impairment, marital quality, and patient psychological distress as predictors of psychological distress among cancer patients’ spouses. Health Psychology, 20, 452– 457. Fengler, A. P., & Goodrich, N. (1979). Wives of elderly disabled men: The hidden patients. The Gerontologist, 2, 175–183. Fincham, F. D., Beach, S. R. H., Harold, G. T., & Osborne, I. N. (1997). Marital satisfaction and depression: Different causal relationships for men and women? Psychological Science, 8, 351–357. Frasure-Smith, N., Lesperance, F., & Talajic, M. (1993). Depression following myocardial infarction: Impact on 6 month survival. Journal of the American Medical Association, 270, 1819 –1825.
Frazier, P. A., Tix, A. P., & Barnett, C. L. (2003). The relational context of social support: Relationship satisfaction moderates the relations between enacted support and distress. Personality and Social Psychology Bulletin, 29, 1133–1146. Given, C. W., Stommel, M., Given, B., Osuch, J., Kurtz, M. E., & Kurtz, J. C. (1993). The influence of cancer patients’ symptoms and functional states on patients’ depression and family caregivers’ reaction and depression. Health Psychology, 12, 277–285. Goh, D. S., King, D. W., & King, L. A. (1982). Psychometric evaluation of the Eysenck Personality Questionnaire. Educational Psychological Measurement, 42, 297–309. Han, B., & Haley, W. E. (1999). Family caregiving for patients with stroke. Review and analysis. Stroke, 30, 1478 –1485. Herlitz, J., Wiklund, I., Sjoland, H., Karlson, B. W., Karlsson, T., Haglid, M., et al. (2001). Relief of symptoms and improvement of health-related quality of life five years after coronary artery bypass graft in women and men. Clinical Cardiology, 24, 385–392. Hooker, K., Monahan, D., Shifren, K., & Hutchinson, C. (1992). Mental and physical health of spouse caregivers: The role of personality. Psychology and Aging, 7, 367–375. Karmilovich, S. E. (1994). Burden and stress associated with spousal caregiving for individuals with heart failure. Progress in Cardiovascular Nursing, 9, 33–38. Karney, B. R., & Bradbury, T. N. (1995). The longitudinal course of marital quality and stability: A review of theory, method, and research. Psychological Bulletin, 118, 3–34. Karney, B. R., & Bradbury, T. N. (1997). Neuroticism, marital interaction, and the trajectory of marital satisfaction. Journal of Personality and Social Psychology, 72, 1075–1092. Kiecolt-Glaser, J. K., & Newton, T. L. (2001). Marriage and health: His and hers. Psychological Bulletin, 27, 472–503. Kiesler, D. J. (1996). Contemporary interpersonal theory and research: Personality, psychopathology, and psychotherapy. New York: Wiley. Koivula, M., Paunonen-Ilmonen, M., Tarkka, M. T., Tarkka, M., & Laippala, P. (2001). Fear and anxiety in patients awaiting coronary artery bypass grafting. Heart and Lung, 30, 302–311. Kulik, J. A., & Mahler, H. I. (1989). Social support and recovery from surgery. Health Psychology, 8, 221–238. Kulik, J. A., & Mahler, H. I. (1993). Emotional support as a moderator of adjustment and compliance after coronary artery bypass surgery: A longitudinal study. Journal of Behavioral Medicine, 16, 45– 63. Kurtz, M. E., Kurtz, J. C., Given, C. W., & Given, B. (1997). Predictors of postbereavement depressive symptomatology among family caregivers of cancer patients. Supportive Care in Cancer, 5, 53– 60. Ladwig, K. H., Kieser, M., Konig, J., Breithardt, G., & Borggrefe, M. (1991). Affective disorders and survival after acute myocardial infarction. European Heart Journal, 12, 959 –964. Langeluddecke, P., Tennant, C., Fulcher, G., Barid, D., & Hughes, C. (1989). Coronary artery bypass surgery: Impact upon the patient’s spouse. Journal of Psychosomatic Research, 33, 155–159. Lindsay, G. M., Hanlon, P., Smith, L. N., & Wheatley, D. J. (2000). Assessment of changes in general health status using the Short-Form 36 Questionnaire 1 year following coronary artery bypass grafting. European Journal of Cardiothoracic Surgery, 18, 557–564. McCrae, R. R., & Costa, P. T., Jr. (1987). Validation of the five-factor model of personality across instruments and observers. Journal of Personality and Social Psychology, 52, 81–90. Nieboer, A. P., Schulz, R., Matthews, K. A., Scheier, M. F., Ormel, J., & Lindenberg, S. M. (1998). Spousal caregivers’ activity restriction and depression: A model for changes over time. Social Science Medicine, 47, 1361–1371. Nijboer, C., Tempelaar, R., Triemstra, M., van den Bos, G. A., & Sanderman, R. (2001). The role of social and psychologic resources in caregiving of cancer patients. Cancer, 91, 1029 –1039.
DOES WHO YOU MARRY MATTER FOR YOUR HEALTH? Northouse, L. L., Templin, T., & Mood, D. (2000). Couples’ adjustment to breast disease during the first year following diagnosis. Journal of Behavioral Medicine, 24, 115–136. O’Farrell, P., Murray, J., & Hotz, S. B. (2000). Psychologic distress among spouses of patients undergoing cardiac rehabilitation. Heart and Lung, 29, 97–104. Patrick, J. H., & Hayden, J. M. (1999). Neuroticism, coping strategies, and negative well-being among caregivers. Psychology and Aging, 14, 273– 283. Perski, A., Feleke, E., Anderson, G., Samad, B., Westerlund, H., Ericsson, C. G., & Rehnqvist, N. (1998). Emotional distress before coronary bypass grafting limits the benefits of surgery. American Heart Journal, 136, 510 –517. Peterson, J. C., Charlson, M. E., Williams-Russo, P., Krieger, K. H., Pirraglia, P. A., Myers, B. S., & Alexopoulos, G. S. (2002). New postoperative depressive symptoms and long-term cardiac outcomes after coronary artery bypass surgery. American Journal of Geriatric Psychiatry, 10, 192–198. Pirraglia, P. A., Peterson, J. C., Williams-Russo, P., Gorkin, L., & Charlson, M. E. (1999). Depressive symptomatology in coronary artery bypass graft surgery patients. International Journal of Geriatric Psychiatry, 14, 668 – 680. Radloff, L. S. (1977). The CES-D Scale: A self-reported depression scale for research in the general population. Applied Psychological Measurement, 1, 385– 401. Reis, H. T., & Collins, W. A. (2004). Relationships, human behavior, and psychological science. Current Directions in Psychological Science, 13, 233–237. Reis, H. T., Collins, W. A., & Berscheid, E. (2000). The relationship context of human behavior and development. Psychological Bulletin, 126, 844 – 872. Reis, M. F., Gold, D. P., Andres, D., Markiewicz, D., & Gauthier, S. (1994). Personality traits as determinants of burden and health complaints in caregiving. International Journal of Aging and Human Development, 39, 257–271. Ruiz, J. M., Matthews, K. A., Scheier, M. F., Wortman, J., & Schulz, R. (2005, March). Persistent post-surgical depressive symptoms and longterm survival following coronary artery bypass surgery. Paper presented at the meeting of the American Psychosomatic Society, Vancouver, British Columbia, Canada. Rumsfeld, J. S., MacWhinney, S., McCarthy, M., Jr., Shroyer, A. L. W., VillaNueva, C. B., O’Brien, M., et al. (1999). Health-related quality of life as a predictor of mortality following coronary artery bypass graft surgery. Journal of the American Medical Association, 281, 1298 –1303. Rusbult, C. E., & Van Lange, P. A. M. (2003). Interdependence, interaction, and relationships. Annual Review of Psychology, 54, 351–375. Saur, C. D., Granger, B. B., Muhlbaier, L. H., Forman, L. M., Mckenzie, R. J., Taylor, M. C., & Smith, P. K. (2001). Depressive symptoms and outcome of coronary artery bypass grafting. American Journal of Critical Care, 10, 4 –10. Scheier, M. F., Carver, C. S., & Bridges, M. W. (1994). Distinguishing optimism from neuroticism (and trait anxiety, self-mastery, and selfesteem): A reevaluation of the Life Orientation Test. Journal of Personality and Social Psychology, 67, 1063–1078. Scheier, M. F., Matthews, K. A., Owens, J. F., Schulz, R., Bridges, M. W., Magovern, G. J., Sr., & Carver, C. S. (1999). Optimism and rehospital-
267
ization after coronary artery bypass graft surgery. Archives of Internal Medicine, 159, 829 – 835. Scholte op Reimer, W. J., de Haan, R. J., Rijnders, P. T., Limburg, M., & van den Bos, G. A. (1998). The burden of caregiving in partners of long-term stroke survivors. Stroke, 29, 1605–1611. Schulz, R., Bookwala, J., Knapp, J. E., Scheier, M., & Williamson, G. M. (1996). Pessimism, age, and cancer mortality. Psychology and Aging, 11, 304 –309. Skinner, H. A., Steinhauer, P. D., & Santa-Barbara, J. (1983). The family assessment measure. Canadian Journal of Community Mental Health, 2, 91–105. Smith, T. W., Glazer, K., Ruiz, J. M., & Gallo, L. C. (2004). Hostility, anger, aggressiveness and coronary heart disease: An interpersonal perspective on personality, emotion, and health. Journal of Personality, 72, 1217–1270. Smith, T. W., Pope, M. K., Rhodewalt, F., & Poulton, J. L. (1989). Optimism, neuroticism, coping, and symptom reports: An alternative interpretation of the Life Orientation Test. Journal of Personality and Social Psychology, 56, 640 – 648. Smith, T. W., & Ruiz, J. M. (2002). Psychosocial influences on the development and course of coronary heart disease: Current status and implications for research and practice. Journal of Consulting and Clinical Psychology, 70, 548 –568. Soderman, E., Lisspers, J., & Sundin, O. (2003). Depression as a predictor of return to work in patients with coronary artery disease. Social Science Medicine, 56, 193–202. Stanley, M. J., & Frantz, R. A. (1988). Adjustment problems of spouses of patients undergoing coronary artery bypass graft surgery during early convalescence. Heart and Lung, 17, 677– 682. Sullivan, H. S. (1953). The interpersonal theory of psychiatry. New York: Norton. Suls, J., & Bunde, J. (2005). Anger, anxiety, and depression as risk factors for cardiovascular disease: The problems and implications of overlapping affective dispositions. Psychological Bulletin, 131, 260 –300. Teri, L. (1997). Behavior and caregiver burden: Behavioral problems in patients with Alzheimer disease and its association with caregiver distress. Alzheimer Disease Association Disorders, 11, S35–S38. Trapnell, P. D., & Wiggins, J. S. (1990). Extension of the interpersonal adjective scales to include the Big Five dimensions of personality. Journal of Personality and Social Psychology, 59, 781–790. Vedhara, K., Shanks, N., Wilcock, G., & Lightman, S. L. (2001). Correlates and predictors of self-reported psychological and physical morbidity in chronic caregiver stress. Journal of Health Psychology, 6, 101– 119. Wahrborg, P. (1999). Quality of life after coronary angioplasty or bypass surgery. 1-year follow-up in the Coronary Angioplasty versus Bypass Revascularization Investigation (CABRI) trial. European Heart Journal, 20, 653– 658. Zarit, S. H., Reever, K. E., & Bach-Peterson, J. (1980). Relatives of the impaired elderly: Correlates of feelings of burden. The Gerontologist, 20, 649 – 655.
Received August 5, 2004 Revision received November 30, 2005 Accepted December 11, 2005 䡲
Journal of Personality and Social Psychology 2006, Vol. 91, No. 2, 268 –280
Copyright 2006 by the American Psychological Association 0022-3514/06/$12.00 DOI: 10.1037/0022-3514.91.2.268
From Automatic Antigay Prejudice to Behavior: The Moderating Role of Conscious Beliefs About Gender and Behavioral Control Nilanjana Dasgupta and Luis M. Rivera University of Massachusetts—Amherst Two experiments tested whether the relation between automatic prejudice and discriminatory behavior is moderated by 2 conscious processes: conscious egalitarian beliefs and behavioral control. The authors predicted that, when both conscious processes are deactivated, automatic prejudice would elicit discriminatory behavior. When either of the 2 processes is activated, behavioral bias would be eliminated. The authors assessed participants’ automatic attitudes toward gay men, conscious beliefs about gender, behavioral control, and interactions with gay confederates. In Experiment 1, men’s beliefs about gender were heterogeneous, whereas women’s beliefs were mostly egalitarian; men’s responses supported the predictions, but women’s responses did not. In Experiment 2, the authors recruited a sample with greater diversity in gender-related beliefs. Results showed that, for both sexes, automatic prejudice produced biased behavior in the absence of conscious egalitarian beliefs and behavioral control. The presence of either conscious process eliminated behavioral bias. Keywords: implicit social cognition, automaticity, attitudes toward homosexuals, prejudice, gender roles
health care, and the justice system (Badgett, 1996; Ellis & Riggle, 1996; Hebl, Foster, Mannix, & Dovidio, 2002; Portwood, 1995; Ridgeway, 1997; Rubenstein, 1996; Rudman & Glick, 2001; Stohlberg, 2002). In recognition of the changing nature of prejudice, social psychologists have responded with new theories and evidence that highlight subtle forms of prejudice and discrimination (Dovidio, 2001; Fazio & Towles-Schwen, 1999; Gaertner & Dovidio, 1986; Greenwald et al., 2002). As a case in point, theories of automatic prejudice focus on negative attitudes toward outgroups that may become spontaneously activated in memory without perceivers’ awareness or control (for reviews, see Dasgupta, 2004; Fazio & Towles-Schwen, 1999; Greenwald & Banaji, 1995; Wilson, Lindsey, & Schooler, 2000). These automatically activated attitudes have the capacity to shape behavior in significant ways. The pernicious impact of automatic prejudice on behavioral outcomes was first demonstrated in a 1995 study by Fazio and colleagues (Fazio, Jackson, Dunton, & Williams, 1995). Several related publications followed closely on the heels of the first report (Dovidio, Kawakami, & Gaertner, 2002; Dovidio, Kawakami, Johnson, Johnson, & Howard, 1997; McConnell & Leibold, 2001; for a review, see Dasgupta, 2004). Collectively, these demonstrated that automatic racial attitudes predict people’s subtle behavior toward racial minorities better than controlled attitudes, especially when the behaviors involve nonverbal and paralinguistic responses that people are typically unaware of, unable to control, or not motivated to control (e.g., eye contact, body posture, speech errors).
The nature of prejudice in the United States has changed substantially in the past century. Attitudes and behavior toward several disadvantaged groups, especially racial minorities and women, have become significantly more tolerant (Cafferata, Horn, & Wells, 1997; Dovidio, 2001; Huddy, Neely, & Lafay, 2000; Schuman, Steeh, Bobo, & Kryson, 1997). Even in the case of sexual minorities, public opinion polls indicate that people are relatively supportive of basic civil rights for gays and lesbians (Herek, 2000; Sherrill & Yang, 2000; Yang, 1997), and their attitudes, at least within certain demographic groups (e.g., younger and educated populations), have become less negative over the past few decades (Herek, 1984; Herek & Capitanio, 1996). Despite these sweeping changes in attitudes, subtle forms of discrimination continue in many areas of everyday life, including employment, housing,
This research was supported by the Wayne F. Placek Award granted by the American Psychological Foundation to Nilanjana Dasgupta. We are indebted to Pavita Krishnaswamy for many conversations during which this research was conceptualized. We are grateful to Matthew Boynes, Christopher Monahan, Jared Steinberg, Sean Thompson, and Joe Gallegos for serving as confederates. We thank Nixzaliz Baez, Michael Berg, Jessie Day, Sarah Gallo, Hilary Khalagher, Christopher Monahan, Yael Nillni, Jaime Schilling, Eric Uhlmann, and Julie Weismantel for their help with stimuli selection, data collection, data entry, and videotape coding. We also thank David DeSteno and Aline Sayer for statistical advice. Finally, we thank our colleagues in the Developmental Psychology Division at the University of Massachusetts for graciously allowing us to use the Child Study Center in Springfield, MA, where Study 2 was conducted. Correspondence concerning this article should be addressed to Nilanjana Dasgupta, Department of Psychology, Tobin Hall, University of Massachusetts, Amherst, MA 01003, or to Luis Rivera, who is now at the Department of Psychology, California State University, 5500 University Parkway, San Bernardino, CA 92407. E-mail:
[email protected] or
[email protected]
Moderators of the Link Between Automatic Prejudice and Discriminatory Behavior Although automatically activated prejudice can bias behavior, this effect is not obligatory; it depends on how aware people are of the possibility of bias, how motivated they are to correct potential 268
MODERATORS OF ANTIGAY PREJUDICE AND BEHAVIOR
bias, and how much control they have over their judgment or behavior. Just as automatic attitudes have been found to be remarkably malleable (e.g., Dasgupta & Asgari, 2004; Dasgupta & Greenwald, 2001; Wittenbrink, Judd, & Park, 2001; for a review, see Blair, 2002), so too behaviors are likely to be quite malleable depending on the extent to which motivation and control are at play.
Motivation as a Moderator of the Link Between Automatic Prejudice and Discrimination The role of motivation in moderating the link between automatic attitudes and behavior was first proposed by Fazio (1990), who argued that, when people have the motivation and opportunity to be mindful, their controlled attitudes are likely to override their automatic attitudes to predict behavior. However, when people lack the motivation or opportunity to be mindful, automatically activated attitudes ought to be the primary predictor of behavior. Consistent with this idea, Dunton and Fazio (1997) found that, among participants who were not motivated to control prejudice, stronger automatic prejudice predicted less favorable judgments of a Black college student. However, among those who were highly motivated to control prejudice, stronger automatic prejudice predicted more favorable judgments of the same student, suggesting that motivated participants were overcorrecting their judgment to avoid potential bias (see also Olson & Fazio, 2004; TowlesSchwen & Fazio, 2003). These studies illustrate that the influence of automatic prejudice on social judgment is conditional upon people’s motivation to be nonprejudiced. Other research has focused on differentiating the source of motivation—whether it emanates from the desire to adhere to one’s personal standards or to social normative standards (Plant & Devine, 1998). Taken together, the extant research has articulated the role of intrinsic versus extrinsic motivation to be nonprejudiced and has examined the influence of these motivations on self-reported judgments and emotions directed at racial minority groups.
Conscious Egalitarian Beliefs as a Source of Motivation People’s motivation to be nonprejudiced is often rooted in their consciously held beliefs and values about egalitarianism. If so, strongly held egalitarian beliefs ought to provide the motivation to attenuate the impact of automatic prejudice on behavior. For example, people who hold egalitarian beliefs about gender roles are motivated to reject gender-based societal demarcations that prescribe “appropriate” roles, behavior, personality, appearance, and (hetero)sexuality for women and men, which in turn, ought to influence their behavior toward gender-nonconforming individuals (e.g., gay men). By contrast, others who hold traditional beliefs about gender roles are motivated to preserve gender-based demarcations between men and women because they view these as beneficial to society rather than instances of unfair discrimination (Bem, 1981, 1984; Deaux & Kite, 1987; Deaux & Lewis, 1984; Kite & Deaux, 1987). Some research suggests that endorsement of gender-based inequalities is linked to people’s own gender identity (Spence, 1993). Those who believe that traditional gender norms are fair and ought not to change are more likely to describe their own self-concept in a traditionally masculine (for men) or feminine (for women) manner, compared with others who reject tradi-
269
tional gender norms as unfair and requiring change. Moreover, people who endorse traditional beliefs are likely to feel threatened while interacting with gender-nonconforming individuals whose presence questions the cherished social order (Kite & Whitley, 1996; LaMar & Kite, 1998). The present research contributes to the existing literature by testing whether conscious egalitarian beliefs about gender roles serve as a source of motivation to attenuate the relation between automatic prejudice and behavioral bias toward gay men. We assessed gender role beliefs using a newly developed measure that captures people’s conscious commitment to reject or uphold normative gender roles and gender identity. Because this measure was tailored to assess beliefs about gender conformity, we expected it to predict people’s motivations toward gender-nonconforming target groups better than other measures of motivation that have been validated specifically for racial groups and thus may not generalize to other groups (e.g., Dunton & Fazio, 1997; Plant & Devine, 1998). In the present research, we also sought to extend previous research in a second important way. Past studies on the effect of motivation on prejudice-related outcomes have exclusively focused on clearly controllable outcomes, such as self-reported judgments (Dunton & Fazio, 1997; Olson & Fazio, 2004) and selfreported emotions in response to hypothetical situations (Plant & Devine, 1998; Towles-Schwen & Fazio, 2003). These studies did not examine whether motivation to be egalitarian can attenuate biases in spontaneous actions that have limited controllability, because they occur rapidly in real time with little attention from social actors. We sought to fill this gap in the literature by focusing specifically on participants’ spontaneous nonverbal and paralinguistic actions during interactions with outgroup members that were evaluated by third-party observers, not by social actors themselves. Our goal was to test whether people’s conscious motivation to be egalitarian can circumvent the impact of automatic prejudice on subtle behavior toward stigmatized others.
Behavioral Control as a Moderator of the Link Between Automatic Prejudice and Discrimination Behavioral control refers to individuals’ ability to monitor and modify their public behavior to fit with prevailing social norms or to ease social interactions independent of their conscious endorsement or nonendorsement of egalitarian ideals. People vary widely in the degree to which they are aware of and able to control their subtle behavior. For example, consider nonverbal and paralinguistic cues such as smiling, eye contact, spatial distance, and friendliness toward interaction partners. Some people are relatively unaware of the nonverbal cues they communicate and are unskilled at correcting them, whereas others are remarkably aware of and practiced at controlling such body language. Most of the research on behavioral control has been conducted within the framework of self-monitoring theory, which refers to individual differences in expressive control and impression management in public situations (Gangestad & Snyder, 2000; Snyder, 1974; Snyder & Gangestad, 1986). However, the measure derived from self-monitoring theory does not specifically assess individual differences in the ability to control subtle nonverbal and paralinguistic cues that people express spontaneously in social interactions (e.g., facial expressions, body posture, eye contact). In addi-
DASGUPTA AND RIVERA
270
tion, the self-monitoring measure does not assess behavioral control in the context of interactions across group boundaries. Because the present research is geared toward understanding subtle forms of behavioral discrimination toward outgroup members, we developed a three-item measure to assess people’s awareness of and control over their subtle behavior during interactions with outgroup members (in this case, gay men).
Overview of the Present Research Our primary goal in the present studies was to identify the conditions under which automatic prejudice in the mind translates into behavioral discrimination. We predicted that two conscious processes will influence the strength of association between automatic prejudice and behavior: (a) people’s motivation to be egalitarian on the basis of their conscious beliefs and (b) their control over subtle behavioral cues. When both conscious processes are deactivated (i.e., when individuals are not motivated by egalitarian beliefs and cannot control their subtle behavior), a strong connection will emerge between automatic attitudes and behavior. That is, automatic prejudice will result in biased behavior. In contrast, when either of the two conscious processes— egalitarian motivation or behavioral control—is activated, the connection between automatic attitudes and behavior will be short-circuited. That is, automatic prejudice in the mind will no longer result in biased behavior. Our prediction is similar to Fazio and colleagues’ model on motivation and opportunity as determinants of behavior (MODE model), which proposes that either conscious motivation or opportunity to control one’s responses ought to eliminate the effect of automatic attitudes on self-reported judgments and behavior (for reviews, see Fazio, 1990; Fazio & Towles-Schwen, 1999). Studies conducted by Fazio and colleagues testing the MODE model have demonstrated that motivation moderates the link between automatic attitudes and self-reported judgments (Dunton & Fazio, 1997; Olson & Fazio, 2004; Towles-Schwen & Fazio, 2003); however, these studies have not examined whether opportunity to control one’s responses has a similar moderating effect. Moreover, these studies have focused on outcome variables that were self-reported and deliberative rather than spontaneous. Our hypotheses were based on the logic that conscious processes such as egalitarian motives and behavioral control ought to exert an effect on the behavioral output at the “downstream end” of the attitude– behavior relation. People who are highly motivated to uphold egalitarian beliefs about gender roles are likely to be mindful in social interactions with gay men. As such, their behavior toward gay men is predicted to be positive regardless of any automatic attitudes they acquired passively by immersion in the larger society. Similarly, people who are highly practiced at controlling their subtle behaviors are also likely to convey positive behavioral cues to ease social interactions with gay individuals regardless of their automatic attitudes. However, people who are not motivated by egalitarian beliefs and not able to control their actions online are predicted to be most susceptible to act in accordance with their automatic attitudes; the more antigay prejudice they harbor, the more discriminatory their actions will be. We conducted two experiments to test our predictions. Experiment 1 was conducted in a small college town in Massachusetts. Although this experiment recruited a community sample, because the local population is politically liberal, we expected that partic-
ipants would endorse relatively egalitarian beliefs about gender roles. This is particularly likely in the case of women (Eagly & Mladinic, 1989; Helmreich, Spence, & Gibson, 1982; Lottes, 1993; McBroom, 1987; Spence & Hahn, 1997; Stark, 1991) because the local community is home to several grass-roots feminist organizations and women’s colleges. Given these demographic constraints, we anticipated that the male sample would provide a better test of our hypotheses than the female sample. To provide a stronger test of our hypotheses, we conducted Experiment 2 in a large city where the population is more heterogeneous in terms of social beliefs about gender roles. This time, participants of both sexes who had low motivation and low control were expected to show a strong connection between automatic prejudice and antigay behavior. Other participants who had high motivation or high control over their actions were expected to exhibit positive behavior regardless of their automatic attitudes.
Experiment 1 Method Participants A community sample of 82 residents of a small town in Massachusetts (52 women, 30 men) participated in exchange for $15. We conducted recruitment using advertisements placed in community newspapers and flyers posted at local businesses. Of the participants, 71% were White, 9% were Black, 7% were Asian, 6% were Hispanic, 5% were multiracial, and 2% did not answer the question. Participants’ ages ranged from 17 to 65 (M ⫽ 26.12 years, SD ⫽ 11.98). None of the participants identified as gay or lesbian; their mean self-rating was 9.99 on an 11-point scale on which 11 represented exclusive heterosexual identification.
Measures and Manipulations Manipulation of confederates’ apparent sexual orientation. Participants received information about each confederate’s sexual orientation role, but the confederates remained unaware of the manipulated role to ensure that their behavior would not change inadvertently as a function of the role. Sexual orientation was manipulated between participants so that both confederates played both roles. Because the confederates rotated roles and because they remained unaware of their own role during any given experimental session, we were assured that any systematic differences in participants’ behavior toward the gay versus straight confederate must be due to participants’ perception of the confederates’ sexual orientation rather than any other confounding variable. Before the interviews, participants were given two folders, each of which included a photograph and a re´sume´ that ostensibly belonged to the interviewer (confederate) whom they would meet shortly. Each re´sume´ described the academic interests, work experience, and extracurricular activities of one of the two confederates. The two re´sume´s were equated in terms of competence and likeability. Listed under extracurricular activities was a sentence indicating the confederate’s involvement in a campus organization— he was described as a member of either the gay students’ alliance at the university (gay role) or a campus fraternity (heterosexual role). In addition to counterbalancing confederates’ sexual orientation, we also counterbalanced two other variables to ensure that they would not confound the results: the order in which participants encountered the individual confederates, and the order in which they encountered the allegedly gay versus heterosexual person. Finally, the confederates were similar in appearance, dress, attractiveness, race (both were White), and outward personality. They were trained to behave in a friendly and professional manner during the interviews.
MODERATORS OF ANTIGAY PREJUDICE AND BEHAVIOR Measurement of nonverbal behavior. Participants’ nonverbal behavior toward each confederate was measured in two ways. First, each confederate rated participants’ behavior toward him. Second, the interviews were also videotaped with a camera hidden among a pile of books and papers positioned on a bookshelf facing the participant’s chair. These videotapes were later judged by two coders who were unaware of the experimental hypotheses and the manipulation of sexual orientation. Six dimensions were used to code behavior. These have been successfully used in past research as indicators of positive and negative nonverbal behavior toward others (DePaulo & Friedman, 1998; Dovidio et al., 1997; Fazio et al., 1995; LaFrance, 1985; McConnell & Leibold, 2001). Three of the items focused on specific behaviors: (a) how much eye contact participants made with each confederate, (b) how much they smiled, and (c) their body posture. Three other items focused on global behavior: (a) participants’ overall friendliness, (b) how comfortable they appeared, and (c) how interested they appeared in the interaction. Coders and confederates rated all behaviors on 11-point scales ranging from not at all (1) to very much (11). Measurement of automatic attitudes toward gay men. Participants’ automatic attitudes toward gay men, compared with heterosexuals, were measured using an Implicit Association Test (IAT; Greenwald, McGhee, & Schwartz, 1998). The IAT is a computerized task that measures the relative strength with which two target groups (e.g., gays vs. heterosexuals) are associated with good versus bad concepts (represented by words such as paradise vs. poison) using response latency to operationalize attitude strength. Pictures of same- and different-sex couples were used to represent gay and heterosexual men. The stimuli were selected to ensure that the couples appeared to be lovers, not platonic friends. Participants saw four types of stimuli presented one at a time on a computer screen (homosexual and heterosexual pictures, good and bad words). Their task was to categorize each stimulus using one of two designated response keys. Participants’ response latencies are typically faster when highly associated pictures and words share the same key than when weakly associated pictures and words share the same key. Given the pervasive presence of antigay sentiments in U.S. society, we predicted that response latencies would be significantly faster when heterosexual and good stimuli shared one response key and when homosexual and bad stimuli shared the other response key. By contrast, response latencies would be substantially slower for opposite combinations of stimuli. The order in which these two stimulus combinations were administered was counterbalanced between participants. Measurement of conscious beliefs about gender roles and gender identity. A review of the social psychological literature on gender-related beliefs revealed that most of the popular and well-validated measures focus on a few specific gender-related domains—mostly women’s employment, heterosexual romantic relationships, and stereotypic personality traits (Glick & Fiske, 1996; Spence & Hahn, 1997; Spence & Helmreich, 1972; Swim, Aikin, Hall, & Hunter, 1995). Typically, these measures do not assess gender-related beliefs in other domains such as parenting, family life, and physical appearance. However, collectively, all these domains involve prescriptive gender norms about roles, behaviors, traits, and physical appearance that are seen as “appropriate” for women and men, and people vary in the degree to which they explicitly endorse such traditional prescriptions. Second, most gender-related scales concentrate on beliefs and attitudes toward women in particular rather than both sexes. Third, with regard to gender identity, the two most popular and well-validated scales—the Bem Sex Role Inventory (Bem, 1974) and the Personal Attributes Questionnaire (Spence, Helmreich, & Stapp, 1974)—focus exclusively on people’s self-descriptions in terms of gendered personality traits, but they do not assess how people present their gender and sexual identity to others and to the self. Because our interest focused on the influence of both traditional beliefs about gender roles (across a wide array of domains) and gender identity, we designed a new scale called Traditional Beliefs about Gender and Gender Identity (TBGI) to assess individual differences
271
in such beliefs. Here we provide relevant evidence of scale construction and scale validation (Rivera & Dasgupta, 2006). Using six student and community samples we constructed a 15-item scale comprising two subscales (see Appendix). One subscale, Traditional Beliefs about Gender (TBG; 8 items), focuses on the degree to which people endorse traditional prescriptive gender norms in various life domains including parenting, professional life, social interactions, and physical appearance. The other subscale, Traditional Beliefs about Gender Identity (TBI; 7 items), focuses on the degree to which people are invested in emphasizing their heterosexual identity to others and to themselves. Participants indicated their agreement or disagreement with each statement on a 7-point scale ranging from strongly disagree (1) to strongly agree (7). Exploratory factor analyses indicated that the scale items loaded onto two separate factors (TBI accounted for 25% of the variance in responses, and TBG accounted for 16% of the variance). The scale showed robust internal consistency across six independent samples (␣s ranged from .84 to .90). The two factors were moderately correlated, r(1985) ⫽ .53, p ⬍ .001. To validate the fit of the two-factor model, we conducted a confirmatory factor analysis (CFA; Jo¨reskog & So¨rbom, 1996) on a new sample. The one-factor model did not fit the data well (goodness-of-fit index [GFI] ⫽ .88, adjusted goodness-of-fit index [AGFI] ⫽ .83, root mean square error of approximation [RMSEA] ⫽ .10, 90% confidence intervals [CIs] ⫽ 0.09, 0.10), whereas the two-factor model fit significantly better (GFI ⫽ .95, AGFI ⫽ .93, RMSEA ⫽ .057, 90% CIs ⫽ 0.05, 0.06), ⌬2(1, N ⫽ 1251) ⫽ 492.41, p ⬍ .001. To rule out the possibility that the TBGI is simply a measure of explicit attitudes toward gays and lesbians, we performed two additional CFAs using a community sample (N ⫽ 136). In the first CFA, TBGI and three measures assessing attitudes toward gay men and lesbians (two feeling thermometers and Herek’s [1988] Attitudes Toward Lesbians and Gay Men scale) were modeled as a single construct. This model did not fit the data well, 2(14, N ⫽ 136) ⫽ 31.71, p ⬍ .005; GFI ⫽ .77, AGFI ⫽ .48, RMSEA ⫽ .07, 90% CI ⫽ 0.0, 0.11. In the second CFA, attitudes toward gays and lesbians and the TBGI were modeled as two distinct but correlated constructs. This model fit the data well, 2(13, N ⫽ 136) ⫽ 8.37, p ⫽ .81; GFI ⫽ .98, AGFI ⫽ .93, RMSEA ⫽ .00, 90% CI ⫽ 0.0, 0.05. Moreover, the two-factor model was a significant improvement over the one-factor model, ⌬2(1, N ⫽ 136) ⫽ 23.34, p ⬍ .01. These analyses suggest that beliefs about gender roles and gender identity represent a construct that is independent from attitudes toward gay men and lesbians. Finally, the TBGI was significantly correlated in the predicted direction with other theoretically related constructs such as attitudes toward women (average r ⫽ .53, p ⬍ .01), attitudes toward gay men and lesbians (average r ⫽ .57, p ⬍ .01), authoritarianism (r ⫽ .63, p ⬍ .01), and social dominance (r ⫽ .18, p ⬍ .05), but not with unrelated constructs such as social desirability (r ⫽ ⫺.04, ns) and self-esteem (r ⫽ ⫺.05, ns). Measurement of behavioral control. We created three items to assess individual differences in the degree to which people are aware of and able to control their subtle nonverbal behavior during interpersonal interactions. These items included the following: (a) “While talking to another person I’m conscious of what I communicate silently with my ‘body language’”; (b) “I try to keep an eye on my own actions when I’m interacting with others so that I don’t behave in a discriminatory manner without thinking”; and (c) “When I’m in the presence of a gay or lesbian person, I pay attention to my own behavior so that they don’t get the impression that I’m prejudiced against them.” For each item, participants indicated their agreement or disagreement on a 7-point scale ranging from strongly disagree (1) to strongly agree (7). These items were designed to capture people’s ability to control their public behavior independent of their beliefs about gender roles and gender identity. The reliability coefficients are similar for all three items together and for only the two items specific to prejudice control (␣s ranged from .64 to .70). Measurement of self-reported sexual orientation. Self-reported sexual orientation was measured with one item embedded in a demographic
272
DASGUPTA AND RIVERA
questionnaire. Participants were asked, “In terms of sexual preference, how do you self-identify?” They marked a position on an 11-point scale anchored by I identify as gay or lesbian exclusively (1), I identify as bisexual (6), and I identify as heterosexual exclusively (11). We chose a single-item measure as a simple way of ensuring that our sample included only heterosexual participants. Past research reveals that there is no clear consensus on how to assess self-reported sexual orientation (for reviews, see Chung & Katayama, 1996; Coleman, 1987; Sell, 1997). For example, sexual orientation has been measured using a single categorical variable with three response options (i.e., heterosexual, bisexual, and homosexual), a single continuous variable (i.e., ranging from exclusively heterosexual to exclusively homosexual), and multiple variables (based on identity, behavior, sexual fantasy, etc.). Our single-item measure is most similar to Kinsey, Pomeroy, and Martin’s (1948) original scale.
Procedure Participants took part in two ostensibly unrelated studies that in reality were two sessions of the same study. The two sessions were separated by 1 week to minimize suspicion and to enhance the “two separate studies” cover story. In the “first study” the female experimenter told participants that they would complete a number of tasks for her undergraduate research project. After signing the consent form, participants completed a brief demographic form, followed by a gay male IAT, the TBGI scale, and behavioral control items (the last two measures were presented in counterbalanced order). Once they were done, participants were reminded that they had signed up for another study scheduled for the following week (in reality, the “second study” was the behavioral session). One week later, participants arrived at a different location where they were greeted by a new female experimenter. She informed them that they would be interviewed by two undergraduate students for their senior theses on public opinions about politics and the economy. She went on to explain that the honors college at the university wanted to profile the accomplishments of students in the honors program so they had compiled a brief folder describing each honors student. The experimenter then gave participants two folders to read that ostensibly belonged to the two interviewers while they waited for the first interviewer to arrive. Each folder contained a re´sume´ and a photograph of the interviewer. In reality, of course, the interviewers were confederates whose sexual orientation was manipulated through information in the re´sume´ (see Measures and Manipulations section for details). The experimenter removed these folders before the confederate entered the room; thus, confederates always remained unaware of their manipulated role. Each confederate conducted a one-on-one interview with the participant for about 10 min in a small private room. The topic of one interview involved participants’ opinion of the economy and how it was affecting their lives. The other interview was about presidential politics and participants’ voting preferences. Both topics were selected to be unrelated to prejudice. The set of questions asked by each confederate was always fixed; however, as mentioned earlier, we counterbalanced (a) confederates’ sexual orientation between participants, (b) the order in which participants encountered the person in the gay versus heterosexual role, and (c) the order in which participants met each individual confederate. Participants’ behavior was rated by the two confederates immediately after the interviews and by two independent judges who later watched the videos taped by the hidden camera. After the interviews were over, the female experimenter returned and asked participants some final questions to assess their level of suspicion, their awareness of the hypotheses, and their awareness of confederates’ sexual orientation. None of the participants in our sample guessed the hypotheses. The experimenter then debriefed participants, requested their permission to use their videotaped interviews, and paid them for their time.
Results and Discussion Automatic Attitudes We calculated automatic attitudes toward gay men relative to heterosexuals by subtracting the average latency for pro-heterosexual combinations (heterosexual ⫹ good and homosexual ⫹ bad) from the pro-gay combinations (homosexual ⫹ good and heterosexual ⫹ bad). The larger this difference score or IAT effect, the stronger the automatic preference for heterosexuals and relative bias against gay men. A t test comparing the average IAT effect to zero revealed that participants expressed substantial automatic prejudice against gay men, compared with heterosexuals (mean IAT effect ⫽ 249 ms; d ⫽ .96), t(81) ⫽ 10.63, p ⬍ .0009. There was no significant difference between men and women’s automatic attitudes (IAT effectmen ⫽ 220 ms; IAT effectwomen ⫽ 266 ms), t(80) ⫽ ⫺1.23, p ⫽ .22. This finding is consistent with previous research showing that, when implicit attitudes toward gay men are examined, male and female participants often exhibit similar levels of antigay bias (Banse, Seise, & Zerbes, 2001; Steffens & Buchner, 2003).
Conscious Beliefs About Gender Roles and Gender Identity Beliefs about gender roles and gender identity revealed significant differences between men and women. Responses on the TBGI scale as a whole (␣ ⫽ .91) showed that men endorsed significantly more traditional beliefs about gender roles and gender identity (M ⫽ 3.61, SD ⫽ 1.06) than women (M ⫽ 2.84, SD ⫽ 1.14), F(1, 80) ⫽ 9.12, p ⫽ .003. This pattern emerged for both subscales. On the TBG subscale (␣ ⫽ .87), men were more likely to endorse traditional gender roles (M ⫽ 3.32, SD ⫽ 1.20) than women (M ⫽ 2.39, SD ⫽ 1.29), F(1, 80) ⫽ 10.32, p ⫽ .002. Similarly, on the TBI subscale (␣ ⫽ .90) men were more invested in making their normative gender identity apparent to others and to the self (M ⫽ 3.92, SD ⫽ 1.38) than were women (M ⫽ 3.34, SD ⫽ 1.43), F(1, 80) ⫽ 3.28, p ⫽ .07.
Behavioral Control The three behavioral control items were averaged into a single index (␣ ⫽ .70) in which higher numbers indicated that perceivers were more practiced at controlling their interpersonal behavior. Results showed that men and women were equally skilled at controlling behavior (Mmale ⫽ 4.37, SD ⫽ 1.43; Mfemale ⫽ 4.46, SD ⫽ 1.39), F ⬍ 1.
Nonverbal Behavior Participants’ behavior was rated by the two confederates and two independent judges. All raters made six behavioral judgments (smiling, eye contact, body posture, friendliness, comfort, and interest). These behaviors were analyzed in two ways: (a) as a single averaged behavioral index that captured participants’ global interaction style and (b) as individual behaviors. The global index was created in the following manner. First, confederates’ ratings were averaged into two behavioral indices, one for the gay confederate and the other for the heterosexual confederate so that higher numbers indicated more favorable behavior (average ␣ ⫽ .84). Second, because the two judges’ ratings were well correlated r(80) ⫽ .73, p ⬍ .001, these ratings were
MODERATORS OF ANTIGAY PREJUDICE AND BEHAVIOR
collapsed into a single index for actions directed at the gay confederate and another index for actions directed at the heterosexual confederate (average ␣ ⫽ .88). Finally, confederates’ and judges’ ratings were combined because they were significantly correlated r(80) ⫽ .53, p ⬍ .0001 and yielded the same pattern of findings. A Participant Sex ⫻ Confederate Role between-participants analysis of variance (ANOVA) indicated that, on average, there was no difference between participants’ spontaneous behavior toward the confederate in the gay versus the heterosexual role (Ms ⫽ 6.48 and 6.44, respectively), F ⬍ 1.
Relationship Between Automatic Antigay Attitudes and Subtle Behavior To test whether automatic antigay attitudes, the TBGI, and behavioral control had any effect on participants’ overall behavior, we conducted a hierarchical regression in which behavior directed at the gay confederate was used as the outcome variable. Behavior directed at the heterosexual confederate and participants’ age were controlled in the first step of the regression equation. Using the heterosexual confederate as a control variable allowed us to partial out individual differences in participants’ general social skills (which should affect their behavior toward heterosexual and gay men equally) and instead only focus on the variance in behavior that was directed at gay men in particular. We also sought to control the possible confounding influence of participants’ age because past research has documented that older people tend to be more prejudiced against gay men and lesbians than younger people (Britton, 1990; Herek, 1988, 1994; Hudson & Ricketts, 1980). This is particularly relevant in the present study because of the wide age range in our community sample. In the second step of the regression equation, we included the predictor variables—automatic attitudes (gay IAT), conscious beliefs about gender and identity (the TBGI), behavioral control, and participant sex—followed by the two-, three-, and four-way interaction variables in subsequent steps. Results revealed a significant four-way IAT ⫻ TBGI ⫻ Behavioral Control ⫻ Participant Sex interaction, Fomnibus(17, 65) ⫽ 1.77, p ⫽ .06; ⌬F(1, 65) ⫽ 4.43, p ⫽ .04; ⌬R2 ⫽ .05;  ⫽ ⫺.35, p ⫽ .04. To disaggregate this interaction effect, we examined the data for male and female participants separately. Because the community from which this sample was drawn was home to several feminist organizations and women’s colleges, we anticipated that female participants would report mostly egalitarian beliefs, whereas male participants would be more heterogeneous in their beliefs, thereby providing a better test of our hypotheses. Specifically, we predicted that automatic antigay attitudes would result in antigay behavior for men who were not motivated by conscious egalitarian beliefs and not able to control their subtle behavior. However, other men who were either highly motivated to be egalitarian or highly skilled at controlling behavior would not exhibit antigay behavior. Male participants. A regression using men yielded a significant three-way IAT ⫻ TBGI ⫻ Behavioral Control interaction, indicating that men’s behavior toward the gay confederate was influenced by their automatic attitudes, beliefs about gender, and behavioral control, Fomnibus(9, 20) ⫽ 5.99, p ⫽ .001; ⌬F(1, 20) ⫽ 4.78, p ⫽ .04; ⌬R2 ⫽ .07;  ⫽ .39, p ⫽ .04. To examine the direction of this effect, traditional versus nontraditional men were disaggregated through a median split of their TBGI scores (Mdn ⫽ 3.47).
273
Traditional men (low motivation to be egalitarian). As shown in Figure 1, Panel A, traditional men’s data showed a significant IAT ⫻ Behavioral Control interaction revealing that automatic antigay prejudice produced discriminatory behavior when male participants were not motivated by egalitarian beliefs and not able to control their behavior, Fomnibus(5, 10) ⫽ 5.35, p ⫽ .03; ⌬F(1, 10) ⫽ 6.39, p ⫽ .05; ⌬R2 ⫽ .17;  ⫽ .61, p ⫽ .05. To explore this two-way interaction more carefully, we separately examined the responses of traditional men who were high versus low in automatic prejudice (Mdn IAT effect ⫽ 119 ms). Results showed that men who exhibited strong automatic prejudice behaved less favorably if they were unable to control their behavior, compared with their peers who were able to control behavior, Fomnibus(3, 5) ⫽ 18.91, p ⫽ .02; ⌬F(1, 5) ⫽ 16.64, p ⫽ .03; ⌬R2 ⫽ .21;  ⫽ 1.17, p ⫽ .03. In contrast, men who exhibited no automatic prejudice behaved similarly regardless of behavioral control, ps ⬎ .25. We interpret these data cautiously given the small sample of traditional men. These findings await replication in the following experiment. Traditional men’s individual behaviors. We conducted similar regressions using each of the six behavioral indicators as dependent variables. Three of these behaviors produced significant effects: Among traditional participants who were automatically prejudiced, low behavioral control resulted in less eye contact with the gay interviewer,  ⫽ .60, p ⫽ .06, less comfort in his presence,  ⫽ .89, p ⫽ .002, and less interest in the conversation,  ⫽ .38, p ⫽ .10. Nontraditional men (high motivation to be egalitarian). For nontraditional men who were highly motivated to be egalitarian, automatic prejudice in the mind did not produce biased action in terms of the overall behavioral index (⌬F ⬍ 1, p ⫽ .42; see Figure 1, Panel B) or individual behaviors (all ps ⬎ .17). Female participants. For women, automatic attitudes, genderrelated beliefs, and behavioral control did not predict behavior (all ps ⫽ ns), which is not surprising given that the vast majority of female participants in this study expressed highly egalitarian beliefs about gender roles (94%) and gender identity (65%) on the TBGI.
Experiment 2 Experiment 2 sought to provide a stronger test of our hypothesis that the relation between automatic antigay prejudice and behavior is guided by the degree to which people (both men and women) are motivated by egalitarian beliefs and able to control their behavior. To that end, we recruited a community sample from a large city where people tend to be more heterogeneous in terms of their gender-related beliefs, compared with the small college town where the previous experiment had been conducted. We predicted that this time, for both sexes, automatic prejudice would produce subtle antigay behavior if participants were not motivated by egalitarian beliefs and not able to control their behavior. In contrast, when either egalitarian motivation or behavioral control was activated, the relation between automatic prejudice and biased behavior would become attenuated.
Method Participants A community sample of 67 participants (39 women, 28 men) was recruited from a city with the help of advertisements in local newspapers and flyers at local businesses and community colleges. All partic-
274
DASGUPTA AND RIVERA
Figure 1. A: Traditional male participants: effect of automatic prejudice and behavioral control on subtle behavior toward gay men. This interaction effect was plotted by calculating values for each of the two predictor variables that was 1 standard deviation above and below the mean (Aiken & West, 1991). B: Nontraditional male participants: Effect of automatic prejudice and behavioral control on subtle behavior toward gay men. IAT ⫽ Implicit Association Test.
ipants were paid $15–$20. Seventy percent of the participants were White, 12% were Black, 9% were Hispanic, 1.5% were Native American, 1.5% were multiracial, and 6% indicated that they belonged to other unspecified groups. Participants’ age ranged from 17 to 71 (M ⫽ 37.33 years, SD ⫽ 13.08). None of the participants identified as gay or
lesbian (mean self-rating ⫽ 10.69 on an 11-point scale). The procedure was identical to Experiment 1 with two exceptions. First, in Experiment 2, we used three confederates who rotated between gay and heterosexual roles (instead of two confederates). Second, the TBGI and behavioral control measures were administered at the end of Session 2.
MODERATORS OF ANTIGAY PREJUDICE AND BEHAVIOR
Results and Discussion Automatic Attitudes A t test comparing the IAT effect to zero revealed that, as a group, participants expressed significant antigay prejudice (mean IAT effect ⫽ 341 ms), t(66) ⫽ 12.18, p ⬍ .0009. In addition, men exhibited more automatic bias (IAT effectmen ⫽ 445 ms, d ⫽ .79) than women (IAT effectwomen ⫽ 269 ms, d ⫽ .52), t(65) ⫽ ⫺2.62, p ⫽ .01. Moreover, participants in this study showed more antigay prejudice (IAT effectExperiment 2 ⫽ 341 ms) than those in the previous study (mean IAT effectExperiment 1 ⫽ 249 ms), t(142) ⫽ 2.21, p ⬍ .03.
Traditional Beliefs About Gender Roles and Gender Identity Beliefs about gender roles and gender identity revealed significant differences between men and women. Responses on the TBGI as a whole (␣ ⫽ .87) showed that men endorsed more traditional beliefs about gender roles and gender identity (M ⫽ 3.93, SD ⫽ 1.01) than women (M ⫽ 3.02, SD ⫽ 1.14), F(1, 66) ⫽ 11.48, p ⫽ .001. This pattern emerged for both subscales. On the TBG subscale (␣ ⫽ .78), men favored the separation of gender roles (M ⫽ 3.31, SD ⫽ 1.22) more than women (M ⫽ 2.50, SD ⫽ 1.21), F(1, 66) ⫽ 7.35, p ⫽ .009. Similarly, on the TBI subscale (␣ ⫽ .81), men were more invested in making their normative gender identity apparent to others and to the self (M ⫽ 4.62, SD ⫽ 1.26) than were women (M ⫽ 3.61, SD ⫽ 1.45), F(1, 66) ⫽ 8.90, p ⫽ .004. Overall, participants in this experiment reported more traditional beliefs on the TBGI (M ⫽ 3.43, SD ⫽ 1.15), compared with their counterparts in Experiment 1 (M ⫽ 3.04, SD ⫽ 1.14), t(146) ⫽ 2.04, p ⬍ .05.
Behavioral Control Responses on the three behavioral control items indicated that, on average, male and female participants were equally able to control their subtle behaviors (Mmale ⫽ 4.21, SD ⫽ 1.23; Mfemale ⫽ 4.38, SD ⫽ 1.19; F ⬍ 1).
Nonverbal Behavior As in the previous experiment, behaviors were analyzed in two ways: (a) as a single averaged behavioral index that captured participants’ global interaction style and (b) as individual behaviors. The global index was created in the following manner. First, the confederates’ ratings were averaged into two behavioral indices, one for the gay confederate and the other for the heterosexual confederate so that higher numbers indicated more favorable behavior (average ␣ ⫽ .74). Second, because the two judges’ ratings were significantly correlated, r(66) ⫽ .60, p ⬍ .001, these ratings were collapsed into one index capturing behavior toward the gay confederate and another capturing behavior toward the heterosexual confederate (average ␣ ⫽ .75). Finally, confederates’ and judges’ ratings were combined, r(66) ⫽ .45, p ⬍ .0001. A Confederate Role ⫻ Participant Sex between-participants ANOVA revealed a significant two-way interaction, F(1, 66) ⫽ 6.37, p ⫽ .01, which indicated that, compared with women, men were less friendly toward the allegedly gay confederate (Ms ⫽ 6.78 and
275
5.50, respectively), t(66) ⫽ ⫺2.50, p ⫽ .02, but both were equally friendly toward the heterosexual confederate (Ms ⫽ 6.05 and 6.20, respectively), t ⬍ 1.
Relationship Between Automatic Antigay Attitudes and Subtle Behavior To test whether automatic antigay attitudes, the TBGI, and behavioral control had any effect on spontaneous behavior toward the gay confederates, we conducted a hierarchical regression using overall behavior toward the gay confederate as the outcome variable. After we controlled for the effect of participants’ age, their behavior toward heterosexual confederates, and participant sex in the first step of the regression equation, gay IAT scores, TBGI, and behavioral control were entered as predictor variables in the second step, followed by the two- and three-way interaction variables in subsequent steps. Results revealed a marginally significant effect of participant sex, indicating that, overall, male participants behaved less positively than female participants, F(3, 64) ⫽ 5.86, p ⬍ .0009;  ⫽ .20, p ⫽ .09. More important, a significant TBGI ⫻ IAT ⫻ Behavioral Control interaction emerged; Fomnibus(10, 57) ⫽ 3.47, p ⫽ .001; ⌬F(1, 57) ⫽ 3.03, p ⫽ .04; ⌬R2 ⫽ .09;  ⫽ .32, p ⫽ .007. All other effects were nonsignificant. To test the direction of the three-way interaction, we separately examined the data for traditional and nontraditional participants based on a median split (Mdn ⫽ 3.43). Traditional participants (low motivation to be egalitarian). Using traditional participants only, we tested whether the IAT and behavioral control predicted people’s behavior toward gay men. A significant IAT ⫻ Behavioral Control interaction revealed that automatic antigay prejudice resulted in discriminatory behavior only among participants who were not motivated by egalitarian beliefs and not able to control their behavior; Fomnibus(5, 27) ⫽ 2.73, p ⫽ .04; ⌬F(1, 27) ⫽ 4.95, p ⫽ .04; ⌬R2 ⫽ .12;  ⫽ .37, p ⫽ .04 (see Figure 2, Panel A). All other effects were nonsignificant. To explore this two-way interaction more carefully, we separately examined the responses of traditional participants who were high versus low in automatic prejudice. Similar to Experiment 1, results showed that those who exhibited high levels of automatic prejudice behaved less favorably if they had little behavioral control, compared with their peers who had a great deal of behavioral control; Fomnibus(2, 20) ⫽ 2.53, p ⫽ .10; ⌬F(1, 20) ⫽ 3.71, p ⫽ .07; ⌬R2 ⫽ .15;  ⫽ .41, p ⫽ .07. In contrast, traditional participants who exhibited low levels of automatic prejudice behaved similarly toward gay men regardless of behavioral control, F ⬍ 1, p ⬎ .40. Traditional participants’ individual behaviors. We conducted similar regressions using each of the six behavioral indicators as dependent variables. Four of the six behaviors produced significant effects. Specifically, among traditional participants who were prejudiced, low behavioral control resulted in less relaxed body posture in the presence of gay men,  ⫽ .46, p ⫽ .03; less friendliness toward gay men,  ⫽ .74, p ⫽ .005; less comfort in the presence of gay men,  ⫽ .50, p ⫽ .01; and less interest in the conversation,  ⫽ .33, p ⫽ .09.
276
DASGUPTA AND RIVERA
Figure 2. A: Traditional participants (men and women): effect of automatic prejudice and behavioral control on subtle behavior toward gay men. This interaction effect was plotted by calculating values for each of the two predictor variables that was 1 standard deviation above and below the mean (Aiken & West, 1991). B: Nontraditional participants (men and women): Effect of automatic prejudice and behavioral control on subtle behavior toward gay men. IAT ⫽ Implicit Association Test.
Nontraditional participants. Another regression tested whether the IAT and behavioral control predicted the behavior of participants who endorsed nontraditional beliefs. All effects were statistically nonsignificant for this group for both the behavioral index, Fs ⬍ 1, ps ⬎ .50 (see Figure 2, Panel B), and individual behaviors, all ps ⬎ .21.
General Discussion Although automatic bias in the mind may predispose people to behave in a subtly discriminatory fashion, the present research illustrates that such behavior is by no means inevitable. People’s
MODERATORS OF ANTIGAY PREJUDICE AND BEHAVIOR
nonverbal and verbal behaviors toward stigmatized individuals such as gay men are guided by a blend of automatic and controlled processes including automatically activated attitudes, conscious egalitarian beliefs, and ability to control behavior.
The Moderating Effect of Conscious Egalitarian Beliefs and Behavior Control The present studies demonstrated that automatic antigay prejudice resulted in discrimination against gay men only when conscious motivation and control were absent. Experiment 1 showed that for men who were not motivated by egalitarian beliefs and who were unable to control their subtle behavior, stronger automatic prejudice produced more antigay discrimination. However, others who endorsed egalitarian beliefs or who were skilled at controlling their actions did not discriminate, regardless of their automatic attitudes. Because Experiment 1 included a disproportionate number of egalitarian women, we conducted Experiment 2 to actively recruit a more heterogeneous urban sample with greater diversity in gender-related beliefs. This experiment demonstrated that for both men and women, conscious processes such as egalitarian beliefs and behavioral control moderated the relation between automatic prejudice and discrimination. Specifically, automatic antigay prejudice in the mind translated into biased action only for people who were not motivated by egalitarian beliefs and not skilled at behavior control. However, others who favored egalitarian beliefs or who were skilled at managing their behavior showed no outward discrimination, regardless of their automatic attitudes. In fact, implicitly biased participants who were behaviorally skilled overcorrected their behavior and acted more favorably toward gay men than their less skilled peers. This finding is consistent with conceptually similar effects reported by Fazio and colleagues, who found that implicitly prejudiced participants who were highly motivated to control racial bias overcorrected their evaluations of African American individuals (Dunton & Fazio, 1997; Olson & Fazio, 2004). Our data illustrate that certain types of nonverbal behaviors (smiling, eye contact, body posture, global friendliness and comfort) can be controlled with practice. However, other behaviors (e.g., eyeblinks, startle responses) may be more difficult to control. In addition, high vigilance during interactions with stereotyped outgroups may have cognitive costs for social actors—that is, after such interactions people may feel cognitively depleted in keeping with Richeson and Shelton’s (2003) findings. However, an interesting alternative possibility is that people who are highly practiced at monitoring and modifying their subtle behaviors may not show cognitive depletion after intergroup interactions if this skill has become automatized. An examination of individual differences in the cognitive consequences of behavioral control promises to be an intriguing avenue of future research. In addition to the moderating role of behavioral control, we also tested the role of conscious egalitarian beliefs as a source of motivation to behave in a nonprejudiced manner. In our research, egalitarian beliefs about gender roles and gender identity were the source of motivation that short-circuited the translation of automatic antigay bias from thoughts into action. For other stigmatized groups besides gay men, the specific nature of the egalitarian belief system may vary, but as a general principle, conscious beliefs in
277
favor of equality ought to exert a moderating influence on the automatic attitude– behavior link because such beliefs motivate people to be mindful in social interactions. We speculate that intrinsically motivated egalitarian beliefs, rather than extrinsically motivated beliefs, ought to be effective in attenuating the link between automatic prejudice and discriminatory action because intrinsically motivated people are likely to have accumulated greater practice at avoiding bias across many types of situations, whereas extrinsically motivated people are only likely to be mindful if situational norms demand it.
A Caveat In our research, we manipulated sexual orientation quite subtly by briefly indicating in the confederate’s re´sume´ that he belonged to a gay students’ alliance on campus (gay role) or a fraternity (heterosexual role). Nevertheless, one may argue that this manipulation might have led participants to perceive the gay confederate as politically active, which in turn might have biased their behavior. Similarly, one may argue that the heterosexual confederate might have been perceived as stereotypical because he was a fraternity member, which also might have biased participants’ behavior. Although both possibilities may have introduced nonsystematic error variance in the behavioral data, these critiques do not provide a clear alternative explanation that accounts for the specific interaction effect involving automatic prejudice, behavioral control, and egalitarian beliefs that was observed across two studies.
Conclusion In summary, the present studies are the first to show that, although spontaneous behavior toward stigmatized others may be driven by automatically activated prejudice under some conditions, conscious processes such as the motivation to be egalitarian and behavioral control can circumvent the effect of automatic prejudice on outward behavior. In other words, these studies show that the relation between automatic attitudes and social behavior is malleable to the extent that the type of behavior under consideration can be shaped by downstream conscious processes such as egalitarian motivation and behavior control. Whereas previous research has shown that practice and vigilance can attenuate automatic biases in attitude activation (Kawakami, Dovidio, & Moll, 2000), the present research extends this logic by demonstrating that practice and vigilance can also attenuate discriminatory actions that typically unfold quickly in real time. Moreover, these data complement other research on motivation to control prejudice (e.g., Dunton & Fazio, 1997; Plant & Devine, 1998) by showing that such motivation is often rooted in people’s consciously held beliefs and values about equality. Conscious egalitarian beliefs can override the effect of automatically activated prejudice and prevent certain forms of behavioral bias toward outgroups. Whereas the present data illustrate that relatively spontaneous interpersonal actions can be modified by motivation and control, future research might investigate whether the effect of such conscious processes generalize to other types of actions and decisions that are more constrained by cognitive load or time pressure.
DASGUPTA AND RIVERA
278 References
Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Thousand Oaks, CA: Sage. Badgett, M. V. L. (1996). Employment and sexual orientation: Disclosure and discrimination in the workplace. In A. L. Ellis & E. D. B. Riggle (Eds.), Sexual identity on the job: Issues and services (pp. 29 –52). New York: Haworth Press. Banse, R., Seise, J., & Zerbes, N. (2001). Implicit attitudes towards homosexuality: Reliability, validity, and controllability of the IAT. Zeitschrift fur Experimentelle Psychologie, 48, 145–160. Bem, S. L. (1974). The measurement of psychological androgyny. Journal of Consulting and Clinical Psychology, 42, 155–162. Bem, S. L. (1981). Gender schema theory: A cognitive account of sex typing. Psychological Review, 88, 354 –364. Bem, S. L. (1984). Androgyny and gender schema theory: A conceptual and empirical integration. In T. B. Sonderegger (Ed.), Nebraska Symposium on Motivation: Psychology and gender (pp. 179 –226). Lincoln: University of Nebraska Press. Blair, I. V. (2002). The malleability of automatic stereotypes and prejudice. Personality and Social Psychology Review, 6, 242–261. Britton, D. M. (1990). Homophobia and homosociality: An analysis of boundary maintenance. The Sociological Quarterly, 31, 423– 439. Cafferata, P., Horn, M. I., & Wells, W. D. (1997). Gender role changes in the United States. In L. R. Kahle & L. Chiagouris (Eds.), Values, lifestyles, and psychographics (pp. 249 –261). Mahwah, NJ: Erlbaum. Chung, Y. B., & Katayama, M. (1996). Assessment of sexual orientation in lesbian/gay/bisexual studies. Journal of Homosexuality, 30, 49 – 62. Coleman, E. (1987). Assessment of sexual orientation. Journal of Homosexuality, 14, 9 –24. Dasgupta, N. (2004). Implicit ingroup favoritism, outgroup favoritism, and their behavioral manifestations. Social Justice Research, 17, 143–169. Dasgupta, N., & Asgari, S. (2004). Seeing is believing: Exposure to counterstereotypic women leaders and its effect on automatic gender stereotyping. Journal of Experimental Social Psychology, 40, 642– 658. Dasgupta, N., & Greenwald, A. G. (2001). On the malleability of automatic attitudes: Combating automatic prejudice with images of admired and disliked individuals. Journal of Personality and Social Psychology, 81, 800 – 814. Deaux, K., & Kite, M. E. (1987). Thinking about gender. In B. B. Hess & M. M. Ferree (Eds.), Analyzing gender: A handbook of social science research (pp. 92–117). Thousand Oaks, CA: Sage. Deaux, K., & Lewis, L. (1984). Structure of gender stereotypes: Interrelationships among components and gender label. Journal of Personality and Social Psychology, 46, 991–1004. DePaulo, B. M., & Friedman, H. S. (1998). Nonverbal communication. In D. T. Gilbert & S. T. Fiske (Eds.), The handbook of social psychology: Vol. 2 (4th ed., pp. 3– 40). New York: McGraw-Hill. Dovidio, J. F. (2001). On the nature of contemporary prejudice: The third wave. Journal of Social Issues, 57, 829 – 849. Dovidio, J. F., Kawakami, K., & Gaertner, S. L. (2002). Implicit and explicit prejudice and interracial interaction. Journal of Personality and Social Psychology, 82, 62– 68. Dovidio, J. F., Kawakami, K., Johnson, C., Johnson, B., & Howard, A. (1997). On the nature of prejudice: Automatic and controlled processes. Journal of Experimental Social Psychology, 33, 510 –540. Dunton, B. C., & Fazio, R. H. (1997). An individual difference measure of motivation to control prejudiced reactions. Personality and Social Psychology Bulletin, 23, 316 –326. Eagly, A. H., & Mladinic, A. (1989). Gender stereotypes and attitudes toward women and men. Personality and Social Psychology Bulletin, 15, 543–558. Ellis, A. L., & Riggle, E. D. B. (1996). Sexual identity on the job: Issues and services. New York: Haworth Press. Fazio, R. H. (1990). Multiple processes by which attitudes guide behavior:
The MODE model as an integrative framework. In M. P. Zanna (Ed.), Advances in experimental social psychology: Vol. 23 (pp. 75–109). Orlando, FL: Academic Press. Fazio, R. H., Jackson, J. R., Dunton, B. C., & Williams, C. J. (1995). Variability in automatic activation as an unobtrusive measure of racial attitudes: A bona fide pipeline? Journal of Personality and Social Psychology, 69, 1013–1027. Fazio, R. H., & Towles-Schwen, T. (1999). The MODE model of attitudebehavior processes. In S. Chaiken & Y. Trope (Eds.), Dual process theories in social psychology (pp. 97–116). New York: Guilford Press. Gaertner, S. L., & Dovidio, J. F. (1986). The aversive form of racism. In J. F. Dovidio & S. L. Gaertner (Eds.), Prejudice, discrimination, and racism (pp. 61– 89). San Diego, CA: Academic Press. Gangestad, S. W., & Snyder, M. (2000). Self-monitoring: Appraisal and reappraisal. Psychological Bulletin, 126, 530 –555. Glick, P., & Fiske, S. T. (1996). The Ambivalent Sexism Inventory: Differentiating hostile and benevolent sexism. Journal of Personality and Social Psychology, 70, 491–512. Greenwald, A. G., & Banaji, M. R. (1995). Implicit social cognition: Attitudes, self-esteem, and stereotypes. Psychological Review, 102, 4 –27. Greenwald, A. G., Banaji, M. R., Rudman, L. A., Farnham, S. D., Nosek, B. A., & Mellott, D. S. (2002). A unified theory of implicit attitudes, stereotypes, self-esteem, and self-concept. Psychological Review, 109, 3–25. Greenwald, A. G., McGhee, D. E., & Schwartz, J. L. K. (1998). Measuring individual differences in implicit cognition: The Implicit Association Test. Journal of Personality and Social Psychology, 74, 1464 –1480. Hebl, M. R., Foster, J. B., Mannix, L. M., & Dovidio, J. F. (2002). Formal and interpersonal discrimination: A field study of bias toward homosexual applicants. Personality and Social Psychology Bulletin, 28, 815– 825. Helmreich, R. L., Spence, J. T., & Gibson, R. H. (1982). Sex role attitudes: 1972–1980. Personality and Social Psychology Bulletin, 9, 656 – 663. Herek, G. M. (1984). Attitudes toward lesbians and gay men: A factor analytic study. Journal of Homosexuality, 10, 39 –51. Herek, G. M. (1988). Heterosexuals’ attitudes toward lesbians and gay men: Correlates and gender differences. Journal of Sex Research, 25, 451– 477. Herek, G. M. (1994). Assessing attitudes toward lesbians and gay men: A review of empirical research with the ATLG scale. In B. Greene & G. M. Herek (Eds.), Lesbian and gay psychology (pp. 206 –228). Thousand Oaks, CA: Sage. Herek, G. M. (2000). Sexual prejudice and gender: Do heterosexuals’ attitudes toward lesbians and gay men differ? Journal of Social Issues, 56, 251–266. Herek, G. M., & Capitanio, J. P. (1996). “Some of my best friends”: Intergroup contact, concealable stigma, and heterosexuals’ attitudes toward gay men and lesbians. Personality and Social Psychology Bulletin, 22, 412– 424. Huddy, L., Neely, F., & Lafay, M. R. (2000). The polls–trends: Support for the women’s movement. Public Opinion Quarterly, 64, 309 –350. Hudson, W. W., & Ricketts, W. A. (1980). A strategy for the measurement of homophobia. Journal of Homosexuality, 5, 357–372. Jean, P. J., & Reynolds, C. R. (1980). Development of the Bias in Attitudes Survey: A sex-role questionnaire. Journal of Psychology: Interdisciplinary and Applied, 104, 269 –277. Jo¨reskog, K. G., & So¨rbom, D. (1996). LISREL 8: User’s reference guide. Chicago: Scientific Software International. Kawakami, K., Dovidio, J. F., & Moll, J. (2000). Just say no (to stereotyping): Effects of training in the negation of stereotypic associations on stereotype activation. Journal of Personality and Social Psychology, 78, 871– 888.
MODERATORS OF ANTIGAY PREJUDICE AND BEHAVIOR Kinsey, A. C., Pomeroy, W. B., & Martin, C. E. (1948). Sexual behavior in the human male. Oxford, England: W. B. Saunders. Kite, M. E., & Deaux, K. (1987). Gender belief systems: Homosexuality and the implicit inversion theory. Psychology of Women Quarterly, 11, 83–96. Kite, M. E., & Whitley, B. E. Jr. (1996). Sex differences in attitudes toward homosexual persons, behaviors, and civil rights: A meta-analysis. Personality and Social Psychology Bulletin, 22, 336 –353. LaFrance, M. (1985). Postural mirroring and intergroup relations. Personality and Social Psychology Bulletin, 11, 207–217. LaMar, L., & Kite, M. E. (1998). Sex differences in attitudes toward gay men and lesbians: A multidimensional perspective. Journal of Sex Research, 35, 189 –196. Lottes, I. L. (1993). Nontraditional gender roles and the sexual experiences of heterosexual college students. Sex Roles, 29, 645– 669. McBroom, W. H. (1987). Longitudinal change in sex role orientations. Differences between men and women. Sex Roles, 16, 439 – 452. McConnell, A. R., & Leibold, J. M. (2001). Relations among the Implicit Association Test, discriminatory behavior, and explicit measures. Journal of Experimental Social Psychology, 37, 435– 442. Olson, M. A., & Fazio, R. H. (2004). Trait inferences as a function of automatically activated racial attitudes and motivation to control prejudiced reactions. Basic and Applied Social Psychology, 26, 1–11. Plant, E. A., & Devine, P. G. (1998). Internal and external motivation to respond without prejudice. Journal of Personality and Social Psychology, 75, 811– 832. Portwood, S. G. (1995). Employment discrimination in the public sector based on sexual orientation: Conflicts between research evidence and the law. Law and Psychology Review, 19, 113–152. Richeson, J. L., & Shelton, J. N. (2003). When prejudice does not pay: Effects of interracial contact on executive function. Psychological Science, 14, 287–290. Ridgeway, C. L. (1997). Interaction and the conservation of gender inequality: Considering employment. American Sociological Review, 62, 218 –235. Rivera, L. M., & Dasgupta, N. (2006). Traditional beliefs about gender and gender identity. Manuscript in preparation. Rubenstein, W. B. (1996). Lesbians, gay men, and the law. In R. C. Savin-Williams & K. M. Cohen (Eds.), The lives of lesbians, gays, and bisexuals: Children to adults (pp. 331–344). Orlando, FL: Harcourt Brace. Rudman, L. A., & Glick, P. (2001). Prescriptive gender stereotypes and backlash toward agentic women. Journal of Social Issues, 57, 743–762. Schuman, H., Steeh, C., Bobo, L., & Kryson, M. (1997). Racial attitudes in America: Trends and interpretations. Cambridge, MA: Harvard University Press.
279
Sell, R. L. (1997). Defining and measuring sexual orientation: A review. Archives of Sexual Behavior, 26, 643– 658. Sherrill, K., & Yang, A. (2000). From outlaws to in-laws: Anti-gay attitudes thaw. Public Perspective, 11, 20 –23. Snell, W. E. (1986). The Masculine Role Inventory: Components and correlates. Sex Roles, 15, 443– 455. Snyder, M. (1974). Self-monitoring of expressive behavior. Journal of Personality and Social Psychology, 30, 526 –537. Snyder, M., & Gangestad, S. (1986). On the nature of self-monitoring: Matters of assessment, matters of validity. Journal of Personality and Social Psychology, 51, 125–139. Spence, J. T. (1993). Gender-related traits and gender ideology: Evidence for a multifactorial theory. Journal of Personality and Social Psychology, 64, 624 – 635. Spence, J. T., & Hahn, E. D. (1997). The Attitudes Toward Women Scale and attitude change in college students. Psychology of Women Quarterly, 21, 17–34. Spence, J. T., & Helmreich, R. L. (1972). The Attitudes Toward Women Scale: An objective instrument to measure attitudes towards the rights and roles of women in contemporary society. JSAS: Catalog of Selected Documents in Psychology, 2, 66 – 67. Spence, J. T., Helmreich, R. L., & Stapp, J. (1974). The Personal Attributes Questionnaire: A measure of sex role stereotypes and masculinityfemininity. Catalog of Selected Documents in Psychology, 4, 43– 44. Stark, L. P. (1991). Traditional gender role beliefs and individual outcomes: An exploratory analysis. Sex Roles, 24, 639 – 650. Steffens, M. C., & Buchner, A. (2003). Implicit Association Test: Separating transsituationally stable and variable components of attitudes toward gay men. Experimental Psychology, 50, 1618 –3169. Stohlberg, S. G. (2002, March 21). Minorities get inferior care, even if insured, study finds. The New York Times, p. 1. Swim, J. K., Aikin, K. J., Hall, W. S., & Hunter, B. A. (1995). Sexism and racism: Old-fashioned and modern prejudices. Journal of Personality and Social Psychology, 68, 199 –214. Towles-Schwen, T., & Fazio, R. H. (2003). Choosing social situations: The relation between automatically activated racial attitudes and anticipated comfort interacting with African Americans. Personality and Social Psychology Bulletin, 29, 170 –182. Wilson, T. D., Lindsey, S., & Schooler, T. Y. (2000). A model of dual attitudes. Psychological Review, 107, 101–126. Wittenbrink, B., Judd, C. M., & Park, B. (2001). Spontaneous prejudice in context: Variability in automatically activated attitudes. Journal of Personality and Social Psychology, 81, 815– 827. Yang, A. S. (1997). The polls—trends: Attitudes toward homosexuality. Public Opinion Quarterly, 61, 477–507.
(Appendix follows)
280
DASGUPTA AND RIVERA
Appendix Traditional Beliefs about Gender and Gender Identity Scale 1. It’s important that men appear masculine and that women appear feminine. 2. It is inappropriate for a man to use clear nail polish on his fingernails. 3. If the aims of women’s liberation are met, men will lose more than they will gain. 4. A woman needs the support of a man to advance professionally. 5. Children raised by single mothers are usually worse off compared to children raised by married couples. 6. Men who end up gay probably didn’t have strong male role models during their childhood. 7. A man who is vulnerable is a sissy. 8. Openly expressing my affection to another person of my own sex is difficult for me because I don’t want others to think I’m gay. 9. I would feel comfortable attending social functions where the majority of people are homosexuals of my own sex. (R) 10. I would feel comfortable knowing that members of my sex found me attractive. (R) 11. If a member of my sex made a sexual advance toward me I would feel angry.
12. I would be comfortable if I found myself attracted to a member of my sex. (R) 13. I would feel nervous being in a group of homosexuals of my own sex. 14. I would feel at ease conversing alone with a homosexual person of my own sex. (R) 15. I would feel comfortable with being labeled as homosexual. (R) Note. Items 1– 8 assess traditional beliefs about gender; Items 9 –15 assess traditional gender identity. When presented to participants, these items were randomly intermixed. (R) indicates reverse-coded items. Five of the above items were borrowed from existing scales (Hudson & Ricketts, 1980; Jean & Reynolds, 1980; Snell, 1986).
Received May 26, 2005 Revision received December 9, 2005 Accepted December 14, 2005 䡲
Journal of Personality and Social Psychology 2006, Vol. 91, No. 2, 281–294
Copyright 2006 by the American Psychological Association 0022-3514/06/$12.00 DOI: 10.1037/0022-3514.91.2.281
Going Along Versus Going Alone: When Fundamental Motives Facilitate Strategic (Non)Conformity Vladas Griskevicius, Noah J. Goldstein, Chad R. Mortensen, Robert B. Cialdini, and Douglas T. Kenrick Arizona State University Three experiments examined how 2 fundamental social motives—self-protection and mate attraction— influenced conformity. A self-protective goal increased conformity for both men and women. In contrast, the effects of a romantic goal depended on sex, causing women to conform more to others’ preferences while engendering nonconformity in men. Men motivated to attract a mate were particularly likely to nonconform when (a) nonconformity made them unique (but not merely a member of a small minority) and when (b) the topic was subjective versus objective, meaning that nonconformists could not be revealed to be incorrect. These findings fit with a functional evolutionary model of motivation and behavior, and they indicate that fundamental motives such as self-protection and mate attraction can stimulate specific forms of conformity or nonconformity for strategic self-presentation. Keywords: nonconformity, mating goals, fear, self-presentation, social influence
the other group members are similar to the professor (Festinger, 1954; Goldstein, Cialdini, & Griskevicius, 2006; Hornstein, Fisch, & Holmes, 1968); and he is uncertain about his decision (Tesser, Campbell, & Mickler, 1983). However, what if, in the process of ordering, the young professor’s attention is suddenly drawn to the beautiful waitress awaiting his selection? Despite the presence of numerous factors known to spur conformity, going along with his rivals in front of a potential mate is unlikely to draw her attention or impress her (Buss, 2003; Gangestad & Simpson, 2000). In fact, the goal of attracting a romantic partner may be more effectively served through deliberate nonconformity, which can make a man stand out as independent and assertive (Baumeister & Sommer, 1997; Simpson, Gangestad, Christensen, & Leck, 1999). Now consider what would happen if the group was composed of young women who were being served by an attractive male waiter. Would a woman dining with her female colleagues also nonconform when she is motivated to attract a potential romantic partner? Sizable literatures indicate that people harbor predilections both to stand out and to fit in (e.g., Brewer, 1991; Maslach, Stapp, & Santee, 1985; Snyder & Fromkin, 1980). Given these oftencompeting tendencies, the present research examines how certain powerful human motives can influence people’s tendency to stand out through nonconformity or to fit in through conformity. More specifically, in three experiments, we investigate how conformity and nonconformity may be influenced by two fundamental social motives: the goal to attract a mate and the goal to protect oneself from danger (Kenrick, Li, & Butner, 2003; Maner et al., 2005). In addition to examining potential sex differences, the studies also aim to elucidate the psychological processes by which fundamental motives can elicit differential tendencies to conform.
Whenever you find yourself on the side of the majority, it is time to pause and reflect. —Mark Twain
Imagine that Solomon, a young professor, and three of his male colleagues meet for dinner at a new restaurant. Inspecting the slate of delectable dishes on the menu, the young professor soon finds himself in a dilemma: What should he order? His new colleagues, however, are unanimous in their selections: Eerily reminiscent of a scene from a classic social psychological study, one by one, each man confidently orders the same item. Considering the choices of the group, how do you think Solomon will order? Over half a century of research on conformity informs us that people are heavily influenced by the actions and beliefs of others (Asch, 1956; Cialdini, Reno, & Kallgren, 1990; Moscovici, 1985; Sherif, 1936). Given that the young professor is likely motivated to gain the approval of his colleagues (Baumeister & Leary, 1995) and to make a good decision (White, 1959), conformity would help him realize each of these general goals (Cialdini & Trost, 1998; Deutsch & Gerard, 1955; Goldstein & Cialdini, in press). In fact, the restaurant predicament is teeming with factors that make conformity especially probable: The decision is public (Argyle, 1957; Campbell & Fairey, 1989); the professor finds the group desirable (Dittes & Kelley, 1956); the group is composed of no fewer than three individuals (Asch, 1956; Milgram, Bickman, & Berkowitz, 1969); the group’s opinion is unanimous (Asch, 1956);
Vladas Griskevicius, Noah J. Goldstein, Chad R. Mortensen, Robert B. Cialdini, and Douglas T. Kenrick, Department of Psychology, Arizona State University. This research was facilitated by National Science Foundation Graduate Research Fellowships awarded to Vladas Griskevicius and Noah J. Goldstein and by National Institutes of Health Grant 5R01MH64734 awarded to Douglas T. Kenrick. We thank Josh Ackerman for his helpful comments on a previous version of this article. Correspondence concerning this article should be addressed to Vladas Griskevicius, Department of Psychology, Arizona State University, Tempe, AZ 85287-1104. E-mail:
[email protected]
Conformity and Motivation Conformity is behavioral change designed to match or imitate the beliefs, expectations, or behaviors of real or imagined others (Cialdini & Trost, 1998). Decades of research have shown that conformity is highly prevalent (see Cialdini & Goldstein, 2004) 281
GRISKEVICIUS ET AL.
282
and that the tendency to imitate is sometimes so swift and mindless that it is almost automatic (Bremner, 2002; Chartrand & Bargh, 1999; Gopnik, Meltzhoff, & Kuhl, 1999). One reason why conformity is so ubiquitous is that it is often adaptive: Following others often leads to better and more accurate decisions, especially when we face uncertainty (Cialdini, 2001; Crutchfield, 1955; Mackie, 1987). This kind of accuracy-based conformity is known as informational influence (Deutsch & Gerard, 1955), and it persists because in many cases it is the most efficient form of behaving (Gigerenzer & Todd, 1999). Consistent with the underlying accuracy function of informational influence, when people have an elevated motivation to be accurate and find themselves in relatively ambiguous situations, conformity becomes increasingly likely (Baron, Vandello, & Brunsman, 1996; Levine, Higgins, & Choi, 2000). A second underlying reason why people tend to conform is that going along with or mimicking another person tends to produce liking (Chartrand & Bargh, 1999; Hatfield, Cacioppo, & Rapson, 1993). This kind of approval-based conformity is known as normative influence (Deutsch & Gerard, 1955), and it serves to facilitate the goal of affiliation (Baumeister & Leary, 1995; Insko, Drenan, Solomon, Smith, & Wade, 1983; Martin & Hewstone, 2003). Normative influence is especially potent because people who deviate from the group are more likely to be punished, ridiculed, and even rejected by other group members (Janes & Olson, 2000; Kruglanski & Webster, 1991; Levine, 1989; Miller & Anderson, 1979; Schachter, 1951). For example, in the classic Asch (1956) line studies, participants tended to conform with the group not necessarily because they believed the consensus of the group reflected the correct response but often because it was easier to go with the crowd than to face the consequences of going against it (Crutchfield, 1955). Correspondingly, when people have a heightened desire to affiliate with a group, mimicry tends to increase (e.g., Lakin & Chartrand, 2003). Although conformity can confer numerous benefits on an individual, nonconformity can also be advantageous (e.g., Argyle, 1957; Hollander, 1958). Nonconformity includes two types of behavior: (a) independence, or resisting influence; and (b) anticonformity, or rebelling against influence (Nail, MacDonald, & Levy, 2000; Willis, 1963). Both types of nonconformity tend to be effective in differentiating people from others, which can satisfy a need for individuation or uniqueness (Maslach et al., 1985; Snyder & Fromkin, 1980). For example, when a person’s uniqueness is threatened by an encounter with a highly similar individual, such a situation increases the tendency to nonconform (Duval, 1972; Weir, 1971, as cited in Snyder & Fromkin, 1980). Given that both conformity and nonconformity can be beneficial, this duality raises an important question: What contexts will lead to the emergence of conformity, and what situations will facilitate nonconformity? The answer may depend on the person’s currently active goal.
social groups, such as attracting and retaining mates, protecting oneself from danger, and attaining and maintaining status (Bugental, 2000; Kenrick, Li, & Butner, 2003). Empirical investigations based on this perspective have addressed various questions in psychology and have found evidence consistent with this framework (e.g., Cosmides & Tooby, 1992; Gangestad & Simpson, 2000; Haselton & Buss, 2000; Maner et al., 2005; Todd & Gigerenzer, 2000). Although there are good theoretical reasons to believe that an evolutionary perspective could enrich the understanding of social influence processes, there is thus far almost no empirical work that has done so (Sundie, Cialdini, Griskevicius, & Kenrick, in press). The present research was aimed to bridge social influence research and evolutionary psychological models by examining how two fundamental social motives—protecting oneself from harm and seeking a romantic partner—influence people’s tendency to conform. Self-protection and mating goals are central to survival and reproduction, and as we discuss below, each goal may lead to different patterns of responding to social influence attempts.
Self-Protective Motivation and Conformity We are here today because our ancestors were successful at navigating through the dangers posed by everyday life, making decisions that served their self-protective interests. A long history of research suggests that stimuli indicating the presence of danger acutely activate a self-protective goal and an associated pattern of affect (Plutchik, 1980); this goal then efficiently facilitates perceptions, cognitions, and behaviors associated with greater survival ¨ hman & success in ancestral environments (Maner et al., 2005; & O Mineka, 2001; Schaller, 2003; Schaller et al., 2004). Many selfprotective behaviors involve group-cohesive processes (Taylor et al., 2000). To increase the probability of survival, many species of animals, for instance, often strategically mimic others (Wickler, 1968), and individuals tend to herd together to be less conspicuous when threatened by a predator (Hamilton, 1971). Mimicry and imitation have been posited to serve a similar safety-enhancing function in humans (Dijksterhuis, Bargh, & Miedema, 2000), suggesting that a motive to protect oneself from danger may facilitate actions designed to avoid standing out of a crowd. Dangerous situations also induce stress and anxiety, which tend to increase the need to affiliate in both human and nonhuman animals (e.g., Schachter, 1959; Taylor et al., 2000). The need to affiliate in times of danger is consistent with findings from terror management theory, which show that people’s desire to affiliate tends to increase after they consider the frightening thought of their own death (Pyszczynski, Greenberg, & Soloman, 1997; Wisman & Koole, 2003). In summary, research in several areas suggests that when a self-protective motive is active, people should be more likely to go along with the group either to affiliate or to avoid drawing attention to themselves.
Fundamental Social Motives Our perceptions, cognitions, and behavior are profoundly influenced— both consciously and nonconsciously— by a large variety of goals and need states (e.g., Bargh, 1990; Chartrand & Bargh, 2002; Simpson et al., 1999). From an evolutionary perspective, the goals and motives having the most immediate impact on behavior are likely to be those that, over the course of human evolutionary history, have been most closely linked to adaptive outcomes in
Mate-Attraction Motivation and Conformity Survival is necessary, but not sufficient, for evolutionary success. Besides surviving, our ancestors were also all successful at reproduction. Not surprisingly, people’s cognitions and behaviors are strongly affected by motivational states specifically linked to reproduction. Stimuli indicating the potential for reproductive success tend to activate a mating goal and its associated affective
FUNDAMENTAL MOTIVES AND CONFORMITY
responses (Scott, 1980); this goal in turn facilitates perceptions, cognitions, and behaviors associated with greater mating success in ancestral environments (Griskevicius, Cialdini, & Kenrick, 2006; Maner et al., 2005; Roney, 2003; Wilson & Daly, 2004). One key to successfully attracting a mate is taking opportunities to positively differentiate oneself from one’s rivals (Buss, 2003); and nonconforming can be an effective method to attract attention and to show a distinction between a person and the larger group (Ridgeway, 1978; Schachter, 1951). Thus, it is possible that a mating motive could lead people to go against the group in order to stand out. Because men and women tend to prefer slightly different characteristics in a romantic partner, men and women seeking to attract a mate may also differ in exactly how and to what extent they will attempt to stand out from their rivals (Barkow, 1989). Traits that women prefer in a mate include willingness to take risks, decisiveness, assertiveness, independence, and general characteristics of leadership (Buss, 2003; Sadalla, Kenrick, & Vershure, 1987). Notably, these are all characteristics that can be conveyed by nonconforming with a group of potential rivals (e.g., by disagreeing with the group). In contrast, traits that men prefer in a mate focus less on social dominance and more on agreeableness and the mate’s ability to facilitate group cohesion (Campbell, 2002). Not only may the successful display of these traits be undermined by going against the group, but conforming more to the group may actually lead a woman to appear more agreeable while facilitating group cohesiveness. Consistent with these differentially preferred characteristics in men and women, research indicates that women are more concerned than men about the quality of interpersonal relationships, group cohesiveness, and the development of shared norms in a group (Eagly, 1978; Eder & Sandford, 1986). Correspondingly, not only do men have a higher drive to display independence and distinctiveness in a group (Baumeister & Sommer, 1997; Cross & Madson, 1997), but women are much quicker to shun female group mates who act against group norms (Goodwin, 1990). Thus, given differing mate preferences for men and women, it is likely that a motive to attract a mate should produce nonconformity for men, but a mate-attraction motive should actually produce more conformity for women.
Positive and Negative Group Judgments When one faces the choice of publicly going along with or going against the preferences of the group, this decision is likely to depend on the nature of the group’s preference. Consider, for example, a situation in which a person is visually inspecting an unusual painting at a museum with a group of acquaintances. Before the person decides to conform or nonconform from the group’s opinion of the painting, it may be important for him or her to consider first whether the others’ consensus is that they like or dislike the painting—that is, whether the group judgment is positive or negative. For the individual in the museum, stating that he likes a unique painting is likely to convey positive dispositional information (i.e., “I am generally positive about novel things like paintings”); whereas stating that he dislikes the painting may convey a negative disposition (i.e., “I am generally negative about novel things like paintings”). Given that a mating motive is likely to make people sensitive to self-presentation (Leary, 1995; Schlenker, 2003), and given that
283
both sexes value some degree of agreeableness in a mate (Green & Kenrick, 1994), mating motives are likely to lead both men and women to present themselves as positive and likable individuals. However, the ability to convey positive dispositional information through conformity or nonconformity hinges on whether the judgment of the group is positive or negative. Consider again the museum situation from a man’s perspective. If the group decries the painting as plebeian and amateur (a negative judgment), the man can convey a positive disposition by going against the group. However, if the group praises the painting’s penetrating genius (a positive judgment), going against the group does not convey a positive disposition. Thus, although a mate-attraction motive should produce male nonconformity when the group judgment is negative (thereby allowing a man to convey both independence and positive dispositional information by going against the group), the effects of the mating motive should be muted for men when the group judgment is positive (resulting in a conflict between wanting to appear independent and to appear positive). Whether the group judgment is positive or negative should also influence when mating motives should lead women to conform more. When the group judgment is positive, a woman can convey a positive disposition by going along with the group. However, when the group judgment is negative, going along with the group does not convey positive information. Thus, although a mateattraction motive should lead women to conform more when the group judgment is positive (thereby allowing a woman to convey positive dispositional information by going along with the group), the effects of the mating motive for women should be muted when the group judgment is negative.
Study 1 The initial study examined how two fundamental social goals—a motive for self-protection and a motive to attract a mate—influence men and women’s tendency to conform in a same-sex group (as compared with people primed with neutral motives). Self-protection and mate-attraction motives were primed through short imagination scenarios. Afterward, conformity was measured by the degree to which the positive versus negative judgment of the group influenced participants’ ratings of a painting (see Mucchi-Faina, Maass, & Volpato, 1991). We hypothesized that, when a self-protective mindset was primed, men and women’s conformity would increase. Moreover, this increase in conformity was predicted to persist regardless of whether the group judgment was positive or negative. Regarding mate-attraction motives, different predictions were made for men and women. For men, we predicted that a mating mindset would produce nonconformity primarily when the group judgment is negative, which would enable men who go against the group to appear independent and convey a positive disposition. For women, we predicted that a mating motive should produce more conformity primarily when the group judgment is positive, which would allow women who go along with the group to appear more agreeable and convey a positive disposition.
Method Participants Two hundred thirty-seven participants (113 male, 124 female) were recruited from introductory psychology classes as partial fulfillment of
284
GRISKEVICIUS ET AL.
their class requirement. All participants came to the lab in same-sex groups of 3– 6 and were seated at private computers that were visually shielded from others by partitions. The mean age for women was 19.2 (SD ⫽ 1.6), and the mean age for men was 19.8 (SD ⫽ 1.9).
Design and Procedure The study design was a between-participants 2 (participant sex) ⫻ 4 (motive prime: mate attraction, self-protection, “scenario” control, or “noprime” control) ⫻ 2 (group judgment: positive vs. negative) design. In the first part of the study, participants rated the attractiveness of multiple images that they believed were used to establish their aesthetic preferences. After the ratings, they underwent one of the four priming manipulations. After the prime, participants entered a computer chat room with 3 same-sex individuals with whom they believed they would later have a face-to-face discussion on aesthetic preferences. In the chat room, they publicly rated one of the images that they had previously rated on how interesting or uninteresting they believed it to be. Half of the time the ratings of the other 3 group members were programmed to be positive, and half of the time the group judgment was negative. The chat room was arranged so that the participant was always the last person in the group to provide a public rating. Conformity measure. The purpose of the first part of the study was to ascertain the participants’ actual private preferences for a specific artistic image that would later serve as the key image of interest in the chat room (with the initial private rating of the image serving as a covariate for the chat-room rating of the image). To reduce pressures to be consistent between the private and the public ratings, and to decrease possible suspiciousness, participants also rated 39 distracter images on the extent to which they thought each image was interesting. The images were collected from the Internet and consisted of various complex and simple graphic artistic designs and abstract paintings. Ratings were provided on a 9-point scale ranging from 1 (not at all interesting) to 9 (very interesting). Participants were led to believe that the 40 images were part of a much larger set and that other participants were likely rating a different set of images. Their ratings for the 40 images indicated a wide range of preferences. However, the mean rating for the key image was 5.00 (SD ⫽ 1.71), which was at the midpoint of the scale. After the private ratings, participants were informed that there was another group of participants in a different room that was also currently working on the same study. They were then told that they had been randomly assigned to a group of four same-sex participants from the two rooms, and the group was linked together by computer in a virtual chat room. Participants were told that in the second half of the study, all 4 members of their group would meet face to face to discuss their individual aesthetic preferences. The chat room was ostensibly the first step in the group discussion and served to publicly ascertain everyone’s aesthetic preferences, which would be the focus of the later discussion. This part of the procedure was designed to ensure that participants were accountable for their responses in the chat room because they might later need to justify their responses in the face-to-face discussion. In the chat room, participants again rated their preferences for the key image. They were led to believe that the image was randomly chosen by the computer and that it might not have been previously seen by them or their 3 group mates. However, it was arranged so that, as participants rated the image, they could see on the screen the ratings of their group members, who were programmed to provide their ratings before the participants. Half of the time, the group judgment was positive (8, 8, 7), indicating that they thought the image was highly interesting; the other half of the time, group judgment was negative (2, 2, 3), indicating that they thought the image was very uninteresting. The rating of the image constituted the dependent measure of the study. Given that participants had no prior interaction with their group mates, their public rating of the image in the chat room was the first piece of information they conveyed about themselves to the group. Priming procedure. Just before participants entered the chat room, they underwent a focusing task that served as the motive prime manipu-
lation. In the task, they read one of three short scenarios that were designed to activate a self-protection, a mate-attraction, or a neutral motive. Each of the three scenarios was of similar length (about 850 words) and contained the same instructions: “Please carefully read the following scenario. As you read, try to imagine yourself in the scenario and create a vivid mental picture.” In the self-protective scenario, participants imagined being in a house alone late at night. As the scenario progressed, they overheard scary noises outside and believed that someone had entered the house. After calling out and receiving no reply, the story ended as someone was about to enter the bedroom. In the mate-attraction scenario, participants imagined being on vacation with their friends. While on vacation, the reader met a highly desirable person of the opposite sex and spent a romantic day with the new romantic interest. The scenario ended as the two people were passionately kissing on a moonlit beach and feeling a strong desire to be with each other.1 The study had two separate control conditions: a scenario control and a no-prime control. In the scenario control, participants read a scenario similar in length to the other two scenarios, except that it was devoid of threat- or romance-inducing content. In the control scenario, participants imagined getting ready to go to a much-anticipated concert with a same-sex friend. They imagined that, during the night of the show, they could not find the concert tickets. Later, the friend arrived with the tickets, and they both headed off to the show anticipating a delightful musical experience. In the no-prime control, participants went to the chat room without reading any scenario. The no-prime control was not expected to produce different levels of conformity, compared with the scenario control. However, having both control conditions ensured that any potential differences in conformity between the control and the substantive motive conditions were not produced by the specific contents of the control scenario. To assess whether the three different scenarios were effective at inducing the desired motives and their associated affective states, a separate group of 46 male and female participants underwent one of the three scenario prime manipulations. Immediately afterward, they indicated the extent to which they were experiencing threat, a desire to protect themselves, romantic arousal, and a desire to attract a romantic partner. Responses to these items were measured using 7-point Likert scales ranging from 1 (not at all) and 7 (very much). There were no interactions or main effects involving participant sex, indicating that the scenarios had a similar effect on men and women. As seen in Table 1, the self-protection scenario elicited significantly more feelings of threat and a stronger desire to protect oneself, compared with either the control condition or the mate-attraction condition ( ps ⬍ .001). Conversely, the mate-attraction scenario elicited significantly more romantic arousal and a stronger desire to attract a romantic partner, compared with either the control condition or the selfprotection condition ( ps ⬍ .001). Thus, both the self-protection and the mate-attraction scenarios were effective at inducing the intended motives and associated affective states.
Results We measured the extent of participants’ conformity by examining the degree to which their public ratings of the target image were influenced by the ratings of their 3 group mates. Half the time, the ratings of the group were high (8, 8, 7), indicating a positive group judgment; half the time, the ratings were low (2, 2, 3), indicating a negative group judgment. Conformity by the participants in the former case was signified by higher ratings; conformity in the latter case was signified by lower ratings. For the statistical analyses, all ratings were standardized, whereby a higher 1
The mate-attraction prime did not suggest that this encounter was a brief romantic fling, nor did the prime suggest that the encounter was the beginning of a meaningful relationship.
FUNDAMENTAL MOTIVES AND CONFORMITY
Table 1 Mean Self-Reported Affect and Motivation for All Motive Prime Scenarios
Affect/motivation item
Control (n ⫽ 16)
Self-protection (n ⫽ 15)
Mate attraction (n ⫽ 15)
2.00 0.89
5.20* 1.74
1.47 0.92
2.31 1.30
5.53* 1.64
2.07 1.67
1.63 0.96
1.53 1.25
5.00* 1.81
1.94 1.57
1.20 0.56
5.33* 1.91
Threat M SD Desire to protect yourself M SD Romantic arousal M SD Desire to attract romantic partner M SD
285
tions were thus combined for the remainder of the analyses. To test the specific hypotheses of the study, we performed a series of planned contrasts, all using the preimage ratings as covariates.
Conformity and Self-Protection It was predicted that a self-protective prime (compared with a control condition) would produce a significant increase in conformity for both sexes. As seen on the left side of Figure 1, a planned contrast comparing conformity in the control and the selfprotection conditions indicated that this was indeed the case, F(1, 160) ⫽ 4.78, p ⫽ .030, 2 ⫽ .029. Also consistent with predictions, the effects of the self-protection prime did not differ for men and women, and the effects of the prime remained similar regardless of whether the group judgment was positive or negative ( ps ⬎ .50). Thus, a state of threat produced an increase in conformity for both men and women, and this increase was unaffected by the valence of the group judgment.
* p ⬍ .001, indicates diference from control scenario.
Conformity and Mate Attraction rating by participants always constituted more conformity, regardless of whether group judgment was positive or negative. The means for the conformity measure in the control conditions were all above the midpoint of 5.0, indicating that there was some degree of conformity in the control conditions as would be expected. Analyses indicated that there were no significant differences in conformity in either of the two control conditions between men and women. As expected, the two control conditions also did not significantly differ from one another, and the control condi-
The effects of a mate-attraction prime (compared with a control) were predicted to be different for men and women. Consistent with this prediction, results indicated a significant two-way interaction with motive and participant sex, F(1, 177) ⫽ 6.33, p ⫽ .013, 2 ⫽ .035. For men, it was predicted that a mating prime would produce less conformity, compared with the control, when group judgment was negative but not necessarily produce less conformity when group judgment was positive. Consistent with this prediction, results indicated a two-way interaction with motive and group
PRIMED MOTIVE
EXTENT OF CONFORMITY
+1.00
SELF-PROTECTION
MATE-ATTRACTION MEN
+0.50
WOMEN Equal to Control -0.50 -1.00 -1.50 -2.00
Positive
Negative
Positive
Negative
GROUP EVALUATION Figure 1. Effects of self-protection or mate-attraction motives on conformity depending on whether group judgment was positive versus negative (Study 1, adjusted means). Positive values denote an increase in conformity relative to the control; negative values denote a decrease in conformity relative to the control, or nonconformity.
GRISKEVICIUS ET AL.
286
judgment for men, F(1, 173) ⫽ 6.62, p ⫽ .011, 2 ⫽ .037. As seen on the right side of Figure 1, when group judgment was negative, a mate-attraction prime led men to conform significantly less than men in the control condition, F(1, 44) ⫽ 9.57, p ⫽ .003, 2 ⫽ .179. However, when the group judgment was positive, there was no difference between the mating and the control conditions for men ( p ⬎ .95). Thus, mating motives led men to go against the group specifically when group judgment was negative, meaning that nonconformity could be used to convey positive dispositional information. For women, it was predicted that a mating prime would lead them to conform more primarily when group judgment was positive. Although the two-way interaction with motive and group judgment for women was not significant, F(1, 173) ⫽ 1.64, p ⫽ .20, as seen in Figure 1, women in the mating condition did conform somewhat more than women in the control condition when group judgment was positive, F(1, 41) ⫽ 3.61, p ⫽ .064, 2 ⫽ .081. However, when group judgment was negative, the romantic prime had no effect on women’s conformity relative to the control ( p ⬎ .70). Thus, a romantic mindset led women to conform somewhat more primarily when group judgment was positive, meaning that higher conformity could convey positive dispositional information about the women.
Discussion Study 1 showed that temporarily activating different fundamental social motives produced different and theoretically meaningful tendencies toward conformity and nonconformity for men and women. As predicted, a motive to protect oneself from danger— even imagined danger—led both men and women to conform more. Being in a state of fear produced more conformity regardless of whether the group judgment was positive or negative. That is, participants conformed more regardless of whether they would be conveying positive or negative dispositional information. In contrast to a self-protection goal, a motive to attract a mate not only produced different effects for men and women, but each of these effects was qualified by whether the group judgment was positive or negative. For men, a romantic prime produced nonconformity specifically when the judgment of the rest of the group was negative. That is, a mating motive led men to go against the group when nonconformity could convey positive information about the men (e.g., “I am the type of person who generally likes novel things and I am independent”). However, when the group judgment was positive and nonconformity could not be used to convey positive information, the power of the mating motive to engender nonconformity was muted. For women, a romantic prime produced a trend toward more conformity specifically when the judgment of the group was positive. That is, a mating motive led women to go along with the group somewhat more when conformity could convey positive information about them. However, when group judgment was negative and conformity could not convey positive information, the power of the mating motive to increase women’s conformity was muted. These findings are consistent with an evolutionary functional perspective of social influence (Sundie et al., in press). It is also notable that a mating prime produced these sex-specific (non)conformity effects even when the group was composed of same-sex individuals. That is, even in a situation that did not directly involve attracting a mate, simply being in a mate-attraction mindset pro-
duced functional patterns of conformity. This finding suggests that priming a fundamental social motive, such as mate attraction, may activate a specific mental set that serves to facilitate cognitions and behaviors in a relatively automatic manner. It also is consistent with the possibility that males compete with one another for status, and that females are not so much directly attracted to the competitiveness per se but to the indirect result—that is, to status as reflected in relative standing among other males (Sadalla et al., 1987).
Study 2 The initial study showed that, when men were primed with a motive to attract a mate, they tended to go against the group (at least when group judgment was negative). Although this tendency to nonconform for men makes sense from a consideration of sex differences between mating and the desire to appear independent, men’s nonconformity is nevertheless puzzling. Given that conformity is generally adaptive because it leads to increased accuracy in decision making, men’s tendency to nonconform indicates that mating motives appear to lead men to behave less adaptively by disregarding any potential gains in accuracy afforded by conformity. From a functional perspective, however, this perplexing dilemma might be better understood if one considers the content of the topic on which a person is likely to be nonconforming. A closer look at conformity and minority influence research reveals a potentially crucial distinction in the kinds of content that are generally used across studies: Sometimes the topic is subjective (e.g., preferences, opinions), and at other times it is objective (e.g., trivia questions; see Maass, Volpato, & Mucchi-Faina, 1996). Conformity pressures operate both when topics are subjective (e.g., Allen, 1975; Santee & Maslach, 1982) and objective (e.g., Sherif, 1936). However, there is a key difference between the two types of content: A subjective quandary by definition does not have a verifiably correct answer, whereas an objective predicament does. For instance, consider the objective dilemma in the TV game show Who Wants to Be a Millionaire?: Contestants unsure of an answer to a multiple-choice question can poll audience members for their responses; although the audience is never unanimous, the response favored by the majority tends to be correct over 90% of the time, and it is almost always chosen by the contestant (Surowiecki, 2005). A situation with an objectively optimal solution, therefore, introduces the powerful self-presentational consideration of being perceived as right (as well as the objective benefit of avoiding a faulty choice). Indeed, the self-presentational and objective benefits of being verifiably correct on an issue should outweigh those of being merely like or unlike the majority. After all, someone who nonconforms on a topic, but is shown to be objectively wrong in his or her choice, is hardly likely to make a favorable impression on a romantic candidate. Thus, we may expect that, in contrast to Study 1, in which the topic was subjective, when a topic has an objective, demonstrably correct position, mating motives should lead both men and women to conform more to the majority view, because the majority typically counsels correctly in such matters (Laughlin, Zander, Knievel, & Tan, 2003; Surowiecki, 2005). Study 2 tested how a motive to attract a mate would influence men’s and women’s conformity on subjective versus objective topics (compared with participants primed with a neutral motive). Unlike in Study 1, in which the group could indicate a positive or
FUNDAMENTAL MOTIVES AND CONFORMITY
a negative judgment, the conformity situations in the present study were constructed in a way that neutralized the role of whether (non)conformity would convey positive versus negative dispositional information. As in Study 1, it was predicted that when the topic was subjective, a mating prime would lead men to nonconform and would lead women to conform more. In contrast, when the topic was objective, it was predicted that a mating motive would lead to a general increase in conformity. To broaden the findings from the initial study, we used a new set of conformity measures. In addition, we examined whether the predicted effects would persist when responses were private and could not directly be seen by others.
Method Participants Sixty-nine participants (38 male, 31 female) were recruited from introductory psychology classes as partial fulfillment of a class requirement. As in Study 1, participants came in groups and were seated a computer.
Design and Procedure The study used a 2 (participant sex) ⫻ 2 (motive prime: mating vs. control) ⫻ 2 (topic: subjective vs. objective) mixed-factorial design; participant sex and prime were between-participants factors, and topic was a within-participants factor. After they entered the laboratory, participants underwent the same mating and control prime procedure from Study 1. (Given that there was no difference between the no-prime and scenario controls in Study 1, only the scenario control was used in Study 2.) After the prime manipulation, participants responded to a six-question survey in which they could see the percentages of previous survey takers who had selected certain responses. Participants’ responses to the survey items constituted the dependent measure of conformity in the study. Of the six survey items, three items were subjective and three were objective. All of the subjective items asked participants for their preference between two choices that, within our sample population, were deemed relatively similar to each other: (a) a Mercedes-Benz or a BMW luxury car; (b) a silver or forest green car color; (c) and a Ferrari or a Lamborghini sports car. Asking participants to select a preference between two similarly desirable items enabled us to neutralize the positivity/negativity dimension that moderated the effects of the mating prime in Study 1. That is, in Study 2, neither conformity nor nonconformity could convey positive versus negative information about the participant. Each of the three objective items asked participants a factual question, and they were provided with two possible responses, one of which was correct: (a) Do you think it’s more expensive to live in New York City or in San Francisco? (b) Which airline has more on-time arrivals, Southwest or America West? (c) Which color shirt is better at keeping a person cool in the sun, green or blue? These items were chosen because any given participant in our sample would generally not know the correct answer to these questions, but he or she should believe that a majority response would likely constitute the correct answer. All six items were presented in random order, and participants had to indicate their responses on a 7-point scale ranging from 1 (definitely Option A) to 7 (definitely Option B) at the endpoints. Participants were informed that over 100 students had already taken the survey and that the responses of previous students would be visible during the time of the survey. They were told that this information was simply a by-product of the survey software and that they should be free to ignore it. For each item, participants could see the percentages of respondents who had chosen either of the two possible options for a given question (e.g., 70%/30%). The percentages for the six items indicated that a substantial majority (between 72% and 89%) had selected one of the two responses.
287
The pairings of the majority responses with the specific survey items and the specific responses within each item were counterbalanced.
Results As in the first study, all the counterbalanced items were standardized, whereby a higher number indicated more conformity regardless of which particular response was favored by the majority. A test of possible sex differences in the control condition indicated no significant differences in conformity for men and women. It was predicted that the mating prime would produce different patterns of conformity for men and women and that these patterns would be qualified by whether the topic was objective or subjective. Consistent with this prediction, a repeated-measures analysis of variance (ANOVA) with participant sex, motive, and topic produced a significant three-way interaction, F(1, 65) ⫽ 15.20, p ⬍ .001, 2 ⫽ .190. To test the specific hypotheses of the study, we performed a series of planned contrasts.
Conformity on Subjective Items When topics were subjective, it was predicted that a mateattraction motive would lead men to nonconform and would lead women to conform more. Consistent with this prediction, an ANOVA with participant sex and motive revealed a significant two-way interaction, F(1, 65) ⫽ 12.14, p ⫽ .001, 2 ⫽ .157. As seen in Figure 2, men in the mating condition conformed significantly less than men in the control condition, F(1, 67) ⫽ 5.19, p ⫽ .026, 2 ⫽ .072. Conversely, women in the mating condition conformed significantly more than women in the control, F(1, 67) ⫽ 7.36, p ⫽ .008, 2 ⫽ .099. This pattern for men and women
Figure 2. Effects of mate-attraction motives on conformity depending on whether content was subjective versus objective (Study 2). Positive values denote an increase in conformity relative to the control; negative values denote a decrease in conformity relative to the control, or nonconformity.
GRISKEVICIUS ET AL.
288
on subjective topics conceptually replicates the findings from Study 1.
Conformity on Objective Items When topics were objective, it was predicted that a mateattraction prime would lead men and women to conform more. As seen in Figure 2, men and women both tended to conform more on the objective items in the mating condition compared to the control, F(1, 65) ⫽ 5.16, p ⫽ .026, 2 ⫽ .074. Although the mating prime increased conformity somewhat more for men than for women, the Motive ⫻ Participant Sex interaction was not statistically significant, F(1, 65) ⫽ 3.54, p ⫽ .064. Thus, when topics were objective, a mate-attraction motive tended to generally produce an increase in conformity, although this increase was greater for men than women.
Discussion Despite the fact that this study used conformity measures different from those in the initial study, the results of Study 2 conceptually replicated and extended the findings of Study 1. When the topic was subjective, mating goals led men to nonconform and led women to conform more. In contrast, when the topic was objective, mating motives produced the predicted increase in conformity for men and women, as being objectively wrong is unlikely to make a favorable impression on a romantic candidate. Thus, mating motives lead men to show independence only on topics that are subjective, when they do not risk the selfpresentational consequences of being proven wrong. Notably, the effects of the mating prime persisted although participants’ responses were ostensibly private. These findings further support the notion that priming fundamental social motives appears to activate specific mental sets that automatically facilitate functional cognitions and behaviors. That is, a relevant audience— or even any audience— did not appear to be necessary to produce the effects.
preference between two alternatives was unanimously one-sided. However, would men still have nonconformed if consensus opinion was split into a majority of 2 and a minority of 1? According to the present perspective, if a majority in a small group is not unanimous, nonconformity is unlikely to enable a man effectively to appear unique or assertive; instead, the man may merely appear to be a follower of a minority of 1.2 For women, we hypothesized earlier that a mating motive would lead to more conformity because it would allow women to appear agreeable and as someone interested in fostering group cohesion— desirable traits in a female romantic partner (Barkow, 1989; Buss, 2003; Campbell, 2002). In large groups of people, a woman could appear agreeable by conforming with the majority even if that majority is not unanimous. However, just as for men, women in a small group are likely to be sensitive to the degree of consensus on a topic. In Study 1, for example, mating motives led women to conform more when the majority was unanimous. However, if the group was split into a majority of 2 and a minority of 1, going along with 2 people (and going against 1 person) is less successful at conveying agreeableness to the group members or fostering group cohesion. Study 3 tested how a mating motive would influence men’s and women’s conformity depending on whether the majority in a small group (5 people) was unanimous versus split. It was predicted that mating motives would produce nonconformity for men and produce conformity for women primarily when the majority was unanimous but not when it was split. In line with the first two studies, these outcomes were only predicted to occur on topics that were subjective. When topics were objective, it was predicted that mating motives would generally lead people to increase their conformity, especially when a small majority was unanimous, as this would be a much stronger indicator of a correct response (Insko, Smith, Alicke, Wade, & Taylor, 1985).
Method Participants
Study 3 Although the results from the first two studies provide preliminary evidence indicating how fundamental social motives influence conformity, it is not fully clear exactly why mating motives produce the specific patterns of behavior. As discussed earlier, we hypothesized that, for men, a mating motive should produce nonconformity when it enables men to be relatively unique and appear assertive and independent— desirable traits in male romantic partners and high-status men (Barkow, 1989; Baumeister & Sommer, 1997; Buss, 2003). In larger groups, such as a group of over 100 people, a man could achieve relative uniqueness by going against the preferences of the majority, even if that majority is not unanimous. As in Study 2, a man who is 1 of 10 people to prefer a BMW can still appear relatively distinct if 100 other men prefer a Mercedes. In fact, it would be rare and possibly disturbing if everyone had the same exact preference in a large group. However, to be distinctive in a small group (e.g., 5 individuals), a man is likely to be highly sensitive to the degree of consensus on a given topic. That is, it is difficult to be distinct when a man is 1 of the 2 people who prefer a BMW, compared with 3 people who prefer a Mercedes. Note that in Study 1, in which groups consisted of 4 persons, mating motives led men to nonconform when the majority
Two hundred fifteen participants (118 male, 97 female) were recruited from introductory psychology classes as partial fulfillment of their class requirement. As in the first two studies, participants came in groups and were seated at private computers.
Design and Procedure In this study we used a 2 (participant sex) ⫻ 2 (motive prime: mating vs. control) ⫻ 2 (topic: subjective vs. objective) ⫻ 2 (majority type: unanimous [4/0] vs. split [3/1]) mixed-factorial design. Participant sex and prime were between-participants factors, and topic and majority type were within-participants factors. The procedure was very similar to that of Study 2, except for several small changes. First, participants responded to 10 instead of 6 survey items. Of the 10 items, 4 were subjective, 4 were objective, and 2 of the items served as fillers. For the subjective items, the same 3 items from the previous study were used along with one new item: Would you prefer to
2
It is also consistent with the present perspective that if there were more than two options in such a situation, a mating motive may be effective at spurring men to select a third— or any other— option, which would enable them to stand out and assert their independence (see Santee & Maslach, 1982).
FUNDAMENTAL MOTIVES AND CONFORMITY have a painting by Van Gogh or Monet? For the objective questions, along with the 3 previous items, the fourth item asked: Which country do you think has more consumers, Finland or Norway? As in Study 2, participants could see the responses of previous survey takers. However, it appeared that only 4 individuals had thus far completed the survey. Because participants were told that the survey questions changed rather frequently, they had no reason to be suspicious of the low number of respondents. The viewable responses of the 4 previous survey takers were strategically arranged. The two filler items were always split 2/2 (i.e., 2 people had indicated a preference toward one response, whereas 2 other people had indicated a preference for the opposing response). One of these filler items always appeared first on the survey to decrease suspiciousness. Of the four subjective and four objective items, half had previous responses that were unanimous (4/0), and half of the items had previous responses showing that the majority was split (3/1). The pairings of the two types of majorities with the specific survey items and the specific responses within each item were counterbalanced.
Results As in the first two studies, regardless of the particular choice advocated by the majority, participants’ responses were converted into a conformity index for each item, whereby higher numbers indicated a higher degree of conformity. There were no significant sex differences in conformity in the control condition. To test the specific hypotheses of the study, we performed a series of planned contrasts for the subjective items and the objective items.
289
When the majority was unanimous, it was predicted that a mating prime would lead men to nonconform and a mating prime would produce more conformity for women. Consistent with this prediction, results indicated a significant Participant Sex ⫻ Motive interaction when the majority was unanimous, F(1, 213) ⫽ 9.30, p ⫽ .003, 2 ⫽ .042. As seen on the left side of Figure 3, when the majority was unanimous, men showed a significant decrease in conformity in the mating condition, compared with the control, F(1, 213) ⫽ 7.45, p ⫽ .007, 2 ⫽ .034. In contrast, a mating prime led women to conform somewhat more, although this difference was not conventionally significant, F(1, 213) ⫽ 2.22, p ⫽ .138, 2 ⫽ .010. When the majority was split, it was predicted that the effects of the mating motive on subjective conformity would be muted. Indeed, as seen in Figure 3, there were no significant interactions with participant sex and motive, main effects, or simple effects when the majority was split (all ps ⬎ .70). Thus, in summary, when topics were subjective, a mating prime led men to nonconform in a small group when the group was unanimous—that is, when going against the group could make the men distinct. For women, a mating prime produced somewhat higher conformity in a small group when the majority was unanimous—that is, when going along with the group would be particularly effective at displaying agreeableness and fostering group cohesion for women.
Conformity on Objective Items Conformity on Subjective Items When topics were subjective, it was predicted that mating goals would (differentially) influence men and women’s conformity depending on whether the majority was unanimous versus split. Consistent with this prediction, results indicated a three-way interaction with participant sex, motive, and majority type, although this interaction was not conventionally significant, F(1, 213) ⫽ 2.84, p ⫽ .093, 2 ⫽ .013.
When topics were objective, it was predicted that a mating prime would produce an increase in men and women’s conformity primarily when the majority was unanimous, but not necessarily when the majority was split. Consistent with this prediction, results indicated a Motive ⫻ Majority Type interaction for objective items, although this interaction was not conventionally significant F(1, 213) ⫽ 3.44, p ⫽ .065, 2 ⫽ .016. As seen in Figure 3, when the majority was unanimous, a mating prime produced a signifi-
Figure 3. Effects of mate-attraction motives on conformity depending on whether content was subjective versus objective, and on whether the majority was unanimous or split (Study 3). Positive values denote an increase in conformity relative to the control; negative values denote a decrease in conformity relative to the control, or nonconformity.
GRISKEVICIUS ET AL.
290
cant increase in conformity for men and women, compared with the control, F(1, 211) ⫽ 3.88, p ⫽ .050, 2 ⫽ .018. When the majority was split, however, a mating prime failed to produce a difference from control for men or women (all ps ⬎ .75). Thus, a mating prime produced an increase in conformity on objective items only when the majority was unanimous, which is precisely when men and women could have more confidence in the accuracy of the majority position.
agreeableness or group cohesion depends on the size of the group and the size of the majority, whereby conformity is more effective at achieving these self-presentational goals in small groups when the majority is unanimous. Thus, the seeming inconsistency between the Studies 2 and 3 does not undermine the theoretical grounding of the predictions or the actual findings. Indeed, the findings appear to indicate that people are understandably sensitive to the size of the group and the size of the majority when opting to (non)conform.
Discussion The results of Study 3 conceptually replicated and extended the findings from the previous two studies by illuminating the processes by which mating motives differentially influence men’s and women’s conformity. First, as in Study 2, when the content was objective, mating motives tended to produce an increase in men and women’s conformity. As would be expected, this increase was strongest when the majority of four was unanimous, which is a stronger indicator of the correct response compared with a split consensus of 3 to 1. Second, when content was subjective and the majority of 4 was unanimous, mating motives led men to nonconform and produced a pattern of higher conformity for women. However, when consensus opinion in the group was split into a majority of 3 and a minority of 1, mating motives failed to influence either men’s or women’s conformity. This unanimous-only finding for men in small groups supports the notion that mating motives lead men to go against the group likely because they motivate men to appear unique and assertive. Each of these self-presentational goals can be optimally achieved through nonconformity primarily when the majority in a small group is unanimous. When consensus is split into a majority of 3 and a minority of 1, going against the group is less likely to make men look unique or assertive. The unanimous-only pattern for women supports the idea that mating motives are likely to lead women to conform more in part because they motivate them to appear agreeable and foster group cohesion. Each of these selfpresentation goals can be optimally achieved through conformity primarily when the majority in a small group is unanimous. When consensus is split into a majority of 3 and a minority of 1, going along with the group is less effective at enabling women to appear agreeable and fostering group cohesion for the entire group. The findings of this study may initially appear at odds with the findings from Study 2. In that study, mating motives led men to nonconform and led women to conform more even though the majority was not unanimous. However, there is a key methodological difference between the studies: In Study 2, the ostensible “group” consisted of over 100 individuals; whereas in this study, the group consisted of only 5 individuals, including the participant. Given that mating motives should produce male nonconformity when going against the group enables men to be relatively distinct, the effectiveness of the mating prime to produce nonconformity should depend on the size of the group and the size of the majority. In larger groups, relative distinctiveness can be achieved via nonconformity even if the majority is not unanimous; that is, a person can appear relatively distinct if he is one of 10 people who prefer Option A compared with 100 people who prefer Option B. In a small group, however, being 1 of the 2 people who prefer Option A compared with the 3 people who prefer Option B is much less effective at achieving distinctiveness. A similar rationale also applies to women: The effectiveness of conformity to convey
General Discussion The present research examined how the temporary activation of two fundamental social motives—a motive to protect oneself from danger and a motive to attract a mate—influenced tendencies to conform. Findings indicated that a self-protective mindset led both men and women to conform more. That is, when people were motivated to avoid threat and to protect themselves from danger, they tended to go along with the group. In contrast, a mating mindset generally produced different effects for men and women. For men, the goal to attract a mate generally led them to go against the preferences of others; whereas for women, the goal to attract a mate generally tended to increase the likelihood that women would conform to the group. However, these general effects of mating motives on (non)conformity were qualified by three key factors. First, as seen in Study 1, the effects of mating motives depended on whether the judgment of the group was positive or negative. That is, one’s decision to (non)conform depended on whether the group opinion was essentially thumbs up or thumbs down. The valence of the group’s judgment of a novel stimulus strongly influences what kind of dispositional information would be conveyed by a person’s (non)conformity. For men, a romantic prime produced nonconformity specifically when the judgment of the group was negative. However, when group judgment was positive—and nonconformity could not be used to convey positive information—the effect of the mating motive to engender nonconformity was muted. For women, a romantic prime tended to produce somewhat more conformity specifically when the judgment of the group was positive. However, when group judgment was negative—and conformity could not convey positive information—any effect of the mating motive for women was muted. Notably, the valence of the group’s judgment had no influence on the effectiveness of self-protection goals to increase conformity, suggesting that self-protection goals are less sensitive to concerns of positive or negative self-presentation. Second, as seen in Studies 2 and 3, mating goals led men to nonconform only on topics that were subjective. That is, men went against the group only when they couldn’t be proven to be incorrect and when going against the crowd could not result in choices that were less accurate. In contrast, when the topic was objective, mating motives actually caused men to conform more. This finding makes sense given that going against the majority opinion on a matter of objective fact is not likely to be the most adaptive behavior, and is often subject to being verified as foolish as opposed to independent. The effects of mating motives for women, however, who tended not to take stands against group opinion, did not depend on whether the topic was subjective or objective. Third, as seen in Study 3, when in a small group, mating goals led men to nonconform and led women to show somewhat of an increase in conformity only when the majority of the group was
FUNDAMENTAL MOTIVES AND CONFORMITY
unanimous. When group opinion was split into a majority of 3 and a minority of 1, the effects of mating motives were muted for men and women. This finding for men supports the assumption that mating motives lead men to desire to appear unique and assertive, which are desirable characteristics in a male mate. The same motive, in contrast, seems to lead women to appear agreeable and foster group cohesion, which is a desirable characteristic in a female mate. However, as seen in Study 2, when the group consists of many individuals (e.g., over 100 people), mating motives will lead men to nonconform and lead women to conform, even when the (large) majority is not completely unanimous.
Fundamental Motivations and Strategic Self-Presentation The findings from all three studies fit with a functional domainspecific model of motivation and behavior. Moreover, the results indicate that fundamental motives, such as mate attraction, can stimulate specific forms of conformity or nonconformity in the service of strategic self-presentation. Notably, the effects of the mating motive were obtained even when groups consisted of same-sex individuals (Study 1) and when people’s responses were ostensibly private (Studies 2 and 3). That is, a mating mindset led men to go against the group and led women to go along with the group even when such behavior could not produce tangible benefits for the (non)conformist. Consistent with other research that activates similar motives (e.g., Maner et al., 2005; Wilson & Daly, 2004), the activation of these fundamental social motives appears to activate specific mental sets that serve to facilitate functional perceptions, cognitions, and behaviors that often occur automatically and outside of the awareness of the participant. For example, Roney (2003) found that men reported greater ambition and desire to earn more money in the presence of desirable women or when the men merely looked at photos of desirable women. Although the presence of a relevant audience may strengthen the tendency for functional behaviors, a relevant audience— or even any direct audience—appears unnecessary to elicit the motive-driven behaviors.
Alternative Explanations Although the present research has adopted a functional evolutionary framework to examine the relationship between various social motives and conformity, it would be possible to derive predictions regarding how various primes would affect conformity from several other theoretical perspectives. However, none of these alternative approaches seems to offer as straightforward an account of the pattern of results obtained in this series of studies. For example, it is possible that the effects of a mating prime for men may have been caused because the prime produced more positive affect and arousal for men than women. Although it is unlikely that the prime produced more positive mood or arousal for men (see Griskevicius, Cialdini, & Kenrick, 2006), even if the mating scenario did produce more positive affect for men, such a finding would not constitute a particularly compelling alternative explanation of the effects. In particular, the affect explanation would suggest that positive affect leads men to nonconform in some circumstances while leading them to conform more in other situations. Although the possibility of higher positive affect for men would suggest a potential mechanism for why these (non) conformity effects occur for men, it would raise the question of
291
why and how a mating prime would produce more conformity for women. It is also possible that the link between mating, selfprotection, and conformity is due to simple mechanisms of associative priming (Srull & Wyer, 1979; see Higgins, 1996, for a review). Research has shown, for example, that when people are primed with scrambled sentences alluding to conformity, they tend to conform more to social pressure (Epley & Gilovich, 1999). Although priming people with self-protective or romantic scenarios may very well activate conformity- or nonconformity-related concepts, it is difficult to see how an associative model framework could account for the very specific pattern of sex differences and similarities in nonconformity as well as conformity that was observed in this research. Moreover, such a perspective would have difficulty explaining why the primes produced responding that was highly sensitive to the specific features of a given situation in ways that supported a more articulated interaction with different goals. The functional framework used in this research is by no means an alternative to the associative network model of cognition. Both models imply that there are certain links between motivation, cognition, and behavior. However, the functional model does more than just assert that priming specific ideas will lead to the activation of associatively linked semantic and affective categories. Rather, the functional model leads to articulated predictions regarding how activating specific functional goals should lead to specific goal-consistent—and sex-consistent— cognitive and behavioral responses (Maner et al., 2005). A social learning model may suggest that men and women have been differentially rewarded for their conformity or nonconformity, although it is again difficult to predict from this perspective the precise pattern of sex differences and similarities, as well as the sensitivity of the behaviors to specific contexts, that we found. Social role theory may suggest that men are taught and rewarded for being tough and resolute. However, in this research, cues connoting danger, which may be predicted to provide men a perfect opportunity to show their toughness and stout independence, caused men to be highly conforming, which is inconsistent with appearing tough and independent. Social role theory may also suggest that, in order to attract mates, men are taught to present themselves as independent and autonomous from the judgments of others. Indeed, although men displayed such behaviors some of the time, a mating motive actually led men to become less independent and less autonomous when topics were objective—a specific prediction clearly derived from a functional perspective. Neither social role theories nor social learning theories are mutually exclusive with functional evolutionary accounts, since evolutionary theorists presume that social roles across societies are a function of evolutionary constraints on men and women and that many behaviors involve an adaptive interplay of learning and evolved predispositions (Kenrick, Trost, & Sundie, 2004; & ¨ hman & Mineka, 2001). We are not aware, however, of predicO tions made by social role or social learning theories for the very specific patterns of results obtained here—patterns that follow directly from considerations of how different fundamental social goals can be achieved through specific self-presentation behaviors for men and women.
GRISKEVICIUS ET AL.
292 Limitations and Future Directions
One of the limitations of the present research is that it did not examine conformity in face-to-face interactions. Although we conducted Study 1 using a virtual group chat room setting with an expectation of a face-to-face discussion, the functional perspective suggests that the effects of fundamental motives are likely to be even stronger in real groups, where people would have more to gain through strategic self-presentation. People’s everyday experiences of conformity are also partly shaped by their cultural context (Bond & Smith, 1996; Kim & Markus, 1999). Although the present studies examined how fundamental motives influence self-presentation via conformity and nonconformity in one culture, an evolutionary functionalist perspective holds that mate-attraction motives should activate a desire to positively differentiate oneself from one’s rivals and to present oneself in a positive light in all cultures. However, the specific contexts in which conformity or nonconformity is seen as an appropriate way to achieve these goals will surely depend on cultural or local norms (Norenzayan & Heine, 2005; Norenzayan, Schaller, & Heine, in press). The mating prime used in the current research is likely to have aroused feelings related to lust rather than to stable attachment. It would be interesting to explore in future research what kinds of behaviors would be produced by eliciting feelings of stable love or attachment. For example, a prime of an elderly affectionate couple is unlikely to produce the same (non)conformity effects as in the present research because it is unlikely to sufficiently activate motives related to mate attraction. However, thoughts of stable attachment may lead men to conform more than they may otherwise because a desire for attachment may produce a desire to belong to a group. The romantic prime used in the present work was ambiguous regarding whether it activated a desire to attract a short-term versus a long-term mate. Given that the type of mating strategy one is pursuing is related to strategically different self-presentation (e.g., Buss & Schmitt, 1993), it would be interesting to explore in future research whether activating an explicit short-term versus a long-term mating goal would have a different effect on men and women’s (non)conformity. Given that leadership qualities in men are valued in both short-term and long-term mates, it seems likely that both types of mating goals would lead men to go against the group. However, to the extent that women are under more pressure to display agreeableness and group cohesion to a potential longterm mate, the desire to attract a long-term romantic partner may produce more conformity for women than a desire to attract a short-term mate. In addition, certain individual differences, such as one’s sociosexual orientation (Simpson & Gangestad, 1991, 1992) and romantic relationship status, may also influence a person’s self-presentational tactics (Simpson et al., 1999).
Conclusion There has been a long-standing debate about whether men are more nonconforming than women. The research we have presented here suggests that the answer depends in part on the goal that is currently active for a man or a woman deciding whether to go along or to go alone. It further suggests that being a conformist or a nonconformist is not simply a trait of men or women that manifests itself without regard to situational inputs. Self-protective motivation leads both men and women to increase their general
tendency to conform with a group’s opinions. Mating motivation, on the other hand, leads to a particular and very functional pattern of nonconformity for males—who will go it alone against a group, but only if such independence cannot be objectively proven to be erroneous and if they are not following another individual who has already defied the group.
References Allen, V. L. (1975). Social support for nonconformity. In L. Berkowitz (Ed.), Advances in experimental social psychology, (Vol. 8, pp. 1– 43). New York: Academic Press. Argyle, M. (1957). Social pressure in public and private situations. Journal of Abnormal and Social Psychology, 54, 172–175. Asch, S. E. (1956). Studies of independence and conformity: I. A minority of one against a unanimous majority. Psychological Monographs, 70(9, Whole No. 416). Bargh, J. A. (1990). Auto-motives: Preconscious determinant of social interaction. In E. T. Higgins & R. M. Sorrentino (Eds.), Handbook of motivation and social cognition: Foundations of social behavior (Vol. 2, pp. 93–130). New York: Guilford Press. Barkow, J. H. (1989). Darwin, sex, and status: Biological approaches to mind and culture. Toronto, Ontario, Canada: University of Toronto Press. Baron, R. S., Vandello, J., & Brunsman, B. (1996). The forgotten variable in conformity research: Impact of task importance on social influence. Journal of Personality and Social Psychology, 71, 915–927. Baumeister, R. F., & Leary, M. R. (1995). The need to belong: Desire for interpersonal attachments as a fundamental human motivation. Psychological Bulletin, 117, 497–529. Baumeister, R. F., & Sommer, K. L. (1997). What do men want? Gender differences and two spheres of belongingness. Psychological Bulletin, 122, 38 – 44. Bond, R., & Smith, P. B. (1996). Culture and conformity: A meta-analysis of studies using Asch’s line judgment task. Psychological bulletin, 119, 111–137. Bremner, J. G. (2002). The nature of imitation by infants. Infant Behavior & Development, 25, 65– 67. Brewer, M. B., (1991). The social self: On being the same and different at the same time. Personality and Social Psychology Bulletin, 17, 475– 482. Bugental, D. B. (2000). Acquisition of the algorithms of social life: A domain-based approach. Psychological Bulletin, 126, 187–219. Buss, D. M. (2003). The evolution of desire: Strategies of human mating (2nd ed.). New York: Basic Books. Buss, D. M., & Schmitt, D. P. (1993). Sexual strategies theory: An evolutionary perspective on human mating. Psychological Review, 100, 204 –232. Campbell, A. (2002). A mind of her own: The evolutionary psychology of women. Oxford, England: Oxford University Press. Campbell, J. D., & Fairey, P. J. (1989). Informational and normative routes to conformity: The effect of faction size as a function of norm extremity and attention to the stimulus. Journal of Personality and Social Psychology, 57, 457– 468. Chartrand, T. L., & Bargh, J. A. (1999). The chameleon effect: The perception– behavior link and social interaction. Journal of Personality and Social Psychology, 76, 893–910. Chartrand, T. L., & Bargh, J. A. (2002). Nonconscious motivations: Their activation, operation, and consequences. In A. Tesser, D. Stapel, & J. Wood (Eds.), Self and motivation: Emerging psychological perspectives (pp. 13– 41). Washington, DC: American Psychological Association. Cialdini, R. B. (2001). Influence: Science and practice (4th ed.). New York: Allyn & Bacon. Cialdini, R. B., & Goldstein, N. J. (2004). Social influence: Conformity and compliance. Annual Review of Psychology, 55, 591– 621.
FUNDAMENTAL MOTIVES AND CONFORMITY Cialdini, R. B., Reno, R. R., & Kallgren, C. A. (1990). A focus theory of normative conduct: Recycling the concept of norms to reduce littering in public places. Journal of Personality and Social Psychology, 58, 1015– 1026. Cialdini, R. B., & Trost, M. R., (1998). Social influence: Social norms, conformity, and compliance. In D. T. Gilbert & S. T. Fiske (Eds.), The handbook of social psychology: Vol. 2 (4th ed., pp. 151–192). Boston: McGraw-Hill. Cosmides, L., & Tooby, J. (1992). Cognitive adaptations for social exchange. In J. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind (pp. 163–228). New York: Oxford University Press. Cross, S. E., & Madson, L. (1997). Models of the self: Self-construals and gender. Psychological Bulletin, 122, 5–37. Crutchfield, R. S. (1955). Conformity and character. American Psychologist, 10, 191–198. Deutsch, M., & Gerard, H. B. (1955). A study of normative and informational social influences upon individual judgment. Journal of Abnormal and Social Psychology, 51, 629 – 636. Dijksterhuis, A., Bargh, J. A., & Miedema, J. (2000). Of men and mackerels: Attention and automatic behavior. In H. Bless & J. P. Forgas (Eds.), Subjective experience in social cognition and behavior (pp. 36 –51). Philadelphia: Psychology Press. Dittes, J. E., & Kelley, H. H. (1956). Effects of different conditions of acceptance upon conformity to group norms. Journal of Abnormal and Social Psychology, 53, 100 –107. Duval, S. (1972). Conformity of a visual task as a function of personal novelty on attitudinal dimensions and being reminded of the object status of the self. Unpublished doctoral dissertation, University of Texas. Eagly, A. H. (1978). Sex differences in influenceability. Psychological Bulletin, 85, 86 –116. Eder, D., & Sandford, S. (1986). The development and maintenance of interactional norms among early adolescents. Sociological Studies of Child Development, 1, 283–300. Epley, N., & Gilovich, T. (1999). Just going along: Nonconscious priming and conformity to social pressure. Journal of Experimental Social Psychology, 35, 578 –589. Festinger, L. (1954). A theory of social comparison processes. Human Relations, 7, 117–140. Gangestad, S. W., & Simpson, J. A. (2000). On the evolutionary psychology of human mating: Trade-offs and strategic pluralism. Behavioral and Brain Sciences, 23, 573–587. Gigerenzer, G., & Todd, P. M. (Eds.). (1999). Simple heuristics that make us smart. London: Oxford University Press. Goldstein, N. J., & Cialdini, R. B. (in press). Using social norms as a lever of social influence. In A. Pratkanis (Ed.), The science of social influence: Advances and future progress. Philadelphia: Psychology Press. Goldstein, N. J., Cialdini, R. B., Griskevicius, V. (2006). A room with a viewpoint: Using normative appeals to motivate environmental conservation behaviors in a hotel setting. Manuscript under review. Goodwin, M. H. (1990). He said, she said. Bloomington: Indiana University Press. Gopnik, A., Meltzhoff, A. N., & Kuhl., P. K. (1999). The scientists in the crib: Minds, brains, and how children learn. New York: Morrow. Green, B. L., & Kenrick, D. T. (1994). The attractiveness of gender-typed traits at different relationship levels: Androgynous characteristics may be desirable after all. Personality & Social Psychology Bulletin, 20, 244 –253. Griskevicius, V., Cialdini, R. B., & Kenrick D. T. (2006). Peacocks, Picasso, and parental investment: The effects of romantic motives on creativity. Journal of Personality and Social Psychology, 91, 63–76. Hamilton, W. D. (1971). Geometry for the selfish herd. Journal of Theoretical Biology, 31, 295–311. Haselton, M. G., & Buss, D. M. (2000). Error management theory: A new perspective on biases in cross-sex mind reading. Journal of Personality and Social Psychology, 78, 81–91.
293
Hatfield, E., Cacioppo, J. T., & Rapson, R. L. (1993). Emotional contagion. New York: Cambridge University Press. Higgins, E. T. (1996). Knowledge activation: Accesibility, applicability, and salience. In E. T. Higgins & A. W. Kruglanski (Eds.), Social psychology: Handbook of basic principles (pp. 133–168). New York: Guilford Press. Hollander, E. P. (1958). Conformity, status, and idiosyncrasy credits. Psychological Review, 65, 117–127. Hornstein, H. A., Fisch, E., & Holmes, M. (1968). Influence of a model’s feeling about his behavior and his relevance as a comparison other on observers’ helping behavior. Journal of Personality and Social Psychology, 10, 222–226. Insko, C. A., Drenan, S., Solomon, M. R., Smith, R., & Wade, T. J. (1983). Conformity as a function of the consistency of positive self-evaluation with being liked and being right. Journal of Personality and Social Psychology, 19, 341–358. Insko, C. A., Smith, R. H., Alicke, M. D., Wade, J., & Taylor, S. (1985). Conformity and group size: The concern with being right and with being liked. Personality and Social Psychology Bulletin, 11, 41–50. Janes, L. M., & Olson, J. M. (2000). Jeer pressure: The behavioral effects of observing ridicule of others. Personality and Social Psychology Bulletin, 26, 474 – 485. Kenrick, D. T., Li, N. P., & Butner, J. (2003). Dynamical evolutionary psychology: Individual decision rules and emergent social norms. Psychological Review, 110, 3–28. Kenrick, D. T., Trost, M. R., & Sundie, J. M. (2004). Sex roles as adaptations: An evolutionary perspective on gender differences and similarities. In A. H. Eagly, A. Beall, & R. Sternberg (Eds.), Psychology of gender (pp. 65–91). New York: Guilford Press. Kim, H., & Markus, H. R. (1999). Uniqueness or deviance, harmony or conformity: A cultural analysis. Journal of Personality and Social Psychology, 77, 785– 800. Kruglanski, A. W., & Webster, D. M. (1991). Group members’ reactions to opinion deviates and conformists at varying degrees of proximity to decision deadline and of environmental noise. Journal of Personality and Social Psychology, 61, 212–225. Lakin, J., & Chartrand, T. L. (2003). Using nonconscious behavioral mimicry to create affiliation and rapport. Psychological Science, 14, 334 –339. Laughlin, P. R., Zander, M. L., Knievel, E. M., & Tan, T. K. (2003). Groups perform better than the best individuals on letters-to-numbers problems: Informative equations and effective strategies. Journal of Personality and Social Psychology, 85, 684 – 694. Leary, M. R. (1995). Self-presentation: Impression management and interpersonal behavior. Madison, WI: Brown & Benchmark. Levine, J. M. (1989). Reaction to opinion deviance in small groups. In P. B. Paulus (Ed.), Psychology of group influence (2nd ed., pp. 187–231). Hillsdale, NJ: Erlbaum. Levine, J. M., Higgins, E. T., & Choi, H. S. (2000). Development of strategic norms in groups. Organizational behavior and human decision process, 82, 88 –101. Maass, A., Volpato, C., & Mucchi-Faina, A. (1996). Social influence and the verifiability of the issue under discussion: Attitudinal versus objective items. British Journal of Social Psychology, 35, 15–26. Mackie, D. M. (1987). Systematic and non-systematic processing of majority and minority persuasive communications. Journal of Personality and Social Psychology, 53, 41–52. Maner, J. K., Kenrick, D. T., Becker, D. V., Robertson, T. E., Hofer, B., Neuberg, S. N., et al. (2005). Functional projection: How fundamental social motives can bias interpersonal perception. Journal of Personality and Social Psychology, 88, 63–78. Martin, R., & Hewstone, M. (2003). Social-influence process of control and change: Conformity, obedience to authority and innovation. In M. A. Hogg & J. Cooper (Eds.), The Sage handbook of social psychology (pp. 347–366). Thousand Oaks, CA: Sage.
294
GRISKEVICIUS ET AL.
Maslach, C., Stapp, J., & Santee, R. T. (1985). Individuation: Conceptual analysis and assessment. Journal of Personality and Social Psychology, 49, 729 –738. Milgram, S., Bickman, L., & Berkowitz, O. (1969). Note on the drawing power of crowds of different size. Journal of Personality and Social Psychology, 13, 79 – 82. Miller, C. E., & Anderson, P. D. (1979). Group decision rules and the rejection of deviates. Social Psychology Quarterly, 42, 354 –363. Moscovici, S. (1985). Social influence and conformity. In G. Lindzey & E. Aronson (Eds.), The handbook of social psychology: Vol. 2 (3rd ed., pp. 347– 412). New York: Random House. Mucchi-Faina, A., Maass, A., & Volpato, C. (1991). Social influence: The role of originality. European Journal of Social Psychology, 21, 183–197. Nail, P. R., MacDonald, G., & Levy, D. A. (2000). Proposal of a fourdimensional model of social response. Psychological Bulletin, 126, 106 –116. Norenzayan, A., & Heine, S. J. (2005). Psychological universals across cultures: What are they and how do we know? Psychological Bulletin, 131, 763–784. Norenzayan, A., Schaller, M., & Heine, S. J. (in press). Evolution and culture. In M. Schaller, J. Simpson, & D. Kenrick (Eds.), Evolution and social psychology. New York: Academic Press. ¨ hman, A., & Mineka, S. (2001). Fears, phobias, and preparedness: O Toward an evolved module of fear and fear learning. Psychology Review, 108, 483–522. Plutchik, R. (1980). A general psychoevolutionary theory of emotion. In R. Plutchik & H. Kellerman (Eds.), Emotion: Theory, research, and experience: Vol. 1. Theories of emotion (pp. 3–33). New York: Academic Press. Pyszczynski, T., Greenberg, J., & Soloman, S. (1997). Why do we need what we need? A terror management perspective on the roots of human social motivation. Psychological Inquiry, 8, 1–20. Ridgeway, C. L. (1978). Conformity, group-oriented motivation, and status attainment in small groups. Social Psychology, 41, 175–188. Roney, J. R. (2003). Effects of visual exposure to the opposite sex: Cognitive aspects of mate attraction in human males. Personality and Social Psychology Bulletin, 29, 393– 404. Sadalla, E. K., Kenrick, D. T., & Vershure, B. (1987). Dominance and heterosexual attraction. Journal of Personality and Social Psychology, 52, 730 –738. Santee, R. T., & Maslach, C. (1982). To agree or not to agree: Personal dissent amid social pressure to conform. Journal of Personality and Social Psychology, 42, 690 –700. Schachter, S. (1951). Deviation, rejection, and communication. Journal of Abnormal and Social Psychology, 46, 190 –207. Schachter, S. (1959). The psychology of affiliation: Experimental studies of the sources of gregariousness. Stanford, CA: Stanford University Press. Schaller, M. (2003). Ancestral environments and motivated social perception: Goal-like blasts from the evolutionary past. In S. J. Spencer, S. Fein., M. P. Zanna, & J. M. Olson (Eds.), Motivated social perception (pp. 215–231). Mahwah, NJ: Erlbaum. Schaller, M., Faulkner, J., Park, J. H., Neuberg, S. L., & Kenrick, D. T. (2004). Impressions of danger influence impressions of people: An
evolutionary perspective on individual and collective cognition. Journal of Cultural and Evolutionary Psychology, 2, 231–247. Schlenker, B. R. (2003). Self-presentation. In M. R. Leary & J. P. Tangney (Eds.), Handbook of self and identity (pp. 492–518). New York: Guilford Press. Scott, J. P. (1980). The function of emotions in behavioral systems: A systems theory analysis. In R. Plutchik & H. Kellerman (Eds.), Emotion: Theory, research, and experience (Vol. 1). New York: Academic Press. Sherif, M. (1936). The psychology of social norms. New York: Harper. Simpson, J. A., & Gangestad, S. W. (1991). Individual differences in sociosexuality: Evidence for converging and discriminant validity. Journal of Personality and Social Psychology, 67, 870 – 883. Simpson, J. A., & Gangestad, S. W. (1992). Sociosexuality and romantic partner choice. Journal of Personality, 60, 31–51. Simpson, J. A., Gangestad, S. W., Christensen, P. N., & Leck, K. (1999). Fluctuating asymmetry, sociosexuality, and intrasexual competitive tactics. Journal of Personality and Social Psychology, 76, 159 –172. Snyder, C. R., & Fromkin, H. L. (1980). Uniqueness: The human pursuit of difference. New York: Plenum Press. Srull, T. K., & Wyer, R. S. (1979). The role of category accessibility in the interpretation of information about persons: Some determinant and implications. Journal of Personality and Social Psychology, 37, 1660 – 1672. Sundie, J. M., Cialdini, R. B., Griskevicius, V., & Kenrick, D. T. (in press). Evolutionary social influence. In M. Schaller, J. Simpson, & D. Kenrick (Eds.), Evolution and social psychology. New York: Academic Press. Surowiecki, J. (2005). The wisdom of crowds. New York: Doubleday Press. Taylor, S. E., Klein, L. C., Lewis, B. P., Gruenewald, T. L., Gurung, R. A. R., & Updegraff, J. A. (2000). Biobehavioral responses to stress in females: Tend-and-befriend, not fight-or-flight. Psychological Review, 107, 411– 429. Tesser, A. Campbell, J., & Mickler, S. (1983). The role of social pressure, attention to the stimulus, and self-doubt in conformity. European Journal of Social Psychology, 13, 217–233. Todd, P. M., & Gigerenzer, G. (2000). Precis of simple heuristics that make us smart. Behavioral & Brain Sciences, 23, 727–780. Weir, H. B. (1971). Deprivation of the need for uniqueness and some variables moderating its effects. Unpublished doctoral dissertation, University of Georgia. White, R. W. (1959). Motivation reconsidered: The concept of competence. Psychological Review, 66, 297–333. Wickler, W. (1968). Mimicry in plants and animals. London: World University Library. Willis, R. H. (1963). Two dimensions of conformity–nonconformity. Sociometry, 26, 499 –513. Wilson, M., & Daly, M. (2004). Do pretty women inspire men to discount the future? Proceedings of the Royal Society, 271B(Suppl. 4), 177–179. Wisman, A., & Koole, S. L. (2003). Hiding in the crowd: Can mortality salience promote affiliation with others who oppose one’s worldviews? Journal of Personality and Social Psychology, 84, 511–526.
Received August 26, 2005 Revision received December 19, 2005 Accepted December 26, 2005 䡲
Journal of Personality and Social Psychology 2006, Vol. 91, No. 2, 295–315
Copyright 2006 by the American Psychological Association 0022-3514/06/$12.00 DOI: 10.1037/0022-3514.91.2.295
Evidence for Strong Dissociation Between Emotion and Facial Displays: The Case of Surprise Rainer Reisenzein
Sandra Bo¨rdgen and Thomas Holtbernd
University of Greifswald
University of Bielefeld
Denise Matz University of Bochum Eight experiments examined facial expressions of surprise in adults. Surprise was induced by disconfirming a previously established schema or expectancy. Self-reports and behavioral measures indicated the presence of surprise in most participants, but surprise expressions were observed only in 4%–25%, and most displays consisted of eyebrow raising only; the full, 3-component display was never seen. Experimental variations of surprise intensity, sociality, and duration/complexity of the surprising event did not change these results. Electromyographic measurement failed to detect notably more brow raisings and, in one study, revealed a decrease of frontalis muscle activity in the majority of the participants. Nonetheless, most participants believed that they had shown a strong surprise expression. Keywords: emotion, facial expression, surprise
call it here, the affect program theory (APT) of facial displays. First, it has been argued that the agreement among judges on the association between facial expressions and emotions is not as high as the proponents of APT have claimed (Russell, 1994; for a reply, see Ekman, 1999). Second, it has been recalled to the scientific public’s mind that the facial judgment studies that constitute the central piece of evidence for APT are first and foremost studies of folk–psychological beliefs about the association between emotions and facial displays (Reisenzein, 2000a; Russell, Bachorowski, & Ferna´ndez-Dols, 2003). Although the intra- and intercultural consistency of these beliefs suggests that they contain a kernel of truth, they (or at least their usual interpretation) could be erroneous in theoretically important respects. A salient possibility is that these beliefs reflect not the modal association between emotions and facial displays but their association in ideal-type cases, in which all components of the emotion syndrome are present (Horstmann, 2002). These ideal types could, however, be rarely exemplified in everyday life. Supporting this possibility, research on the actual association between emotions and facial displays suggests that this association is not as strong as APT seems to imply. In particular, facial expressions of emotion are often absent in situations in which, at first sight at least, APT would predict them to occur (e.g., Ferna´ndez-Dols & Ruiz-Belda, 1997; Fischer, Manstead, & Zaalberg, 2003; Russell et al., 2003). In addition, if they occur, facial expressions of emotion seem to be more often partial than complete (e.g., Carroll & Russell, 1997; Reisenzein, 2000a). Finally, at least some facial displays, notably smiling, seem to be as strongly pulled forth by the presence of other people as by the emotional state of the person (e.g., Fridlund, 1991; Holodynski, 2004; Kraut & Johnston, 1979; Ruiz-Belda, Ferna´ndez-Dols, Carrera, & Barchard, 2003; for reviews, see Fischer et al., 2003; Parkinson, 2005; Wagner & Lee, 1999).
Since the publication of Darwin’s (1872/1998) book on the expression of emotions in man and animals, the relation between emotional states and facial displays has been a controversial topic. Most empirical research focused on the inference of emotions from posed facial expressions (for reviews, see Elfenbein & Ambady, 2002; Russell, 1994). This research revealed substantial agreement among judges, both intra- and cross-culturally, that a few emotions—in particular, happiness, sadness, fear, anger, disgust, and surprise—are associated with specific facial displays (e.g., Ekman, Friesen, et al., 1987; for a recent example, see Tracy, Robins, & Lagattuta, 2005). Many believe that these findings constitute strong evidence for the existence of phylogenetically determined, discrete emotion mechanisms that comprise motor programs for emotion-specific facial displays as core components. Indeed, for some emotion theorists, the existence of a facial display that, unless inhibited or masked, is shown whenever one has the emotion comes close to a defining condition for emotions (e.g., Ekman, 1997; Izard, 1991; Leventhal, 1984; Tomkins, 1962). However, in recent years, objections have been raised against this “emotions view of faces” (Fridlund, 1994, p. 124) or, as we
Rainer Reisenzein, Institute of Psychology, University of Greifswald, Greifswald, Germany; Sandra Bo¨rdgen and Thomas Holtbernd, Department of Psychology, University of Bielefeld, Bielefeld, Germany; Denise Matz, Department of Medical Psychology and Medical Sociology, University of Bochum, Bochum, Germany. We thank Alfons Hamm and Almut Weike for their assistance with the EMG measurement in Experiment 7 and Gernot Horstmann, Michael Niepel, and Achim Schu¨tzwohl for their comments on earlier versions of the article. Correspondence concerning this article should be addressed to Rainer Reisenzein, Institute of Psychology, University of Greifswald, FranzMehring-Straße 47, 17487, Greifswald, Germany. E-mail: rainer
[email protected]
295
¨ RDGEN, HOLTBERND, AND MATZ REISENZEIN, BO
296
However, the issues are clearly not yet decided. First, as discussed in more detail later, APT has conceptual resources that allow this theory to explain many cases of reported dissociations between emotions and facial displays (Ekman, 1997; Rosenberg & Ekman, 1994). Second, the available evidence suggests that emotional states affect facial displays at least in addition to other factors (e.g., Hess, Banse, & Kappas, 1995; Jakobs, Manstead, & Fischer, 2001; Reisenzein, 2000a; Ruch, 1995). Third, research has so far concentrated on the facial display of smiling and associated emotions such as happiness or amusement, whereas the relation of other basic emotions to facial displays has been much less studied (for research on disgust, see, e.g., Rosenberg & Ekman, 1994; for research on sadness, see, e.g., Jakobs et al., 2001; Mauss, Levenson, McCarter, Wilhelm, & Gross, 2005; research on surprise is reviewed below). It is conceivable that APT holds true for some emotions but not for others. Therefore, before sweeping generalizations are made, a wider range of basic emotions and associated facial displays must be taken into view. Motivated by these considerations, the aim of the studies reported here was to examine the affect program theory of facial displays, plus several modifications of this theory, for the case of surprise.1 Surprise recommended itself as a suitable object of inquiry for several reasons. First, at least since Darwin, surprise has been associated with a biologically determined facial display consisting of eyebrow raise, widening of the eyes, and opening of the mouth/jaw drop; and since Darwin, surprise appears on the lists of most basic-emotion theorists, including all proponents of APT. Indeed, if one accepts the standard view of (biologically) basic emotions— organized response syndromes that are typically elicited by cognitive appraisals of situations, and that evolved as “solutions” of recurrent adaptive problems (e.g., Cosmides & Tooby, 2000; Ekman, 1999)—surprise seems to be as good an example of a basic emotion as one can find (see Meyer & Niepel, 1994). Second, the relation between emotion and facial expression has not yet been studied in depth for the case of surprise. Third, compared with other basic emotions such as anger or fear, the study of surprise offers a number of advantages (Reisenzein, 2000a). In particular, surprise can be easily induced in laboratory settings with excellent control of its onset, intensity, and the timing of its measurement. In addition, apart from subjective reports of surprise, a variety of other self-report and nonfacial behavioral indicators of surprise are available, and ethical problems associated with the induction of other emotions can be largely avoided. However, before we give an overview of our studies, we briefly review previous relevant research.
menter in a peek-a-boo game after a series of no-change trials; and Scherer, Zentner, and Stern (2004) changed the experimenter’s voice by means of digital filtering after a period of normal talking. Hiatt et al. (1979) also attempted to induce surprise by staging the instant disappearance of a musical Ferris wheel that the infants were watching, whereas Bennett, Bendersky, and Lewis (2002) and Reissland, Shepherd, and Cowie (2002) confronted infants with a jack-in-the-box. In older children, researchers sought to induce surprise by a “magical” change of the color or number of marbles after a series of no-change trials (Charlesworth, 1964) and by means of unexpected loud noises (a loud clock buzzer or a cycle horn went off during story time; Blurton Jones & Konner, 1971; Wheldall & Mittler, 1976). In most studies, fewer than half of the children showed at least some evidence of a surprise display (i.e., at least one component of the expression) in response to the presumed unexpected event. The maximum frequency of a single surprise component reported in any one study was 60% (Hiatt et al., 1979; for eye widening and Ferris wheel); the maximum frequency of a two-component display (eyebrow raising plus eye widening) was 52% (Bennett et al., 2002). In addition, when a surprise display occurred, it was nearly always incomplete (i.e., it consisted of only one or two components of the traditional surprise expression). For example, only 7% complete surprise displays were observed in the study by Camras et al. (2002), and only 3.5% were observed in the study by Scherer et al. (2004). Furthermore, in two studies, surprise displays occurred with the same (low) frequency in the surprise trial as in a presumably neutral baseline period (Camras et al., 2002; Scherer et al., 2004); and in one study, they occurred as often to surprising events as to events that were not intended to elicit surprise (arm restraint and the approach of a stranger; Bennett et al., 2002). Finally, one study obtained suggestive evidence for the context dependence of the link between surprise and facial expression: Blurton Jones and Konner (1971) observed about 50% eyebrow raisings in response to a clock buzzer if the clock was hidden behind an object, but they observed only approximately 5% when it was in full view. Although these results provide some support for the assumption of APT that surprise is associated with a characteristic facial display, they also suggest that this association is far from perfect. However, as mentioned earlier, APT has conceptual resources that allow this theory to explain many cases of observed emotion–face dissociations. Exploiting these resources to the maximum, proponents of APT can argue with some plausibility that the reviewed findings are inconclusive. First, at least in some studies, the surprising events may have induced other strong emotions or facial
Review of Research Studies With Children The majority of the researchers who studied surprise expressions in children investigated infants. To induce surprise, they typically used a repetition– change paradigm (i.e., a salient stimulus change after a series of no-change trials), often in combination with a “magical” event. Hiatt, Campos, and Emde (1979) and Camras, Meng, Ujiie, et al. (2002) secretly switched a hidden toy that the infants previously had repeatedly retrieved; Parrott and Gleitman (1989) changed the identity or location of the experi-
1 Following Darwin (1872/1998), we conceptualize surprise as a mental state or process, and speak of the surprise display as being caused by, and in this sense as expressing, the mental state of surprise. However, we leave it open whether the proximate mental cause of the surprise display is the feeling of surprise, the appraisal of unexpectedness that we take to be its cognitive antecedent (cf. Meyer et al., 1997), or some other surprise-related mental process or combination of processes. Surprise must be distinguished from the startle reaction elicited by sudden intense stimuli, in particular loud noises (Ekman, Friesen, & Simons, 1985; Koch, 1999). Whereas surprise is caused by the appraisal of unexpectedness, startle is a reflexlike reaction to intense sensory input.
SURPRISE AND FACIAL EXPRESSION
reactions that interfered with the surprise display. Conversely, nominally unsurprising situations in which surprise displays were seen may in fact (also) have elicited surprise (e.g., Bennett et al., 2002). Second, in all studies, the facially unresponsive children may simply not have been surprised by the eliciting events, either because they failed to attend to them or because they did not appraise them as unexpected. This possibility is difficult to exclude because most studies did not include an independent indicator of surprise, and those who did found evidence for surprise only in a subsample (e.g., Camras et al., 2002; Hiatt et al., 1979). Third, again in all studies, the experimenter, parents, or peers were present as onlookers; therefore, at least in the studies with older children, surprise expressions may have been suppressed or masked in accordance with display rules (e.g., Charlesworth, 1964). Fourth, at least in infants, the evolutionary module responsible for the surprise display may not yet be fully developed (cf. Bennett et al., 2002). When taken together, these factors may well explain the observed dissociations between surprise and its facial display in the studies with children.
Studies With Adults Studies of emotional expressions in adults allow researchers to avoid several of the problems inherent to studies with infants and small children. Nevertheless, only four studies appear to have examined adults’ facial reactions to potentially surprising events, and two of these (Ekman, 1972; Ekman, Friesen, & Ancoli, 1980) did not have surprise as their main focus. The first study was conducted by Landis (1924) as part of his pioneering if controversial research on facial expressions of emotion. Similar to the children studies reviewed earlier, only a minority of Landis’s participants showed unambiguous components of the traditional surprise display in potentially surprising situations. For example, about 30% showed eyebrow raising and about 20% eye widening when a firecracker was dropped behind their chair. However, the conclusions that can be drawn from Landis’s research are limited, as his study is open to most of the objections mentioned in the discussion of the children studies and suffers from additional problems, such as an inadequate measurement of facial reactions (see Ekman, Friesen, & Ellsworth, 1982; Woodworth & Schlosberg, 1954). Ekman (1972) coded the spontaneous facial expressions of 50 participants who, while alone, watched a stressful film and a neutral film. One hundred twenty-six pure (i.e., unblended) surprise expressions were coded during the stress film and 28 during the neutral film. However, neither the number of surprising incidents shown in the films nor the participants’ subjective reactions to these incidents were measured; therefore, the strength of association between surprise and facial expression is difficult to evaluate.2 Ekman, Friesen, and Ancoli (1980) also investigated spontaneous facial expressions during a stressful film. Although retrospective ratings suggested that some surprise was experienced during the film, facial expressions of surprise were infrequent (Ekman et al., 1980, p. 1131; the exact data were not reported). However, the film used in this study seems to have primarily elicited feelings and facial reactions of disgust that may have interfered with the expression of surprise. Therefore, this study, too, does not permit to draw firm conclusions about the relation between surprise and facial expression.
297
In contrast to these earlier investigations, Reisenzein (2000a) focused explicitly on surprise, which in this case was induced by confronting participants with unexpected solutions to selected items in a computerized quiz. Although subjective ratings and behavioral measures (reaction time [RT] delay on a parallel task) attested to the effectiveness of the surprise induction, maximally 34% of the participants showed a facial surprise display (at least one component) to any one item. Also, the observed surprise expressions were mostly one- or two-component displays. Hence, only moderate coherence between surprise and its facial expression was again found even when many potential problems of previous studies were controlled (for details, see Reisenzein, 2000a). Two possible remaining objections against this study, however, are that the intensity of surprise elicited by the unexpected quiz solutions was frequently too low to result in a facial display and that many participants inhibited their surprise expressions as the experimenter remained in the room during the quiz. Taken together, the available studies on surprise expressions in children and adults suggest three main conclusions. First, at least partial surprise displays in response to the theoretically predicted elicitors (unexpected events) have been observed in controlled laboratory situations. However, so far it has not been demonstrated that the traditional expression of surprise is shown in such situations by even the majority of surprised people. Second, most, if not all, observed cases of dissociation between surprise and facial expression can be attributed by proponents of APT to methodological problems (e.g., failure to induce surprise) or substantive factors (e.g., control of expression). Third, with the exception of the studies by Blurton Jones and Konner (1971) and Reisenzein (2000a), no attempt has so far been made to empirically examine the viability of these or other possible explanations for the observed dissociations between surprise and facial expression.
Aims and Overview of the Present Studies The experimental induction and the measurement of surprise used in the present studies was based on a cognitive–psychoevolutionary model of surprise (e.g., Meyer, Reisenzein, & Schu¨tzwohl, 1997; Reisenzein, 2000b). The core of this model, which is depicted in Figure 1, concerns the mental processes elicited by (ultimately) surprising events. According to the model, these processes begin with (a) the appraisal of a cognized event as schema-discrepant or unexpected. Disconfirmation of an explicit or an implicit expectancy has been posited by practically all surprise theorists as the primary, if not the only, cognitive elicitor of surprise and is so regarded in common sense also (see Reisenzein, 2000b; Ruffman & Keenan, 1996). The appraisal of an event as unexpected then causes (b) the occurrence of a surprise feeling and, simultaneously, the interruption of ongoing information processing and the reallocation of processing resources to the unexpected event. The function of interruption and resource reallocation is to prepare the individual for (c) the subsequent analysis and evaluation of the unexpected event plus—if the results of this analysis 2 For example, if the participants were three times surprised during the stressful film and showed a surprise display each time, the relative frequency of surprise expressions would be a high 84%. However, if they were five times surprised and showed one surprise display to a subjectively unsurprising event, the relative frequency would be a low 30%.
298
¨ RDGEN, HOLTBERND, AND MATZ REISENZEIN, BO
Figure 1. Model of surprise-related mental processes and their indicators in Experiments 1– 8. RT ⫽ reaction time.
indicate so—the updating of the relevant schemas. The first two steps in this series of mental processes are identified with the workings of the surprise mechanism proper, which is taken to be a phylogenetically old mechanism whose main evolutionary function is to monitor and update the person’s schemas or belief system in the face of unexpectedness (belief disconfirmation, schema discrepancy). Although this model of surprise shares an evolutionary focus with APT, it does not contain specific assumptions about the facial expression of surprise except for the assumption that this expression is caused by one or several of the processes posited in the model, possibly in conjunction with other mental events. In accord with the described model, surprise was induced in the present studies by first establishing and then disconfirming a schema or set of beliefs concerning the experimental events. Several variants of this paradigm were used (e.g., changing the appearance of task-irrelevant stimuli in a choice reaction task after a prolonged series of no-change trials; violating a rule that a color sequence followed after participants had detected this rule). To check the effectiveness of the surprise induction, we sought to verify the occurrence of the surprise-related mental processes postulated in the model. These processes were inferred, in different studies, as shown in Figure 1.
distractor stimuli intended to induce surprise. The effectiveness of the surprise induction was checked through self-reports of surprise and RT delay at the choice reaction task. The design of the experiment was a 5 (extremity of the unexpected stimulus change) ⫻ 2 (first vs. second presentation of the stimulus change) factorial with repeated measures on the second factor. Both manipulations were intended to influence the intensity of surprise.
Method Participants. Participants were 60 students (40 female, 20 male) at the University of Bielefeld whose mean age was 25.5 years. Thirty-two were introductory psychology students who participated as part of their study requirements. The rest were students of various other disciplines; some were volunteers, and some were paid (about $3).3 Participants were randomly assigned to the five stimulus change conditions. Procedure. The experiment was described to the participants as a visual reaction task. Stimuli were presented on a personal computer with a color monitor and a mouse. Each trial began with the presentation of a fixation point at the center of the participant’s monitor, inside a 14 cm ⫻ 8 cm frame. The screen area outside the central frame (the background) was filled with a number pattern. After 1400 ms, the fixation cross was
3
Experiment 1: Taking a First Look The experimental paradigm used in Experiment 1 was based on the work of Meyer, Niepel, Rudolph, and Schu¨tzwohl (1991). Participants worked on a choice reaction task. After a series of uniform trials that served to establish a schema for the experimental events, there was an unannounced change of appearance of the
Care was taken to ensure that no participant took part in more than one study: (a) We sampled students of different academic disciplines, different year groups, and in one case a different university (Experiment 7); (b) at the end of each study, we asked the participants whether they had ever participated in a similar study; and (c) we checked the videos for double appearances. To avoid advance suspicions that the studies dealt with surprise, they were announced under varying, inconspicuous titles (e.g., as studies on “visual perception” or on “ability to concentrate”).
SURPRISE AND FACIAL EXPRESSION replaced by two words, one above the other; 500 ms later, a small dot appeared for the duration of 100 ms either above the upper word or below the lower word. The participants’ task was to react as quickly as possible to the position of the dot by pressing the right or left mouse button. The button press caused the words to disappear, thereby ending the trial. Trials were separated by 1000 ms. RT was recorded by the computer. In Trial 38, the first surprise trial, a salient change in the appearance of the display occurred. Depending on experimental condition, this change consisted of (a) the display of one of the two words in reverse video (RV) mode, that is, as white letters against a black background; (b) a slight change of the background pattern; (c) a pronounced change of the background (broad black and white stripes); or (d) the combination of the RV display of one word with either the slight or the strong background change. Previous research using similar repetition– change paradigms found that such stimulus changes cause moderate feelings of surprise, as well as an RT delay, in most participants (e.g., Meyer et al., 1991; Schu¨tzwohl, 1998). We expected, furthermore, that at least the extremes of the stimulus changes would differ in surprisingness (cf. Schu¨tzwohl, 1998). Immediately after Trial 38, the participants were interviewed about the nature of the stimulus changes, after which they were asked “Did the stimulus changes occurring in the last trial surprise you? If yes, how strongly?” Answers were given on a rating scale ranging from 0 (not at all surprised) to 10 (as surprised as one can be). The experiment then continued for nine more standard trials that were followed, in Trial 48, by a repetition of the surprise event. Once the trials had begun, the experimenter sat at a table located behind the participant and oriented 90° away, where he or she busied himself or herself with other tasks. Thus, the situation was minimally social in character (“mere presence” paradigm; Guerin, 1986). At the end of the experiment, the participants were informed about the purpose of the study and the fact that they had been videotaped, and they were asked for permission to analyze the videos. This debriefing protocol was also followed in all subsequent studies. The participants’ expressive reactions were unobtrusively videotaped from the adjoining room (see Reisenzein, 2000a, for details). The camera took a frontal picture of the participant’s head and shoulders slightly from above. From the original video recordings, we constructed a master tape showing the two surprise trials and two randomly selected baseline trials. Each film clip began with the onset of the fixation cross and ended with the offset of the distractor words following the participant’s reaction. The clips were coded by two independent observers who were unaware of the identity of the trials (baseline or surprise) and of the aims of the experiment. Each clip was shown twice before it was coded. A present/absent coding scheme with eight categories was used. Four categories referred to facial or vocal surprise displays. A full facial display of surprise was defined to consist of three components (see Darwin, 1872/1998; Ekman, Friesen, & Hager, 2002): raising of the eyebrows (action units [AUs] AU1–AU2 of Ekman et al.’s [2002] Facial Action Coding System), widening of the eyes (accomplished by raising of the upper eyelid; AU5), and jaw drop/opening of the mouth (AUs 26 –27). Each facial expression component was coded as present if it occurred within the coding time window (on average 2.7 s in the first critical trial), regardless of how intense it was or how long it lasted (e.g., even the tiniest and briefest upward movement of an eyebrow was coded as “eyebrow raising present”). A fourth coding category captured surprise vocalizations such as “oh” or “wow.” Mouth opening was coded in conjunction with a surprise vocalization only if it preceded the vocalization and, thus, was apparently not simply the by-product of the vocalization. The remaining three coding categories comprised smiling (AU12) and laughter, as well as various nonverbal and verbal responses reflecting cognitive reactions to the surprising event: nodding, verbal acknowledgment (“aha”), and affirmative vocalizations (e.g., “see!”). These additional categories were suggested by Reisenzein’s (2000a) study.
299
We trained the coders using written descriptions of the coding categories, pictures of prototypical surprise expressions, and videotapes of comparatively expressive participants from the previous study by Reisenzein (2000a).
Results To reduce the effect of outliers, we fixed RTs ⬎2000 ms at this value (Fazio, 1990; Ratcliff, 1993; see also Meyer et al., 1991). RTs from false responses (12% in Trial 38 and 10% in Trial 48) were retained in the analyses. To obtain a baseline RT for each participant, we averaged the RTs of the nine trials immediately preceding Trial 38 and the nine trials between Trials 38 and 48. Surprise feelings and RTs. In Trial 38, all but one participant reported at least minimal surprise (rating ⬎0) about the stimulus change (M ⫽ 5.4, SD ⫽ 2.3). Also, all but one of the participants showed an RT increase from his or her individual baseline; for 68%, it exceeded 2.5 standard deviations. Mean RT increase from baseline was statistically significant both overall (M ⫽ 435 ms, SD ⫽ 434), t(59) ⫽ 7.75, p ⬍ .001; and for each of the five experimental groups considered separately, with ts(11) ⬎ 3.05, ps at least ⬍ .02. In Trial 48, when the stimulus changes occurred for the second time, 53 of the 60 participants still felt minimally surprised (rating ⬎ 0) although, as predicted, less so than in Trial 38 (M ⫽ 3.4, SD ⫽ 2.5), t(59) ⫽ 5.2, p ⬍ .001. RTs in Trial 48 were still significantly above baseline overall (M ⫽ 376 ms, SD ⫽ 469), t(59) ⫽ 6.20, p ⬍ .001, and within each experimental group ( p at least ⬍ .05, one-tailed).4 In four of the five groups (all but the “weak background change only” group), the RT increase in Trial 48 was less pronounced than in Trial 38, as predicted. When we excluded the exceptional group, the RT reduction from Trial 38 to Trial 48 became significant (M ⫽ ⫺188 ms), t(47) ⫽ 2.9, p ⬍ .01.5 We had also predicted differences in surprise intensity between the five experimental conditions (stimulus changes), at least between the extreme groups. In Trial 38, the obtained pattern of means conformed to predictions: Surprise ratings were lowest for the “weak background change only” condition (M ⫽ 4.5), and highest for the two background change-plus-RV word display conditions (both 5.9), with the other two conditions lying in between (5.3, 5.4). A comparison of the extreme groups confirmed the a priori hypothesis, t(34) ⫽ 1.8, p ⬍ .05 (one-tailed). The RT delays (increases from baseline) showed a parallel pattern, and again the comparison of the extreme groups was significant, t(34) ⫽ 2.1, p ⬍ .05 (one-tailed). In Trial 48, the differences between experimental groups in self-reports of surprise and (if the 4 Guided by the principle that statistical tests should fit research hypotheses as closely as possible, we report test results as significant if they meet the 5% criterion for a two-tailed test or at least for a one-tailed test in the case of a priori, directed hypotheses (see Furr & Rosenthal, 2003). Onetailed tests are marked as such. 5 The “weak background change only” group showed an unexpected RT increase compared with Trial 38, t(11) ⫽ ⫺2.9, p ⬍ .05. Additional analyses suggested that most participants in this group could not identify the exact nature of the background change when it first occurred and therefore delayed their button press (which made the stimuli disappear) in Trial 48 to take a closer look.
300
¨ RDGEN, HOLTBERND, AND MATZ REISENZEIN, BO
“weak background change only” group is excluded) in RT delay were no longer significant. Facial displays. Because of equipment breakdown, the video recordings of three participants were missing. Of the remaining 57 participants, 3 (5.3%) showed one component of the facial surprise display (two brow raises, one eye widening) in the first surprise trial. No surprise expressions were coded in the second critical trial, and one (brow raising) was coded in the baseline trials. Smiling/laughter occurred once in the first and three times in the second surprise trial. Agreement between the two coders was perfect for the surprise displays; there was one disagreement for smiling. Satisfactory reliability of the coding system was also documented in Reisenzein’s (2000a) study, where a higher number of facial expressions were available for reliability estimation. To make sure that the coding interval in the critical trials had not been too short, the original tape recordings were re-examined. Again the same results were obtained.
Discussion The central finding of Experiment 1 is the extremely low incidence of facial displays of surprise, coupled with evidence for the presence of surprise from self-ratings and RT delay. Judged by the latter measures, most participants were surprised about the unexpected stimulus changes in both critical trials. By contrast, only 5% of the participants showed evidence of a surprise expression in Trial 38, and all of them showed only a single component of the expression. In addition, the manipulation of surprise intensity (achieved by varying the extremity of the stimulus changes and by repeating the stimulus changes) affected self-reports of surprise and RT delay in the predicted manner, but it did not affect facial expression.
Experiment 2: Taking a Closer Look Experiment 2 was part of a series of studies conducted to test the hypothesis that the intensity of felt surprise is influenced by the degree of mental interference caused by an unexpected event (see Reisenzein, 2000b). This experiment provided a convenient opportunity to verify the findings of Experiment 1 using (a) a somewhat different surprise paradigm that included an alternative manipulation of the intensity of experienced surprise; (b) several additional measures of the subjective experience of surprising events; and (c) a physiological indicator of surprise, pupillary dilation. Pupillary dilation is commonly listed as a component of the physiological orienting reaction that is held to be elicited by novel, unexpected, and significant events (e.g., Rohrbaugh, 1984), and it has been empirically demonstrated to occur in response to unexpected stimulus changes (e.g., Maher & Furedy, 1979). Because pupillary dilation is also a sensitive indicator of processing load or mental effort (Beatty & Lucero-Wagoner, 2000), we hypothesized that its occurrence in response to unexpected events reflects (directly) the exploration and analysis of these events (Meyer et al., 1997; see also Experiment 5). Klix, van der Meer, and Preuß (1984) even proposed that the dilation of the pupil in response to mental load may reflect an evolutionary adaptation to surprising events that occurred in the twilight, where pupillary dilation can presumably improve vision. Pupil size was determined
in Experiment 2 from video recordings of one eye. The magnified recording of the eye region needed for this purpose simultaneously allowed (d) an excellent observation of even slight movements of the eyebrow and eyelid. In this way, Experiment 2 provided for an additional methodological control of the results of Experiment 1.
Method Participants. The final sample consisted of 25 students (13 female, 12 male) from the same pool as in Experiment 1, who were between 20 and 28 years old. Four additional participants were excluded from the data analyses, three because of missing data (e.g., too-dark eyes), and one because she did not comply with the instructions. The participants were randomly assigned to one of two experimental conditions (high vs. low task difficulty). Procedure. The experiment was described as a study on pupil size changes during text comprehension. The participant was asked to put his or her head on a chin rest in front of a computer monitor and to read a text (the beginning of a novel) presented in three-word chunks inside a small window. There were two experimental conditions differing in task difficulty. The task difficulty manipulation was intended as an alternative method to influence the intensity of felt surprise, as we previously found that an unexpected event is experienced as more surprising if it interferes more strongly with ongoing activities (see Reisenzein, 2000b). In the difficult-task condition, the text was presented quickly (each three-word chunk for 300 ms, followed by a 50-ms pause), whereas in the easy task condition, it was presented slowly (each chunk for 900 ms, followed by a 300-ms pause). The potentially surprising event occurred immediately after the last word of the text. It consisted of a computer-generated random tone sequence of 2.5-s duration and the simultaneous appearance of a meaningless sequence of ASCII signs within the text window, after which the program abruptly terminated. Thus, the overall appearance of the stimulus changes was rather like that of a computer breakdown. A few seconds later, the participants were presented with a set of index cards, each of which showed a question and a corresponding rating scale. The questions asked, among other things, for felt surprise and for the degree of interference and distraction caused by the surprising event, and they were answered on scales ranging from 0 (not at all) to 100 (extremely). In addition, as an alternative measurement of experienced surprise, we asked the participants to graph the time course of their surprise feelings on a Time (from the onset of the surprise event until 4.7 s later in 100 ms) ⫻ Intensity (0 [not at all surprised] to 100 [extremely surprised]) grid sheet (cf. Sonnemans & Frijda, 1994). These “surprise curves” were later digitized with the help of a graphic tablet. During the experiment, the right eye of the participant including the eyebrow region was filmed with a video camera fitted with a strong zoom lens. This allowed us to record pupil size changes as well as movements of the brow and eyelid. Other facial movements were not considered in this study. The video recordings were later manually scored for pupil size changes in 400-ms intervals at several points before and until 2.4 s after the onset of the surprise event. To this end, we replayed the video on a large TV monitor and froze the picture at each measurement point. We then fitted transparencies with different-sized measurement rings (1 mm apart in diameter) over the enlarged pupil until a best fit was achieved, and we recorded the corresponding diameter. If a blink occurred at the selected measurement point, we scored the nearest previous or following frame showing a clear image of the pupil. To estimate the reliability of the measurements, a second coder repeated the measurements for three participants. The agreement of the measurements was nearly perfect, interrater r ⫽ .96. Eye widening and brow raising were coded by a different observer in the interval beginning with the appearance of the tone and ending with the beginning of the ratings. Slow-motion display and repeated viewing were used. In view of the finding in Experiment 1 that only one instance of brow
SURPRISE AND FACIAL EXPRESSION raising was observed during the two coded baseline trials, only the facial reactions to the surprise event were coded.
Results Feelings of surprise and interference. According to the direct surprise ratings, 24 of the 25 participants were at least minimally surprised (rating ⬎0; M ⫽ 38, SD ⫽ 22, Mdn ⫽ 30). Similarly, the average of the individual maxima of the Time ⫻ Intensity “surprise curves” drawn by the participants was 47 on the 100-point scale (SD ⫽ 23, Mdn ⫽ 49). Substantial correlations of the direct surprise ratings with key parameters of the graphs (up to r ⫽ .88, obtained for the individual maxima) suggest that the measurement of surprise was reliable. Further confirming the effectiveness of the surprise induction on the experiential level, 19 (76%) of the 25 participants said that the surprise event distracted them at least minimally (rating ⬎0), and 16 (64%) said the event interfered at least minimally with the processing of the text. Pupillary dilation. The mean of the measurements during the text reading and immediately before the surprise event served as a baseline for each participant. The pupil began to dilate after the surprise event, reaching its maximum 400 – 800 ms after the beginning of the tone sequence, after which it declined again and reached near-baseline at the last measurement point (2.4 s later). Paired t test (df ⫽ 24) comparisons with the baseline showed that the increase in pupil size was significant (ts at least ⬎1.9; ps at least ⬍.05, one-tailed), with the exception of the last two measurement points. All participants showed some degree of pupil dilation in response to the surprise event, and 11 (44%) showed an increase ⬎2.5 standard deviations from their individual baseline. Facial displays. No eyebrow raises were observed in response to the surprise event, but 1 of the 25 participants showed eye widening. Effects of the experimental manipulation. In accord with the hypothesis that the intensity of felt surprise is influenced by the degree of interference caused by an unexpected event, participants in the difficult-task condition felt more surprised than those in the easy-task condition. We obtained this effect for both the direct surprise ratings, t(23) ⫽ 1.9, p ⬍ .05 (one-tailed), and the surprise graphs, which were significantly higher in the difficult-task group from the second to the seventh measurement interval, ts(23) ⬎ 2.6, ps ⬍ .05. The participants also felt more strongly interrupted by the surprise event in the difficult task condition, t(23) ⫽ 1.8, p ⬍ .05 (one-tailed). In contrast, the single case of eye widening was observed in the easy-task condition. The experimental manipulation also had no significant effects on pupil dilation in response to the unexpected event. This result is in line with the suggestion, made in the introduction to this study, that pupillary dilation mainly reflects processes concerned with the analysis of the unexpected event; for presumably these processes were similarly demanding in both experimental conditions.
Discussion Experiment 2 replicated the findings of Experiment 1 and added to them in three ways. First, the results of Experiment 1 were confirmed for a somewhat different surprise paradigm that included a different manipulation of surprise intensity. Second, we confirmed the
301
effectiveness of the surprise induction using a different method of measuring felt surprise (the drawing of a surprise curve), ratings of experienced interference and distraction, and pupillary dilation. Third, because of the magnified recording of the eye region, even very slight facial movements could be detected, ruling out another possible source of bias in Experiment 1. In contrast to some of the earlier studies (cf. the introduction), the strong dissociation between surprise and facial expression observed in Experiments 1 and 2 cannot be plausibly attributed to interfering muscular movements (practically none occurred), suboptimal recording or coding of (visible) facial expressions (see Experiment 2 in particular), or obvious measurement problems concerning the indicators of surprise (e.g., failure to relate the self-report questions to a clearly specified event, or substantially delayed self-reports). Nor do the findings seem to be attributable to a failure to induce surprise in many participants: Clear instances of the theoretically posited elicitors of surprise (unexpected events) were focally presented, and both self-reports and behavioral measures indicated that the surprise induction was successful for most participants. Hence, purely methodological explanations of the failure to observe a (visible) surprise display in most participants of Experiments 1 and 2 seemed implausible.6 In Experiment 3, we therefore examined in more detail the two substantive explanations of the findings of Experiments 1 and 2 available to APT: inhibition of facial displays and insufficient surprise intensity.
Experiment 3: Reducing Sociality, Increasing Intensity The first substantive explanation that can be offered by APT for the findings of Experiments 1 and 2 is that motor-expressive tendencies (signals to the face; Ekman, 1997) were elicited but were suppressed in an attempt to conform to internalized social or personal norms concerning appropriate expressive behavior (i.e., display rules). We tested this hypothesis in Study 3 by leaving half of the participants alone during the experiment. Although this manipulation of sociality still fails to rule out the presence of an “implicit audience” (e.g., Chovil, 1991; Fridlund, 1994), this is not required for a test of the display-rules hypothesis. That is, to test this hypothesis, one need not assume that display rules have no effect in solitary situations; only that their effect is reduced. APT predicts this to occur because the stimuli that presumably activate display rules (other people) are not present in solitary situations or are present only in symbolic form (see also Chovil, 1991). The second substantive explanation of the low incidence of surprise displays observed in Experiments 1–2 available to APT is this: A surprise display occurs only if surprise exceeds a certain threshold of intensity, and the facially nonreactive participants of Experiments 1 and 2 were not surprised enough (cf. Ekman, 1997; Tassinary & Cacioppo, 1992). This insufficient-intensity hypothesis covers two possibilities (see also Larsen, Norris, & Cacioppo, 2003): (a) The intensity of surprise was too low to cause a visible display, although invisible muscle changes did occur; (b) surprise 6
To examine the possibility that the self-reports of surprise were influenced by experimental demand characteristics (Orne, 1962), we conducted two additional studies (details are available from Rainer Reisenzein). No evidence for demand effects was obtained.
302
¨ RDGEN, HOLTBERND, AND MATZ REISENZEIN, BO
was too weak to even elicit a motor signal to the face. To test the insufficient-intensity hypothesis, one can increase either the intensity of surprise or the sensitivity of facial measurement (by using electromyographic [EMG] recordings; cf. Tassinary & Cacioppo, 1992). In Experiment 3, we used the first method (increasing the intensity of surprise). This allowed us to test both versions of the insufficient-intensity hypothesis simultaneously: If surprise is strong enough, both the threshold of elicitation of the motor signal and the threshold of visibility of the resulting muscle movements should be exceeded. The method of surprise induction used in Experiment 3 was inspired by a study by Horstmann and Schu¨tzwohl (1998), who found that strong surprise can be induced by first making the participants believe that the stimulus events follow an invariable rule and then disconfirming this rule. This experimental paradigm also provided for an unobtrusive behavioral index of the participants’ expectations of the events occurring in the upcoming trials.
Method Participants. The final sample consisted of 22 students (12 female, 10 male) from the same pool as in Experiments 1 and 2, whose mean age was 23.2 years. The participants were randomly assigned to one of two experimental conditions (social vs. alone). Procedure. Similar to Experiment 1, the participants worked on a choice RT task in which they had to react as quickly as possible to the position of a dot. However, in Experiment 3, the choice RT task was embedded into a second, rule-detection task: The dot appeared on the screen above or below a bar that was either red, green, blue, or yellow. The participants were told that the sequence of colors across trials followed an experimenter-specified rule and that their second task was to detect this rule. Beginning with Trial 9, they had to predict the color of the bar in the following trial. Although the rule underlying the color sequence was simple (red-green-blue-yellow), its detection was not trivial, because the colors had to be memorized and because the choice reaction task imposed an additional mental load. Pretesting suggested that 25–30 trials were needed to detect the rule. On this basis, 40 trials were presented. The first 39 trials consisted of nine repetitions of the four-color sequence plus its 10th repetition up to the next-to-last color. In Trial 40, the surprise trial, a black bar was presented instead of the yellow bar predicted by the rule. Immediately after Trial 40, the participants were asked to rate how surprised they felt about the occurrence of the black bar on the scale already used in Experiment 1. Two experimental conditions were compared. In the (minimally) social condition (n ⫽ 10), the experimenter remained in the room during the experiment. In the alone condition (n ⫽ 12), the experimenter left the room after the first two or three trials and returned only after the surprise trial.
Results and Discussion Rule detection (pre-event expectancies). The correct prediction of the entire color sequence immediately before the critical Trial 40 was used as the criterion for rule detection. According to this criterion, all but two 2 participants had detected the rule by Trial 39. On this basis, it can be assumed that the subsequent appearance of the black bar was rule-discrepant for nearly everybody. In addition, the black bar was also visually discrepant to the preceding colors (cf. Experiments 1–2). Surprise feelings and RT delay. Replicating Horstmann and Schu¨tzwohl (1998), most participants rated themselves as strongly surprised by the appearance of the black bar: On the scale ranging
from not at all surprised (0) to as surprised as one can be (10), all but three participants (86%) gave ratings ⱖ 6, and 50% gave ratings ⱖ 8 (M ⫽ 7.0, SD ⫽ 2.2). t-test comparisons showed that this mean was significantly higher than in Experiments 1 and 2 ( ps ⬍ .01). We obtained no significant difference in felt surprise between the social group and the alone group, t(20) ⬍ 1. The RT data were analyzed analogously to Experiment 1. We found a significant RT increase from baseline in Trial 40 in both the social group, t(9) ⫽ 3.7, p ⬍ .01; and in the alone group, t(11) ⫽ 5.2, p ⬍ .001. All but two participants showed an RT increase from their individual baseline, and 13 (59%) showed an increase ⬎2.5 SD (M ⫽ 435 ms; SD ⫽ 434). We found no significant difference between the experimental groups in RT increase, t(20) ⫽ 1.1, p ⫽ .28. Facial displays. Brow raising occurred in 2 (9%) of the 22 participants (11% of those with surprise ratings ⱖ6). Both belonged to the social condition, but this difference is not significant (Fisher exact probability test). Other components of the surprise display were not observed. The obtained frequency of surprise displays is not significantly different from that in Experiment 1 (Fisher exact probability test, p ⫽ .43) or from that in Experiments 1 and 2 combined ( p ⫽ .39). Two additional participants in the social condition reacted with a frown in response to the unexpected color change, possibly reflecting puzzlement (Darwin, 1872/1998).
Conclusions Experiment 3 revealed that the incidence of surprise displays in an alone condition did not differ significantly from that in a (minimally) social condition. This finding speaks against the hypothesis that the low frequency of surprise expressions in this study, as well as in Experiments 1 and 2, was due to inhibition or masking. Experiment 3 also revealed that even an event rated as highly surprising by most participants did not result in significantly greater expressivity. This speaks against the insufficientintensity hypothesis.
Experiment 4: Increasing the Duration and Complexity of the Surprise Event and Examining Beliefs About Expression If the conclusions drawn from Experiment 3 are correct, then the APT of facial expressions of surprise in its original form is untenable. However, this does not mean that the more general assumption of this theory—that there is an evolutionary link between surprise and facial expression— has to be discarded as well. It is possible to expand or modify APT in ways that preserve this central idea but provide for new conceptual resources to explain the dissociation findings (whether researchers should still call these modifications “variants of APT” is another question). The general principle underlying these modifications is to make emotional displays (here the surprise display) depend on factors in addition to the presence of the emotion and the absence of deliberate control (Reisenzein, Meyer, & Schu¨tzwohl, 1996; see also Ekman, 1993, for a number of suggestions in this direction). This move necessarily weakens the link between emotion and expression, but in contrast to more radical alternatives (e.g., Fridlund,
SURPRISE AND FACIAL EXPRESSION
1994), it does not completely sever this link. In Experiments 4 – 6, we tested several possible modifications of the APT of surprise. In Experiment 4, we tested what is perhaps the most conservative modification of APT. Precisely speaking, this modification covers a whole set of hypotheses which have in common the assumption that they view duration as a critical variable for the elicitation of the surprise display: It is assumed that surprise manifests itself in the face only given some minimal duration of (a) the event that causes a schema discrepancy, (b) the schema discrepancy itself (i.e., the time needed for resolving the discrepancy and for schema update), or (c) the feeling of surprise caused by the schema discrepancy. Assuming that any one of these time spans is critical for a surprise display to occur, it may have been too brief in the preceding studies. To test this hypothesis, we simultaneously varied the duration and complexity of the surprising event. In addition to this main goal, we had two other aims in Experiment 4. First, we wanted to provide further evidence for the presence of surprise in our participants. For this purpose, in Experiment 4 we included a fairly comprehensive set of self-report measures tapping different aspects of the subjective experience of surprising events that are predicted to occur by the model described in the introduction. Second, we queried what the participants themselves thought about their facial displays.
Method Participants. The participants were 22 students (14 female, 8 male) from the same pool as in the previous experiments, with a mean age of 23.9 years. They were randomly assigned to one of two experimental conditions (short/simple vs. extended/complex). Procedure. A short-term memory paradigm served as the parallel task during which participants were distracted by a surprising event. The memory task comprised 54 trials. Each trial began with the presentation of a fixation cross at the center of the computer monitor. Five hundred milliseconds later, seven different, randomly selected consonants were simultaneously shown for 4 s. This was followed by a rehearsal period of 4 s, symbolized on the monitor by a 20-step countdown that began with the number 20 and ended with a question mark in place of the zero. The numbers were presented successively at the center of the screen for 200 ms, and each number was accompanied by a brief but fairly loud tone. The participants were asked to memorize the letters and to report back as many as they could when the question mark appeared. The recalled letters were noted down by the experimenter. There were five critical trials: Trials 20, 28, 34, 50, and 54. The potentially surprising event consisted of several of the countdown numbers being shown in reverse video mode (white against black), a one-octave pitch change of the accompanying tones, or both stimulus changes combined. In one of the two experimental conditions, the surprise event was kept short (affecting Steps 18 –15 of the countdown ⫽ 800 ms) and configurally simple. In the other condition, the stimulus changes were more extended (affecting several steps of the countdown beginning with the first and ending with the last ⫽ 4 s) than those in the short/simple condition and in the previous experiments (on average, 1.6 s in Experiment 1 and 2.5 s in Experiment 2), and they deviated in a complex, unpredictable way from the audiovisual pattern presented during the baseline. We reasoned that this complex deviation might be less readily accommodated to the previously established schema. After each surprise trial, the participants were presented with 11 index cards asking for the occurrence of surprise and surprise-related processes postulated in the surprise model described in the introduction (e.g., interference, attention capture, and forgetting of the letters caused by the
303
surprising event). In addition, as in Experiment 2, the participants were again asked to graph the course of their surprise feeling across time. Given the dearth of surprise displays observed in Experiments 1–3, we were curious what the participants themselves thought about their facial reactions to the surprise event. Therefore, after the first two questions (asking for experienced surprise and interference), they were also asked the following: “Do you think your surprise showed on your face? (yes/no). If yes, how did it show? (My eyebrows went up; my eyes widened; my jaw dropped/my mouth went open; I blinked; other)”.
Results and Discussion Surprise phenomenology and memory performance. Results concerning the experience of the surprise event replicated and extended those obtained in the previous experiments. The mean surprise rating in the first surprise trial was 53; all participants were at least minimally surprised (rating ⬎ 0), and 68% had ratings ⬎ 50. A 2 (experimental condition: long/complex vs. short/simple) ⫻ 5 (repetition of the surprise event) analysis of variance (ANOVA) with repeated measures on the second factor revealed that surprise intensity declined significantly across the critical trials from M ⫽ 53 in the first trial to M ⫽ 13 in the third trial, after which it remained at this level, F(4, 80) ⫽ 26.1, p ⬍ .001, Huynh–Feldt ε ⫽ .81. In contrast, experimental condition (short/simple vs. long/complex surprise event) had no significant effect, and the interaction was also nonsignificant (Fs ⬍ 1). Parallel results were obtained in the first surprise trial for the maxima of the individual surprise curves (M ⫽ 67), as well as for the ratings of experienced interference (M ⫽ 47), distraction (50), confusion (43), attention capture (50), and forgetting (52). In all cases, the ratings declined significantly with the repetition of the stimulus changes (all repetition effects were significant at p ⬍ .01 or better) and reached a low bottom level at around the third repetition. The effects of experimental condition and the interaction effects were nonsignificant. In contrast to expectations, there was also no significant effect of experimental condition on the estimated duration of surprise (M ⫽ 1.14 s according to the direct rating), although this measure, too, declined significantly with the repetition of the surprise event, F(4, 80) ⫽ 7.3, p ⬍ .01. Finally, startle [German: “erschreckt”] was rated as low (M ⫽ 28 in the first critical trial). The effectiveness of the surprise induction was additionally confirmed by the analysis of the expectedness ratings (“Did you expect a change of stimulus presentation in this trial?”). In the first critical trial, the participants clearly had not expected the surprise event to happen (M ⫽ 10), whereas in the second and in the following critical trials, the mean expectancy ratings ranged between 60 and 70. Finally, after the first occurrence of the surprise event, 36% of the participants said that they wondered about what had happened, and 70% said that they spontaneously inferred that the event was a part of the experiment. The participants’ reports about forgetting, interference, attention capture, and distraction were partly supported by their performance at the memory task. In the first surprise trial, memory performance decreased from an average of 66% correctly recalled letters during a baseline period (the 10 trials prior to the first surprise trial and all trials between the critical trials) to 57% correct, t(21) ⫽ 2.0, p ⬍ .05 (one-tailed). Fifteen (68%) of the participants showed a performance decline. In the second and the
¨ RDGEN, HOLTBERND, AND MATZ REISENZEIN, BO
304
subsequent critical trials, the performance decline was no longer significant. Facial displays. Two (9%) of the 22 participants showed a component of the surprise display (one eyebrow raising, one eye widening) in the first surprise trial, and two more showed eyebrow raising to, respectively, the second and fourth repetition of the surprise event. Of the additional coding categories, only smiling/ laughter was observed, which occurred in 11 (50%) of the participants in the first and twice each in the second, third, and fourth critical trial. This may reflect that the participants found the surprise event more amusing than in the previous experiments, that the level of sociality was higher, or both (cf. Hess et al., 1995). Beliefs about expression. In striking contrast to the objective codings, 77% of the participants in the first surprise trial believed that they had shown one or more components of the facial surprise display (59% eyebrow raising, 55% eye widening, 14% mouth opening). In addition, 36% thought that they had blinked, and 5% reported that they had shown other movements in response to the surprising event, although this, too, was not confirmed by the video codings. In the second to fifth critical trial, the percentage of participants who believed that they had shown at least one component of the surprise display (of those who still felt surprised) was 46%, 32%, 45%, and again 32%, respectively. The effect of repetition was significant, F(4, 80) ⫽ 4.73, p ⬍ .01, Huynh-Feldt ε ⫽ 1.0, whereas the effect of experimental condition (short/simple vs. long/complex) was not, F(1, 20) ⫽ 1.39, p ⫽ .25, as was the interaction (F ⬍ 1).
by others (e.g., Andrew, 1963; Fridlund, 1994; see also the reviews and discussions in Ekman, 1979; Smith & Scott, 1997). On the basis of these considerations, it appears possible that facial displays of surprise occur preferably if the localization and direct investigation of the surprise-eliciting event require a visual search including rapid reorientation of the eyes, head, or body toward the event. In support of this hypothesis, Blurton Jones and Konner (1971) found that brow raises in children in response to a clock buzzer that suddenly went off during story time were more frequent if the clock was hidden behind some object than when it was clearly in view. This hypothesis, which can be regarded as another modification of APT (but see the General Discussion), was reexamined for adults in Experiment 5. In a manner similar to that in Experiment 4, the participants worked on a memory task in which they were surprised by an unannounced sequence of tones. However, in contrast to Experiment 4, the tones were played through a loudspeaker located to the right and above the eye level of the participants. As a consequence, eye and head movements to the right and slightly upward were necessary to visually explore the sound source in an optimal way. We also attempted to influence (facilitate vs. inhibit) this visual exploration tendency experimentally (see Method). An additional goal of Experiment 5 was to verify the findings of Experiment 4 concerning participants’ beliefs about their surprise displays.
Method Conclusions Experiment 4 added to the previous experiments in three ways. First, the insufficient-duration hypothesis was not supported: There was no evidence for a greater probability of facial displays of surprise in response to a longer and more complex unexpected event. Second, we obtained additional support for the presence of surprise in most participants: Particularly in the first surprise trial, the participants reported considerable interference, distraction, confusion, attention capture, and forgetting of task material induced by the surprising event. This finding was verified by an objective decline of performance on the memory task. In addition, the participants rated the surprising event in the first surprise trial as highly unexpected, and they reported the occurrence of investigative processes and causal attributions. Third, we found that despite the low incidence of visible facial surprise expression, the majority of the participants believed that they had shown at least one component of the surprise display. We delay discussion of this—surprising—finding to Experiment 5.
Experiment 5: Embedding Surprise Displays Into Orienting Movements Today, the most widely assumed evolutionary function of the facial displays associated with emotions is communication to conspecifics (e.g., Ekman, 1997; cf. Fridlund, 1994). In contrast, Darwin (1872/1998) proposed that emotional displays evolved primarily because of their nonsocial functions. With respect to surprise, Darwin suggested that eye widening and eyebrow raising evolved primarily to aid the rapid localization and visual investigation of an unexpected event. Similar proposals have been made
Participants. Participants were 20 students (13 female, 7 male) with a mean age of 23.7 years from the same pool as in the previous experiments. They were randomly assigned to the two experimental conditions (facilitation vs. inhibition of the visual exploration tendency). Procedure. A short-term memory paradigm similar to that in Experiment 4 was used. The memory task comprised 20 trials, with the surprise event occurring in the last trial. In contrast to Experiment 5, the surprise event was exclusively auditory in nature: It consisted of an irregular sequence of 20 high and low tones, which were played during the 20-step rehearsal countdown through a small loudspeaker located approximately 60° to the right of the participant and 30° above eye level. The speaker was partly hidden behind a wall poster to make the detection and exploration of the sound source more difficult. Also in contrast to Experiment 4, no tones were played during the baseline trials so that the location of the speaker would not be revealed prematurely. The experimental manipulation of the visual exploration tendency was based on the assumption that the surprising event would elicit two different investigatory tendencies: a nonsocial one (visual exploration of the sound source) and a social one (asking the experimenter about the significance of the event). In the facilitation condition, the experimenter sat at a table to the right of the participant and, hence, in a direction congruent with the location of the hidden speaker. In the inhibition condition, the experimenter sat to the left of the participant. As a consequence, in the facilitation condition, the movements suggested by the nonsocial exploration tendency were compatible with those suggested by the social one, in that both involved turning to the right. In contrast, in the inhibition condition, the two exploratory tendencies suggested incompatible movements. Immediately after the surprise trial, the participants were presented with three index cards asking for the intensity of felt surprise, perceived facial changes, and the duration of felt surprise. Behaviors were coded as before, but two new categories were added: eye or head movements toward the loudspeaker and turning to or asking the experimenter about the significance of the event.
SURPRISE AND FACIAL EXPRESSION
Results and Discussion With the exception of estimated surprise duration, there were no significant differences between the two experimental conditions; therefore, experimental condition is ignored. Experience of surprise; interference with the parallel task. Mean intensity of felt surprise was 61.3 (SD ⫽ 21.6, Mdn ⫽ 61), with all but one participant checking numbers ⬎ 0 and all but two checking numbers ⱖ 50 on the 100-point scale. Estimated duration of surprise was 1.8 s (SD ⫽ 1.1). Memory performance decreased from an average of 79% correctly recalled letters during baseline (the 10 trials before the critical one) to 59% correct in the surprise trial, t(19) ⫽ 3.74, p ⬍ .001, with 15 (75%) of the 20 participants showing a performance decrement. Investigative activities. In line with expectations, the majority of the participants (12, or 60%) showed at least one of the two hypothesized investigative activities. Seven participants looked to the source of the surprise event. Eleven turned to and looked at the experimenter; nine of these asked the experimenter about the significance of the tones. Six participants showed both behaviors. Facial displays. We observed one instance each of eyebrow raising and mouth opening and three cases of eye widening, each shown by a different participant; hence, 5 participants (25%) showed a component of the surprise display. This percentage is significantly higher (Fisher exact probability test, p ⬍ .05) than that observed in Experiments 1– 4 combined (8 of 124 participants, or 6.3%), although only marginally higher ( p ⬍ .09, one-tailed) than in Experiment 4 (9%). In addition, the surprise displays were preferentially shown by participants who looked to the loudspeaker: 4 of these 7 participants showed a facial component of surprise, as compared with 1 of 13 who did not look to the speaker; Fisher exact probability test, p ⬍ .05. Smiling or laughter (mostly the former) was observed in most participants (75%). All participants who showed a surprise expression also smiled, but in each case the smiling occurred only 2-3 s after the surprise display. Therefore, it is unlikely that smiling interfered with the surprise display in the other participants. Beliefs about expression. Closely replicating the findings of Experiment 4, 16 (80%) of the participants said they believed that their surprise had shown on the face in one or more of the following forms: eyebrow raising (60%), eye widening (45%), jaw drop or mouth opening (30%). To aid the interpretation of the self-reports, additionally we asked the last 17 participants whether other people would have noticed their facial expressions if they had closely watched. Thirteen of these participants believed that they had shown a surprise display; 12 of them said that others could have noticed.
Conclusions Experiment 5 yielded two main findings. First, in line with the visual exploration hypothesis, the frequency of surprise displays increased significantly compared to the previous studies if the direct exploration of the surprising event required visual search. This result replicates the findings of Blurton Jones and Konner (1971) for adults. However, the incidence of surprise displays was still small (25%) and only one-component displays were observed. In evaluating these results, it must be considered that the attempt
305
to instigate visual search was only partly successful. When visual exploration occurred, surprise displays were more frequent (57%). Second, Experiment 5 replicated and extended the results from Experiment 4 of a dissociation between displays of surprise and participants’ beliefs about their displays. Theoretically, this finding can mean two things. First, it could mean that most participants reacted with minute, invisible surprise expressions to the stimulus changes (cf. Tassinary & Cacioppo, 1992), and people are sensitive observers of even such invisible expressions. Second, it could mean that the participants’ self-reports were based on a different source of information. What comes to mind here are, in particular, generalized beliefs about the association between surprise and expression (cf. Ekman et al., 1987). The finding of Experiment 5, that participants believed that their surprise expressions were visible to others, supports the second explanation. At least, participants believed that their facial expressions were more intense than they in fact were.
Experiment 6: Generalization to a Different Surprise Event The main aim of Experiment 6 was to test whether the findings of the previous studies are restricted to surprising events of the kind staged in these studies (simple audiovisual changes) or can be generalized to other kinds of surprising events. In choosing a generalization event, we wanted in particular to meet any remaining concerns that the surprise induced in the previous studies, including Experiment 3, was still not intense enough to elicit a facial display. To address this concern, we staged a surprise situation that seemed intuitively powerful but still permitted strict experimental control: Participants were secretly photographed while they rated a series of pictures of faces on the monitor, and their own picture was presented to them as the last in the series. In a pretest, where we described this event to 33 participants and asked them to estimate their likely reactions, it received not only high surprise ratings, but most participants also believed that their surprise would show strongly on their face (rating of M ⫽ 70 on a 0 –100-point scale). The pretest also suggested that the described surprising event would be experienced as highly amusing, and thus as a pleasant surprise. Furthermore, confrontation with one’s own face is held by some authors to be a powerful social stimulus (e.g., Wicklund & Frey, 1980), that should therefore have higher personal relevance than the surprise events staged in the preceding studies. The second aim of Experiment 6 was to retest the display-rules hypothesis (cf. Experiment 3), again by varying the level of sociality. We used this opportunity to simultaneously test yet another possible modification of APT: that the surprise display occurs only when the level of sociality is high (more detail is given in the Method section). Finally, to further clarify the findings of Experiments 4 and 5 concerning beliefs about facial expression, we asked the participants of Experiment 6 to rate the intensity with which surprise had shown on their face and to describe the facial changes in their own words.
306
¨ RDGEN, HOLTBERND, AND MATZ REISENZEIN, BO
Method Participants. Participants were again 23 students (13 female, 10 male) at the University of Bielefeld, whose mean age was 25.3 years. They were randomly assigned to one of two experimental conditions, with 11 in the nonsocial and 12 in the social condition. Seven additional participants had to be excluded from the data analyses, 3 because they did not recognize themselves on the monitor and 4 because of equipment problems. Procedure. Participants were told that the goal of the experiment was to test whether photographs of faces are judged differently when presented in different media. In the first phase of the experiment, they judged 16 pictures of faces presented on the monitor. To ensure that the participants paid attention to the identity of the depicted person, they first indicated via a keypress whether or not the face appeared familiar to them. Subsequently, they rated the person on two trait scales (conscientious and well balanced). The photographs were black-and-white and color pictures of faces showing mostly a neutral expression and included a few pictures of well-known politicians and actors. A spy camera hidden in a book case next to the monitor transmitted an image of the participant’s face to the adjoining room. There, a confederate made a photograph of the participant’s face from the incoming video stream with the help of video capturing software and edited the picture to make it similar in appearance to the other photographs. For the second phase of the experiment, the participants were seated at a different table, where they judged 16 printed photographs on the same scales. This allowed the experimenter to unobtrusively download the participant’s picture to the experimental computer. In the third phase of the experiment, the participants judged another series of faces on the monitor. The seventh and last face of this series was their own. Subsequently, they completed a set of rating scales and questions. In addition to items asking for surprise and surprise-related processes, these included 14 emotion or mood items (e.g., happy, angry, startled, nervous, embarrassed, tired, wakeful). Sociality was manipulated as follows: In the alone condition, the experimenter left the room after the instruction and returned only when the participant, after the first and second series of photographs, pressed a signal button. The experimenter then briefly explained the next phase and left again. After the surprise trial, he waited for 30 seconds before he returned to the room. In the social condition, the experimenter remained in the room throughout the experiment. However, different from the manipulation of sociality used in Experiment 3, he sat to the side of the monitor table facing the participant, and noted down his or her picture ratings, that had to be made verbally in this condition. This established a continuous face-to-face interaction between experimenter and participant.
Results and Discussion Evidence for surprise. Sociality had no significant effect on most variables and will therefore only be mentioned for the exceptional cases. Most participants judged the appearance of their own face on the monitor as very surprising, with all 23 scoring ⱖ30 on the 0 (not at all surprised) to 100 (extremely surprised) scale, and 16 ⱖ 60; M ⫽ 69 (SD ⫽ 21, Mdn ⫽ 70). Similar results were obtained on scales asking for astonishment (M ⫽ 67) and amazement (M ⫽ 66). To obtain more information on the meaning of the surprise ratings, the participants were also asked to recall a “highly surprising” event from their past and to compare it with the experimental event. On average, participants judged the intensity of surprise caused by the experimental event to be 70% of that of the recalled event (SD ⫽ 38); 10 gave percentage scores ⱖ70% and 5 ⬎ 100%. The effectiveness of the surprise induction was further supported by the ratings of attention capture (M ⫽ 72) and
confusion (55), and by reports about spontaneous explanatory search (87% of the participants). Also, the participants typically had not expected that anything unusual would happen during the experiment, M ⫽ 31. The average estimated duration of surprise was 4 s (with 8 responses in the “⬎5-s” category fixed at 6 s). The surprising effect of the appearance of one’s own face was also reflected behaviorally: (a) The response to the “familiar face” question was significantly retarded relative to baseline (the average RT of the 10 pictures preceding the critical item), t(22) ⫽ 4.3, p ⬍ .001. Eighteen participants showed an RT increase from baseline; of those who did not, 2 recognized themselves only during the first trait judgment. (b) Ten of the 12 participants in the social and 4 in the nonsocial condition made spontaneous verbal exclamations suggestive of surprise, such as “hey” or “that’s me,” p ⬍ .05 (Fisher exact probability test). (c) Seventy-four percent of the participants showed evidence of either a visual search (e.g., taking a second look at the picture) or, more typically, a verbal search (asking the experimenter). Facial expression of surprise and beliefs about expression. We coded the first 10 s after the onset of the surprise event. We observed one case of eyebrow raising and one case of eyebrow raising plus mouth opening, one in the social and the other in the nonsocial condition (8.7%). Nonetheless, all participants believed that they had shown a surprise expression. In this study, we first asked them to state how strongly their surprise had shown on the face on a scale ranging from 0 (did not show at all) to 100 (showed extremely strongly). The mean rating on this scale was M ⫽ 78 (SD ⫽ 18); 19 (82%) of the participants had scores ⱖ70. Second, we asked the participants to describe in their own words how surprise had shown on the face (in a way visible to others). The most frequently named expression was smiling/laugher (34%), followed by “wide eyes” (30%; this description may have been meant to include raised eyebrows), brow raising (26%), and mouth opening/jaw drop (9%). At least one of the three classical surprise components was named by 52%. Other affects and nonverbal behaviors. As suggested by the pretest, the other strong emotion elicited by the sudden appearance of one’s own face was amusement, M ⫽ 75 (SD ⫽ 19). This was presumably also reflected in a high happiness rating on the mood questionnaire (M ⫽ 72). With the exception of a set of items concerned with wakefulness and relaxation, the means for all other emotion and moods items were low (e.g., angry ⫽ 4, embarrassed ⫽ 27, and startled ⫽ 30 on the 100-point scale). The most frequently observed facial expression was smiling/ laughter, which occurred in 22 of the 23 participants. However, as in Experiment 5, the two participants who showed a partial surprise display smiled only several seconds after this display. The intensity of the mirth expression was higher in the social condition (11 cases of laughter) than in the alone condition (5 times). If smiling is coded as 1 and laughter as 2, the difference between the means of the two conditions (1.9 vs. 1.3) is significant, t(21) ⫽ 2.6, p ⬍ .05. Because the occurrence of amusement was independently ascertained in this study, this finding suggests that the presence of the experimenter increased the tendency to express amusement (cf. Hess et al., 1995).
SURPRISE AND FACIAL EXPRESSION
Conclusion Experiment 6 tested whether the previous findings could be generalized to a different, intuitively powerful surprise situation. Confirming this intuition, the unexpected appearance of their own face on the monitor was, on average, rated as having 70% of the intensity of a recalled, highly surprising event. Nonetheless, the results concerning facial expression replicated all of the central findings of the previous experiments. First, only 2 of the 23 participants (9%) showed a surprise display, and in both cases it was partial only. This finding refutes once again the insufficientintensity hypothesis (Experiment 3). Second, the manipulation of sociality had no effect on the surprise display. This finding speaks once more against the hypothesis that surprise displays were inhibited or masked (Experiment 3); a conclusion that receives additional support from the finding that sociality did affect spontaneous verbal exclamations suggestive of surprise, as well as mirth reactions. Furthermore, the finding that even a face-to-face interaction failed to bring forth more of a surprise display speaks against yet another hypothesis: that too low a level of sociality prevented the surprise expression from occurring in the preceding studies. Third, the finding that the estimated duration of surprise (on average, 4 s) was much longer than in Experiments 4 and 5 (1.1 s and 1.8 s, respectively) speaks once more against the insufficient-duration hypothesis (cf. Experiment 4). Finally, as in Experiments 4 and 5, the participants typically believed that they had shown components of the surprise display. The frequency of spontaneously mentioned surprise components was about 30% less than that obtained by the checklist method in Experiments 4 and 5, but they were still listed by the majority. Furthermore, most participants believed that their surprise expression had been intense. This speaks further against the hypothesis that the selfreports about facial expressions were based on invisible, minute facial changes. Finally, we found that, apart from amusement, other strong emotions were not elicited by the surprising event staged in Experiment 6. However, because the amusement ratings were as high as those of surprise, proponents of APT could at this point raise the objection that the expression of surprise did not occur with higher frequency because the feeling of amusement, or the facial display occasioned thereby, overruled surprise or the associated expression. Regardless of the merits of this explanation in other cases, we do not regard it as convincing in the present case for both theoretical and empirical reasons. That is, APT does not predict that just any facial movement or feeling that co-occurs with surprise interferes with the facial display of surprise; only incompatible movements and strong incompatible emotions do. Of the facial components of surprise, at least eyebrow raising is, however, not incompatible with smiling. Also, the feeling of surprise is not incompatible with that of amusement; on the contrary, surprise is often regarded as a precondition or a component of amusement (e.g., Suls, 1971; see also Deckers, 1993). For this case—the co-occurrence of two compatible emotions—APT predicts that signals for both facial displays are sent to the face, resulting in a facial blend (e.g., raised eyebrows in a smiling face; Ekman, 1972). However, this was not observed. Furthermore, in both Experiments 5 and 6, the surprise displays (of the few participants who showed one) preceded smiling by at least a second, suggesting
307
that the feeling of surprise was present in pure form at least briefly before amusement set in—long enough, we suggest, to manifest itself on the face.
Experiment 7: Testing for Invisible Brow Raisings Strictly speaking, the results of Experiments 1– 6 pertain only to surprise displays that are visible to observers. Although it may seem implausible, given the high intensity of surprise induced in Experiments 3 and 6, it is still conceivable that many participants showed minute surprise expressions that were below the coders’ threshold of awareness (Tassinary & Cacioppo, 1992). We conducted Experiments 7 and 8 to examine this possibility by measuring facial EMG. EMG recordings are able to detect muscle movements that are invisible to the naked eye (Fridlund & Cacioppo, 1986). Experiment 7 was a replication of Experiment 6 (without the sociality manipulation), whereas in Experiment 8 we used a surprise paradigm comparable with those used in Experiments 1– 6.
Method Participants. The final sample of Experiment 7 consisted of 28 students (13 female, 15 male) with a mean age of 22.4 years of various disciplines—mostly nonpsychology—at the University of Greifswald. Eight additional participants were excluded from the data analyses, four because they did not recognize their face on the monitor, three because of problems with the EMG measurement, and one because of a procedural error. Procedure. The experiment was conducted by two experimenters, one male and one female. The procedure was similar to that of Experiment 6. To distract the participants from their facial muscles, we told them that we were interested in subtle changes of blood flow in the face during picture viewing. Although subjectively unnoticeable, these physiological reactions could presumably be detected by temperature sensors that would be placed on selected places of the face. After the participants were seated in front of the computer monitor, a snapshot of their face was secretly taken from the incoming video stream of the spy camera. One experimenter (always the same sex as the participant) then attached the EMG electrodes, while the other experimenter, who was separated from the participant by a room partition, edited the picture and transferred it to the experimental computer through a parallel link. After a relaxation period, 38 black-and-white photographs were presented. The participant’s task was to view the photographs and to indicate whether the depicted person appeared familiar. Pictures were separated by an intertrial interval of 3.3 s, during which a blank screen was shown. Level of sociality was kept constant at a low level, as it had no effect in Experiment 6 (nor in Experiment 3). That is, the same-sex experimenter stayed in the room behind the partition and busied himself or herself with supervising the EMG apparatus. The picture of the participant’s face was shown in Trial 38 and remained on the screen for 10 s. Subsequently, the experimenter moved to the participant’s table and asked the postexperimental questions, which were largely the same as those used in Experiment 6: The mood scale was omitted, the free listing of perceived facial surprise components was again replaced by checking components on a list, and a question asking for the time when the face had been recognized was included. After the experiment, the participants were debriefed about the true nature of the physiological recordings. EMG measurement. Of the muscles involved in the surprise display, we only considered the frontalis muscle, responsible for eyebrow raising. The muscle responsible for eye-widening (musculus levator palpebrae superioris) retracts over the eyeball into the orbit; therefore its activity cannot be measured with surface EMG (although it can be measured with
308
¨ RDGEN, HOLTBERND, AND MATZ REISENZEIN, BO
needle electrodes; cf. Aramideh, Ongboer de Visser, Devriese, Bour, & Speelman, 1994). We also neglected mouth opening/jaw drop because there are no established guidelines for its EMG measurement and because a pretest (using voluntary jaw dropping) in which we attempted to index this facial movement by the relaxation of the masseter muscle was unsuccessful. However, given that eyebrow raising was the most frequent visible facial surprise component observed in Experiments 1– 6 and in the previous study by Reisenzein (2000a), the frontalis muscle seemed the most promising place where to look for invisible surprise displays. In addition, we recorded EMG activity over corrugator supercilii (responsible for brow knitting), partly to control for possible artifacts in the frontalis EMG that may have been due to corrugator movements, and we recorded EMG activity over zygomaticus major to detect possible invisible smiles. We recorded the EMG signals using the Vitaport II recorder (Temec Instruments B.V., the Netherlands). Miniature (0.3-cm) bipolar Ag/AgCl electrodes were placed on the left side of the face in accordance with the guidelines of Fridlund and Cacioppo (1986). Amplifiers were set at a theoretical resolution of 0.23 V. We filtered the EMG signals with an 8Hz high-pass and a 400Hz low-pass hardware filter and digitized them at 1024 samples per second. Offline, we filtered the recorded EMG signals again with a 16Hz high-pass filter to attenuate blink and eye movement artifact (van Boxtel, 2001), as well as with a 50Hz notch filter to eliminate possible power line interference. Subsequently, the EMG signals were full-wave rectified and then smoothed with a flat 10Hz low-pass filter. For each channel, we computed the mean EMG amplitudes for the 20 consecutive 0.5-s intervals following stimulus onset in the surprise trial. To estimate the baseline variability of EMG activity, we computed the standard deviation of the means of the 11 baseline (1-s prestimulus) periods consisting of the surprise trial and the 10 preceding trials. Using this standard deviation estimate and the mean of the 1-s baseline immediately preceding the critical trial, we then individually standardized the poststimulus EMG means (cf. Hess & Blairy, 2001).
Results Evidence for surprise. The subjective data replicated those of Experiment 6. Mean rated surprise on the 0 –100-point scale was 79 (SD ⫽ 14; Mdn ⫽ 80), and relative surprise in comparison to an “extremely surprising” remembered experience was 68% (SD ⫽ 63, Mdn ⫽ 50). Participants had not expected anything unusual to happen during the experiment (M ⫽ 24), were astonished (M ⫽ 68) and confused (M ⫽ 74) by the appearance of their own face, found their attention strongly captured by it (M ⫽ 74) and typically searched for an explanation (61%). Again, participants were also strongly amused (M ⫽ 82) by the surprising event. Again, the surprise stimulus caused a significant RT increase of the familiarity judgment relative to the baseline in the preceding 10 trials, t(27) ⫽ 1.75, p ⬍ .05 (one-tailed); and again, most participants eventually addressed the experimenter to ask for an explanation. Finally, according to the retrospective reports, the identity of the face was detected, on average, 1.3 s after picture onset (SD ⫽ 1.3, Mdn ⫽ .75). Video codings and beliefs about expression. The videos of the 10-s period of the surprise trial were digitized and coded for facial expressions with a software media player. We observed four brow raises and two eye widenings; at least one of these displays was shown by 5 participants (18%). This is not significantly different from the frequency of surprise expressions obtained in Experiment 6 (9%), Fisher exact probability test, p ⫽ .30. Also similar to Experiment 6, we observed smiling/laughing in most participants (86%); with the exception of two participants who broke into a
laughing fit, it consisted of smiling only, similar to the nonsocial condition of Experiment 6. Again similar to Experiment 6, the participants believed that their surprise had shown strongly on the face (M ⫽ 86, SD ⫽ 15, Mdn ⫽ 85). Similar to Experiments 4 and 5 in which a checklist method had been used, 64% believed that they had shown brow raising, 71% eye widening, and 57% mouth opening/jaw drop; 93% checked at least one surprise component. The correlation between observed and perceived expression components was close to zero; r ⫽ ⫺.12 (brow raising) and .12 (at least one component shown). EMG activity. Before the data analysis, we scanned the EMG records for movement artifacts using the video recordings and onscreen displays of the EMG. Most movement artifacts occurred in the second half of the 10-s picture presentation period and were due to the fact that about one third of the participants turned to and addressed the experimenter (which involved eye, head, and body movements and talking) before the end of the observation period. For these participants, only the first 4 –9 s of the 10-s period could be evaluated. Two other participants, as mentioned, broke into a laughing fit; for these, we had to discard the remainder of the trial. One participant briefly looked to the ceiling during the later part of the surprise trial and another lowered his head and peered at the picture “from below lowered eyes,” both of which resulted in an increase of frontalis EMG activity; these periods (about 1–2 s) were also discarded. Because the main aim of the EMG measurement was to detect possible invisible frontalis activity, the statistical analysis centered on the 24 participants who did not show visible brow raising (as expected, the visible brow raisings were reflected in highly significant increases of the frontalis EMG). For these participants, on average, 8 s of artifact-free EMG were available; 11 had complete protocols for the whole 10-s period, and 19 had complete protocols for the first 5 s. The mean change of frontalis and corrugator EMG activity during the 10 s of picture presentation is shown in Figure 2. We obtained nearly
Figure 2. Average frontalis and corrugator electromyographic (EMG) responses during the surprise trial, Experiment 7.
SURPRISE AND FACIAL EXPRESSION
identical findings when we only considered the 11 participants with complete protocols. As can be seen, overall there was a decrease of frontalis activity across time, as well as a decrease of corrugator activity after a slight elevation after stimulus onset. Dependent t tests comparing the (unstandardized) mean in each poststimulus interval with the prestimulus baseline mean revealed significant decreases of frontalis EMG ( ps ⬍.01, dfs between 23 and 12) between 1.5 s and 8.5 s and again from 9 –9.5 s. The corrugator EMG also showed a significant decline from 3 s to 8.5 s; the initial increase at 0.5 and 1 s (see Figure 2) was marginally significant ( ps ⫽ .05 and .08, respectively). Because the mean data could have masked significant frontalis EMG increases in individual participants, we next examined the individual changes of EMG activity across time. Z values ⬎ 2 were scored as significant increases from baseline, and z values ⬍ ⫺2 were scored as significant decreases. Using this criterion, the individual graphs could be classified into three nonoverlapping groups: Fifteen (54%) participants showed a significant decrease of frontalis EMG during at least one of the poststimulus periods but no significant increase; 7 (25%) had neither a significant increase nor a significant decrease, and 2 participants showed a significant increase but no significant decrease. Thus, the typical temporal pattern of the frontalis EMG change was either a decrease or no change. Finally, the Spearman rank correlation (used to account for possible nonlinearities) between the standardized frontalis EMG and self-rated surprise was close to zero for all 20 measurement intervals. A significant increase of zygomaticus activity (⬎ 2z) was detected in 24 of the 28 participants, the same who also showed a visible smile. With one exception, the frontalis EMG increase preceded the zygomaticus response by 1–3 s, confirming the impression gleaned from the videos and suggesting that amusement set in only a few seconds after surprise. Consistent with this interpretation, the rank correlation between the standardized zygomaticus EMG and self-rated amusement became significant (r ⫽ .48, p ⬍ .01) 4.5 s after stimulus onset, after which it remained significant or close to significant until the end of the measurement period.
Discussion Experiment 7 largely replicated the results of Experiment 6. The unexpected appearance of their own face on the monitor was judged as strongly surprising by most participants, but only 18% showed a visible component of the surprise display. The EMG measurement suggested two additional instances of brow raising, thus raising the frequency of brow raising from 14% (video) to 21% (EMG). At the same time, however, the EMG measurement revealed a significant decrease of frontalis muscle activity in the majority of the participants with no visible surprise displays, and the correlation between frontalis EMG and self-rated surprise was essentially zero. Although not predicted, the observed decrease of frontalis—as well as corrugator—activity in the majority of the participants is consistent with recent results reported by other authors. Stekelenburg and van Boxtel (2002) examined psychophysiological reactions to novel sounds (e.g., animal sounds, human talk, industrial and environmental noises) that were presented from time to time
309
during a text-reading task without forewarning and thus were presumably at least somewhat surprising (subjective measures were not taken). In one of their experiments (Experiment 1), they found, exactly as we did, that the stimuli caused a decrease of frontalis activity, as well as a decrease of corrugator activity after a small initial increase. Camras et al. (2002) and Scherer et al. (2004) reported that expectancy violations in infants led to a temporary cessation of facial movements in many children. Post hoc, these findings fit well with the inhibitory effect of unexpected events on ongoing mental processes—and, consequently, the behaviors controlled by these processes—postulated in our surprise model (cf. the introduction). As noted, this model assumes that surprise-induced response inhibition serves to prepare the organism for the analysis of unexpected events (Meyer et al., 1997). Finally, the EMG findings further clarify the interpretation of the participants’ reports about perceived surprise displays. As noted, one possible explanation of these self-reports is that they were based on minute, invisible, facial changes. This hypothesis was already thrown into doubt by the finding that the participants believed their surprise displays to be visible to others (Experiments 5 and 6) and by the high perceived intensity of the expression (Experiment 6, replicated in Experiment 7). Further refuting this hypothesis, Experiment 7 revealed that there was no correlation between observed and reported brow raisings, that 92% of the facially unresponsive participants showed no significant frontalis EMG increase, and that 63% even showed a significant decrease.
Experiment 8: Once More With EMG To test whether the findings for the frontalis EMG obtained in Experiment 7 generalize to the surprise paradigms used in Experiments 1–5, we conducted a final experiment. Also in this study, once again we varied the factors intensity of surprise and sociality to examine their possible effects on EMG activity. Participants were randomly assigned to the resulting four experimental conditions.
Method Participants. The participants were 23 students at the University of Bielefeld. The surprise induction method was similar to that used in Experiment 1. The main difference was that the participants worked on a numerical addition task instead of a choice reaction task. In each trial, they had to add three numbers that appeared on the screen for 5.3 s. Subsequently, a solution number was presented for 3 s, and the participants decided whether it was correct. Trials were separated by 3 s, during which a blank screen was shown. In Trial 25, 2.5 s after the presentation of the numbers, a salient change of the mode of stimulus presentation occurred similar to the strong-background-change condition of Experiment 1: a repeated change of the color of the screen background and an inversion of the text color, accompanied by a sequence of tones. Design and procedure. The experiment had a 2 (sociality: experimenter present vs. absent) ⫻ 2 (task difficulty: low [one-digit numbers] vs. high [two-digit numbers]) design. We varied task difficulty to influence the degree of interference caused by the surprising event and, thereby, the intensity of felt surprise (see Experiment 2). Observable facial reactions were coded as before. We measured the frontalis EMG with the Vitaport I, a precursor model of the recorder used in Experiment 7. This recorder features a dedicated EMG channel that integrates (rectifies and smoothes) EMG online. The integrated signals were digitized at 256 Hz. In this study,
¨ RDGEN, HOLTBERND, AND MATZ REISENZEIN, BO
310
we only measured activity over the frontalis muscle. We evaluated the frontalis EMG from the onset of the surprise event in the critical trial until the end of the number presentation (3 s later) in 0.5-s intervals. The means of each measurement interval were individually standardized using the mean of the 1-s interval preceding the surprise event, and the standard deviation of the means of the 11 baseline (1-s prestimulus) periods consisting of the surprise trial and the 10 preceding trials. Before the statistical analyses, we scanned the EMG records and excluded periods with artifacts. Artifacts that were due to body, head, or eye movements were very rare in this study, but we had to exclude periods with blinks (on average, 1.5 per participant), because no high-pass filtering of the raw EMG had been used.
Results and Discussion Because of a procedural error, surprise ratings were not collected in this study. However, Reisenzein and Studtmann (2006, Experiment 2), who used the same paradigm, found that the stimulus changes caused surprise in most participants and that higher surprise was felt in the difficult than in the easy-task condition (M ⫽ 75 vs. 48, p ⬍ .001). Furthermore, the stimulus changes caused a significant increase of RT and a significant performance reduction relative to baseline, ts(22) ⬎ 2.73, ps ⬍ .05. Video codings revealed 5 participants (23%) who showed at least one surprise component (four eyebrow raises, one eye widening). Three other participants, instead of raising their eyebrows, frowned in response to the surprising event; these participants were excluded from the subsequent analyses. We then examined the EMG protocols of the 16 participants with no visible eyebrow movements for invisible frontalis activity, using the same procedure as in Experiment 7. This analysis suggested one additional case of eyebrow raising (standardized frontalis EMG ⬎ 2z) in the person who showed visible eye widening. In contrast to Experiment 7, the analysis of the EMG protocols of the remaining participants and of the average EMG indicated unchanged activity. One possible, if speculative, explanation for this difference to Experiment 7 is that the surprising stimulus changes were too brief to allow further sensory exploration that may have benefited from “facial stilling.” Smiling occurred in 6 participants, 2 in the nonsocial and 4 in the social condition. There was no significant difference in the frequency of visible surprise displays between the social and the alone condition (2 vs. 3), nor between the easy- and difficult-task conditions (3 vs. 2). There were also no significant effects of sociality and task difficulty on the frontalis EMG in the six 0.5-s poststimulus intervals.
General Discussion The first goal of the studies reported here was to provide further evidence on the relation between surprise and facial expression. As concerns this issue, the results of the studies provide evidence for several types of dissociation: (a) between the mental state of surprise and its traditional facial display, (b) between surprise and expression with respect to their reactivity to experimental manipulations, (c) between the different components of the surprise display, and (d) between the occurrence of surprise displays and participants’ beliefs about their occurrence.
Dissociation Between Surprise and Facial Expression This is the theoretically most important type of dissociation found. The pertinent evidence can best be summarized by referring
to the theoretical model of surprise described in the introduction, on which the induction and the measurement of surprise were based (see Figure 1). Without repeating the details, self-report and behavioral data collected in the different studies suggested that all of the surprise-related processes postulated in the model—the appraisal of unexpectedness, the feeling of surprise, the interruption of processing, and so forth— occurred in the majority of our participants.7 In addition, the presence of surprise is supported by the nature of the experimental inductions of surprise, which can claim theoretical and intuitive validity. On the basis of these data and theoretical considerations, the principle of “inference to the best explanation” (Harman, 1989) warrants the conclusion that the mental state of surprise was indeed present in most participants: No alternative hypothesis provides for a better explanation of the complete pattern of subjective and (nonfacial) behavioral data. At the same time, visible or EMG-detected facial expressions of surprise occurred only in a small minority of our participants. Table 1 summarizes the pertinent findings. As can be seen, overall only 11% of the 220 participants showed a visible facial surprise expression (at least one component), with a range of 4% (Experiment 2) to 25% (Experiment 5). Results from Experiments 7 and 8 indicate that the low incidence of visible expressions cannot be plausibly attributed to invisible displays: Measurement of the frontalis EMG suggested only one to two additional invisible brow raises; and in Experiment 7, 54% of the participants even showed a significant decrease of frontalis activity— exactly the opposite of what APT predicts. The present findings therefore document an even more extreme dissociation between surprise and facial expression in adults than Reisenzein’s (2000a) study. The dissociation between surprise and facial expression was also reflected in their differential reactivity to experimental manipulations of surprise intensity (degree of schema discrepancy, Experiment 1; task difficulty, Experiments 2 and 8; repetition of the surprise event, Experiments 1 and 5). Whereas these manipulations had the predicted effects on subjective and on nonfacial behavioral measures of surprise, they had no statistically reliable effects on facial expression.
Dissociation Between the Components of the Surprise Expression Of the 27 observed cases of visible or invisible surprise displays, 24 consisted of a single component: most frequently, eyebrow raising; the remaining 3 were two-component expressions. The “complete” surprise face was never seen (see Table 1). These results are again similar to, if more extreme than, previous findings by Reisenzein (2000a), who observed 54% single-component (mostly brow raising), 31% two-component (mostly brow raising and eye widening), and only 15% three-component displays. The findings are also in accord with data by Carroll and Russell (1997) on the surprise displays of movie actors (although these were posed rather than spontaneous expressions). 7 Additional support for this conclusion stems from psychophysiological studies which found that unexpected events of the type staged in Experiments 1–5 and 8 also cause physiological orienting responses (skin conductance responses and heart rate changes; e.g., Maher & Furedy, 1979; Niepel, 2001; Siddle & Jordan, 1993).
— — — 77 80 52b 93 — 76 — — — 14 30 9b 57 — 28 — — — 55 45 30b 71 — 50 — — — 59 60 26b 64 — 52 57 25 22 22 20 23 28 23 220 Surprise intensity Surprise intensity Sociality Duration of event Ease of visual orienting Sociality — Sociality, intensity Audiovisual change “Computer breakdown” Rule violation Audiovisual change Audiovisual change Picture of own face Picture of own face Audiovisual change 1 2 3 4 5 6 7 8 1–8
Note. For Experiments 1 and 5, data from the first critical trial are used. Dashes indicate that data were not collected. a For which video data were available. b Free listings of perceived expression components. c Video only/frontalis EMG included.
0 0 0 0 0 1 1 0/1c 2/3c 0 0 0 0 1 1 0 0 2 1 1 0 1 3 1 2 1 10 2 0 2 1 1 1 4/6c 4/5c 15/18c
3 (5%) 1 (4%) 2 (9%) 2 (9%) 5 (25%) 2 (9%) 5/7 (18/25%)c 5/5 (22/22%)c 25/27 (11/12%)c
ⱖ1 Jaw Eye Brow Jaw Eye Brow Experimental manipulations Surprise paradigm Experiment
Table 1 Summary of the Main Results of Experiments 1– 8
n
a
Surprise expression
ⱖ1
ⱖ2
Beliefs about expression (% of participants)
SURPRISE AND FACIAL EXPRESSION
311
It appears that, as originally formulated, APT does not allow for incomplete emotion expressions except in the sense that components of a display are selectively inhibited or masked or are too weak to be visible. Correspondingly, in their FACS Investigator’s Guide, Ekman et al. (2002, p. 174) did not list single-component displays among the expressions of surprise, and several previous investigators also required the presence of at least two of the three facial surprise components to code surprise (e.g., Bennett et al., 2002; Reissland et al., 2002). According to this stricter criterion, only 3 (1.3%) of our 220 participants showed a surprise display. The predominance of partial surprise displays may therefore signal the need for yet another modification of APT, for it could mean that the different components of the surprise expression are controlled by separate mental processes rather than by a unitary motor program (e.g., Ortony & Turner, 1990; Smith & Scott, 1997). However, note that an alternative explanation more in line with APT is possible: The different components of the surprise display could have different response thresholds, with eyebrow raising appearing first. Assuming that the hypothetical additional factor necessary for a surprise display (in addition to the presence of surprise and the absence of inhibition) was not present in our experiments, one would then expect this display to occur not only infrequently in these situations but also in partial form. In any case, it is interesting to note that lay people do allow for the occurrence of partial surprise displays (Experiments 4 –7).
Dissociation Between Surprise Displays and Beliefs About the Displays A fourth type of dissociation—a dissociation between surprise displays and beliefs about them—was documented in Experiments 4 –7. These studies found consistently that the participants grossly overestimated their surprise expressions: In contrast to the results of the video codings and the analyses of the EMG data (Experiment 7), most participants believed (a) that their surprise had strongly shown on the face (mean intensity ratings ⬇ 80 on the 0 –100-point scale; Experiments 6 –7)—in any case, in a way visible to others (Experiments 5 and 6)—and (b) that the surprise expression included one or more of the traditionally posited features (i.e., eyebrow raising, eye widening, mouth opening/jaw drop). Converging evidence from Experiments 4 –7 indicates that the reports about perceived surprise expressions were not based on visible or invisible facial displays (see the discussion of Experiment 7). Therefore, the participants must have relied on a different source of information. As we hinted earlier, we believe that the participants based their expression reports on schemas or generalized beliefs about the emotion–face association (see also Rime´, Phillipot, & Cisamolo, 1990). More precisely, we propose that they inferred their probable facial expression from their feelings of surprise (minor premise) and from generalized beliefs about the facial expression associated with surprise (major premise): They reasoned that, because they felt surprised, and because surprise is associated with a characteristic facial display, they must have shown this display. Experimental support for this hypothesis was obtained by Reisenzein and Studtmann (2006), who found that an experimental manipulation of surprise intensity, although not influencing expression, significantly affected participants’ beliefs
312
¨ RDGEN, HOLTBERND, AND MATZ REISENZEIN, BO
about their facial expression. Apparently, then, people are relatively insensitive to their own facial displays of surprise (or at least their absence) but highly susceptible to schema-based beliefs about the emotion–face association. This conclusion, if correct, may help to explain the discrepancy between both folk–psychological and scientific beliefs about the relation between surprise and facial expression (cf. the introduction), and the present findings. As already suggested in the introduction, the cognitive representation or schema of surprise to which people recur when they judge the association between surprise and facial expression (e.g., Ekman et al., 1987), does not seem to reflect the statistically modal but the ideal case of surprise, where the surprise syndrome is present in full-fledged form (Horstmann, 2002). The question then is how this ideal-type schema is acquired in the first place and why it is not corrected by experience. The present findings suggest a partial answer to these questions: If people are not sensitive to their own facial displays (e.g., simply because they usually do not attend to them), then a central source of information available for acquiring veridical beliefs about the emotion–face association is neglected. In particular, people will then miss cases where surprise is present but facial expression is not. Worse, reliance on the ideal-type schema even in personal experiences of surprise leads into a self-reinforcing cycle: Prospectively, one expects surprise displays to occur even in the statistically modal cases of surprise; retrospectively, one surmises them to have been present in these cases (see also Schu¨tzwohl & Krefting, 2001), thereby apparently confirming one’s expectations.
Explanations of the Emotion–Face Dissociation The second goal of the studies reported here was to explore possible explanations for the dissociation between the emotion of surprise and its facial display. First, we considered method problems related to the induction and measurement of surprise. On the basis of Experiments 1 and 2, we concluded that the method artifact hypothesis can be ruled out. Next, we examined the two substantive explanations for the observed dissociation available to APT (as originally stated): inhibition or masking of facial displays due to display rules, and insufficient surprise intensity. In our view, the display-rules hypothesis cannot explain the results, because the incidence of surprise displays was no higher in nonsocial than in social situations (Experiments 3, 6, and 8). The insufficient-intensity hypothesis, too, was not supported. First, the surprise display was insensitive to manipulations of surprise intensity (Studies 1, 2, and 8). Second, even high induced surprise (Studies 3, 6, and 7) did not result in much more of a surprise display. Third, additional analyses revealed that the frequency of surprise expressions in highly surprised participants—those with ratings ⱖ7 on the 0 –10-point surprise scale or ⱖ70 on the 0 –100-point scale—was nearly identical (12.6%) to that of surprise expressions in the total sample (11%). Fourth, EMG measurements (Experiments 7 and 8) detected only very little invisible facial activity related to surprise, and in one study (Experiment 7) they even revealed a decrease of frontalis muscle activity in the majority of the participants. Finally, we turned to modifications of APT, obtained from the original theory by adding the assumption that some other factor X, apart from surprise and the absence of deliberate control, is needed
for the facial surprise display to occur. Three such modifications were examined: the insufficient-duration hypothesis, the visualorienting hypothesis, and the insufficient-sociality hypothesis. The insufficient-duration hypothesis was not supported: An experimental manipulation of the duration of unexpected stimulus changes did not affect facial expression (Experiment 5) and even comparatively long-lasting surprise (Experiment 6) did not produce surprise expressions. The insufficient-sociality hypothesis was also unsupported (Experiment 6): Although a high level of sociality increased spontaneous verbal exclamations suggestive of surprise as well as mirth reactions, it did not increase the frequency of surprise expressions. Only the visual-orienting hypothesis found some support (Experiment 5). Admittedly this support was weak; however, the findings agree with previous results of Blurton Jones and Konner (1971). In addition, the higher frequency of (partial) surprise displays observed in some of the surprise situations staged in previous studies (cf. the introduction) may also be explainable by the visual-orienting hypothesis. On the other hand, visual orienting does not seem to be generally necessary for surprise displays (e.g., Reisenzein, 2000a). To reconcile these findings with the visual-orienting hypothesis, one would need to add something like Darwin’s (1872/1998) assumption that, due to “the force of association” (p. 281), surprise displays are eventually shown even to events that do not require visual search. Finally, it needs to be pointed out that, whatever the merits of the visual-orienting hypothesis may be, as an adjunct to or a modification of APT, this hypothesis is problematic. To fit the proposed schema of an APT modification (i.e., given absence of inhibition: surprise ⫹ factor X 3 facial display), the visualorienting hypothesis must be read as follows: surprise ⫹ need for visual orienting 3 facial display. However, the more natural explication of this hypothesis—which seems to have been endorsed by Darwin (1872/1998), Andrew (1963), and Blurton Jones and Konner (1971)—is that surprise instigates visual orienting and that the surprise display is the result or by-product of the latter (i.e., surprise 3 visual orienting 3 facial display). This latter formulation of the visual-orienting hypothesis is more accurately classified as a variant of componential theories of facial expression (e.g., Ortony & Turner, 1990; Smith & Scott, 1997), according to which the different components of facial expressions are partly controlled by separate processes, not all of which are necessarily emotional in character. According to this formulation of the visual-orienting hypothesis, surprise is but one of the conditions that instigate visual search, and the resulting facial display is not fundamentally different from similar displays that occur as the result of visual orienting due to other causes, such as when people are required to quickly look up (Reisenzein & Studtmann, 2006; see also Bennett et al., 2002; Camras, Lambrecht, & Michel, 1996). The modifications of APT tested in our studies are not the only possible ones. However, at least two further conceivable modifications of APT can already be eliminated on the basis of our results (as well as those of previous studies): that the onset of the surprising event must be sudden rather than gradual and that the surprising event must be novel rather than familiar. All of the surprising events staged in our studies had a sudden onset, and all were novel in two salient senses of this word: First, they had not occurred before in the experiment; and second, they caused the
SURPRISE AND FACIAL EXPRESSION
revision of existing, and thus the acquisition of, novel beliefs (see Ruffman & Keenan, 1996). Two other possible modifications of APT still need to be tested more thoroughly: that unexpected events, in addition to causing surprise, must also be pleasant or unpleasant rather than hedonically neutral and/or that they must be important to people’s goals or welfare. The hedonic hypothesis is thrown into doubt by the findings of Experiments 6 and 7, in which surprise was coupled with high amusement. However, none of the surprising events staged in our studies were probably very important to the participants’ goals or welfare. However, note that even if future research were to support this or some other as-yet untested modification of APT, this would not change the conclusion that, in contrast to the orginal formulation of APT, surprise displays are elicited only by a (possibly small) subset of the events which cause surprise. And of course it could turn out that even the proposed modification of APT (surprise ⫹ absence of inhibition ⫹ factor X 3 expression) is untenable, because there is no factor X. In this case, at the latest, proponents of APT could argue that at second thought, surprise should be excluded from the domain of applicability of APT. Indeed, as mentioned in the introduction, it is possible that APT holds true for some emotions but not for others. Note, however, that this move is also not without problems. In particular, it is unsatisfactory without an explanation of why surprise does not fit APT. Given that the standard definition of basic emotions as appraisal-induced biological response syndromes (cf. the introduction) does not suggest a straightforward difference between surprise and other emotions, such an explanation may not be easy to come by.
Implications of the Dissociation Results Regardless of how the observed dissociation between surprise and its facial expression is ultimately explained, the mere existence of this dissociation has important theoretical and practical implications. To conclude the article, we briefly mention two of them. First, the present findings speak against any strong version of the facial feedback theory of emotional experience in the case of surprise: that facial feedback is necessary for the feeling of surprise, or that it is a major determinant of this feeling (e.g., Izard, 1977; Laird & Bresler, 1992). Because facial expressions of surprise typically did not occur in our participants, they could not have influenced their feeling of surprise. Second, on a more practical level, our findings indicate caution in using facial expression to diagnose surprise in both research and applied settings. Perhaps the presence of a facial surprise display, or of components of that display, reliably indicates surprise in many situations (see also Reisenzein, 2000a), although certainly not in all (e.g., Camras et al., 1996; Ekman, 1979; Reisenzein & Studtmann, 2006). However, our findings suggest that, even when suppression or masking are not at work, the reverse does not hold: The absence of a facial display is no strong reason to infer a lack of surprise.
References Andrew, R. J. (1963). The evolution of facial expression. Science, 141, 1034 –1041. Aramideh, M., Ongboer de Visser, B. W., Devriese, P. P., Bour, L. J., &
313
Speelman, J. D. (1994). Electromyographic features of levator palpebrae superioris and orbicularis oculi muscles in blepharospasm. Brain, 117, 27–38. Beatty, J., & Lucero-Wagoner, B. (2000). The pupillary system. In J. J. T. Caccioppo, L. G. Tassinary, & G. G. Berntson (Eds.), Handbook of psychophysiology (2nd ed., pp. 142–162). Cambridge, United Kingdom: Cambridge University Press. Bennett, D. S., Bendersky, M., & Lewis, M. (2002). Facial expressivity at 4 months: A context by expression analysis. Infancy, 3, 97–113. Blurton Jones, N. G., & Konner, M. J. (1971). An experiment on eyebrowraising and visual searching in children. Journal of Child Psychology and Psychiatry, 11, 233–240. Camras, L. A., Lambrecht, L., & Michel, G. F. (1996). Infant “surprise” expressions as coordinative motor structures. Journal of Nonverbal Behavior, 20, 183–195. Camras, L. A., Meng, Z., Ujiie, T., Dharamsi, S., Miyake, K., Oster, H., et al. (2002). Observing emotions in infants: Facial expression, body behavior, and rater judgments of responses to an expectancy-violation event. Emotion, 2, 179 –192. Carroll, J. M., & Russell, J. A. (1997). Facial expressions in Hollywood’s portrayal of emotion. Journal of Personality and Social Psychology, 72, 164 –176. Charlesworth, W. R. (1964). Instigation and maintenance of curiosity behavior as a function of surprise versus novel and familiar stimuli. Child Development, 35, 1169 –1186. Chovil, N. (1991). Social determinants of facial displays. Journal of Nonverbal Behavior, 15, 141–154. Cosmides, L., & Tooby, J. (2000). Evolutionary psychology and the emotions. In M. Lewis & J. M. Haviland-Jones (Eds.), Handbook of emotions (2nd ed., pp. 91–115). New York: Guilford Press. Darwin, C. (1998). The expression of the emotions in man and animals (Edited by Paul Ekman). London: Fontana Press (Original work published 1872). Deckers, L. (1993). On the validity of a weight-judging paradigm for the study of humor. Humor: International Journal of Humor Research, 6, 43–56. Ekman, P. (1972). Universals and cultural differences in facial expressions of emotion. In J. Cole (Ed.), Nebraska Symposium on Motivation: Vol. 19 (pp. 207–283). Lincoln: University of Nebraska Press. Ekman, P. (1979). About brows: Emotional and conversational signals. In M. von Cranach, K. Foppa, W. Lepenies, & D. Ploog (Eds.), Human ethology (pp. 169 –248). Cambridge, United Kingdom: Cambridge University Press. Ekman, P. (1993). Facial expression and emotion. American Psychologist, 48, 384 –392. Ekman, P. (1997). Expression or communication about emotion. In N. L. Segal, G. E. Weisfeld, & C. C. Weisfeld (Eds.), Uniting psychology and biology: Integrative perspectives on human development (Vol. 48, pp. 384 –392). Washington, DC: American Psychological Association. Ekman, P. (1999). Facial expressions. In T. Dalgleish & M. Power (Eds.), Handbook of cognition and emotion (pp. 301–321). New York: Wiley. Ekman, P., Friesen, W. V., & Ancoli, S. (1980). Facial signs of emotional experience. Journal of Personality and Social Psychology, 39, 1125– 1134. Ekman, P., Friesen, W. V., & Ellsworth, P. (1982). Does the face provide accurate information? In P. Ekman (Ed.), Emotion in the human face (2nd ed., pp. 56 –110). Hillsdale, NJ: Erlbaum. Ekman, P., Friesen, W. V., & Hager, J. V. (2002). Facial action coding system (2nd ed.). Salt Lake City, UT: Research Nexus eBook. Ekman, P., Friesen, W. V., O’Sullivan, M., Chan, A., DiacoyanniTarlatzis, I., Heider, K., et al. (1987). Universals and cultural differences in the judgments of facial expressions of emotion. Journal of Personality and Social Psychology, 53, 712–717.
314
¨ RDGEN, HOLTBERND, AND MATZ REISENZEIN, BO
Ekman, P., Friesen, W. V., & Simons, R. C. (1985). Is the startle reaction an emotion? Journal of Personality and Social Psychology, 49, 1416 – 1426. Elfenbein, H. A., & Ambady, N. (2002). On the universality and cultural specificity of emotion recognition: A meta-analysis. Psychological Bulletin, 128, 203–235. Fazio, R. H. (1990). A practical guide to the use of response latency in social psychological research. In C. Hendrick & M. S. Clark (Eds.), Research methods in personality and social psychology (pp. 74 –97). Newbury Park, CA: Sage. Ferna´ndez-Dols, J.-M., & Ruiz-Belda, M.-A. (1997). Spontaneous facial behavior during intense emotional episodes: Artistic truth and optical truth. In J. A. Russell & J.-M. Ferna´ndez-Dols (Eds.), The psychology of facial expression (pp. 255–274). Cambridge, United Kingdom: Cambridge University Press. Fischer, A. H., Manstead, A. S. R., & Zaalberg, R. (2003). Social influences on the emotion process. In W. Stroebe & M. Hewstone (Eds.), European review of social psychology (Vol. 14, pp. 171–201). Hove, United Kingdom: Psychology Press. Fridlund, A. J. (1991). Sociality and solitary smiling: Potentiation by an implicit audience. Journal of Personality and Social Psychology, 60, 229 –240. Fridlund, A. J. (1994). Human facial expression: An evolutionary view. San Diego, CA: Academic Press. Fridlund, A. J., & Cacioppo, J. T. (1986). Guidelines for human electromyographic research. Psychophysiology, 23, 567–589. Furr, R. M., & Rosenthal, R. (2003). Evaluating theories efficiently: The nuts and bolts of contrast analysis. Understanding Statistics, 2, 45– 67. Guerin, B. (1986). Mere presence effects in humans: A review. Journal of Experimental Social Psychology, 22, 38 –77. Harman, G. (1989). The inference to the best explanation. In R. Brody & R. Grandy (Eds.), Readings in the philosophy of science (pp. 323–328). Englewood Cliffs, NJ: Prentice Hall. Hess, U., Banse, R., & Kappas, A. (1995). The intensity of facial expression is determined by underlying affective state and social situation. Journal of Personality and Social Psychology, 69, 280 –288. Hess, U., & Blairy, S. (2001). Facial mimicry and emotional contagion to dynamic emotional facial expressions and their influence on decoding accuracy. International Journal of Psychophysiology, 40, 129 –141. Hiatt, S. W., Campos, J. J., & Emde, R. N. (1979). Facial patterning and infant emotional expression: Happiness, surprise, and fear. Child Development, 50, 1020 –1035. Holodynski, M. (2004). The miniaturization of expression in the development of emotional self-regulation. Developmental Psychology, 40, 16 – 28. Horstmann, G. (2002). Facial expressions of emotion: Does the prototype represent central tendency, frequency of instantiation, or an ideal? Emotion, 2, 297–305. Horstmann, G., & Schu¨tzwohl, A. (1998). Zum Einfluß der Verknu¨pfungs¨ berrraschungsreaktion sta¨rke von Schemaelementen auf die Sta¨rke der U [Effect of strength of association of schema elements on the surprise reaction]. Zeitschrift fu¨r Experimentelle Psychologie, 45, 203–217. Izard, C. E. (1977). Human emotions. New York: Plenum Press. Izard, C. E. (1991). The psychology of emotions. New York: Plenum Press. Jakobs, E., Manstead, A. S. R., & Fischer, A. H. (2001). Social context effects on facial activity in a negative emotional setting. Emotion, 1, 51– 69. Klix, F., van der Meer, E., & Preuß, M. (1984). Semantische Relationen: Erkennungsaufwand und psychophysiologische Reaktionstendenzen [Semantic relations: Recognition expenditure and psychophysiological reaction tendencies]. In F. Klix (Ed.), Geda¨chtnis, Wissen, Wissensnutzung (pp. 156 –172). Berlin: Deutscher Verlag der Wissenschaften.
Koch, M. (1999). The neurobiology of startle. Progress in Neurobiology, 59, 107–128. Kraut, R. E., & Johnston, E. E. (1979). Social and emotional messages of smiling: An ethological approach. Journal of Personality and Social Psychology, 37, 1539 –1553. Laird, J. D., & Bresler, C. (1992). The process of emotional experience: A self-perception theory. In M. S. Clark (Ed.), Review of personality and social psychology (Vol. 13, pp. 213–234). Thousand Oaks, CA: Sage. Landis, C. (1924). Studies of emotional reactions: II. General behavior and facial expression. Journal of Comparative Psychology, 4, 447–509. Larsen, J. T., Norris, C. J., & Cacioppo, J. T. (2003). Effects of positive and negative affect on electromyographic activity over zygomaticus major and corrugator supercilii. Psychophysiology, 40, 776 –785. Leventhal, H. (1984). A perceptual-motor theory of emotion. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 17, pp. 117–182). New York: Academic Press. Maher, T. F., & Furedy, J. J. (1979). A comparison of the pupillary and electrodermal components of the orienting reflex in sensitivity to initial stimulus presentation, repetition, and change. In H. D. Kimmel, E. H. van Olst, & J. F. Orlebeke (Eds.), The orienting reflex in humans (pp. 381–391). Hillsdale, NJ: Erlbaum. Mauss, I. B., Levenson, R. W., McCarter, L., Wilhelm, F. H., & Gross, J. J. (2005). The tie that binds? Coherence among emotion experience, behavior, and physiology. Emotion, 5, 175–190. Meyer, W.-U., & Niepel, M. (1994). Surprise. In V. S. Ramachandran (Ed.), Encyclopedia of human behavior (Vol. 4, pp. 353–358). Orlando, FL: Academic Press. Meyer, W.-U., Niepel, M., Rudolph, U., & Schu¨tzwohl, A. (1991). An experimental analysis of surprise. Cognition and Emotion, 5, 295–311. Meyer, W.-U., Reisenzein, R., & Schu¨tzwohl, A. (1997). Towards a process analysis of emotions: The case of surprise. Motivation and Emotion, 21, 251–274. Niepel, M. (2001). Independent manipulation of stimulus change and unexpectedness dissociates indices of the orienting response. Psychophysiology, 38, 84 –91. Orne, M. T. (1962). On the social psychology of the psychological experiment. American Psychologist, 17, 776 –783. Ortony, A., & Turner, W. (1990). What’s basic about “basic” emotions? Psychological Review, 97, 315–331. Parkinson, P. (2005). Do facial movements express emotions or communicate motives? Personality and Social Psychology Review, 9, 278 –311. Parrott, W. G., & Gleitman, H. (1989). Infants’ expectations in play: The joy of peek-a-boo. Cognition and Emotion, 3, 291–311. Ratcliff, R. (1993). Methods for dealing with reaction time outliers. Psychological Bulletin, 114, 510 –532. Reisenzein, R. (2000a). Exploring the strength of association between the components of emotion syndromes: The case of surprise. Cognition and Emotion, 14, 1–38. Reisenzein, R. (2000b). The subjective experience of surprise. In H. Bless & J. P. Forgas (Eds.), The message within: The role of subjective experience in social cognition and behavior (pp. 262–279). Philadelphia: Psychology Press. Reisenzein, R., Meyer, W.-U., & Schu¨tzwohl, A. (1996). Reactions to surprising events: A paradigm for emotion research. In N. Frijda (Ed.), Proceedings of the 9th conference of the International Society for Research on Emotions (pp. 292–296). Toronto, Ontario, Canada: ISRE. Reisenzein, R., & Studtmann, M. (2006). On the expression and experience of surprise: No evidence for facial feedback, but evidence for a reverse self-perception effect. Manuscript under review. Reissland, N., Shepherd, J., & Cowie, L. (2002). The melody of surprise: Maternal surprise vocalizations during play with her infant. Infant and Child Development, 11, 271–278. Rime´, B., Phillipot, P., & Cisamolo, D. (1990). Social schemata of periph-
SURPRISE AND FACIAL EXPRESSION eral changes in emotion. Journal of Personality and Social Psychology, 59, 38 – 49. Rohrbaugh, J. W. (1984). The orienting reflex: Performance and central nervous system manifestations. In R. Parasuraman & D. R. Davies (Eds.), Varieties of attention (pp. 323–373). Orlando, FL: Academic Press. Rosenberg, E. L., & Ekman, P. (1994). Coherence between expressive and experiential systems in emotion. Cognition and Emotion, 8, 201–229. Ruch, W. (1995). Will the real relationship between facial expression and affective experience please stand up: The case of exhilaration. Cognition and Emotion, 9, 33–58. Ruffman, T., & Keenan, T. R. (1996). The belief-based emotion of surprise: The case for a lag in understanding relative to false belief. Developmental Psychology, 32, 40 – 49. Ruiz-Belda, M-A., Ferna´ndez-Dols, J.-M., Carrera, P., & Barchard, K. (2003). Spontaneous facial expressions of happy bowlers and soccer fans. Cognition and Emotion, 17, 315–326. Russell, J. A. (1994). Is there universal recognition of emotion from facial expression? A review of the cross-cultural studies. Psychological Bulletin, 115, 102–141. Russell, J. A., Bachorowski, J.-A., & Ferna´ndez-Dols, J. M. (2003). Facial and vocal expressions of emotion. Annual Review of Psychology, 54, 329 –349. Scherer, K. R., Zentner, M. R., & Stern, D. (2004). Beyond surprise: The puzzle of infants’ expressive reactions to expectancy violation. Emotion, 4, 389 – 402. Schu¨tzwohl, A. (1998). Surprise and schema strength. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 1182–1199. Schu¨tzwohl, A., & Krefting, E. (2001). Die Struktur der Intensita¨t von ¨ berraschung [The structure of the intensity of surprise]. Zeitschrift fu¨r U Experimentelle Psychologie, 48, 41–56. Siddle, D. A. T., & Jordan, J. (1993). Effects of intermodality change on electrodermal orienting and on the allocation of processing resources. Psychophysiology, 30, 429 – 435. Smith, C. A., & Scott, H. S. (1997). A componential approach to the meaning of facial expression. In J. A. Russell & J. Ferna´ndez-Dols
315
(Eds.), The psychology of facial expression (pp. 229 –254). Cambridge, United Kingdom: Cambridge University Press. Sonnemans, J., & Frijda, N. H. (1994). The structure of subjective emotional intensity. Cognition and Emotion, 8, 329 –350. Stekelenburg, J. J., & van Boxtel, A. (2002). Pericranial muscular, respiratory, and heart rate components of the orienting response. Psychophysiology, 39, 707–722. Suls, J. M. (1971). A two-stage model for the appreciation of jokes and cartoons: An information-processing analysis. In J. H. Goldstein & P. E. McGhee (Eds.), The psychology of humor (pp. 81–100). New York: Academic Press. Tassinary, L. G., & Cacioppo, J. T. (1992). Unobservable facial actions and emotion. Psychological Science, 3, 28 –33. Tomkins, S. S. (1962). Affect, imagery, consciousness: Volume I. The positive affects. New York: Springer. Tracy, J. L., Robins, R. W., & Lagattuta, K. H. (2005). Can children recognize pride? Emotion, 5, 251–257. van Boxtel, A. (2001). Optimal signal bandwidth for the recording of surface EMG activity over facial, jaw, oral, and neck muscles. Psychophysiology, 38, 22–34. Wagner, H., & Lee, V. (1999). Facial behavior alone and in the presence of others. In P. Philippot, R. S. Feldman, & E. J. Coats (Eds.), The social context of nonverbal behavior (pp. 262–286). New York: Cambridge University Press. Wheldall, K., & Mittler, P. (1976). Eyebrow-raising, eye widening and visual search in nursery school children. Journal of Child Psychology and Psychiatry, 17, 57– 62. Wicklund, R. A., & Frey, D. (1980). Self-awareness theory: When the self makes a difference. In M. Wegner & R. R. Vallacher (Eds.), The self in social psychology (pp. 31–54). New York: Oxford University Press. Woodworth, R. S., & Schlosberg, H. (1954). Experimental psychology. New York: Holt.
Received April 4, 2005 Revision received January 30, 2006 Accepted February 6, 2006 䡲
SURPRISE AND FACIAL EXPRESSION eral changes in emotion. Journal of Personality and Social Psychology, 59, 38 – 49. Rohrbaugh, J. W. (1984). The orienting reflex: Performance and central nervous system manifestations. In R. Parasuraman & D. R. Davies (Eds.), Varieties of attention (pp. 323–373). Orlando, FL: Academic Press. Rosenberg, E. L., & Ekman, P. (1994). Coherence between expressive and experiential systems in emotion. Cognition and Emotion, 8, 201–229. Ruch, W. (1995). Will the real relationship between facial expression and affective experience please stand up: The case of exhilaration. Cognition and Emotion, 9, 33–58. Ruffman, T., & Keenan, T. R. (1996). The belief-based emotion of surprise: The case for a lag in understanding relative to false belief. Developmental Psychology, 32, 40 – 49. Ruiz-Belda, M-A., Ferna´ndez-Dols, J.-M., Carrera, P., & Barchard, K. (2003). Spontaneous facial expressions of happy bowlers and soccer fans. Cognition and Emotion, 17, 315–326. Russell, J. A. (1994). Is there universal recognition of emotion from facial expression? A review of the cross-cultural studies. Psychological Bulletin, 115, 102–141. Russell, J. A., Bachorowski, J.-A., & Ferna´ndez-Dols, J. M. (2003). Facial and vocal expressions of emotion. Annual Review of Psychology, 54, 329 –349. Scherer, K. R., Zentner, M. R., & Stern, D. (2004). Beyond surprise: The puzzle of infants’ expressive reactions to expectancy violation. Emotion, 4, 389 – 402. Schu¨tzwohl, A. (1998). Surprise and schema strength. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 1182–1199. Schu¨tzwohl, A., & Krefting, E. (2001). Die Struktur der Intensita¨t von ¨ berraschung [The structure of the intensity of surprise]. Zeitschrift fu¨r U Experimentelle Psychologie, 48, 41–56. Siddle, D. A. T., & Jordan, J. (1993). Effects of intermodality change on electrodermal orienting and on the allocation of processing resources. Psychophysiology, 30, 429 – 435. Smith, C. A., & Scott, H. S. (1997). A componential approach to the meaning of facial expression. In J. A. Russell & J. Ferna´ndez-Dols
315
(Eds.), The psychology of facial expression (pp. 229 –254). Cambridge, United Kingdom: Cambridge University Press. Sonnemans, J., & Frijda, N. H. (1994). The structure of subjective emotional intensity. Cognition and Emotion, 8, 329 –350. Stekelenburg, J. J., & van Boxtel, A. (2002). Pericranial muscular, respiratory, and heart rate components of the orienting response. Psychophysiology, 39, 707–722. Suls, J. M. (1971). A two-stage model for the appreciation of jokes and cartoons: An information-processing analysis. In J. H. Goldstein & P. E. McGhee (Eds.), The psychology of humor (pp. 81–100). New York: Academic Press. Tassinary, L. G., & Cacioppo, J. T. (1992). Unobservable facial actions and emotion. Psychological Science, 3, 28 –33. Tomkins, S. S. (1962). Affect, imagery, consciousness: Volume I. The positive affects. New York: Springer. Tracy, J. L., Robins, R. W., & Lagattuta, K. H. (2005). Can children recognize pride? Emotion, 5, 251–257. van Boxtel, A. (2001). Optimal signal bandwidth for the recording of surface EMG activity over facial, jaw, oral, and neck muscles. Psychophysiology, 38, 22–34. Wagner, H., & Lee, V. (1999). Facial behavior alone and in the presence of others. In P. Philippot, R. S. Feldman, & E. J. Coats (Eds.), The social context of nonverbal behavior (pp. 262–286). New York: Cambridge University Press. Wheldall, K., & Mittler, P. (1976). Eyebrow-raising, eye widening and visual search in nursery school children. Journal of Child Psychology and Psychiatry, 17, 57– 62. Wicklund, R. A., & Frey, D. (1980). Self-awareness theory: When the self makes a difference. In M. Wegner & R. R. Vallacher (Eds.), The self in social psychology (pp. 31–54). New York: Oxford University Press. Woodworth, R. S., & Schlosberg, H. (1954). Experimental psychology. New York: Holt.
Received April 4, 2005 Revision received January 30, 2006 Accepted February 6, 2006 䡲
PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES
The Evolutionary Significance of Depressive Symptoms: Different Adverse Situations Lead to Different Depressive Symptom Patterns Matthew C. Keller
Randolph M. Nesse
Virginia Commonwealth University
University of Michigan at Ann Arbor
Although much depression may be dysfunctional, the capacity to experience normal depressive symptoms in response to certain adverse situations appears to have been shaped by natural selection. If this is true, then different kinds of situations may evoke different patterns of depressive symptoms that are well suited to solving the adaptive challenges specific to each situation. The authors called this the situation–symptom congruence hypothesis. They tested this hypothesis by asking 445 participants to identify depressive symptoms that followed a recent adverse situation. Guilt, rumination, fatigue, and pessimism were prominent following failed efforts; crying, sadness, and desire for social support were prominent following social losses. These significant differences were replicated in an experiment in which 113 students were randomly assigned to visualize a major failure or the death of a loved one. Keywords: depression subtypes, depressive symptoms, evolutionary psychology, psychopathology, Darwinian psychiatry
standing how and why depressive symptoms differ across episodes. Our focus is on unipolar depressive symptoms (hereafter, depressive symptoms) such as sadness, fatigue, pessimism, and so forth, but unless noted, our usage is agnostic as to whether these symptoms cross a clinical threshold of severity or duration. We provide evidence that different precipitants cause different depressive symptom patterns that are consistent with an evolutionary account of their origins.
We could never learn to be brave and patient if there were only joy in the world. —Helen Keller
Depression is not a unitary phenomenon. Different depressive episodes often have different symptoms profiles, even within the same person across time (Oquendo et al., 2004), and the precipitants of depression vary widely, from deaths of loved ones to failures at major goals to chronic stress (Kendler, Gardner, & Prescott, 2002). Thus, a central challenge in depression research has been to disaggregate it into meaningful subtypes, generally based on symptom profiles, precipitating causes, or both. In the present article, we review previous approaches for subtyping depression, and then introduce and test a new framework for under-
Previous Approaches for Subtyping Depression One straightforward way to subdivide depression is based on the depressive symptoms themselves. For instance, the subtype depression with melancholia is characterized by anhedonia, fatigue, chronically depressed mood, early morning insomnia, weight loss, and guilt (Diagnostic and Statistical Manual of Mental Disorders, 4th edition, text rev. [DSM–IV–TR]; American Psychiatric Association, 2000). Its previous designation, endogenous depression, was largely abandoned when it became clear that these symptoms were as likely to be precipitated by life events as other types of depression and that much depression originally reported as having “no cause” was found to be precipitated by events, often of an embarrassing nature (Leff, Roatch, & Bunney, 1970). Another reliably occurring cluster of symptoms, atypical depression, is in some ways the opposite of depression with melancholia, characterized by increased appetite and sleeping, heavy feeling limbs, rejection sensitivity, and perhaps mood reactivity (American Psychiatric Association, 2000). Depression has also been subtyped by the kind of events that precipitate an episode (hereafter adverse situations or precipitants). For instance, seasonal affective disorder (SAD) is recurrent depression with typical onset in the fall/winter; it is characterized
Matthew C. Keller, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University; Randolph M. Nesse, Department of Psychology, Department of Psychiatry, and Research Center for Group Dynamics, Institute for Social Research, University of Michigan at Ann Arbor. Matthew C. Keller was supported by a National Science Foundation Graduate Research Fellowship; a fellowship from the University of California, Los Angeles, Center for Society and Genetics; and National Research Service Award T32 MH-20030 from the National Institutes of Health (principal investigator, M. C. Neale). We thank Barbara Fredrickson, Bobbi Low, Oscar Ybarra, Michael Neale, Paul Andrews, Steven Aggen, and Kenneth Kendler for help and suggestions. We also appreciate the hard work of research assistants Gloria Jen and Danelle Filips. Correspondence concerning this article should be addressed to Matthew C. Keller, Virginia Institute for Psychiatric and Behavioral Genetics, Biotech 1, 800 East Leigh St., Richmond, VA 23219. E-mail:
[email protected]
Journal of Personality and Social Psychology, 2006, Vol. 91, No. 2, 316 –330 Copyright 2006 by the American Psychological Association 0022-3514/06/$12.00 DOI: 10.1037/0022-3514.91.2.316
316
EVOLUTION AND DEPRESSIVE SYMPTOMS
by fatigue, increased appetite and sleeping, and carbohydrate craving (Rosenthal et al., 1984; Young, 1991). Some SAD symptoms occur in most people in northern latitudes during the winter (Dam, Jakobsen, & Mellerup, 1998), suggesting that SAD may be an extreme of normal wintertime behavioral changes. Bereavement is a dysphoric reaction precipitated by the death of a loved one. Common symptoms include a profound sense of loss, emotional pain, crying, and loneliness (Archer, 1999). Bereavement is not considered pathological by DSM–IV–TR standards if it fits the expected symptom profile and lasts less than 2 months. Finally, diathesis-stress models posit that depression subtypes arise from interactions between adverse situations and stable dispositional factors (Abramson, Metalksy, & Alloy, 1989; Beck, 1967). People who characteristically attribute adverse situations to stable, global causes are at higher risk for hopelessness depression following adverse situations (Abramson et al., 1989). Some evidence indicates that hopelessness depression is characterized by negative cognitions, decreased motivation, fatigue, psychomotor retardation, sleep disturbances, sadness, poor concentration, and suicidal ideation (Alloy, Just, & Panzarella, 1997; Joiner, 2001). Another diathesis-stress model posits that people who are high in need for approval (sociotropes) are at risk for depression following social losses, whereas people high in need for personal achievement (autonomous individuals) are at risk for depression following failures (Beck, Epstein, & Harrison, 1983). Beck et al. (1983) suggested that depression in sociotropes is characterized by emotional lability, helplessness, crying, anxiety, and concern over social desirability, whereas depression in autonomous individuals is characterized by pessimism, guilt, irritability, and social withdrawal. This symptom-specificity hypothesis generally has been confirmed for sociotropic but not autonomous depression (Burke & Haslam, 2001; Sato & McCann, 2000). Although these and other previous attempts to subtype depression have captured important dimensions along which depressive reactions differ, we suggest that several factors limit their explanatory power. First, an individual’s depressive symptoms are by no means consistent across different depressive episodes (e.g., Coryell et al., 1994; Oquendo et al., 2004), a finding that undermines symptom-specificity models based on individual trait differences, including symptom-specific predictions of diathesisstress models. Second, previous attempts to subtype depression are not based on a unifying theoretical framework. The symptoms of hopelessness depression, for example, provide little insight into what symptoms we should expect in bereavement, SAD, or sociotropic depression. The symptoms of such descriptive subtypes have been thoroughly documented, but an answer to why these particular symptoms coexist remains elusive. Finally, previous attempts to subtype depression often focus on clinical depression. However, clinical depression requires the co-occurrence of a number of prespecified symptoms, which may artificially impose symptom uniformity and obscure heterogeneity that exists in less severe but more common (Judd, Akiskal, & Paulus, 1997) subclinical depressive episodes. The following two sections introduce a theoretical framework grounded in evolutionary theory that attempts to explain why the capacity for depressive symptoms exists in the first place. Although we focus on normally expressed depressive symptoms— those that most people experience following adverse situations— rather than on clinical depression per se, we believe that this
317
framework might also provide some theoretical coherence to the subtypes of depression reviewed above.
Evolutionary Explanations for Depressive Symptoms Unpleasant and disabling states, such as fever, pain, nausea, and inflammation, are often assumed to be abnormal even though they are commonly aroused by specific negative situations. However, it has become increasingly clear that these states are controlled by evolved regulation systems that express the response when cues indicate the presence of particular kinds of situations (Nesse, 2005; Nesse & Williams, 1994). Affect states likewise were shaped by selection to deal with the challenges posed by certain situations (Nesse, 1990). In particular, depressive symptoms are consistently aroused in response to certain adverse situations (Monroe & Simons, 1991), and they appear closely regulated. This suggests that depressive symptoms are not necessarily maladaptive but rather can be useful in the types of negative situations that arouse them. Several previous evolutionary hypotheses have argued for domain-specific functions of depression, such as a signal of submissiveness following a loss of status (Price, Sloman, Gardner, Gilbert, & Rhode, 1994), a strategy to conserve energy and resources (Engel, 1980), a way to avoid social losses (Allen & Badcock, 2003), as a means of social manipulation (Hagen, 1999; Watson & Andrews, 2002), or as a way to analyze complex social problems (Watson & Andrews, 2002). A more inclusive model suggests that depressive symptoms can be useful in unpropitious situations in any domain (Klinger, 1975; Nesse, 2000). These evolutionary models do not hypothesize that depressive symptoms are always adaptive but rather that the capacity to express them in certain adverse situations increased fitness among human ancestors, and so these capacities continue to be a part of human nature today. Our own hypothesis can be differentiated from most previous evolutionary hypotheses in at least two ways. First, we focus on normally expressed depressive symptoms rather than on clinical depression per se, not only because clinical depression may conceal important symptom heterogeneity but also because clinical depression is more likely to be an inappropriate and maladaptive response (see also Allen & Badcock, 2003). Intense and prolonged depressive symptoms (depression) may sometimes be normal, nonpathological responses to chronic or severe precipitants (whether useful or not in the individual instance), but at other times may reflect defects or maladaptive “noise” in the mechanisms responsible for regulating normal depressive symptoms (Keller & Miller, in press). Second, we do not argue that depressive symptoms have a unitary cause or serve a unitary function (see also Watson & Andrews, 2002). Rather, given that highly varied situations can arouse depressive symptoms and that many depressive symptoms have little in common (e.g., crying vs. fatigue vs. pessimism), we hypothesize that different symptoms serve related but nevertheless distinguishable functions. We see depressive symptoms as partially differentiated branches on a phylogenetic tree (Nesse, 2004).
The Situation–Symptom Congruence Hypothesis If different depressive symptoms serve different functions, then different precipitants should give rise to different symptom patterns that increase the ability to cope with the adaptive challenges
318
KELLER AND NESSE
specific to each situation. We term this predicted match situation– symptom congruence (SSC). Specific patterns of SSC can be predicted from the potential functions of 11 depressive symptoms and the utility of these functions in different situations. 1. Emotional pain or sadness should occur in response to losses of resources valuable to fitness (Nesse, 2000). As with somatic pain, the aversiveness of emotional pain is its raison d’eˆtre: It draws attention to and stimulates withdrawal from currently harmful situations, and it motivates avoidance of future actions that could lead to similar losses (Carver, 2004; Nesse, 2004). Given that social bonds have probably been among the most fitnessrelevant resources throughout human evolution, social losses should be especially painful. Situations that do not represent a loss per se, and in which a specific and potentially avoidable event is not the cause, should elicit less emotional pain. 2. Crying, like many emotional signals, is expressed via configurations of facial musculature and vocal behaviors, and it elicits specific reactions in receivers of the signal—in this case, empathy and comforting behaviors (Hill & Martin, 1997). It seems likely therefore that crying requests and secures aid. Given that crying appears to strengthen social bonds (Frijda, 1986), we predict that crying will be especially prominent when social bonds themselves are threatened, lacking, or lost. This hypothesis may seem at odds with evidence that depression is often met with interpersonal rejection (e.g., Segrin & Abramson, 1994). However, these conclusions are relevant to extreme depressive symptoms rather than crying in appropriate contexts. Moreover, this research generally indicates that depression elicits rejection from strangers and loose associates; depression appears to elicit solicitous responses from people close to the depressed person (Sheeber, Hops, Andrews, Alpert, & Davis, 1998). 3. Desire for social support would also be adaptive when help is needed. As with crying, the motivation for forming/strengthening social bonds may be especially high following social losses to replace lost bonds. When the loss is not social, however, forming social bonds should be less important. 4. Fatigue refers to physical or mental weariness. Normally, fatigue results from exertion and motivates conserving energy and disengaging motivation. It is parsimonious to assume that fatigue serves the same functions when continued striving is unlikely to be rewarded, such as following failures (given that continued striving at failed goals is maladaptive), when one is unable to cope with all they are attempting to do, or when physical exertion should be minimized to conserve energy, such as might have occurred during ancestral winters. 5. Pessimism is the tendency to expect unfavorable future outcomes. Some evidence suggests that such depressive shifts are actually away from a baseline optimistic bias and toward more realistic appraisals (Alloy & Ahrens, 1987), although in certain domains, pessimism is clearly unrealistically negative (e.g., Stone, Dodrill, & Johnson, 2002). Given that goal pursuit reflects the perceived likelihood of success (Carver & Scheier, 2001), pessimism should diminish initiative and withdraw the organism from current and potential goals (Klinger, 1975) and should be most prominent when future efforts are unlikely to succeed. 6. Guilt refers to feelings of self-reproach and worthlessness. Guilt might motivate an individual to try to figure out how his or her actions led to the situation, and so should be prominent in proportion to the degree of control the individual had in the situation.
7. Rumination, or the obsessive replaying of negative events, feelings, and implications of those feelings, is a common concomitant of depression (Beck, 1967). Numerous studies have concluded that rumination is maladaptive, based on evidence that it increases other depressive symptoms (Nolen-Hoeksema, 1991). However, this conclusion only holds if other depressive symptoms are indeed maladaptive. Moreover, research in emotion regulation stresses the importance of working through rather than avoiding negative emotions (Stanton, Kirk, Cameron, & Danoff-Burg, 2000). Along with other theorists (Martin & Tesser, 1996; Watson & Andrews, 2002), we hypothesize that rumination aids in understanding the causes and consequences of the adverse situations to avoid such situations in the future and to reconsider strategies and goals themselves. If so, rumination should be most prominent when the best future course of action is uncertain or after an untoward event that is potentially avoidable and could recur. 8. Anhedonia refers to diminished mood reactivity and a decreased ability to experience positive emotions. Positive emotions, according to numerous theorists, facilitate approach behavior and increase risk-taking (see Fredrickson, 2001). An inability to experience positive emotions should decrease these tendencies and should be prominent when the environment is unpropitious. 9. Anxiety is a painful state of uneasiness or nervousness about possible future losses. Anxiety promotes wariness and hypervigilance, particularly toward potential threats, and so should be adaptive in threatening situations (Marks & Nesse, 1994). 10. Appetite changes can increase or decrease food intake during depressive episodes. In the most serious cases of depression, appetite is diminished (Beck, 1967). This lack of response to normally pleasurable cues can be seen as a concomitant of anhedonia. A temporary decrease in foraging could have adaptively reduced energy expenditure and risk exposure in unpropitious situations in which efforts would likely be wasted. An increase in appetite, on the other hand, might have been adaptive when food is in short supply, such as during ancestral winters. 11. Sleep increases or decreases often occur during depressive episodes. It seems possible, but tenuous at best, that wakefulness is a form of nocturnal hypervigilance in risky situations. More sleep, on the other hand, could adaptively conserve energy in unpropitious situations. The 11 depressive symptoms above are ordered according to our subjective confidence about the proposed functions of each symptom. Certain depressive symptoms may simply be epiphenomena with no adaptive utility. For example, changes in sleep and appetite may be byproducts of more general changes in arousal. We have emphasized strictly functional accounts, in part, because they are more easily falsified. Support for nonadaptive explanations increases to the degree that empirical support for functional hypotheses is weak. We also note that functional accounts are not alternatives to proximate explanations about responsible mechanisms, either psychological or neurophysiological. Both proximal and ultimate/evolutionary explanations are essential for a full understanding of depressive symptoms. The prediction made by the SSC hypothesis is not that symptoms will be present or absent depending on the situation but only that they should be more or less pronounced in the predicted patterns. The fuzzy boundaries arise in part because the varied precipitants have multiple overlapping adaptive challenges. For instance, over evolutionary time, failures were probably preceded more often by other failures than by social losses, so we predict
EVOLUTION AND DEPRESSIVE SYMPTOMS
more fatigue, pessimism, and rumination following failures than following social losses. Nevertheless, social losses were also probably associated with future failures to some degree, and so also should elicit some degree of these symptoms. Depressive symptom patterns should differ quantitatively, not qualitatively, across precipitants. Such partial differentiation of response specificity has precedents in other biological domains. Antigens arouse general responses (inflammation, fever, and malaise) helpful in fighting a wide range of infections, as well as specific responses that depend on the kind of threat (eosinophils to parasites, interferon to viral invasion, and natural killer cells to cancerous cells). Subtypes of anxiety may also be partially differentiated to cope with different kinds of dangers (Marks & Nesse, 1994). The fit between situation and response supports the hypothesis that the immune response and anxiety are defensive reactions that maximized ancestral fitness in negative situations. Likewise, evidence for a fit between different kinds of situations and specific depressive symptoms would support the hypothesis that depressive symptoms aided ancestral fitness during such situations. In a previous study (Keller & Nesse, 2005), we found that students reported more fatigue and pessimism following failures or during the wintertime, and they reported more crying and sadness following social losses. These results are consistent with the SSC hypothesis but are preliminary for two reasons. First, symptom scales were derived from the Center for Epidemiologic Studies— Depression Scale (CES–D; Radloff, 1977) using face validity, meaning that many symptoms could not be assessed, and those that could had unknown reliabilities. Second, symptom–situation associations were based solely on retrospective reports that are open to noncausal explanations. To circumvent problems inherent to using existing depression scales, we developed a scale designed to measure different depressive symptoms in Study 1. In Study 2, participants used this scale to retrospectively report the symptoms that followed a recent adverse situation. To increase confidence that different situations cause different symptom patterns, we randomly assigned participants in Study 3 to imagine either the death of a loved one or the failure of a major goal prior to reporting current depressive symptoms.
Study 1: Depressive Symptoms Scale Development Depression inventories, such as the CES–D, Beck Depression Inventory (BDI; Beck, Steer, & Garbin, 1988), and Hamilton Depression Rating Scale (Hamilton, 1967), are designed and validated to measure a single underlying latent construct, depression severity, so they are poorly suited to measure specific symptoms of depression. The only scale designed to measure separate depressive symptoms, the Multiscore Depression Inventory (Berndt, Petzel, & Berndt, 1980), does not assess many of the symptoms for which we made predictions and was not developed using modern statistical methods. To investigate whether adverse situations lead to different symptom patterns, it was necessary to create our own scale of depressive symptoms, the Depressive Symptoms Scale (DSS).
Method Participants Because we wished to study normal reactions that follow adverse situations in ordinary people, we used nonclinical populations to validate the
319
DSS. The exploratory sample, also used in Study 2, consisted of undergraduate students who completed the study for course credit. At the beginning of one fall and one winter semester, we prescreened 2,664 introductory psychology students (57% female) for the experience of a 2-week period when they felt “down, sad, or disturbed” during the previous 12 months; 1,127 of the 2,664 students (42%) reported such a period. They then indicated the situations (if any) they thought caused this episode from among the following: general stress or inability to cope (46.7%), social isolation (39.5%), romantic breakup (25.4%), failure at an important goal (19.7%), a specific situation not mentioned above (18.9%), death of a loved one (13.0%), the wintertime (9.7%), and no specific cause (8.4%). These categories were derived from our earlier study (Keller & Nesse, 2005), and the percentages sum to more than 100% because participants could choose more than one category. To ensure adequate sample sizes for each precipitant category (important for Study 2), we prescreened participants to oversample those who experienced less common precipitants (deaths of loved ones, wintertime blues, and failures). We invited 623 of the 1,127 eligible students to participate for course credit, of which 473 agreed and 456 of these (96%) completed the study. A further 11 responses (2.5%) were later dropped because they had incomplete data, indicated on a probing question that they had not taken the survey seriously (see Measures), or visited the debriefing Web page before completing the survey. Of the 445 complete responses, 283 were female, 162 were male, and ages ranged from 18 –23 years (M ⫽ 18.8, SD ⫽ 1.0). We supplemented the exploratory sample with a cross-validation sample of 311 participants who volunteered to participate on a Web site dedicated to online psychological studies. Unlike the exploratory sample, the crossvalidation sample was used solely in Study 1 because information regarding the precipitating situation was not collected. The data for 22 participants (7%) were dropped prior to any analysis for the same reasons mentioned for the exploratory sample. Of the remaining 289 eligible participants, 221 were female and 68 were male; 269 lived in North America, 14 lived in Europe, and 6 lived in Asia; ages ranged from 18 to 58 years (M ⫽ 27.1, SD ⫽ 9.9).
Procedure Participants who met our prescreening criteria in the exploratory sample completed the survey over the Internet at a private location (usually at home) after receiving an e-mail with the Web address. Participants in the cross-validation sample chose the present study among many other study titles by clicking on a link titled “25 Minute Psychology Survey.” After reading the consent form and filling out a short demographic questionnaire, participants from both samples identified the weeklong period when they felt the worst emotionally in the previous 12 months. To refresh their memories of this period, participants wrote a free-format paragraph about what events or situations, if any, they thought precipitated the depressive symptoms and another paragraph about how they felt during the weeklong period when they felt the worst. Participants in the exploratory sample provided additional information (see Measures, Study 2). Finally, all of the participants responded to items from the DSS, the CES–D, and then the BDI scales regarding the symptoms that occurred during the week when they felt the worst, and then answered a final probing question. The last page fully debriefed the participants.
Measures DSS. Participants answered 66 questions, written by Matthew C. Keller, which assessed 11 depressive symptoms (6 questions per symptom): Emotional Pain, Anhedonia, Fatigue, Pessimism, Rumination, Crying, Guilt, Anxiety, Changes in Eating Habits, Changes in Sleeping Habits, and Desire for Social Support. The first 10 symptoms are commonly identified as symptoms of depression (Beck, 1967); Desire for Social
KELLER AND NESSE
320
Support was added based on predictions of the SSC hypothesis. When awkward wording could be avoided, an equal number of positively and negatively worded items were included for each scale. Because participants reported on how they felt during a weeklong period, we used the duration response format from the CES–D (rarely or none of the time ⫽ 1; some of the time ⫽ 2; a moderate amount of the time ⫽ 3; most or all of the time ⫽ 4). We attempted to assess a range of symptom intensities across the 6 questions for each symptom. Items were presented in random order within groups of 11 items, such that each symptom was represented only once within each group, and groups were then randomly ordered. The DSS instructions read as follows: Please think carefully about how you felt during the weeklong period when you felt the worst. After each statement, indicate how often you felt the ways described. Remember: (a) all responses are completely anonymous, so be as honest as possible; (b) answer each item separately from all others, even if some questions seem redundant; (c) there are no right or wrong answers; try to indicate how you actually felt rather than how you think you “should” have felt. Other depression inventories. Participants also filled out the BDI and CES–D. The BDI is a 21-item measure of depression that uses a 4-point, statement-anchored response format. The CES–D scale is a 20-item measure of depression that uses a 4-point, frequency response format. The CES–D is often considered a more sensitive measure of less severe, subthreshold depression compared with the BDI. The BDI and CES–D items were reworded to be in the past tense for this study. Probing questions. Participants answered the question “How seriously have you answered questions on this survey up to this point (all responses are 100% anonymous . . . knowing this really helps us out)” using a 5-point, description-anchored scale (very seriously ⫽ 1 to not seriously at all ⫽ 5). Ninety-eight percent of participants indicated that they had taken the task seriously or very seriously.
Analysis The analysis proceeded in three phases. In the first phase, conducted on responses from the exploratory sample, we used exploratory factor analysis (EFA) and then confirmatory factor analysis (CFA) to uncover the latent structure of the 66 depressive symptom items and to drop items and factors that did not fit this structure. In the second phase, we cross-validated the final (primary) model from the exploratory sample on the cross-validation sample using CFA. In the final phase, we compared the primary model with several alternative models. We treated item responses as ordinal rather than continuous data by fitting all factor models using robust weighted least squares on polychoric correlations with Mplus 3 software (L. K. Muthe´n & Muthe´n, 1998). The data set from the combined sample (N ⫽ 724) contained 71 missing values out of a total of 47,784 (724 ⫻ 66) possible values (.0015 of the total data frame). We imputed the missing values using PRELIS 2 (Jo¨reskog & So¨rbom, 1996). To check that this imputation did not alter the model fits, we reran final models using listwise deletion. Changes in fit were extremely minor and are not reported.
Results and Discussion Refinement of the Primary Model in the Exploratory Sample We determined the number of factors using scree plots from principal-axis EFA with oblique promax rotation. As expected, the first eigenvector explained a large amount of the variation (31%) in item responses. Nevertheless, much of the latent factor substruc-
ture was not captured by this single factor; solutions of between 11 and 13 latent factors fit the data best (accounting for 69% to 74% of the overall variation). The 13-factor solution was the most interpretable, having factors corresponding to nine of the symptoms that we had expected, as well as four factors from questions we originally thought would tap into just two symptoms: sleepiness and quality of sleep (from Changes in Sleeping Habits questions) and loss of appetite and increased eating/weight gain (from Changes in Eating Habits questions). We used CFA to refine the 13-factor solution suggested by EFA. Because part of our interest was to understand how depressive symptoms relate to each other, we allowed the 13 latent factors to be intercorrelated. Items that loaded poorly on the factors (standardized loadings ⬍ .50), that were factorially complex (as judged by modification indices showing large cross-loadings to other factors) or for which the model explained little item variation (R2 ⬍ .30) were dropped sequentially, and the model was rerun. Factors that ended up having fewer than 3 items (quality of sleep and increased eating/weight gain) were dropped. This procedure was iterated until no more items or factors could be dropped. The primary model for the DSS in the exploratory sample retained 47 items that loaded onto 11 depressive symptoms (see Table 1). Table 2 shows the maximum likelihood correlation matrix as well as the descriptive statistics for the 11 DSS symptom scales from the exploratory sample. The average coefficient alpha (Cronbach, 1951) for the 11 DSS subscales was .86. We conducted a second-order EFA with oblique promax rotation on the 11 ⫻ 11 correlation matrix shown in Table 2. Fatigue, Anhedonia, Emotional Pain, Pessimism, Crying, Low Appetite, and Sleepiness loaded most highly (in order) onto the first factor (Overall Dysphoria); Guilt, Rumination, Pessimism, and Anxiety loaded most highly onto the second factor (Brooding/Agitation); Crying, Emotional Pain, and Desire for Social Support loaded most highly onto the third factor (Signal for Help). We used the factor loadings from the Overall Dysphoria factor to create an Overall Dysphoria score for each participant, which appeared to tap into the same construct as the overall scores of the BDI and CES–D (see Table 2).
Cross-Validation of the Primary Model In the second phase of the analysis, we assessed the degree to which the primary model developed with the exploratory sample explained item covariance in the cross-validation sample. The chi-square statistic is not a good index of fit in this case because even trivial lacks of fit tend to be significant with large sample sizes. Better fit indices are the Tucker–Lewis index (TLI), the comparative fit index (CFI), and the root-mean-square error of approximation (RMSEA). For both continuous and categorical data (Yu & Muthe´n, 2001), TLI ⬎ .95, CFI ⬎ .95, and RMSEA ⬍ .06 suggest “good fits” (Hu & Bentler, 1999), whereas TLI ⬎ .90, CFI ⬎ .90, and RMSEA ⬍ .10 have historically suggested “acceptable fits” (Browne & Cudeck, 1993; though see Hu & Bentler, 1999). For the exploratory sample, two of the indices from the primary model indicated good fits and two indicated acceptable fits. We expected a decent fit for the exploratory sample because the model was refined using this sample; better information of a model’s generalizability comes from the same model run on an independent
EVOLUTION AND DEPRESSIVE SYMPTOMS
321
Table 1 Means and Factor Loadings for the Depressive Symptoms Scale Exploratory sample Scale and item Emotional Pain I felt really sad I “hurt” inside, even though the pain wasn’t physical I felt fine emotionally I was in agony I was free from emotional pain Pessimism Things seemed hopeless I felt pessimistic about the future I felt like things were going to turn out really well I felt discouraged about things I felt hopeful for the future Fatigue I felt as energetic as I normally do Everything seemed like such an effort I felt active and full of “pep” I could not “get going” It was easy to get a lot of things done Anhedonia I was still able to feel happy I enjoyed life Nothing could make me smile Things that normally gave me joy continued to give me joy I was incapable of feeling anything pleasant Rumination I couldn’t “let go” of certain thoughts I was able to clear problems from my mind I thought about how I could have done things differently I would catch myself thinking about the same issue Crying I felt like crying I cried really hard I got teary-eyed I sobbed It took effort to fight off tears Guilt I felt ashamed I felt guilt-free I was angry at myself Rational or not, I blamed myself Low Appetite The thought of food was not appealing I lost my appetite Food didn’t taste as good as it usually did Anxiety I was free from fear Things made me nervous I was free from worrying I was more afraid than usual I felt anxious Sleepiness I wanted to sleep all day I slept more than I normally do I felt sleepy even when I had gotten plenty of sleep Desire for Social Support I felt like having a heart-to-heart with a close friend or relative I wanted to share how I felt with someone I wanted to be with close friends or family for support
Confirmatory sample
M
SL
M
SL
3.39 3.28 1.42 2.40 1.34
.87 .74 ⫺.87 .77 ⫺.71
3.38 3.32 1.41 2.87 1.31
.80 .78 ⫺.90 .82 ⫺.65
2.98 2.94 1.49 3.23 1.75
.85 .81 ⫺.73 .76 ⫺.74
3.00 2.98 1.61 3.23 1.76
.91 .80 ⫺.86 .77 ⫺.81
1.58 2.72 1.30 2.71 1.53
⫺.70 .64 ⫺.80 .76 ⫺.58
1.58 2.99 1.33 2.91 1.59
⫺.72 .77 ⫺.74 .82 ⫺.62
2.00 1.69 2.26 2.15 2.09
⫺.79 ⫺.87 .76 ⫺.68 .81
1.86 1.66 2.46 1.91 2.57
⫺.81 ⫺.89 .78 ⫺.74 .78
3.52 1.52 3.26 3.55
.72 ⫺.61 .61 .64
3.52 1.52 3.24 3.58
.76 ⫺.67 .62 .81
3.21 2.43 2.82 2.40 2.64
.91 .94 .89 .95 .83
3.12 2.48 3.02 2.64 3.00
.92 .92 .89 .97 .73
2.41 1.79 2.68 2.76
.77 ⫺.60 .88 .83
2.53 1.81 2.75 2.80
.82 ⫺.63 .77 .85
2.10 2.01 2.13
.89 .93 .86
2.39 2.47 2.58
.85 .91 .93
1.86 2.26 1.34 2.08 2.38
⫺.54 .68 ⫺.76 .88 .85
1.73 2.65 1.41 2.61 2.71
⫺.61 .80 ⫺.83 .83 .78
2.62 2.31 2.69
.66 .97 .86
2.48 2.16 2.62
.60 1.03 .72
2.65 2.75 2.63
.92 .88 .82
2.58 2.71 2.50
.45 1.70 .37
Note. Item means are reported on a 1– 4 scale. SL ⫽ standardized loadings from freely estimated pathways in threshold confirmatory factor models.
KELLER AND NESSE
322
Table 2 Correlations and Descriptive Statistics Among 11 Depressive Symptoms Scale (DSS) Subscales, Overall Dysphoria, the CES-D, and the BDI Scale
1
2
3
4
5
6
7
8
9
10
11
12
13
14
1. Emotional Pain 2. Pessimism 3. Fatigue 4. Anhedonia 5. Rumination 6. Crying 7. Guilt 8. Low Appetite 9. Anxiety 10. Sleepiness 11. Desire for Social Support 12. Overall Dysphoria 13. CES-D 14. BDI No. items in scale M SD Coefficient ␣
— .82 .71 .81 .81 .79 .52 .53 .56 .36 .08 .94 .82 .75 5 3.27 0.64 .89
— .77 .79 .75 .50 .64 .44 .55 .44 ⴚ.16 .96 .83 .80 5 3.19 0.69 .87
— .82 .63 .46 .47 .53 .43 .73 ⴚ.16 .93 .84 .78 5 3.21 0.58 .82
— .64 .54 .46 .57 .34 .39 ⴚ.25 .92 .83 .77 5 2.71 0.66 .88
— .54 .81 .42 .59 .27 .12 .86 .76 .75 4 3.46 0.55 .73
— .26 .41 .37 .30 .22 .68 .59 .51 5 2.70 0.94 .95
— .33 .60 .25 ⫺.04 .65 .60 .68 4 2.77 0.84 .84
— .35 .39 ⫺.04 .61 .61 .61 3 2.08 0.96 .91
— .25 .13 .61 .57 .57 5 2.71 0.73 .84
— ⫺.07 .58 .54 .52 3 2.55 0.98 .88
— ⴚ.11 ⴚ.13 ⴚ.17 3 2.66 1.00 .90
— .88 .82 47 2.89 0.74 n/a
— .86 20 2.65 0.55 .89
— 21 2.08 0.50 .89
Note. Significant correlations ( p ⬍ .01) in bold. The DSS is made up of Subscales 1–11. Statistics for scales and subscales are based on participants’ means of the questions making up the scales after relevant items were reversed. CES-D ⫽ Center for Epidemiologic Studies–Depression Scale; BDI ⫽ Beck Depression Inventory.
sample. We ran two types of models on the cross-validation sample. The factor pattern invariant model (Table 3, row 2a) fixed the pathways to be the same as they were in the exploratory sample but allowed the loadings to vary. The fits of this model were similar to the fit for the exploratory sample, with the RMSEA index showing a slight decrement. The factor loading invariant model (Table 3, row 2b) fixed both the pathways and their loadings to be the same as in the exploratory sample. The fits of this more stringent model were somewhat degraded. Taken together, the cross-validation results indicate that the same basic latent factor structure existed in both samples but that the specific values of the factor loadings differed slightly between them. Because the patterns were similar but loadings differed between the samples, we combined the two samples for further
analyses but allowed the factor loadings as well as the means of the latent variables to differ between them.
Comparisons With Alternative Models As recommended by Cliff (1983), we compared the primary model with several plausible alternative models to gauge the uniqueness of the primary model and to better understand the structure of the items and latent factors. All items loaded directly onto a single Overall Dysphoria factor in the first alternative model (Table 3, row 4). The very poor fit of this model indicates that items from similar symptoms do indeed cluster together. The second alternative model (Table 3, row 5) is the same as the primary model except that items from four symptoms having the
Table 3 Goodness-of-Fit Summaries for Confirmatory Factor Models Model Primary model: Eleven intercorrelated subscales 1. Exploratory sample (n ⫽ 446) 2. Cross-validation sample (n ⫽ 289) 2a. Pattern invariant 2b. Loading invariant 3. Full sample (N ⫽ 735) Alternative models (full sample) 4. No subscales; all items load directly onto Overall Dysphoria 5. Eight intercorrelated subscales model 6. Eleven subscales correlate only through Overall Dysphoria 7. Three second-order intercorrelated latent factors
2
df
CFI
TLI
RMSEA
596*
176
.971
.913
.073
381* 187* 927*
101 37 265
.964 .947 .969
.902 .947 .908
.098 .119 .083
3,606* 1,285* 892* 1,005*
150 265 163 245
.532 .858 .874 .962
.818 .952 .954 .895
.202 .103 .101 .093
Note. 2 and df statistics are approximations due to fitting of robust weighted least squares using polychoric correlations. The df statistics differ between identical models (rows 1 and 3) because sample sizes are used in these df approximations (B. O. Muthe´n, 2004). See text for explanations of terms and descriptions of models. CFI ⫽ comparative fit index; TLI ⫽ Tucker-Lewis index; RMSEA ⫽ root-mean-square error of approximation. * p ⬍ .001.
EVOLUTION AND DEPRESSIVE SYMPTOMS
highest intercorrelations—Emotional Pain, Fatigue, Anhedonia, and Pessimism—loaded instead onto a single latent factor, Core Dysphoria. The poor fit of this model indicates that although these four symptoms appear to be core depressive symptoms, items tapping into them are not interchangeable, and these four symptoms are differentiable. In the third alternative model (Table 3, row 6), the items related to the symptoms as in the primary model, but the 11 symptoms loaded onto a single Overall Dysphoria factor, such that the intercorrelations between the symptoms were explained only by their associations with Overall Dysphoria. The relatively poor fit of this model indicates that there is substructure even between the 11 symptoms themselves; the relationships between symptoms cannot be captured along a single dimension. The fourth alternative model (Table 3, row 7) was similar to the third, except that the 11 symptoms loaded onto the three intercorrelated second-order latent factors (Overall Dysphoria, Brooding/Agitation, and Signal for Help) that were suggested by the second-order factor analysis described above. The similar fit of this model compared with the primary model suggests that the relationships between symptoms fall mainly along three dimensions, and this model might be considered a viable alternative to the primary model.
Summary The DSS primary model fit the exploratory sample well to adequately depending on the criterion. The differences in fits between the exploratory and cross-validation samples were not large, which is noteworthy given the major demographic differences between the two samples. The factor structure of the DSS also appeared preferable to several plausible alternative factor structures. Thus, the basic DSS structure reported here—11 intercorrelated subscales—should generalize to the nonclinical populations from which these samples were drawn. Although the DSS is sufficiently reliable for the present study, the present version should not be considered final. Future revisions of the DSS need to improve the model’s overall fit by rewording items with low loadings and including more items for symptoms with few items.
Study 2: Test of the Situation–Symptom Congruence Hypothesis In Study 2, we used the DSS to investigate whether the patterns of depressive symptoms differed depending on the precipitating situation, and if so, whether these patterns were consistent with the SSC hypothesis.
Method Participants and Procedure The participants of Study 2 were the exploratory sample described in Study 1. The procedure is also described above, although participants provided more detailed information, as described below.
Measures Categorical precipitants. After writing the free-format paragraphs about the causes of their depressive symptoms (see Procedure, Study 1), participants chose the single most likely cause from among the following eight (mutually exclusive) precipitants: Death of Loved One (n ⫽ 44),
323
Romantic Loss (n ⫽ 92), Social Isolation (n ⫽ 112), Failure at Important Goal (n ⫽ 44), Stress or Difficulty Coping (n ⫽ 83), Wintertime (n ⫽ 30), No Cause (n ⫽ 13), and Other Cause (n ⫽ 27). Privacy protections did not allow matching prescreening data to participants’ responses collected during the study, and so we could not assess the degree to which these responses corresponded to their earlier, nonmutually exclusive responses provided during prescreening. Properties of precipitants. Using 6-point scales (not at all ⫽ 1 to completely ⫽ 6), participants answered the following five questions about the precipitant: (a) “To what degree was the situation due to a social loss (e.g., losing someone close to you through a death or a breakup, losing a friend after a fight)?” (Social Loss; M ⫽ 3.6, SD ⫽ 1.8); (b) “To what degree was the situation caused by your effort at something not working?” (Failed Effort; M ⫽ 3.0, SD ⫽ 1.6); (c) “To what degree was the situation due to being shamed?” (Shamed; M ⫽ 2.1, SD ⫽ 1.4); (d) “To what degree did the situation occur suddenly (vs. gradually)?” (Suddenness; M ⫽ 3.7, SD ⫽ 1.6); and (e) “To what degree did you have control over the situation?” (Control; M ⫽ 2.5, SD ⫽ 1.4). Participants then indicated the date when the precipitant occurred or when they began to feel bad if no precipitant occurred. Time Since Precipitant (M ⫽ 39.5, SD ⫽ 30.4) was defined as the number of weeks between this date and when the survey was completed. Depressive symptom scores. We obtained 11 standardized symptom scores for each participant from the primary DSS model on the exploratory sample (see Study 1) using the SAVE ⫽ FSCORES command in Mplus 3. We also obtained an overall dysphoria score based on a weighted combination of the 11 symptoms (see Study 1). Information on present mood, antidepressant usage, and depression history. Following completion of the DSS, CES–D, and BDI surveys (see Measures, Study 1), participants rated their mood over the last week using a description-anchored, 9-point scale (completely depressed ⫽ 1 to completely euphoric ⫽ 9; M ⫽ 5.4, SD ⫽ 1.7). Participants then indicated how often they had been depressed in their life from among the following: “I have never been depressed” (10%), “I have been depressed once in my life” (17%), “I have been depressed a few times in my life” (65%), “I have been depressed most of my life” (7%), and “I have been depressed for as long as I can remember” (1%). Finally, participants indicated whether they were currently taking antidepressant medication (10% were).
Analysis The global prediction that different precipitants would be associated with different depressive symptoms patterns was tested by the Precipitant ⫻ Symptom interaction term in repeated measure multivariate analysis of variance (MANOVA) using the GLM command in SPSS software. The 11 depressive symptoms from the DSS were within-subject dependent variables, and the 7 categorical precipitants served as between-subjects independent variables. The Other Cause precipitant was not included in these analyses, reducing the sample size from 445 to 418. This repeated measure MANOVA analysis is similar to a mixed (split-plot) analysis of variance (ANOVA), with one between-subjects variable (precipitant type, 7 levels) and one within-subject term (symptom type, 11 levels), but it does not require the rarely met statistical assumption of sphericity (Tabachnick & Fidell, 2001). We tested predictions of the SSC hypothesis using both within- and between-subjects follow-up ANOVA contrasts. We used structural equation modeling (SEM) in Mplus 3 software to test whether the ratings on the degree to which the precipitants involved social loss and failed effort (from Properties of Precipitants) were differentially related to the 11 DSS symptoms. With sample sizes as large as the present one, multivariate normality is not crucial for statistical inference with MANOVA or SEM, but the presence of outliers can be problematic (Tabachnick & Fidell, 2001). We found no multivariate outliers using a conservative p ⬍ .001 criterion for Mahalanobis distances, which compared the highest scores and those
KELLER AND NESSE
324
expected from a chi-square distribution with degree of freedom equal to the number of variables. Because the sensitive Box’s M test indicated that the assumptions of equality of the variance– covariance matrices were violated on the omnibus MANOVA analyses, we used Pillai’s approximation to the F (hereafter Pillai’s F) for omnibus tests, which is robust to this assumption (Olson, 1979).
Results and Discussion Tests of the Situation–Symptom Congruence Hypothesis The Precipitant ⫻ Symptom MANOVA interaction term was highly significant across the 11 DSS symptoms, Pillai’s F(60, 2448) ⫽ 4.77, p ⬍ .001, partial 2 ⫽ .11, indicating that different precipitants aroused different patterns of depressive symptoms. Controlling for gender, time since the precipitant, number of previous dysphoric episodes, antidepressant usage, and mood in the last week, the Precipitant ⫻ Symptom interaction remained significant, Pillai’s F(60, 2406) ⫽ 4.64, p ⬍ .001, partial 2 ⫽ .10. We used the hypothesized functions of each symptom (outlined in the introduction) to predict which symptoms should be prominent following the particular precipitants investigated in Study 2 (see Table 4). Anxiety was left out because none of its predictions corresponded well to the six precipitant categories. The fourth column of Table 4 shows 10 between-subjects contrast tests, one per symptom, which compared the mean symptom levels of precipitants expected to have high levels of that symptom versus the mean of the other precipitants. SSC predictions were supported for 4 of 10 symptoms. However, main effects of precipitants can obscure precipitant differences within symptoms (Tabachnick & Fidell, 2001). For example, fatigue levels may not have differed between the romantic loss and winter precipitants, as predicted, simply because virtually all symptoms were higher for romantic loss (see Figure 1). Controlling for overall dysphoria, the predicted precipitants had significantly higher mean levels, relative to their
overall symptom levels, for 8 of the 10 symptoms (fifth column, Table 4). An alternative to testing if precipitants differ within each symptom is to test if symptom levels differ within each precipitant. This approach does not suffer from the analogous problem discussed in the previous paragraph—main effects of symptom levels obscuring symptom pattern differences in this case— because symptoms, having means of zero, necessarily had no main effects. For each precipitant, we performed a repeated measures contrast test that compared the combination of symptoms expected to be prominent versus the combination of all other symptoms. Except for the symptom pattern following an inability to cope, SSC predictions were supported (contrasts shown in Figure 1). These effects held or grew stronger after controlling for the same five variables described in the previous paragraph. The SSC-inspired contrasts predicted a substantial amount of the variation in depressive symptom patterns (2 ⫽ .03 to .33, depending on the precipitant). This is impressive given that different people must react differently to similar situations and that each participant had to choose a single, mutually exclusive precipitant, which may not always reflect reality.
SEM Tests of Situation–Symptom Congruence In addition to assessing whether symptom patterns differ between mutually exclusive precipitant categories, the SSC hypothesis can also be tested by assessing whether symptom patterns differ as a function of the degree to which relevant dimensions were perceived to play a role in causing the depressive symptoms. We chose to focus on two dimensions, Social Loss and Failed Effort (from Properties of Precipitants), because these dimensions are predicted by the SSC hypothesis to lead to much different symptom profiles. To test this, we began with a fully saturated SEM model in which Social Loss and Failed Effort had pathways
Table 4 Results of Follow-Up Between-Subjects Contrast Tests for Each Symptom Are mean symptom levels significantly higher among predicted precipitants? Proposed function of symptom
Symptom should be prominent following:
Emotional pain
To make fitness-relevant losses aversive
Crying
To signal a need for help and succor To make or re-form social bonds To down-regulate effort
Death, romantic loss, social isolation, failure Death, romantic loss, social isolation Death, romantic loss, social isolation Failure, can’t cope, winter Failure, can’t cope Romantic loss, failure, can’t cope Romantic loss, failure, can’t cope
Symptom
Social support Fatigue Pessimism Guilt Rumination Anhedonia High appetite Sleepiness a
To give up on failing goals To learn from one’s role in current situations To analyze current situations to avoid similar future situations To decrease approach behaviors To increase calories To conserve energy
For t tests, the degree of freedom is 412.
b
Failure, can’t cope, winter Winter Winter
Not controlling for overall dysphoriaa
Controlling for overall dysphoriab
No, t ⫽ 0.73, p ⫽ .532, 2 ⫽ .00
Yes, t ⫽ 5.33, p ⬍ .001, 2 ⫽ .07
Yes, t ⫽ 2.91, p ⫽ .004, 2 ⫽ .02
Yes, t ⫽ 5.27, p ⬍ .001, 2 ⫽ .06
Yes, t ⫽ 4.19, p ⬍ .001, 2 ⫽ .04
Yes, t ⫽ 4.06, p ⬍ .001, 2 ⫽ .04
No, t ⫽ ⫺1.76, p ⫽ .186, 2 ⫽ .00
Yes, t ⫽ 3.17, p ⫽ .002, 2 ⫽ .03
No, t ⫽ 0.93, p ⫽ .352, 2 ⫽ .00 Yes, t ⫽ 5.09, p ⬍ .001, 2 ⫽ .06
Yes, t ⫽ 5.99, p ⬍ .001, 2 ⫽ .08 Yes, t ⫽ 6.88, p ⬍ .001, 2 ⫽ .10
Yes, t ⫽ 2.45, p ⫽ .015, 2 ⫽ .01
Yes, t ⫽ 5.41, p ⬍ .001, 2 ⫽ .07
No, t ⫽ ⫺3.12, p ⫽ .002, 2 ⫽ .02
No, t ⫽ ⫺1.64, p ⫽ .101, 2 ⫽ .01
No, t ⫽ 1.75, p ⫽ .110, 2 ⫽ .01 No, t ⫽ 0.796, p ⫽ .426, 2 ⫽ .00
No, t ⫽ 0.06, p ⫽ .951, 2 ⫽ .00 Yes, t ⫽ 3.02, p ⫽ .003, 2 ⫽ .02
For t tests, the degree of freedom is 411.
EVOLUTION AND DEPRESSIVE SYMPTOMS
325
to all 11 depressive symptoms, and we then dropped pathways that did not reach marginal significance ( p ⬍ .10) one at a time, rerunning until all such pathways had been dropped. The final model (see Figure 2) shows that Failed Effort related significantly to (in order of strength of association) Guilt, Rumination, Pessimism, Fatigue, Anxiety, Sleepiness, Anhedonia, and Emotional Pain. Social Loss, on the other hand, related significantly to Desire for Social Support, Crying, and Emotional Pain and was negatively associated with Guilt. The fit of this final model was almost perfect, 2(9) ⫽ 7.29, p ⫽ .61 (CFI ⫽ 1.00, TLI ⫽ 1.00, RMSEA ⫽ .00) because all 9 degrees of freedom came from pathways that were dropped due to being nonmarginally significant in previous models. In a second model controlling for three likely mediating variables, we essentially replicated these results (the Failed Effort–Anxiety pathway was nonsignificant), indicating that Shamed, Suddenness, and Lack of Control do not mediate the relationships in Figure 2.
Study 3: Depressive Symptoms Following Random Assignment to Imagined Precipitants Study 2 found that retrospective reports of depressive symptom patterns differed depending on the precipitant in ways consistent
Figure 2. Structural equation model relating 11 depressive symptoms to the degree to which social losses and failed efforts played roles in causing the depressive symptoms. Path coefficients for the bold, dotted, and dashed pathways are significantly positive ( p ⬍ .05), marginally positive ( p ⬍ .10), and significantly negative ( p ⬍ .05), respectively. Pathways that were not marginally significant ( p ⬎ .10) were dropped from the model.
with predictions based on a functional hypothesis of depressive symptoms. However, these data could not assess the direction of causation. People who are characteristically fatigued and pessimistic may be more likely to fail at goals, for example. Study 3 attempted to control for such third-variable and reverse causation explanations by an experimental manipulation using imagined precipitants.
Method Participants Because imagining depressing scenarios has less emotional impact than real-world situations investigated in Study 2, we preselected participants who had a good chance of being emotionally affected by the experimental manipulations. To this end, we prescreened 1,211 introduction to psychology students to identify 509 participants who rated both accomplishments of goals and personal attachments as being important or very important and who had not participated in Study 2. Of the 353 invited to participate, 129 responded, and 116 (90%) completed the study for course credit. Of these participants, 64 were female, 52 were male, and ages ranged from 18 to 22 years (M ⫽ 18.5, SD ⫽ 0.93).
Figure 1. Mean levels (boxes and stars) within each precipitant (Low Appetite was reversed for visual clarity). Stars represent symptoms predicted to be prominent in that precipitant, according to the situation– symptom congruence hypothesis. Error bars represent 95% confidence intervals. Repeated measures contrasts (t tests) compare the mean of the predicted symptoms with the mean of all other symptoms for each precipitant. Pessm. ⫽ pessimism; Anhed. ⫽ anhedonia; Rumin. ⫽ rumination; Hi Appet. ⫽ High Appetite. * p ⬍ .05, ** p ⬍ .01, *** p ⬍ .001.
Procedure Participants completed the survey over the Internet at a private location (usually at home) between 10 a.m. and 7 p.m. After reading the consent form and filling out a demographic questionnaire, participants clicked on a link that randomly assigned them to either identify their most important goal over the next 3 years (failure condition, n ⫽ 60) or to identify the person whom they felt closest to (death condition, n ⫽ 56). We dropped 1 participant in the failure condition and 2 participants in the death condition because they indicated on a probing question (see Measures) that they had
KELLER AND NESSE
326
not taken the task seriously. Participants in the death condition read the following instructions: The first task of this experiment will be for you to write a fictional first-person story. The reason for writing this story is to induce an emotional reaction in you, so we encourage you to let yourself go emotionally and to allow yourself to feel any and all emotions that writing about this event elicits. In particular, we would like for you to imagine that you receive news in November that the person whose initials you placed in the box above has been diagnosed with brain cancer. Over the next few months the doctors try several promising procedures. However, by March, it becomes clear that things are not going well, and in late March this close person to you dies. Keep the following in mind as you write: (a) Write the story in the first person, and be as descriptive as possible. (b) The story should begin by describing the person you placed in the box above, why this person is important to you, and what you and this person have been doing (in the fictional future) together. The story should end in April, 2 weeks after you have learned of this person’s death. (c) Make the story as realistic as possible—something that could actually happen in the future. (d) Try to cultivate the emotional reaction that your story elicits. Instructions for the failure condition were similar, except that aspects related to the loved one were replaced by the important goal that participants had identified, and aspects related to the death were replaced by a definitive failure at the goal. Enough clarifying information was given to ensure that the failure participants only identified goals that they could conceivably fail at over the next few years (e.g., abstract goals, such as “world peace,” were not allowed). The time frames were made explicit (the death/failure occurring 6 months after the stories began) to remove this as a potential confound. All participants were asked to write their fictional account until the input box was full (about 350 words). Participants then completed a modified DSS questionnaire and answered two probing questions. Participants were fully debriefed on the final page of the study. The study took 15– 40 min to complete.
Measures Depressive symptom scores. The DSS item wordings, instructions, and response format were altered for Study 3. DSS items were worded in the present tense. Questions from the Rumination, Sleepiness, and Low Appetite scales—symptoms unlikely to meaningfully change over the brief course of this study—were omitted. Likewise, the intensities of several questions were altered when judged necessary (e.g., “I feel like I could cry really hard” rather than “I cried really hard”). The modified DSS instructed read: “Think carefully about how you actually feel right now compared to how you felt on average today. We are interested in what types of feelings and emotions that you are experiencing, not in how you think that you should or would feel” (italics in original). We also altered the response anchor descriptions (a lot less than before the study ⫽ 1 to a lot more than before the study ⫽ 5). Such self-perceived deviations of mood state scales have been shown to be reliable and to correlate highly with repeatedly measured mood states (Eid, Schneider, & Schwenkmezger, 1999). Moreover, self-perceived deviations of mood state scales effectively control for stable interpersonal differences (Eid et al., 1999), which was important given interpersonal differences in baseline symptoms likely to exist in any sample. Residual factor covariance matrices were not positive definite in CFA models, a common situation when the ratio of sample size to number of ordinal items (113:44) is as low as it is in the present sample (Flora & Curran, 2004). Rather than saving factor scores from Mplus, DSS symptom scales were the standardized sums of relevant items. Probing questions. Using the same probing question from Study 1, 93% of participants indicated that they took the task “seriously” or “very
seriously.” We also asked participants to “choose the number that describes how much of an emotional effect writing the story had on you” (big effect ⫽ 1 to no effect ⫽ 5; M ⫽ 2.64, SD ⫽ 1.02).
Analysis We found no violations of the assumption of equal variance– covariance matrices. We ran analyses both including and excluding 2 participants who had outlying Mahalanobis distances using the same criterion from Study 2, but because there was little difference between these models, only the models including both are reported. Participants reported being more emotionally involved in writing the death story (M ⫽ 2.3, SD ⫽ 0.90) than the failure story (M ⫽ 3.0, SD ⫽ 1.0), t(108) ⫽ 3.60, p ⬍ .001, so this variable was statistically controlled in analyses. Gender had no additive or interactive effects in the analyses and so was not included as a covariate.
Results and Discussion The Precipitant ⫻ Symptom interaction was highly significant across the eight symptoms assessed in this study, Pillai’s F(7, 105) ⫽ 5.97, p ⬍ .001, partial 2 ⫽ .29, indicating that visualizing the death of a loved one led to a different pattern of depressive symptoms than visualizing a major failure. Controlling for how emotionally involving writing the story had been, the effect remained strong, Pillai’s F(7, 103) ⫽ 4.02, p ⫽ .001, partial 2 ⫽ .22. The symptom profiles and repeated measures contrast tests for the two conditions are shown in Figure 3. Although both deaths of loved ones and failures were predicted to lead to emotional pain, we predicted that social losses would lead to more emotional pain than would failures. The results of the contrast tests (see Figure 3) support the hypothesis that deaths of loved ones and failures cause patterns of depressive symptoms that are consistent with the SSC hypothesis.
General Discussion This study tested the hypothesis that depressive symptom patterns differ depending on the precipitating cause in ways consistent
Figure 3. Mean depressive symptom levels of participants randomly assigned to visualize the death of the person they are closest to or the failure at their most important life goal. Stars represent symptoms predicted to be more prominent in that condition compared with the other condition. Error bars represent 95% confidence intervals. Repeated measures contrasts (t tests) compare the mean of the predicted symptoms with the mean of all other symptoms for each condition after controlling for emotional involvement in visualizing the scenario. Pessm. ⫽ pessimism; Anhed. ⫽ anhedonia. ** p ⬍ .01.
EVOLUTION AND DEPRESSIVE SYMPTOMS
with a functional account of different depressive symptoms (the SSC hypothesis). Using the measure of depressive symptoms developed in Study 1, Study 2 found that retrospective reports of depressive symptom patterns matched the precipitants as predicted by the SSC hypothesis. Emotional pain, which makes losses painful and should thereby motivate avoidance of them, was common to all the precipitants except for the winter season, but it was especially prominent following social losses. We hypothesize that this is because social bonds have been especially important to fitness throughout human evolution. Social losses were also strongly associated with crying and a desire to be with friends and family, responses that may help establish or strengthen lost social bonds. Failing efforts were most strongly associated with guilt, rumination, pessimism, and fatigue—reactions that may have been shaped by natural selection to minimize wasted effort and to reassess failing strategies. Anhedonia, fatigue, sleepiness, and (unexpectedly) desire for social support were prominent symptoms of the winter blues. Such “hibernation” symptoms may have protected against starvation and exposure during ancestral winters. Reactions of participants in Study 3, who were randomly assigned to imagine either the death of a loved one or the failure of a major goal, were very similar to reactions reported by participants who actually experienced these situations. Taken together, these results provide strong evidence that different precipitants cause different depressive symptom patterns, and they are consistent with the hypothesis that depressive symptoms serve situation-specific functions. This supports the more global thesis that depressive symptoms are defensive reactions designed by natural selection to cope with certain kinds of adverse situations. Whether or not one agrees with this interpretation, we hope to have demonstrated that evolutionary approaches can stimulate the formation of testable and useful hypotheses in psychiatry and psychology. Our findings are relevant to normally expressed depressive symptoms—symptoms that most people would feel in response to adverse situations—and may or may not generalize to depression per se. Nevertheless, we do not think that it is a coincidence that the patterns of depressive symptoms found in our studies resemble several depression subtypes previously uncovered in psychiatric research. Symptoms that we found to be aroused by deaths of loved ones, romantic losses, and social isolation resemble bereavement and share some features with sociotropic depression. Symptoms that we found to be aroused by failures resemble symptoms of depression with melancholia, hopelessness depression, and autonomous depression. Symptoms that we found to be aroused by the winter season are generally consistent with SAD symptoms. Although we think that finding previously identified symptom clusters in our own data bolsters confidence in our findings, our results are not simply replications of previous findings or confirmations of previous theories of depression subtypes. First, we have investigated a broader array of both situations and symptoms than has previously been done, allowing us to test symptom pattern differences systematically. Moreover, we have introduced a unifying framework that may help explain why particular symptoms often co-occur and that may also provide a novel way to subtype depression based on the precipitating situation. Along with evidence that depressive symptom patterns show little within-person stability (Coryell et al., 1994; Oquendo et al., 2004), and contrary to many previous theories of depression subtypes, our results
327
suggest that situational rather than dispositional factors may be central to explaining symptom pattern differences between episodes.
Limitations The conclusions from the present set of studies are subject to several limitations. First, the SSC hypothesis did not predict a number of findings (e.g., loss of appetite following romantic breakups), and much variation in symptom patterns therefore remains unexplained. We also recognize that alternative explanations of our results exist, and we hope that such alternatives make new predictions that discriminate between the SSC and alternative explanations. Second, Studies 1 and 2 used retrospective reports of symptoms and precipitants. Although this is the norm in life events research, including longitudinal research (Kessler, 1997), several studies have found that retrospective reports that were taken as soon as a week after concurrent reports suffer from poor reliability and contamination due to self-enhancement and anchoring biases (e.g., Henry, Moffitt, Caspi, Langley, & Silva, 1994; Smith, Leffingwell, & Ptacek, 1999). It is important to know if the present results replicate when symptoms are measured using daily assessment procedures. Nevertheless, retrospective recall bias was not a limitation of Study 3, which substantively replicated two symptom patterns observed in Study 2. Third, the self-reported data on the precipitants could have been unreliable or even biased. The single forced-choice response format gave data on precipitant categories that were no doubt less reliable than what could be obtained by extensive life event interviews, such as those collected by Brown and Harris (1978). Moreover, strictly speaking, we investigated the participants’ causal attributions and not necessarily the true causes of their depressive symptoms. Although one can argue that attributions are the most relevant criteria for testing our hypothesis, we cannot rule out the possibility that experiencing certain symptoms altered the participants’ causal attributions. Once again, however, this limitation does not apply to Study 3. A fourth limitation, potentially more important because it also applies to Study 3, is that the self-reported data on symptoms could have been biased (see, e.g., Rottenberg, Gross, Wilhelm, Najmi, & Gotlib, 2002). In our studies, participants may have been more likely to remember or incorrectly report certain symptoms in conjunction with certain precipitants. For example, participants might associate the death of a loved one with crying because crying behavior is intertwined with the memory of a funeral. Although the vast majority of research on depression and affect involves self-report data, different methods of data collection would be required to address this issue. Fifth, self-report of certain symptoms may necessarily overlap with self-report of certain precipitants. For example, reporting that social isolation caused the depression is not much different from reporting a desire for social support during this period. Similarly, failing at a goal may have been perceived as being similar to certain pessimism items. Nevertheless, inspection of the symptom profiles makes this an unlikely explanation for the totality of our results. Sixth, the present research was conducted on student samples and may not generalize to other populations experiencing depres-
KELLER AND NESSE
328
sive symptoms. Failures to replicate these findings in different age groups or cultures using the types of symptoms and precipitants investigated here might indicate that our findings were somehow unique to student populations and would be problematic for our evolutionary hypothesis.
Future Research This research needs to be replicated in different populations and cultures. Such studies could also investigate depressive symptom patterns for several precipitants predicted by the SSC but not investigated in the present study, such as postpartum depressive symptoms, physical illness, and being shamed. For example, both epidemiological and experimental studies show that the body’s own defensive response to infections—specifically, cytokines secreted by immune system cells— can cause depressive symptoms (Schiepers, Wichers, & Maes, 2005; Yirmiya et al., 2000). The SSC hypothesis predicts that symptoms related to reduced energy expenditure, such as fatigue and anhedonia, will be prominent during illnesses or following cytokine administration, whereas other depressive symptoms, such as crying, emotional pain, guilt, and rumination, will be much less prominent. Tests of SSC predictions are but one way to assess the more global hypothesis that depressive symptoms are functional. A seemingly more direct test would be to measure whether depressive symptoms increase fitness or lead to positive outcomes. However, such investigations would likely be inconclusive. First of all, fitness in modern environments, replete with birth control, medication, and other evolutionary novelties, may correlate poorly with ancestral fitness, which is the relevant criterion (Tooby & Cosmides, 1990). More important, depressive symptoms are only hypothesized to be useful given already adverse situations. Comparing the outcomes of people suffering from depressive symptoms with those not suffering them would be as meaningless as comparing the outcomes of people suffering from fever with those of healthy controls. Virtually any biological defensive reaction would appear maladaptive by such a standard. The correct comparison would be between people who did and did not have depressive symptoms in the same adverse situation, but correctly equating adversity across situations may be devilishly difficult. Another obvious next step in testing the SSC hypothesis is to investigate whether depressive symptoms have the effects hypothesized. For example, do fatigue, anhedonia, and pessimism reduce motivation, goal pursuit, and energy expenditure? Second, in the same way that blocking fever may prolong infections (Nesse & Williams, 1994), blocking normal depressive symptoms with antidepressant medication could increase the risk of chronic negative life situations or poorer outcomes in such situations, even as the sufferers feel better. Similarly, individuals who lack a capacity for depressive symptoms (who have pathological euthymia) should be more likely to lose valuable attachments, more likely to persist at unachievable pursuits, less able to learn from mistakes, and less able to recruit friends during adverse situations.
Conclusions Researchers and clinicians routinely presume that depressive symptoms, or at least extreme ones, are maladaptive. However, many aversive biological defenses, such as pain, are highly func-
tional, in part because they are aversive. The fact that they cause disability and death does not undermine this argument; diarrhea is a useful defense that nonetheless is related to thousands of deaths each years. We propose that the genes of those ancestors who responded to deaths, failures, and losses with indifference tended to be displaced by the genes of those ancestors who responded to these precipitants with emotional pain, crying, anhedonia, guilt, pessimism, fatigue, and rumination. Such depressive symptoms appear to be neither abnormal nor spontaneous; in our study, 42% of college undergraduates reported experiencing them in the previous year, and 92% identified a specific cause. The patterns of depressive symptoms depended on the precipitating situation in a way consistent with the hypothesis that depressive symptoms serve specific functions during adverse situations. Depending on the situation, some or even many episodes of depression may be normal reactions to highly adverse situations. Individual differences in tendencies to get depressive symptoms may have the same significance as variations in tendencies to get a fever during a cold. This in no way implies that depression is “good” or that treating it is “bad.” Patients wanting treatment may not care, understandably, that depressive responses to adverse situations helped their ancestors survive and have offspring. Moreover, the desire to support friends, loved ones, and (in modern environments) patients in times of need—and to extricate them from adverse situations—may be as natural and adaptive as the depressive symptoms themselves. While an evolutionary approach raises questions about the wisdom of routinely blocking depressive symptoms as opposed to treating their causes, the scientific basis for distinguishing pathological from useful depressive symptoms will require a much better understanding of how they were shaped by natural selection.
References Abramson, L., Metalksy, G., & Alloy, L. (1989). Hopelessness depression: A theory based subtype of depression. Psychological Review, 96, 358 – 372. Allen, N. B., & Badcock, P. B. T. (2003). The social risk hypothesis of depressed mood: Evolutionary, psychosocial, and neurobiological perspectives. Psychological Bulletin, 129, 887–913. Alloy, L., & Ahrens, A. H. (1987). Depression and pessimism for the future: Biased use of statistically relevant information in predictions for self versus others. Journal of Personality and Social Psychology, 52, 366 –378. Alloy, L., Just, N., & Panzarella, C. (1997). Attributional style, daily life events, and hopelessness depression: Subtype validation by prospective variability and specificity of symptoms. Cognitive Therapy and Research, 21, 321–344. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington, DC: Author. Archer, J. (1999). The nature of grief: The evolution and psychology of reactions to loss. New York: Routledge. Beck, A. T. (1967). Depression: Clinical, experimental, and theoretical aspects. New York: Harper & Row. Beck, A. T., Epstein, N., & Harrison, R. (1983). Cognition, attitudes, and personality dimensions in depression. British Journal of Cognitive Psychotherapy, 1, 1–16. Beck, A. T., Steer, R. A., & Garbin, M. G. (1988). Psychometric properties of the Beck Depression Inventory: Twenty-five years of evaluation. Clinical Psychology Review, 8, 77–100. Berndt, D. J., Petzel, T. P., & Berndt, S. M. (1980). Development and
EVOLUTION AND DEPRESSIVE SYMPTOMS initial evaluation of a multiscore depression inventory. Journal of Personality Assessment, 44, 396 – 403. Brown, G. W., & Harris, T. O. (1978). Social origins of depression. New York: Free Press. Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 132–162). Beverly Hills, CA: Sage. Burke, A., & Haslam, N. (2001). Relations between personality and depressive symptoms: A multimeasure study of dependency, autonomy, and related constructs. Journal of Clinical Psychology, 57, 953–961. Carver, C. S. (2004). Negative affects deriving from the behavioral approach system. Emotion, 4, 3–22. Carver, C. S., & Scheier, M. F. (2001). Optimism, pessimism, and selfregulation. In E. C. Chang (Ed.), Optimism and pessimism: Implications for theory, research, and practice (pp. 31–51). Washington, DC: American Psychological Association. Cliff, N. (1983). Some cautions concerning the application of causal modeling methods. Multivariate behavioral research, 18, 115–126. Coryell, W., Winokur, G., Shea, T., Maser, J. D., Endicott, J., & Akiskal, H. S. (1994). The long-term stability of depressive subtypes. American Journal of Psychiatry, 151, 199 –204. Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334. Dam, H., Jakobsen, K., & Mellerup, E. (1998). Prevalence of winter depression in Denmark. Acta Psychiatrica Scandinavica, 97, 1– 4. Eid, M., Schneider, C., & Schwenkmezger, P. (1999). Do you feel better or worse? The validity of perceived deviations of mood states from mood traits. European Journal of Personality, 13, 283–306. Engel, G. L. (1980). The clinical application of the biopsychosocial model. American Journal of Psychiatry, 137, 535–544. Flora, D. B., & Curran, P. J. (2004). An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data. Psychological Methods, 9, 466 – 491. Fredrickson, B. L. (2001). The role of positive emotions in positive psychology: The broaden-and-build theory of positive emotions. American Psychologist, 57, 218 –226. Frijda, N. H. (1986). The emotions. New York: Cambridge University Press. Hagen, E. H. (1999). The function of postpartum depression. Evolution and Human Behavior, 20, 325–359. Hamilton, M. (1967). Development of a rating scale for primary depressive illness. British Journal of Social Clinical Psychology, 6, 278 –296. Henry, B., Moffitt, T. E., Caspi, A., Langley, J., & Silva, P. A. (1994). On the “remembrance of things past”: A longitudinal evaluation of the retrospective method. Psychological Assessment, 6, 92–101. Hill, P., & Martin, R. B. (1997). Empathic weeping, social communication, and cognitive dissonance. Journal of Social and Clinical Psychology, 16, 299 –322. Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indices in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55. Joiner, T. (2001). Negative attributional style, hopelessness depression and endogenous depression. Behaviour Research and Therapy, 39, 139 –149. Jo¨reskog, K. G., & So¨rbom, D. (1996). PRELIS 2: User’s reference guide. Lincolnwood, IL: Scientific Software International. Judd, L., Akiskal, H. S., & Paulus, M. (1997). The role and clinical significance of subsyndromal depressive symptoms (SDS) in unipolar major depressive disorder. Journal of Affective Disorders, 45, 5–18. Keller, M. C., & Miller, G. (in press). Resolving the paradox of common, harmful, heritable mental disorders: Which evolutionary genetic models work best? Behavioral and Brain Sciences. Keller, M. C., & Nesse, R. M. (2005). Subtypes of low mood provide evidence of its adaptive significance. Journal of Affective Disorders, 86, 27–35.
329
Kendler, K. S., Gardner, C. O., & Prescott, C. A. (2002). Toward a comprehensive developmental model for major depression. American Journal of Psychiatry, 159, 1133–1145. Kessler, R. C. (1997). The effects of stressful life events on depression. Annual Review of Psychology, 48, 191–214. Klinger, E. (1975). Consequences of commitment to and disengagement from incentives. Psychological Review, 82, 1–25. Leff, M., Roatch, W., & Bunney, W. (1970). Environmental factors preceding the onset of severe depression. Psychiatry, 33, 293–311. Marks, I. M., & Nesse, R. M. (1994). Fear and fitness: An evolutionary analysis of anxiety disorders. Ethology and Sociobiology, 15, 247–261. Martin, L. L., & Tesser, A. (1996). Some ruminative thoughts. In R. S. Wyer (Ed.), Advances in social cognition (Vol. 9, pp. 1– 47). Hillsdale, NJ: Erlbaum. Monroe, S. M., & Simons, A. D. (1991). Diathesis-stress theories in the context of life stress research: Implications for the depressive disorders. Psychological Bulletin, 110, 406 – 425. Muthe´n, B. O. (2004). Mplus technical appendices. Los Angeles: Muthe´n & Muthe´n. Muthe´n, L. K., & Muthe´n, B. O. (1998). Mplus user’s guide. Los Angeles: Muthe´n & Muthe´n. Nesse, R. M. (1990). Evolutionary explanations of emotions. Human Nature, 1, 261–289. Nesse, R. M. (2000). Is depression an adaptation? Archives of General Psychiatry, 57, 14 –20. Nesse, R. M. (2004). Natural selection and the elusiveness of happiness. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 359, 1333–1347. Nesse, R. M. (2005). Natural selection and the regulation of defenses: A signal detection analysis of the smoke detector principle. Evolution and Human Behavior, 26, 88 –105. Nesse, R. M., & Williams, G. C. (1994). Why we get sick: The new science of Darwinian medicine. New York: Times Books. Nolen-Hoeksema, S. (1991). Responses to depression and their effects on the duration of depressive episodes. Journal of Abnormal Psychology, 100, 569 –582. Olson, C. L. (1979). Practical considerations in choosing a MANOVA test statistic: A rejoinder to Stevens. Psychological Bulletin, 86, 1350 –1352. Oquendo, M. A., Barrera, A., Ellis, S. P., Li, S., Burke, A., Grunebaum, M., et al. (2004). Instability of symptoms in recurrent major depression: A prospective study. American Journal of Psychiatry, 161, 255–261. Price, J. S., Sloman, L., Gardner, R., Gilbert, P., & Rhode, P. (1994). The social competition hypothesis of depression. British Journal of Psychiatry, 164, 309 –315. Radloff, L. S. (1977). The CES–D scale: A self report depression scale for research in the general population. Applied Psychological Measurement, 1, 385– 401. Rosenthal, N. E., Sack, D. A., Gillin, J. C., Lewy, A. J., Goodwin, J. C., Davenport, P. S., et al. (1984). Seasonal affective disorder: A description of the syndrome and preliminary findings with light therapy. Archives of General Psychiatry, 41, 72– 80. Rottenberg, J., Gross, J. J., Wilhelm, F. H., Najmi, S., Gotlib, I. H. (2002). Crying threshold and intensity in major depressive disorder. Journal of Abnormal Psychology, 111, 301–312. Sato, T., & McCann, D. (2000). Sociotropy–autonomy and the Beck Depression Inventory. European Journal of Psychological Assessment, 16, 66 –76. Schiepers, O. J. G., Wichers, M. C., & Maes, M. (2005). Cytokines and major depression. Progress in Neuro-Psychopharmacology & Biological Psychiatry, 29, 201–217. Segrin, C., & Abramson, L. Y. (1994). Negative reactions to depressive behaviors: A communication theories analysis. Journal of Abnormal Psychology, 103, 655– 668. Sheeber, L., Hops, H., Andrews, J., Alpert, T., Davis, B. (1998). Interac-
330
KELLER AND NESSE
tional processes in families with depressed and non-depressed adolescents: Reinforcement of depressive behaviors. Behavior Research and Therapy, 36, 417– 427. Smith, R. E., Leffingwell, T. R., & Ptacek, J. T. (1999). Can people remember how they coped? Factors associated with discordance between same-day and retrospective reports. Journal of Personality and Social Psychology, 76, 1050 –1061. Stanton, A. L., Kirk, S. B., Cameron, C. L., & Danoff-Burg, S. (2000). Coping through emotional approach: Scale construction and validation. Journal of Personality and Social Psychology, 78, 1150 –1169. Stone, E. R., Dodrill, C. L., & Johnson, N. (2002). Depressive cognition: A test of depressive realism versus negativity using general knowledge questions. Journal of Psychology, 135, 583– 602. Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics. Needham Heights, MA: Allyn & Bacon. Tooby, J., & Cosmides, L. (1990). The past explains the present: Emotional adaptations and the structure of ancestral environments. Ethology and Sociobiology, 11, 375– 424.
Watson, P., & Andrews, P. (2002). Toward a revised evolutionary adaptationist analysis of depression: The social navigation hypothesis. Journal of Affective Disorders, 72, 1–14. Yirmiya, R., Pollak, Y., Reichenberg, A., Barak, O., Avitsur, R., Shavit, Y., et al. (2000). Illness, cytokines, and depression. Annals of the New York Academy of Sciences, 917, 478 – 487. Young, M. A. (1991). The temporal onset of individual symptoms in winter depression: Differentiating underlying mechanisms. Journal of Affective Disorders, 22, 191–197. Yu, C.-Y., & Muthe´n, B. O. (2001). Evaluation of model fit indices for latent variable models with categorical and continuous outcomes. Unpublished manuscript.
Received March 21, 2005 Revision received February 24, 2006 Accepted February 27, 2006 䡲
Journal of Personality and Social Psychology 2006, Vol. 91, No. 2, 331–341
Copyright 2006 by the American Psychological Association 0022-3514/06/$12.00 DOI: 10.1037/0022-3514.91.2.331
It’s Not Just the Amount That Counts: Balanced Need Satisfaction Also Affects Well-Being Kennon M. Sheldon
Christopher P. Niemiec
University of Missouri—Columbia
University of Rochester
The basic psychological needs for autonomy, competence, and relatedness have been found to have unique additive effects on psychological well-being (see E. L. Deci & R. M. Ryan, 2000). In the present study, the authors extended these findings by examining whether the balance in the satisfaction of these 3 needs is also important. The results of 4 studies showed that people who experienced balanced need satisfaction reported higher well-being than those with the same sum score who reported greater variability in need satisfaction. This finding emerged for multiple measures of needs and adjustment and was independent of neuroticism. Moreover, results were obtained consistently across concurrent, prospective, daily diary, and observer-report study designs. Discussion focuses on the psychological meaning and functional implications of balanced need satisfaction. Keywords: life balance, need satisfaction, well-being
efficient, and masterful vis-a`-vis the environment. Relatedness (Baumeister & Leary, 1995) refers to the need to feel understood, connected with, and appreciated by close others. Considerable research now supports the SDT proposal that all three needs are important. For example, the simultaneous experience of autonomy, competence, and relatedness has been shown to contribute to people’s reports of fulfillment from satisfying events (Sheldon, Elliot, Kim, & Kasser, 2001), good days (Sheldon et al., 1996), secure attachments (LaGuardia, Ryan, Couchman, & Deci, 2000), and college classroom experiences (Filak & Sheldon, 2003). Moreover, need satisfaction contributes to the experience of heightened psychological and physical health in a variety of domains, including the workplace (e.g., Baard, Deci, & Ryan, 2004; Ilardi, Leone, Kasser, & Ryan, 1993; Vansteenkiste et al., in press), athletics (e.g., Gagne´, Ryan, & Bargmann, 2003), and general health practices (e.g., Williams et al., 2006; Williams, McGregor, Zeldman, Freedman, & Deci, 2004), as well as across the life span, from adolescents (e.g., Niemiec et al., in press) to older persons (e.g., Kasser & Ryan, 1999). Finally, the proposition that all three needs are essential has been supported in both Eastern and Western cultures (e.g., Deci et al., 2001; Sheldon et al., 2001), at both within- and between-person levels of analysis (Reis, Sheldon, Gable, Roscoe, & Ryan, 2000), and in both cross-sectional and longitudinal designs (Sheldon & Elliot, 1999). The SDT approach provides a framework for resolving many perennial questions that concern psychological needs. The first question, which was alluded to above, is as follows: If a posited need varies drastically in its importance for different people, then is it really a necessary antecedent for psychological health, or is it merely an idiosyncratic desire? According to SDT, the fundamental needs should not vary much in their importance for different people, a position bolstered by the increasing prominence of evolutionary perspectives that focus on the universal and speciestypical features of human nature (Buss, 2000; Deci & Ryan, 2000). In other words, all people require certain types of experiences to approximately the same extent—what varies is the extent to which they manage to get such satisfaction. Thus, the SDT perspective
Psychological needs theories have had a long and checkered history in psychology, dating back to McDougall (1908), Murray (1938), and Maslow (1971). Although the concept of psychological needs provides a promising framework for understanding the antecedents of human thriving, disagreements on the definition and conceptualization of needs have slowed the progress in realizing this potential (Ryan, 1995). For example, needs theorists differ widely in their view of whether psychological needs are variable in their importance for different people, versus largely invariant; whether needs are expressive motives that impel people toward certain types of incentives in the environment, versus experiential requirements necessary for people to thrive; whether needs are acquired during the process of individual development, versus evolved and inherited; and whether needs are few in number, versus multitudinous in number (Sheldon, Ryan, & Reis, 1996). Recently, research based in the self-determination theory (SDT; Deci & Ryan, 1985, 2000) tradition has led to an upsurge in interest in the concept of psychological needs (see also Baumeister & Leary, 1995; Brewer, 1991; Sheldon, 2004). According to SDT, psychological needs are evolved experiential requirements that all people must have in order to grow to their fullest potential, in the same way that plants require key nutrients (i.e., soil, sun, water) to thrive (Ryan, 1995). SDT postulates the existence of three basic psychological needs—autonomy, competence, and relatedness— and proposes that each is a distinct necessity for psychological health (Ryan, 1995). The need for autonomy (deCharms, 1968) refers to the experience that behavior is owned and endorsed “at the highest level of reflection” (see also Deci & Ryan, 2000). Competence (White, 1959) refers to the need to feel effective,
Kennon M. Sheldon, Department of Psychology, University of Missouri—Columbia; Christopher P. Niemiec, Department of Clinical and Social Sciences in Psychology, University of Rochester. Correspondence concerning this article should be addressed to Kennon M. Sheldon, Department of Psychology, 210 McAlester Hall, University of Missouri—Columbia, Columbia, MO 65211. E-mail:
[email protected] 331
SHELDON AND NIEMIEC
332
differs considerably from the motive disposition (or social motive) tradition (McClelland, 1985), which focuses on acquired individual differences in the strength of motives such as achievement, power, intimacy, or affiliation (McAdams, 2001). Simply stated, in the SDT model, needs are invariant required inputs, rather than varying motivated outputs. A second perennial question, which is related to the first, is as follows: If needs are adaptive motives that facilitate psychological health, then why do some people focus their energies in ways that are unsatisfying and even maladaptive? According to SDT, needs are experiential requirements, not behavioral motives. Because there is no a priori connection between motivated behavior and resultant need satisfaction, people can sometimes pursue the “wrong” goals—that is, goals that do not meet their needs (Sheldon, 2004). For example, Niemiec, Ryan, and Deci (2006) found that changes in the attainment of intrinsic, but not extrinsic, life goals during a 1-year period were strongly related to changes in need satisfaction and psychological health during the same period of time, which supports the SDT position that not all goal attainment is equally beneficial for need satisfaction and well-being (Ryan, Sheldon, Kasser, & Deci, 1996). A third perennial question is as follows: If needs really are necessary for psychological health, then shouldn’t a high level of need satisfaction predict well-being? Some previous theoretical approaches have not incorporated this idea, and indeed some previously postulated needs may actually be damaging to human integrity (e.g., Murray’s [1938] proposed “needs” for abasement and aggression). From the SDT perspective, needs are essential for well-being and therefore should be validated, in part, by their association with psychological health (Baumeister & Leary, 1995; Ryan, 1995). Thus, if a particular type of experience does not facilitate the adjustment of nearly everyone who has it, then that experience is probably not a psychological need.
The Balance of Need Satisfaction We hoped to contribute new information to this emerging synthesis by examining the balance of need satisfaction in addition to the total amount of need satisfaction. Although balance has received no prior research attention, there is reason to hypothesize that it may be an important determinant of well-being (see Sheldon, 2004, p. 199). Again, SDT posits that the satisfaction of all three needs is essential for the experience of psychological health, a proposal that has been supported in many studies using a wide variety of outcomes. Often, the basic psychological needs will be satisfied to an approximately equal extent. However, at some times, people’s lives may become configured such that they experience an imbalance in their levels of need satisfaction. For example, suppose that the owner of a new business must work very long hours to pursue his dream. He is his own boss, and thus he experiences very good satisfaction of his need for autonomy (e.g., a score of 6 on a scale ranging from 1 to 7). Moreover, his business has grown quite successful, and thus he experiences very good satisfaction of his need for competence (e.g., a 6). However, he is unable to spend much time with his family and friends, and thus he experiences low satisfaction of his need for relatedness (e.g., a 3). Therefore, the sum score of need satisfaction for the entrepreneur is 15. In contrast, consider a mother who has put aside her career to raise her children. In planning her own days and enjoying her
children, she experiences good satisfaction of her needs for autonomy and relatedness (i.e., 5s on both). She also works a 15-hr per week job at which she is successful, and thus she experiences good satisfaction of her need for competence (e.g., a 5). Note that the sum score of need satisfaction for the mother is also 15. An important question thus concerns whether the greater balance in need satisfaction experienced by the mother is more facilitative of psychological health, even though both she and the entrepreneur experience the same total amount of need satisfaction. Although no previous research has examined the relation of balanced need satisfaction to well-being, work in other domains suggests that internal variability is problematic for psychological health. For example, Paradise and Kernis (2002) found that unstable self-esteem was associated with less positive psychological functioning, particularly for people with high self-esteem. Selfverification research (Swann, 1990) has shown that people desire, and benefit from, consistency in their self-concept, and selfdiscrepancy research has shown that people are happier when they do not perceive discrepancies between how they and others see them (Campbell, Assanand, & Di Paula, 2003). Donahue, Robins, Roberts, and John (1993) found that people experienced heightened ill-being when their self-concepts were more differentiated or variable across roles, and Sheldon and Emmons (1995) found that greater differentiation across personal goals predicted negative goal outcomes. Thus, it appears that within-person variability, over time or across contexts, in self-relevant constructs is detrimental to well-being. The research cited above only indirectly addresses the balance hypothesis examined in the present research, which concerns variability among psychological needs that are presumed to be unique. Thus, we drew from the burgeoning literature on the consequences of life balance for psychological health to provide a more specific theoretical rationale for our hypothesis. According to the scarcity hypothesis (Chapman, Ingersoll-Dayton, & Neal, 1994), there exists a limited amount of time and energy that people can devote to different areas in their lives (e.g., work, family). The allocation of these resources to certain life domains leaves less time and energy available for other domains. Thus, when allocation of these resources is imbalanced across life domains (either because of inappropriate allocation or circumstances that mandate imbalanced allocation), there is insufficient time and energy that can be allocated to other important areas, creating stress and conflict that result in a variety of negative consequences for health and wellbeing (Adams, King, & King, 1996; Arye, 1992; Grant-Vallone & Donaldson, 2001; Noor, 2004; Rice, Frone, & McFarlin, 1992). In a similar manner, we propose that imbalance among the satisfaction of the psychological needs reflects inappropriate allocations of resources across the different domains of life, which may induce stresses and conflicts that ultimately detract from well-being. Of course, there are other explanations for the balance effect, should it emerge. For example, balanced need satisfaction may reflect a person’s engagement in harmonious, rather than obsessive, passions (e.g., Vallerand et al., 2003); emerging research suggests that obsessive passions can consume a person’s life, engendering stress and role conflict that detract from well-being (Seguin-Levesque, Lalliberte, Pelletier, Blanchard, & Vallerand, 2003). The balance prediction may also be derived from eudaemonic (i.e., meaning-based), rather than the hedonic (i.e., pleasurebased), conceptions of thriving (Ryan & Deci, 2000; Sheldon, 2004; Waterman, 1993). Many eudaemonic philosophies espouse
BALANCED NEED SATISFACTION AND WELL-BEING
balance, harmony, and temperance, whereas hedonic philosophies typically espouse intensity, quantity, and extremity. Finally, demonstrating the importance of balance would also support personality theories that focus on developing all sides of the self (Jung, 1971) and that advise people not to put all their eggs in one basket (Linville, 1987).
Study 1 To provide an initial test of these ideas, we assessed in Study 1 the psychological well-being of a large sample of participants using several measures of well-being, and we also assessed participants’ concurrent satisfaction of autonomy, competence, and relatedness. Our first hypothesis was that satisfaction of the three needs would positively correlate with the well-being measures, replicating prior research showing that need satisfaction facilitates well-being. Our second hypothesis was that balance in satisfaction across the three needs would also correlate positively with well-being. As we discuss in greater detail in the Method section, we computed balance in terms of the total divergence among the measures of satisfaction of autonomy, competence, and relatedness, where less divergence indicates more balance. Notably, however, as the total satisfaction of the three needs increased, there was less possible variation among the measures of the three needs. This would likely yield a statistical confound (i.e., a ceiling effect) that could have obscured the relation of balance to psychological health. To avoid this confound, we included the three measures of need satisfaction as main effects in the regression models, thereby controlling for the total amount of need satisfaction. Conveniently, this procedure enabled an evaluation of both of our Study 1 hypotheses within the same model. As a second way of evaluating the robustness of the hypothesized relation, we also tested for curvilinear effects involving the three need satisfaction variables. We hoped to show that balance has a positive relation that persists even when response extremity (i.e., the tendency to give very low or very high ratings) is controlled. Such a tendency should be captured by including a squared product term for each of the three need satisfaction variables, thereby controlling for any nonlinear effects of response extremity on well-being. We made no predictions concerning the curvilinear effects of need satisfaction on well-being because these associations have not been reported previously. If balance remained a significant predictor after controlling for the nonlinear effects, it would suggest that balance is equally important for people with low, moderate, and high levels of need satisfaction.
Method Participants and Procedure Participants were 315 students (64% women) in an introductory psychology course at the University of Missouri who participated as part of a course requirement.1 The mean age of the participants was 19 years, with a range from 18 to 44. The majority of the sample identified themselves as Caucasian (88%), and the rest of the sample was composed of African American (3%), Hispanic (4%), and other (4%). After signing up for the study, they were e-mailed a link to a survey that they completed online.
333
Measures Well-being. We used the Positive Affect Negative Affect Scale (PANAS; Watson, Tellegen, & Clark, 1988) to assess participants’ positive (10 items; e.g., “interested,” “strong”) and negative (10 items; e.g., “ashamed,” “irritable”) affect, using a scale ranging from 1 (not at all) to 5 (all the time). We used the Satisfaction With Life Scale (Diener, Emmons, Larsen, & Griffin, 1985) to assess participants’ life satisfaction (five items; e.g., “I am satisfied with my life”), using a scale ranging from 1 (strongly disagree) to 5 (strongly agree). Together, these scales assessed both the emotional and cognitive facets of well-being. All three measures asked participants about their “life in general.” The reliability for each measure was as follows: positive affect ␣ ⫽ .88, negative affect ␣ ⫽ .88, and life satisfaction ␣ ⫽ .84. We computed an aggregate subjective well-being (SWB) score (␣ ⫽ .90) by summing the positive affect and life satisfaction items and subtracting the negative affect items, following the recommendations of Diener and Lucas (1999) and the procedures of Sheldon and Elliot (1999) and Sheldon and Kasser (1998, 2001). We used the Subjective Happiness Scale (Lyubomirsky & Lepper, 1999) to assess participants’ overall level of happiness (four items; e.g., “I consider myself a very happy person”), using a scale ranging from 1 (strongly disagree) to 5 (strongly agree). The reliability for this measure was ␣ ⫽ .83. Need satisfaction. The need satisfaction items used by Sheldon et al. (2001) assessed general experiences of autonomy (three items; e.g., “In my life, my choices are based on my true interests and values”), competence (three items; e.g., “In my life, I feel very capable in what I do”), and relatedness (three items; e.g., “In my life, I feel close and connected with other people who are important to me”). Responses were made on a 7-point Likert-type scale, ranging from 1 (strongly disagree) to 7 (strongly agree). The reliability for each subscale was as follows: autonomy ␣ ⫽ .76, competence ␣ ⫽ .84, and relatedness ␣ ⫽ .88. To assess the balance of need satisfaction, we computed the difference between each pair of needs and then summed the absolute values of the three difference scores, which yielded a measure of the total divergence among the three scores. Similar statistical methodologies (e.g., focusing on the variance or on the standard deviation across the three scores) correlated highly with the absolute difference measure (i.e., r ⬎ .93) and yielded very similar results. Given the 7-point scale, the balance score could range from 0 (indicating equal satisfaction among the three needs) to 12 (indicating the maximum summed difference among the needs; e.g., as yielded by scores of 1, 4, and 7). The balance score was transformed by subtracting each participant’s score from the highest observed score of 9; this created a variable in which higher scores corresponded to greater balance among the three needs.
Results and Discussion Preliminary Analyses Table 1 presents the means, standard deviations, and intercorrelations of the three measures of need satisfaction, the two measures of well-being, and the balance score. As expected, the three measures of need satisfaction and the balance score were positively correlated with the well-being measures as well as with each other. The fact that balance correlated positively with the measures of need satisfaction underscored the importance of controlling for level of satisfaction in the regression models, so as to establish the unique contribution of balance. 1
Sheldon and Hoon (2006) also used this sample to report on the relation of need satisfaction to subjective well-being (SWB). However, they had very different theoretical purposes; did not examine the balance issue, only the amount issue; and used a different measure of need satisfaction than the one reported herein.
SHELDON AND NIEMIEC
334
Table 1 Means, Standard Deviations, and Intercorrelations of Study Variables: Study 1 Measure Need 1. Autonomy 2. Competence 3. Relatedness 4. Balance Well-being 5. SWB 6. Happiness
M
SD
1
2
3
4
5.56 5.40 5.81 6.89
0.98 1.08 1.06 1.72
— .67 .51 .26
— .54 .37
— .12
—
0.00 8.74
2.22 2.81
.55 .40
.60 .44
.51 .43
.29 .30
5
6
— .68
—
Note. All correlations were significant at the p ⬍ .05 level or greater. SWB ⫽ subjective well-being.
Primary Analyses Our most important hypothesis was that balance has a main effect on well-being that is independent of the total amount of need satisfaction. We conducted two hierarchical regression analyses, one using SWB and the other using happiness as the dependent variable (DV). In Step 1, the DV was regressed on the three measures of need satisfaction, with the balance score being entered in Step 2. Using SWB as the DV, autonomy, competence, and relatedness had standardized coefficients of .20, .35, and .21, respectively (all ps ⬍ .01); for Step 1, F(3, 311) ⫽ 78.3, p ⬍ .001. This replicates past research on the independent relations of the three needs to SWB (e.g., Reis et al., 2000; Sheldon et al., 2001). In Step 2, the balance score was a significant positive predictor (⌬R2 ⫽ .01), F(1, 310) ⫽ 3.90, p ⬍ .05. Together, the four measures accounted for 44% of the variance in SWB. Using happiness as the DV, autonomy, competence, and relatedness had standardized coefficients of .13, .22, and .24, respectively (all ps ⬍ .05); for Step 1, F(3, 311) ⫽ 35.2, p ⬍ .001. In Step 2, the balance score was a significant positive predictor (⌬R2 ⫽ .028), F(1, 310) ⫽ 12.25, p ⬍ .001. Together, the four measures accounted for 28% of the variance in happiness. Table 2 presents the results of
these analyses from Study 1 as well as from the other three studies. To test for curvilinear relations, we entered the three squared product terms in a third step in both analyses. None of the three product terms reached significance in either analysis (all ps ⬎ .14). More important, in the SWB analysis, the impact of balance was essentially unchanged (going from  ⫽ .09 to  ⫽ .08, p ⫽ .12). In a similar manner, in the happiness analysis, the impact of balance was essentially unchanged (going from  ⫽ .18 to  ⫽ .16, p ⫽ .004). These results indicated that the effect of balance was not reducible to the influence of participants who were simply very high or very low in their satisfaction of one or more of the three needs. In sum, Study 1 provided initial support for the hypothesis that balanced need satisfaction is beneficial for well-being and is independent of the level of need satisfaction. The balance effect emerged using two different measures of well-being and was not reducible to the main or curvilinear effects of the amounts of need satisfaction.
Study 2 In Study 2 we used a short-term longitudinal design to reexamine the models tested in Study 1. As mentioned in the introduction, Sheldon and Elliot (1999), as well as Reis et al. (2000), showed that the satisfaction of autonomy, competence, and relatedness over time predicted positive change in global well-being, which is consistent with bottom-up models of well-being that focus on the accumulation of small positive experiences over time (Diener, 1994). These findings suggest that need satisfaction is not only a stable personality disposition that can predict stable components of well-being; levels of need satisfaction also vary over time within individuals, producing corresponding fluctuations in well-being (Lyubomirsky, Sheldon, & Schkade, 2005). Thus, we examined whether the balance of need satisfaction can also influence shortterm variations in well-being. Specifically, we assessed well-being both at the beginning and at the end of a college semester and attempted to predict changes in well-being during that period. Our hypotheses were again two-
Table 2 Results of Regressions Used to Test the Primary Hypotheses: Studies 1– 4 Dependent measure SWB Study & step Study 1 Step 1 Step 2 Study 2 Step 1 Step 2 Study 3 Step 1 Step 2 Study 4 Step 1 Step 2
Happiness
Oppositional–defiant
⌬R2
p
⌬R2
p
.45 .01
⬍ .01 ⬍ .05
.25 .03
⬍ .01 ⬍ .01
.51 .02
⬍ .01 ⬍ .05
.43 .01
⬍ .01 ⬍ .01
Impulsive
⌬R2
p
⌬R2
p
.31 .03
⬍ .01 ⬍ .05
.25 .03
⬍ .01 ⬍ .05
Note. In Step 1, autonomy, competence, and relatedness were entered; in Step 2, the balance score was entered. In Study 2, Time 1 subjective well-being (SWB) was also entered in Step 1.
BALANCED NEED SATISFACTION AND WELL-BEING
fold. First, we sought to replicate past longitudinal research linking need satisfaction to enhanced well-being (Reis et al., 2000; Sheldon & Elliot, 1999). Second, we sought to demonstrate that balance is facilitative of positive changes in well-being over and above the main effects of need satisfaction. Conceptually, demonstrating this relation using both between-subjects (Study 1) and within-subjects (Study 2) designs would provide strong support for the importance and generalizability of the balance hypothesis (e.g., see Reis et al., 2000; Sheldon et al., 1996). Also, such a longitudinal finding would suggest that trying to alter the balance of need satisfaction in people’s lives may be a useful strategy for enhancing their well-being. There was an additional new feature of Study 2. Specifically, we measured and controlled for neuroticism, which refers to the tendency to experience negative emotional extremes and volatility (Costa & McCrae, 1987). We did this because it is incumbent upon researchers to show that the effects of new personality constructs are not reducible to the effects of known constructs (Fortunato & Goldblatt, 2002), and relevant constructs derived from the Big Five model of personality have become the standard for this endeavor (Mroczek, Spiro, Aldwin, & Ozer, 1993). We examined neuroticism in particular because it is associated with inconsistent or variable responding (Robinson & Tamir, 2005). Because either characteristic might potentially generate and explain the (im)balance finding, showing that balance remains a significant predictor of well-being even after controlling for individual differences in neuroticism would support the incremental validity of the new construct.
Method Participants and Procedure Participants were 145 students (78% women) at the University of Missouri who participated in a 1-year study of goals, adjustment, and wellbeing.2 The mean age of the participants was 18 years, with a range from 18 to 24. The majority of the sample identified themselves as Caucasian (90%), and the rest of the sample was composed of African American (8%) and other (2%). The data herein reported were collected during and just after the first semester. We assessed participants’ well-being and neuroticism at a laboratory session at the beginning of the semester. At four times during the semester, we mailed the participants questionnaires that assessed their current need satisfaction. Near the end of the semester, we sent participants a questionnaire that again assessed their well-being.
Measures Well-being. As in Study 1, we used the PANAS (Watson et al., 1988) and Satisfaction With Life Scale (Diener et al., 1985) to assess participants’ positive affect, negative affect, and life satisfaction, using the same response scales. We administered the measures at both the beginning and the end of the semester, and we computed two aggregate indices of SWB by summing the relevant positive affect and life satisfaction scores and subtracting the relevant negative affect score. The reliabilities for this measure at Time 1 and Time 2 were ␣ ⫽ .89 and .90, respectively. Need satisfaction. We used three items to assess autonomy (e.g., “Feeling choiceful and self-expressive in my everyday behavior”), three to assess competence (e.g., “Feeling competent and effective in my everyday behavior”), and three to assess relatedness (e.g., “Feeling related and connected to those who are important to me”). At four different times during the semester, participants rated their experience of each need during the previous 3 weeks, using a 7-point Likert-type scale, ranging from 1 (not at all) to 7 (very much). We computed semester autonomy, competence,
335
and relatedness need satisfaction scores by averaging these measures across the four assessments (␣s ⫽ .74, .79, and .68, respectively). We computed the balance score for the semester in the same way as in Study 1, by first creating three difference scores, then summing and reversing the absolute values of the three differences. Presumably, because of the greater averaging involved in Study 2 relative to Study 1, the balance score ranged from 0 to 4 in Study 2, rather than from 0 to 9 as in Study 1. Neuroticism. We used the Neuroticism Scale from the NEO Five Factor Inventory (Costa & McCrae, 1989) to assess participants’ neuroticism (12 items; e.g., “I feel inferior to others”). Responses were made on a 5-point Likert-type scale, ranging from 1 (strongly disagree) to 5 (strongly agree). The reliability for this measure was ␣ ⫽ .86.
Results and Discussion Preliminary Analyses Table 3 presents the means, standard deviations, and intercorrelations of the three measures of need satisfaction, the two measures of well-being, and the balance score. As in Study 1, the three measures of need satisfaction and the balance score were positively correlated with the well-being measures. Also notable was that semester-long balance was positively correlated with final SWB but not with initial SWB, consistent with our dynamic change perspective.
Primary Analyses Again, our primary hypothesis was that balance has a main effect on changes in well-being that is independent of the total amount of need satisfaction. We conducted a hierarchical regression analysis in which final SWB (the DV) was regressed onto initial SWB and the three need satisfaction scores in Step 1 and then onto balance in Step 2. In Step 1, F(4, 140) ⫽ 36.22, p ⬍ .01, autonomy, competence, and relatedness had standardized coefficients of .17 ( p ⫽ .11), .21 ( p ⫽ .06), and .08 ( p ⫽ .28), respectively, and the test-retest coefficient for initial SWB was .38 ( p ⬍ .01). Notably, although the needs did not reach significance in the change analyses, each was significantly correlated with SWB at the zero-order level. In Step 2, the balance score was also a significant and positive predictor (⌬R2 ⫽ .018), F(1, 139) ⫽ 5.22, p ⫽ .024. Together, the five measures accounted for 53% of the variance (see Table 2 for a summary of the results). To test for curvilinear relations, we entered the three squared product terms in Step 3. These three product terms did not reach significance (all ps ⬎ .34), and balance remained significant ( ⫽ .15, p ⫽ .024). This underscores the importance of balance even after the influence of extreme scores is removed. In a second control analysis, we entered neuroticism in Step 1 along with initial SWB and the three need satisfaction scores before entering the balance score in Step 2. In Step 1, neuroticism had a marginally significant negative association with change in SWB ( ⫽ ⫺.14, p ⫽ .075); more important, balance remained significant in Step 2 (⌬R2 ⫽ .018, p ⫽ .023). This finding indicates that balance is independent of the relations of neuroticism (or trait negative affectivity). 2 Sheldon and Hoon (2006) also used this sample to report on the relation of need satisfaction to longitudinal SWB, but again, they had a very different theoretical purpose, and they did not examine the balance issue, only the amount issue.
SHELDON AND NIEMIEC
336
Table 3 Means, Standard Deviations, and Intercorrelations of Study Variables: Study 2 Measure Need 1. Autonomy 2. Competence 3. Relatedness 4. Balance Well-being 5. Initial SWB 6. Final SWB
M
SD
1
2
3
4
5.44 5.32 5.45 3.58
0.92 0.98 0.91 0.69
— .82 .56 .01
— .62 .19
— .30
—
0.05 ⫺0.01
2.35 2.40
.55 .60
.53 .60
.48 .49
.07 .22
5
6
vious 24 hr (Reis et al., 2000; Sheldon et al., 1996). Participants were included in the final dataset only if they had at least five reports; thus, 8 of the participants were omitted from analyses, leaving a total sample of 83. The majority of participants (64%) had 8 reports, and there were 609 reports in the day-level sample.
Measures
— .63
—
Note. All correlations significant at the p ⬍ .01 level or greater, except those between balance and initial subjective well-being (SWB) and between balance and autonomy need satisfaction ( ps ⬎ .10).
In sum, Study 2 replicated the primary findings of Study 1 using a short-term longitudinal analysis of changes in SWB. We again demonstrated that the relation of balance to SWB was independent of both the linear and the curvilinear relations of need satisfaction; moreover, in Study 2 we found that this relation was robust when controlling for neuroticism, thus indicating that balance is different than (the absence of) trait negative affectivity or emotional volatility. Finally, the fact that balanced need satisfaction predicted positive changes in well-being over a 3-month period suggests that trying to change the overall balance of autonomy, competence, and relatedness need satisfaction in one’s life may be a defensible happiness-increasing strategy.
Study 3 Whereas in Study 1 we examined global satisfaction and wellbeing at one point in time and in Study 2 we examined changes in well-being from the beginning to the end of a 3-month period, in Study 3 we examined within-subject variations in need satisfaction and well-being over multiple, shorter periods. Specifically, in Study 3 we examined the balance hypothesis using a daily diary methodology. Participants rated their need satisfaction and wellbeing experienced during the last 24 hours at eight different times during a college semester. In this study, we sought (a) to replicate previous work by Reis et al. (2000) that showed that satisfaction of the three needs at the day-to-day level predicted daily fluctuations in well-being and (b) to demonstrate that balance would also predict greater well-being at the day-to-day level. Such findings would support our assumption that there is a dynamic process at work and suggest that trying to alter the balance of need satisfaction may be useful for enhancing well-being.
Method
Well-being. We used items developed by Emmons (1991) to assess participants’ daily positive (4 items; e.g., “happy”) and negative (5 items; e.g., “frustrated”) mood, and we used items developed by Brunstein (1993) to assess participants’ daily life satisfaction (2 items; e.g., “Today, I am completely satisfied with my life”). For mood, responses were made on a 7-point Likert-type scale, ranging from 1 (not at all) to 7 (very much), and for life satisfaction the scale ranged from 1 (completely disagree) to 7 (completely agree). The reliability for each measure was as follows: positive mood ␣ ⫽ .89, negative mood ␣ ⫽ .82, life satisfaction ␣ ⫽ .78. As in Study 1, we computed a daily SWB score by standardizing the measures and summing positive mood and life satisfaction and subtracting negative mood (Diener & Lucas, 1999; Sheldon & Elliot, 1999). The reliability for this composite was ␣ ⫽ .89. Need satisfaction. In each of the eight questionnaires, participants responded to three items to assess autonomy (e.g., “Feeling generally autonomous and choiceful in what I do”), three to assess competence (e.g., “Feeling generally competent and able in what I attempt”), and three to assess relatedness (e.g., “Feeling generally related and connected to the people I spend time with”). Responses were made on a 7-point Likert-type scale, ranging from 1 (very little) to 7 (very much). We created the daily balance score in the same way as in the previous studies: by summing and reverse scoring the absolute differences among the three needs. This score represents the extent to which participants experienced commensurate amounts of satisfaction on particular days. In Study 3, the balance score ranged from 0 to 12, rather than from 0 to 9 as in Study 1 or from 0 to 4 as in Study 2. This range likely reflects the fact that satisfaction was assessed on a large number of single days with no averaging, allowing for more variability in satisfaction.
Results and Discussion Preliminary Analyses Table 4 presents the means, standard deviations, and intercorrelations of the three measures of need satisfaction, the measure of daily well-being, and the balance score. In general, the need satisfaction measures were positively correlated with balance and with SWB.
Primary Analyses We used SAS proc mixed to test our primary hypotheses at the day level (Singer, 2002). This software accounted for the nesting of multiple diary reports within participants and focused the analysis on within-subjects variation around the participants’ own mean (for the reader’s information, proc mixed is similar to HLM and other multilevel modeling software). Specifically, we pre-
Participants and Procedure Participants were 91 students (79% women) at the University of Rochester who participated as part of an extra credit opportunity.3 The mean age of the participants was 19 years, with a range from 17 to 35. The majority of the sample identified themselves as Caucasian (62%), and the rest of the sample was composed of African American (16%), Hispanic (5%), Asian (12%), and other (5%). At eight points during the semester, approximately once every 10 days, participants rated their satisfaction of autonomy, competence, and relatedness, as well as their well-being, during the pre-
3 This sample was also used in Sheldon and Elliot (1999) to test hypotheses concerning goals, need satisfaction, and well-being. However, in contrast to the present study, Sheldon and Elliot only reported aggregated diary data that was (a) averaged over the eight reports, with no within-subjects results; (b) focused on semester well-being, not daily well-being; and (c) used different items from the daily questionnaires to assess need satisfaction. Thus, none of the measures in the current study have been reported before.
BALANCED NEED SATISFACTION AND WELL-BEING
Table 4 Means, Standard Deviations, and Intercorrelations of Study Variables: Study 3 Measure Need 1. Daily autonomy 2. Daily competence 3. Daily relatedness 4. Daily balance Well-being 5. Daily SWB
M
SD
1
2
3
4.86 4.89 5.31 8.88
1.39 1.25 1.20 2.37
— .52 .29 .45
— .26 — .35 ⫺.05
—
0.00
2.48
.49
.61
.33
.34
4
5
337
(i.e., impulsive and oppositional-defiant behaviors). Because psychological needs are defined as necessary for a variety of adjustment and psychological health outcomes (Deci & Ryan, 2000), we assumed that need satisfaction and balance should affect both intrapsychic and behavioral indices of mental health. Thus, we hypothesized that positive behavioral conduct would be predicted both by the total amount of need satisfaction and by the balance of need satisfaction. In other words, participants who receive inadequate and/or imbalanced need satisfaction from their mothers should evidence more signs of disruptive or defiant behaviors.
—
Note. All correlations significant at the p ⬍ .01 level or greater, except that between relatedness need satisfaction and balance. SWB ⫽ subjective well-being.
dicted daily SWB from both the daily satisfaction scores and the daily balance score, while controlling for participant-level mean differences on the variables of interest. We present the nonstandardized coefficients below. As expected, the intercept term representing the sample mean was significantly different from zero (B ⫽ 4.85, p ⬍ .01). In addition, autonomy, competence, and relatedness each had significant relations to daily SWB (Bs ⫽ .39, 1.07, and .55, all ps ⬍ .01), indicating that day-level fluctuations in all three of the needs made independent contributions. Most important for the primary study hypothesis, daily balance also had a significant positive relation to daily SWB (B ⫽ .14, p ⬍ .01; see Table 2 for a summary of the results). To test for curvilinear relations, we reconducted the analysis, this time also entering three squared product terms. None of the three product terms reached significance (all ps ⬎ .28), and balance remained significant ( p ⬍ .01). This finding again indicates that balance remains important even after extreme scores are considered. In sum, Study 3 replicated the primary findings of Studies 1 and 2, using a design that focused on predicting within-subjects fluctuations in SWB around the participants’ own means. Once again, the association between balanced need satisfaction and SWB was independent of both the linear and the curvilinear effects of the amount of need satisfaction. These results go beyond the 3-monthlong design of Study 2, suggesting that the balance of need satisfaction varies even in the short-term and has predictable effects on daily well-being.
Study 4 A potential limitation of the previous studies was that all data were collected from the same source, namely the participants. Thus, it is possible that the observed relations were obtained because of the common (self-report) method variance shared between the measures of need satisfaction and well-being. It was therefore important to assess whether the balance hypothesis was supported when the data came from two different respondents. Thus, in Study 4 we examined the balance hypothesis using a non-self-report dependent variable. Specifically, in Study 4 participants reported their experiences of support for their psychological need satisfaction that were provided to them by their mothers. In addition, mothers reported on their children’s general engagement in disruptive behaviors
Method Participants and Procedure Participants were 200 students (50% women) at the University of Rochester who participated in the study for extra course credit. The mean age of the participants was 19 years, with a range from 17 to 34. The majority of the sample identified themselves as Caucasian (79%), and the rest of the sample was composed of African American (4.5%), Hispanic (5%), Asian (8.5%), and other (3%). A majority of the sample lived with their mother while school was not in session (90.5%), and of those, 99% lived with their biological mother. Participants completed a battery of questionnaires and subsequently addressed an envelope to their mother that contained the measure of their child’s behavioral conduct. Mothers were assured that their responses would remain confidential and that their voluntary participation in the study would in no way affect their own or their child’s standing at the university. Eighty-one percent of participants’ mothers returned completed questionnaires, yielding a total of 162 responses from mothers. Mothers’ and participants’ data were matched using a randomly assigned 4-digit code number that was used on both sets of data.
Measures Behavioral conduct. We used the Disruptive Behavior Disorder Scale (Pelham, Gnagy, Greenslade, & Milich, 1992) to assess mothers’ perceptions of their children’s engagement in oppositional-defiant (8 items; e.g., “Often argues with adults”) and impulsive (9 items; e.g., “Often talks excessively”) behaviors. Responses were made on a 4-point Likert-type scale, ranging from 1 (not at all) to 4 (very much). The reliability for each subscale was as follows: oppositional-defiant behaviors ␣ ⫽ .88 and impulsive behaviors ␣ ⫽ .80. Need satisfaction. We used a modified Need Satisfaction Scale (LaGuardia et al., 2000) to assess participants’ perceptions of their mothers’ support for autonomy (2 items; e.g., “When I am with my mother, I have a say in what happens and I can voice my opinions”), competence (2 items; e.g., “When I am with my mother, I feel very capable and effective”), and relatedness (2 items; e.g., “When I am with my mother, I feel a lot of closeness and intimacy”). Responses were made on a 7-point Likert-type scale, ranging from 1 (not at all true) to 7 (very true). The reliability for each subscale was as follows: autonomy ␣ ⫽ .73, competence ␣ ⫽ .75, and relatedness ␣ ⫽ .74. The balance score was created in the same way as in the previous studies, by summing and reverse scoring the absolute differences among the three needs. As in Study 1, the balance score ranged from 0 to 9 in Study 4.
Results and Discussion Preliminary Analyses Independent samples t tests with Bonferroni protection revealed no significant differences in participants’ reports of support for autonomy, competence, and relatedness between those participants
SHELDON AND NIEMIEC
338
4 strengthens the case for the idea that balanced need satisfaction is an important and substantive predictor of psychological health and adjustment.
whose mothers returned the measure of behavioral conduct and those whose mothers did not. Table 5 presents the means, standard deviations, and intercorrelations of the three measures of need satisfaction, the balance score, and the measures of behavioral conduct. Consistent with Studies 1, 2, and 3, the need satisfaction measures were positively correlated with balance. Also, as would be expected from the results of the three previous studies, the measures of need satisfaction were negatively correlated with disruptive behaviors. In addition, balance was also negatively correlated with disruptive behaviors.
General Discussion The purpose of the present research was to attempt to contribute new information to the evolving synthesis that concerns the nature and implications of psychological need satisfaction. This article began with a discussion of three perennial questions about psychological needs: whether the needs vary in their importance across individuals, whether need satisfaction necessarily follows from motivated behavior, and whether the level of need satisfaction predicts psychological health. SDT posits that all people require certain experiential inputs (viz., autonomy, competence, and relatedness) for optimal health and functioning, although this view is in contrast with theories that conceptualize psychological needs as acquired individual differences that do not necessarily promote well-being when attained. Many recent findings support the SDT position and suggest that the psychological needs have considerable explanatory power for understanding both the social and the personality factors that enable psychological health (e.g., Deci & Ryan, 2000; Sheldon, 2004). The present research confirmed these past findings and, moreover, identified an important construct that has not been examined within SDT—the balance in satisfaction of the psychological needs. Through four studies that used a diverse set of methodologies and measures, we examined the hypothesis that balanced need satisfaction contributes to psychological health over and above the total amount of need satisfaction. Consistent with past research, our findings indicate that the people who are happiest in life are those who endorse their actions, feel effective, and feel connected to close others, thereby satisfying their needs for autonomy, competence, and relatedness. More important, we found that the balance of need satisfaction, in addition to the total amount, is also important for psychological health. Thus, regarding the example described in the introduction, our data suggest that the stay-athome mother may experience higher well-being relative to the entrepreneur because her level of satisfaction is more balanced, even though both had the same overall amount of need satisfaction. Admittedly, the effect sizes for the positive impact of balance on well-being were modest (ranging in absolute value from .09 to .24), whereas the three needs themselves had larger effects. However, the balance effect emerged consistently across diverse methodologies and designs (i.e., cross-sectional, prospective, daily diary, and multiple reporter) and also emerged consistently across
Primary Analyses Our primary hypothesis was that balance, in addition to the three psychological needs, would negatively relate to mothers’ reports of oppositional-defiant and impulsive behaviors. We conducted two hierarchical regression analyses, one using oppositional-defiant behaviors and the other using impulsive behaviors as the DV. In Step 1, we regressed the DV onto the three measures of need satisfaction; we added the balance score in Step 2; and we added the curvilinear terms in Step 3. Using oppositional-defiant behaviors as the DV, autonomy, competence, and relatedness had standardized coefficients of ⫺.25 ( p ⬍ .05), ⫺.26 ( p ⬍ .05), and ⫺.12 (ns); for Step 1, F(3, 157) ⫽ 23.77, p ⬍ .001. In Step 2, the balance score was also a significant negative predictor (⌬R2 ⫽ .03), F(1, 156) ⫽ 5.34, p ⬍ .05. Together, the four measures accounted for 34% of the variance in oppositional-defiant behaviors. In Step 3, none of the curvilinear terms reached significance (all ps ⬎ .08). Using impulsive behaviors as the DV, autonomy, competence, and relatedness had standardized coefficients of ⫺.23 ( p ⬍ .05), ⫺.34 ( p ⬍ .01), and .05 (ns); for Step 1, F(3, 157) ⫽ 17.71, p ⬍ .001. In Step 2, the balance score was also a significant negative predictor (⌬R2 ⫽ .03), F(1, 156) ⫽ 6.45, p ⬍ .05 (see Table 2 for a summary of the results). Together, the four measures accounted for 28% of the variance in impulsive behaviors. In Step 3, none of the curvilinear terms reached significance (all ps ⬎ .10). In sum, Study 4 replicated the earlier studies and also extended them by (a) focusing on a behavioral, as opposed to a well-being, outcome; (b) measuring the outcome (i.e., mother-rated disruptive behavior) independently of the need satisfaction predictors, thus eliminating self-report method variance as an alternate explanation of the findings; and (c) extending the balanced need satisfaction effect to the social realm (i.e., how mother treats me), going beyond the prior focus upon private experience alone. Thus, Study
Table 5 Means, Standard Deviations, and Intercorrelations of Study Variables: Study 4 Variable Need measures 1. Autonomy 2. Competence 3. Relatedness 4. Balance Disruptive behaviors 5. Oppositional–defiant 6. Impulsive Note.
M
SD
1
2
3
4
5.46 5.83 6.04 9.53
1.55 1.44 1.19 2.09
— .76 .57 .69
— .64 .62
— .45
—
1.49 1.34
0.55 0.45
⫺.52 ⫺.46
⫺.53 ⫺.48
⫺.42 ⫺.29
⫺.49 ⫺.46
All correlations significant at the p ⬍ .001 level.
5
6
— .76
—
BALANCED NEED SATISFACTION AND WELL-BEING
multiple measures of well-being and adjustment. In addition, the hypothesized relation was obtained when multiple alternative factors were controlled, including the total amount of need satisfaction, the curvilinear effects of need satisfaction, and neuroticism. The robustness of the findings controlling for curvilinear effects of satisfaction, in particular, suggests that balance is important for those at both the bottom and at the top of the scale, rather than becoming important only beyond some initial threshold of satisfaction.4 As discussed in the introduction, ours is not the first study to find a connection between personal variability and ill-being. For example, self-concept differentiation (Donahue et al., 1993; Sheldon, Ryan, Rawsthorne, & Ilardi, 1997), unstable self-esteem (Paradise & Kernis, 2002), high self-other discrepancies (Campbell et al., 2003), and overly differentiated goals (Sheldon & Emmons, 1995) have also been shown to carry risks. The present research may be viewed as adding to this general tradition, in which statistical measures of system properties are shown to provide information that goes beyond the mere content and extremity of responses (Block, 1961; Campbell et al., 2003). However, ours is the first study to demonstrate that within-person variation across important types of experience (i.e., psychological needs) can have negative effects, going beyond past research that has focused only on within-person variation across important self-concepts. Because the focus of the present research was on demonstrating the importance of balanced need satisfaction for psychological health, and not on identifying possible mediators of this effect, we cannot definitively state which mechanisms account for our findings. Some possible answers are suggested, however, by the burgeoning life balance literature, particularly within the area of industrial-organizational psychology (Greenblatt, 2002). This research focuses specifically on imbalances between work and family life that, over time, can exact a toll on workers’ emotional and psychosocial well-being. Considerable research now supports this idea, and thus life balance workshops and intervention programs have become near-standard in many corporate settings (Green & Skinner, 2005). Theoretical explanations of the negative effects of imbalance for well-being have focused on the stress, strain, burnout, and role conflict occasioned by such a lifestyle (Crooker, Smith, & Tabak, 2002; Greenblatt, 2002; Russell, 2005). We propose that imbalanced need satisfaction reflects a similar set of dynamics, as in the earlier example of the overworked entrepreneur. In other words, such discrepancies may lead to chronic stress and role conflict (Donahue et al., 1993), which in turn detract from well-being via mechanisms that are distinct from need satisfaction itself. Thus, although the entrepreneur experiences the same total amount of need satisfaction as the stay-athome mother, his well-being may suffer because of the stresses and conflicts occasioned by his mode of deriving satisfaction. Another possible explanation for the imbalance effect is suggested by Vallerand and colleagues’ recent work showing that so-called harmonious passions are more salubrious than so-called obsessive passions (Seguin-Levesque et al., 2003; Vallerand et al., 2003). Although people truly want to engage in obsessive passions (e.g., road-biking, Web-surfing, gambling), such activities can become addictions that create conflict and imbalance in people’s lives. Of course, future research will be needed to test the hypotheses that stress, strain, role conflict, or obsessive tendencies account for the imbalance effect.
339
An intriguing question that also warrants future empirical consideration concerns whether, and how, well-being is affected by an increase in the total amount of need satisfaction that also results in a more imbalanced need profile. We propose that when a person changes to a more imbalanced state, he or she begins to accrue life stresses and role conflicts that eventually, if not immediately, detract from well-being. Thus, a person who shifts from a 4 – 4 – 4 profile of need satisfaction to a more imbalanced 5– 4 – 4 profile may experience heightened well-being in the short-term, but this increase in well-being is likely to be leavened in the long-term because of the added stress that is associated with imbalanced need satisfaction. Obviously, it will be important for future well-being research to investigate both the short-term and long-term effects of changing one’s need satisfaction profile. The present research has important implications for the debate concerning eudaemonic and hedonic approaches to well-being (Ryan & Deci, 2000). Consistent with the eudaemonic perspective, our research suggests that trying to maximize certain needs while ignoring others is likely to be detrimental for well-being. Instead, psychological health is supported through the development of a multifaceted personality and through the commensurate satisfaction of all three psychological needs. This brings to mind Aristotle’s concept of the golden mean from his Niccomacean Ethics. The golden mean is the path of “not too little, not too much, but just enough”; when people seek the golden mean, they learn to live in harmony and order with themselves. Future intervention-based research might seek to enhance people’s well-being by helping them to alter the balance of satisfaction of their psychological needs. For example, after an initial assessment, customized interventions might focus participants on making life changes to boost needs that are inadequately satisfied, while ignoring needs that are already adequately satisfied. In this way, participants’ well-being might benefit the most because their aggregate level of need satisfaction is raised at the same time that variability is reduced. Specific interventions for boosting need satisfaction might include relationship and interpersonal therapies to facilitate the satisfaction of relatedness; skills training and performance therapies to increase the satisfaction of competence; and motivational interviewing and insight therapy to bolster the satisfaction of autonomy (Markland, Ryan, Tobin, & Rollnick, 2005; Sheldon, Williams, & Joiner, 2003).
Limitations and Conclusion There were several limitations to the present research. First, all the participants were from the United States and were rather homogenous in age and background. Thus, the replicability of these results to people of different ages and in different cultures must be established. Second, we do not yet know which factors 4 Although the curvilinear analyses generally address the threshold issue, to examine it more rigorously we excluded those in the bottom 10% of the sample on overall need satisfaction in each of the four studies, and the results were essentially unchanged. In addition, we examined the interaction between balance and the three needs (low vs. high) in each study. Although some interactions emerged, they were not obtained at a level above chance and they did not form a clear pattern. Thus, there is no evidence that balance matters less for those who are low in need satisfaction. Indeed, balance may matter more for such people because need deprivation is more salient for them.
SHELDON AND NIEMIEC
340
mediate or account for the balance effects. Does balanced need satisfaction influence well-being by reducing stress, by enhancing personal resources, and/or by buffering against momentary failures? These process questions await future research attention. Third, our primary finding remains to be demonstrated within laboratory settings that experimentally manipulate the three needs. Such context-focused research would go beyond the maternal need provision data reported in Study 4. Fourth, the longitudinal findings of Studies 2 and 3 remain to be replicated over longer periods of time. Might shifting the balance of satisfaction in one’s life promote positive changes that last over the long term? Fifth, it remains to be demonstrated whether the balance effect generalizes to other theories of needs and motives. We believe it should apply primarily to theories that focus on needs as universally required experiences for well-being (e.g., Deci & Ryan, 2000), rather than those that view needs as acquired individual differences in behavioral motives (e.g., McClelland, 1985). For now, however, it appears that it’s not just the amount that counts— balance also matters.
References Adams, G. A., King, L. A., & King, D. W. (1996). Relationship of job and family involvement, family social support, and work-family conflict with job and life satisfaction. Journal of Applied Psychology, 81, 411– 420. Arye, S. (1992). Antecedents and outcomes of work-family conflict among married professional women: Evidence from Singapore. Human Relations, 45, 813– 837. Baard, P. P., Deci, E. L., & Ryan, R. M. (2004). Intrinsic need satisfaction: A motivational basis of performance and well-being in two work settings. Journal of Applied Social Psychology, 34, 2045–2068. Baumeister, R. F., & Leary, M. R. (1995). The need to belong: Desire for interpersonal attachments as a fundamental human motivation. Psychological Bulletin, 117, 497–529. Block, J. (1961). Ego-identity, role variability, and adjustment. Journal of Consulting and Clinical Psychology, 25, 392–397. Brewer, M. B. (1991). The social self: On being the same and different at the same time. Personality and Social Psychology Bulletin, 17, 475– 482. Brunstein, J. (1993). Personal goals and subjective well-being: A longitudinal study. Journal of Personality and Social Psychology, 65, 1061– 1070. Buss, D. M. (2000). The evolution of happiness. American Psychologist, 55, 15–23. Campbell, J. D., Assanand, S., & Di Paula, A. (2003). The structure of the self-concept and its relation to psychological adjustment. Journal of Personality, 71, 115–140. Chapman, N. J., Ingersoll-Dayton, B., & Neal. M. B. (1994). Balancing the multiple roles of work and caregiving for children, adults, and elders. In G. W. Keita & J. J. Hurrell (Eds.), Job stress in a changing workforce: Investigating gender, diversity, and family issues (pp. 283–300). Washington, DC: American Psychological Association. Costa, P. T., Jr., & McCrae, R. R. (1987). Neuroticism, somatic complaints, and disease: Is bark worse than bite? Journal of Personality, 55, 299 –316. Costa, P. T., Jr., & McCrae, R. R. (1989). NEO-PI/FFI manual Suppl. Odessa, FL: Psychological Assessment Resources. Crooker, K. J., Smith, F. L., & Tabak, F. (2002). Creating work-life balance: A model of pluralism across life domains. Human Resource Development Review, 1, 387– 419. deCharms, R. (1968). Personal causation. New York: Academic Press. Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and selfdetermination in human behavior. New York: Plenum Press.
Deci, E. L., & Ryan, R. M. (2000). The “what” and “why” of goal pursuits: Human needs and the self-determination of behavior. Psychological Inquiry, 11, 227–268. Deci, E. L., Ryan, R. M., Gagne´, M., Leone, D. R., Usunov, J., & Kornazheva, B. P. (2001). Need satisfaction, motivation, and well-being in the work organizations of a former Eastern Bloc country. Personality and Social Psychology Bulletin, 27, 930 –942. Diener, E. (1994). Assessing subjective well-being: Progress and opportunities. Social Indicators Research, 31, 103–157. Diener, E., Emmons, R., Larsen, R., & Griffin, S. (1985). The Satisfaction With Life Scale. Journal of Personality Assessment, 47, 1105–1117. Diener, E., & Lucas, R. E. (1999). Personality and subjective well-being. In D. Kahneman, E. Diener, & N. Schwartz (Eds.), Well-being: The foundations of hedonic psychology (pp. 213–229). New York: Sage. Donahue, E. M., Robins, R. W., Roberts, B. W., & John, O. P. (1993). The divided self: Concurrent and longitudinal effects of psychological adjustment and social roles on self-concept differentiation. Journal of Personality and Social Psychology, 64, 834 – 846. Emmons, R. (1991). Personal strivings, daily life events, and psychological and physical well-being. Journal of Personality, 59, 453– 472. Filak, V., & Sheldon, K. M. (2003). Student psychological need satisfaction and college teacher-course evaluations. Educational Psychology, 23, 235–247. Fortunato, V. J., & Goldblatt, A. M. (2002). Construct validation of revised Strain-Free Negative Affectivity Scale. Educational and Psychological Measurement, 62, 45– 63. Gagne´, M., Ryan, R. M., & Bargmann, K. (2003). Autonomy support and need satisfaction in the motivation and well-being of gymnasts. Journal of Applied Sport Psychology, 15, 372–390. Grant-Vallone, E. J., & Donaldson, S. I. (2001). Consequences of workfamily conflict on employee well-being over time. Work & Stress, 15, 214 –226. Green, P., & Skinner, D. (2005). Does time management training work? An evaluation. International Journal of Training and Development, 9, 124 – 139. Greenblatt, E. (2002). Work/life balance: Wisdom or whining. Organizational Dynamics, 31, 177–193. Ilardi, B. C., Leone, D., Kasser, R., & Ryan, R. M. (1993). Employee and supervisor ratings of motivation: Main effects and discrepancies associated with job satisfaction and adjustment in a factory setting. Journal of Applied Social Psychology, 23, 1789 –1805. Jung, C. (1971). The portable Jung. New York: Viking Press. Kasser, V. M., & Ryan, R. M. (1999). The relation of psychological needs for autonomy and relatedness to health, vitality, well-being and mortality in a nursing home. Journal of Applied Social Psychology, 29, 935–954. LaGuardia, J. G., Ryan, R. M., Couchman, C. E., & Deci, E. L. (2000). Within-person variation in security of attachment: A self-determination theory perspective on attachment, need fulfillment, and well-being. Journal of Personality and Social Psychology, 79, 367–384. Linville, P. (1987). Self-complexity as a cognitive buffer against stressrelated illness and depression. Journal of Personality and Social Psychology, 52, 663– 676. Lyubomirsky, S., & Lepper, H. S. (1999). A measure of subjective happiness: Preliminary reliability and construct validation. Social Indicators Research, 46, 137–155. Lyubomirsky, S., Sheldon, K., & Schkade, D. (2005). Pursuing happiness: The architecture of sustainable change. Review of General Psychology, 9, 111–131. Markland, D., Ryan, R. M., Tobin, V. J., & Rollnick, S. (2005). Motivational interviewing and self-determination theory. Journal of Social and Clinical Psychology, 24, 811– 831. Maslow, A. (1971). The farther reaches of human nature. New York: Viking Press. McAdams, D. P. (2001). The person: An integrated introduction to personality psychology. Fort Worth, TX: Harcourt.
BALANCED NEED SATISFACTION AND WELL-BEING McClelland, D. (1985). Human motivation. Glennville, IL: Scott Foresman. McDougall, W. (1908). Introduction to social psychology. London: Methuen. Mroczek, D. K., Spiro, A., Aldwin, C. M., & Ozer, D. J. (1993). Construct validation of optimism and pessimism in older men: Findings from the normative aging study. Health Psychology, 12, 406 – 409. Murray, H. (1938). Explorations in personality. New York: Oxford University Press. Niemiec, C. P., Ryan, R. M., & Deci, E. L. (2006). The path taken: Consequences of attaining intrinsic and extrinsic aspirations in postcollege life. Manuscript submitted for publication. Niemiec, C. P., Lynch, M. F., Vansteenkiste, M., Bernstein, J., Deci, E. L., & Ryan, R. M. (in press). The antecedents and consequences of autonomous self-regulation for college: A self-determination theory perspective on socialization. Journal of Adolescence. Noor, N. M. (2004). Work-family conflict, work- and family-role salience, and women’s well-being. The Journal of Social Psychology, 144, 389 – 405. Paradise, A. W., & Kernis, M. H. (2002). Self-esteem and psychological well-being: Implications of fragile self-esteem. Journal of Social and Clinical Psychology, 21, 345–361. Pelham, W. E., Gnagy, E. M., Greenslade, K. E., & Milich, R. (1992). Teacher ratings of DSM–III-R symptoms for the disruptive behavior disorders. Journal of the American Academy of Child & Adolescent Psychiatry, 32, 210 –218. Reis, H. T., Sheldon, K. M., Gable, S. L., Roscoe, R., & Ryan, R. M. (2000). Daily well-being: The role of autonomy, competence, and relatedness. Personality and Social Psychology Bulletin, 26, 419 – 435. Rice, R. W., Frone, M. R., & McFarlin, D. B. (1992). Work-non-work conflict and the perceived quality of life. Journal of Organizational Behavior, 13, 155–168. Robinson, M. D., & Tamir, M. (2005). Neuroticism as mental noise: A relation between neuroticism and reaction time standard deviations. Journal of Personality and Social Psychology, 89, 107–114. Russell, J. E. A. (2005). Work and life integration—Organizational, cultural, and individual perspectives. Personnel Psychology, 58, 1065– 1069. Ryan, R. M. (1995). Psychological needs and the facilitation of integrative processes. Journal of Personality, 63, 397– 427. Ryan, R. M., & Deci, E. L. (2000). On happiness and human potentials: A review of research on hedonic and eudaimonic well-being. Annual Review of Psychology, 52, 141–166. Ryan, R. M., Sheldon, K. M., Kasser, T., & Deci, E. L. (1996). All goals are not created equal: An organismic perspective on the nature of goals and their regulation. In P. M. Gollwitzer & J. A. Bargh (Eds.), The psychology of action: Linking cognition and motivation to behavior (pp. 7–26). New York: Guilford Press. Seguin-Levesque, C., Lalliberte, M. L. N., Pelletier, L. G., Blanchard, C., & Vallerand, R. (2003). Harmonious and obsessive passion for the Internet: Their associations with the couple’s relationship. Journal of Applied Social Psychology, 33, 197–221. Sheldon, K. M. (2004). Optimal human being: An integrated multilevel perspective. Mahwah, NJ: Erlbaum. Sheldon, K. M., & Elliot, A. J. (1999). Goal striving, need satisfaction, and longitudinal well-being: The self-concordance model. Journal of Personality and Social Psychology, 76, 482– 497. Sheldon, K. M., Elliot, A. J., Kim, Y., & Kasser, T. (2001). What’s satisfying about satisfying events? Comparing ten candidate psychological needs. Journal of Personality and Social Psychology, 80, 325–339.
341
Sheldon, K. M., & Emmons, R. A. (1995). Comparing differentiation and integration within personal goal systems. Personality and Individual Differences, 18, 39 – 46. Sheldon, K. M., & Hoon, T. H. (2006). The multiple determination of well-being: Independent effects of positive traits, needs, goals, selves, social supports, and cultural contexts. Manuscript submitted for publication. Sheldon, K. M., & Kasser, T. (1998). Pursuing personal goals: Skills enable progress, but not all progress is beneficial. Personality and Social Psychology Bulletin, 24, 1319 –1331. Sheldon, K. M., & Kasser, T. (2001). Getting older, getting better? Personal strivings and personality development across the life-course. Developmental Psychology, 37, 491–501. Sheldon, K. M., Ryan, R. M., Rawsthorne, L., & Ilardi, B. (1997). “True” self and “trait” self: Cross-role variation in the Big Five traits and its relations with authenticity and well-being. Journal of Personality and Social Psychology, 73, 1380 –1393. Sheldon, K. M., Ryan, R. M., & Reis, H. R. (1996). What makes for a good day? Competence and autonomy in the day and in the person. Personality and Social Psychology Bulletin, 22, 1270 –1279. Sheldon, K. M., Williams, G., & Joiner, T. (2003). Self-determination theory in the clinic: Motivating physical and mental health. Hartford, CT: Yale University Press. Singer, J. D. (2002). Fitting individual growth models using SAS proc mixed. In D. S. Moskowitz & S. L. Hershberger (Eds.), Modeling intraindividual variability with repeated measures data: Methods and applications (pp. 135–170). Mahwah, NJ: Erlbaum. Swann, W. B., Jr. (1990). To be adored or to be known? The interplay of self-enhancement and self-verification. In E. T. Higgins & R. M. Sorrentino (Eds.), Handbook of motivation and cognition: Foundations of social behavior (Vol. 2, pp. 408 – 448). New York: Guilford Press. Vallerand, R. J., Blanchard, C., Mageau, G. A., Koestner, R., Ratelle, C., Leonard, M., et al. (2003). Les passions de l’ame: On obsessive and harmonious passion. Journal of Personality and Social Psychology, 85, 756 –767. Vansteenkiste, M., Neyrinck, B., Niemiec, C. P., Soenens, B., de Witte, H., & Van den Broeck, A. (in press). On the relations among work value orientations, psychological need satisfaction, and job outcomes: A selfdetermination theory approach. Journal of Occupational and Organizational Psychology. Waterman, A. S. (1993). Two conceptions of happiness: Contrasts of personal expressiveness (eudaemonia) and hedonic enjoyment. Journal of Personality and Social Psychology, 64, 678 – 691. Watson, D., Tellegen, A., & Clark, L. (1988). Development and validation of brief measures of positive and negative affect: The PANAS Scales. Journal of Personality and Social Psychology, 54, 1063–1070. White, R. W. (1959). Motivation reconsidered: The concept of competence. Psychological Review, 66, 297–333. Williams, G. C., McGregor, H. A., Sharp, D., Levesque, C., Kouides, R. W., Ryan, R. M., & Deci, E. L. (2006). Testing a self-determination theory intervention for motivating tobacco cessation: Supporting autonomy and competence in a clinical trial. Health Psychology, 25, 91–101. Williams, G. C. McGregor, H. A., Zeldman, A., Freedman, Z. R., & Deci, E. L. (2004). Testing a self-determination theory process model for promoting glycemic control through diabetes self-management. Health Psychology, 23, 58 – 66.
Received February 2, 2005 Revision received March 13, 2006 Accepted March 14, 2006 䡲
Journal of Personality and Social Psychology 2006, Vol. 91, No. 2, 342–350
Copyright 2006 by the American Psychological Association 0022-3514/06/$12.00 DOI: 10.1037/0022-3514.91.2.342
Expect the Unexpected: Ability, Attitude, and Responsiveness to Hypnosis Grant Benham
Erik Z. Woody
University of Texas—Pan American
University of Waterloo
K. Shannon Wilson and Michael R. Nash University of Tennessee Participants’ expectancies and hypnotic performance throughout the course of a standardized, individually administered hypnotic protocol were analyzed with a structural equation model that integrated underlying ability, expectancy, and hypnotic response. The model examined expectancies and ability as simultaneous predictors of hypnotic responses as well as hypnotic responses as an influence on subsequent expectancies. Results of the proposed model, which fit very well, supported each of the 4 major hypothesized effects: Expectancies showed significant stability across the course of the hypnosis protocol; expectancies influenced subsequent hypnotic responses, controlling for latent ability; hypnotic responses, in turn, affected subsequent expectancies; and a latent trait underlay hypnotic responses, controlling for expectancies. Although expectancies had a significant effect on hypnotic responsiveness, there was an abundance of variance in hypnotic performance unexplained by the direct or indirect influence of expectation and compatible with the presence of an underlying cognitive ability. Keywords: hypnosis, hypnotic suggestibility, structural equation modeling, expectancy theory
of test suggestions (typically 12) are administered to determine the extent to which the subject behaviorally responds to the hypnosis procedure.1 It turns out that people differ markedly in the extent to which they respond to hypnosis. These individual differences are both very consistent, as seen in the strong positive correlations in responsiveness to the different hypnotic suggestions in a protocol, and very stable, as seen in the strong positive correlations in test performance across time (e.g., Piccione, Hilgard, & Zimbardo, 1989). Explaining the genesis of these striking individual differences in hypnotic responding is arguably the first and most fundamental challenge to any scientific explanation of hypnosis. At issue here is explaining an aspect of human performance, in this case hypnotic performance. Not surprisingly, then, aptitude and attitude (and how they are configured causally) are central to explanatory models of hypnotic performance, just as they are to models of intellectual, athletic, and artistic performance. Indeed, generally speaking there are those theories of hypnosis that afford the central causal role to aptitude (i.e., ability) and those theories that grant center stage to attitude (e.g., expectation, motivation). Below, we broadly sketch the contours of these two types of theory, we propose a model that integrates the insights of both of them, and we describe a study that tested this model.
As defined by the American Psychological Association, Division 30 (Green, Barabasz, Barrett, & Montgomery, 2005), and others (Kihlstrom, 2003; Killeen & Nash, 2003), a hypnotic procedure occurs when “one person (the subject) is guided by another person (the hypnotist) to respond to suggestions for changes in subjective experience, alterations in perception, sensation, emotion, thought or behavior” (Green et al., 2005, p. 262). Logistically, there are three components to a hypnotic procedure. First, the subject is told that he or she is going to be administered “some suggestions for imaginative experiences.” Second, the subject is administered the induction, which is no more or less than “an extended initial suggestion” (Green et al., 2005, p. 262). The nature and specific wording of this initial suggestion do not seem crucial; for example, most standard hypnotic inductions include suggestions to relax, but some do not. However, all current standard protocols include at least one explicit introductory suggestion—for example, eye closure in the Stanford Scale of Hypnotic Susceptibility, Form C (SHSS:C; Weitzenhoffer & Hilgard, 1962). Finally, after this introductory suggestion that constitutes the induction, a series
Grant Benham, Department of Psychology and Department of Anthropology, University of Texas—Pan American; Erik Z. Woody, Department of Psychology, University of Waterloo, Waterloo, Ontario, Canada; K. Shannon Wilson and Michael R. Nash, Department of Psychology, University of Tennessee. This research was supported by an Operating Grant RGPGP 283352 from the Natural Sciences and Engineering Research Council of Canada to Erik Z. Woody. Correspondence concerning this article should be addressed to Michael R. Nash, Department of Psychology, University of Tennessee, 307 Austin Peay Building, Knoxville, TN 37996-0900. E-mail:
[email protected]
1
The APA definition describes the same sequence of events for selfhypnosis: “Persons can learn self-hypnosis, which is the act of administering hypnotic procedures on one’s own” (Green et al., 2005, p. 262). In addition, a hypnotic procedure can be administered via audiotape (Shor & Orne, 1962), interactive computer software (Grant & Nash, 1995), and videotape (American Sign Language; Repka & Nash, 1995). 342
HYPNOSIS: EXPECT THE UNEXPECTED
Aptitude-Centered Theories of Hypnotic Responsiveness Aptitude-centered theories posit that the highly consistent individual differences in hypnotic performance reflect the direct and substantial operation of a latent cognitive ability. Much as there are aptitudes that explain substantial variability in athletic, artistic, and intellectual performance, so there is a putative aptitude that substantially explains variability in hypnotic performance. These theorists have generally characterized this cognitive ability as a capacity to alter the experience of agency such that there are transient disconnects (dissociations) between intent–action, implicit– explicit memory, and implicit– explicit perception (Kihlstrom, 1998; Woody & Farvolden, 1998). They further have speculated that, whatever the nature of this capacity, it may be rooted in genetic (Morgan, 1973; Raz, 2005), cognitive (Tellegen & Atkinson, 1974), and neural substrates (Graffin, Ray, & Lundy, 1995; Horton, Crawford, Harrington, & Downs, 2004). However, the underlying nature of such an ability remains somewhat elusive. Aptitude-centered theorists acknowledge the additional influence of attitude on hypnotic performance, according it a role in determining the extent of hypnotic response (e.g., Shor, 1971). For instance, when people who have had no personal experience with hypnosis estimate how responsive they will be, this expectation usually correlates with actual hypnotic performance in the r ⫽ .25–.35 range (Barber & Calverley, 1969; Derman & London, 1965; Shor, 1971; Shor, Pistole, Easton, & Kihlstrom, 1984); similarly, when people estimate how hypnotically responsive other people in general are, this too correlates with their own actual hypnotic performance (r ⫽ .22; Shor, 1971). However, aptitudecentered theorists have pointed out that the relatively modest contribution of attitude leaves plenty of room for the operation of ability (Kihlstrom, 2003; see also Katsanis, Barnard, & Spanos, 1988; Spanos, Brett, Menary, & Cross, 1987). Further, people have the capacity to reflect on their own performance and revise expectations according to their own response history (Bandura, 1977, 1997; Mischel, Cantor, & Feldman, 1996; Olson, Roese, & Zanna, 1996; Wilson, Lisle, Kraft, & Wetzel, 1989). Aptitude-centered theorists expect a person’s prior performance on hypnotic tasks to inform his or her estimates (i.e., expectancies) of success on future hypnotic tasks; however, the primary determinant of performance on those future hypnotic tasks would be aptitude (a latent trait) acting directly on performance itself, not acting on performance through expectancies or attitudes per se.
Attitude-Centered Theories of Hypnotic Responsiveness In contrast, attitude-centered theorists view hypnotic responsiveness as based primarily (and perhaps even exclusively) on the direct operation of social learning, or social– cognitive, variables (expectations, motivation, attitude, and role enactment; Spanos, 1991). For instance, Kirsch (1991) has argued that response to hypnotic suggestions proceeds directly from the individual’s response expectancies about hypnosis: The effectiveness of a hypnotic induction appears to depend entirely on people’s beliefs about its effectiveness . . . . In other words, response expectancy may be the sole determinant of the situations in which hypnotic responses occur, and also of the nature of the responses that occur in those situations. (pp. 460 – 461)
343
For such theorists, people’s beliefs about their own hypnotic responsiveness exert a profound effect on the extent of hypnotic performance and may in fact constitute “the ‘essence’ of hypnosis” (Kirsch, 1991, p. 461). If aptitude-centered theorists have failed to empirically identify the contours of the purported cognitive ability underlying hypnotizability, attitude theorists have not established whether experimentally instilling high expectations of success leads to the predicted enhanced hypnotic responsiveness. In two studies, Kirsch and his colleagues (Kirsch, Wickless, & Moffitt, 1999; Wickless & Kirsch, 1989) found that participants who were administered one type of enhanced-expectation manipulation tended to be more responsive to hypnosis than control participants. Critically, in neither study were participants’ expectations actually measured (i.e., there was no manipulation check to confirm that expectations were in fact different between experimental and control groups). When this design flaw was corrected in a new study in a separate laboratory (Benham, Bowers, Nash, & Muenchen, 1998), a measurable increase in expectation due to manipulation was secured, but this did not lead to higher hypnotic responsiveness scores. Across each of two independent samples, hypnotic responsiveness of participants receiving the enhanced-expectations manipulation was no different than the hypnotic responsiveness of control participants who did not receive the manipulation. Attitude-centered theorists explain the stability of hypnotic responsiveness not as a reflection of enduring individual differences in ability but as a function of expectancies that have been stabilized by repeated testing. With each test of hypnosis (presumably each administration of a suggestion), “The subjects reach a conclusion about the degree to which they are hypnotized, and this conclusion elicits altered and more confidently held expectations about their responses to subsequent suggestions” (Kirsch & Council, 1992, p. 287). These altered expectations then directly influence the extent of response to the subsequent suggestions, which in turn leads to further consolidation of expectancies and hence stability of responsiveness. Indeed, Council, Kirsch, and Hafner (1986) queried subjects twice about their expectations: before they were administered the initial suggestions of the hypnotic procedure (i.e., prior to the induction) and after having been administered the initial suggestions (i.e., after the induction). Once subjects had a chance to observe their own response to the initial suggestions, their expectations correlated .55 with final performance, up from a correlation of .21 between initial expectations and final performance. The effect of altered expectation on responsiveness is posited to be directly causal, of substantial magnitude, and not merely due to observation of one’s own prior performance (Braffman & Kirsch, 1999; Council et al., 1986). Attitude-centered theorists tend to be open to the possibility that there could be an aptitude component to hypnotic responsiveness. However, these theorists have expressed concern that any role of aptitude is readily confounded with the strong effects of expectancy: “Once expectancy effects are eliminated, there may be nothing left” (Kirsch, 1991, p. 461; see also Braffman & Kirsch, 1999). Further addressing whether ability variables are important, Kirsch states “The question that needs to be asked about these variables is whether they can be shown to produce effects that are independent of subjects’ expectations . . . or whether their effects are entirely mediated by response expectancy” (Kirsch, 1991, p. 462).
344
BENHAM, WOODY, WILSON, AND NASH
A Structural Model Integrating Ability, Attitude, and Hypnotic Response As indicated in the foregoing review, each of the two major types of hypnosis theory raises important questions about the main evidence for the alternative theory. To summarize, according to aptitude-centered theorists, the fact that expectancies predict responsiveness does not necessarily imply that they have a causal role. This is because expectancies may be based on accurate self-observation of ability: Through their experience of hypnosis, subjects may assess their level of ability and hence correctly predict their likelihood of responding to future suggestions. Although it is certainly possible that positive expectancies have some additional feed-forward effect, aptitude-centered theorists imply that this additional effect can be assessed accurately only if underlying ability is controlled for. Likewise, according to attitude-centered theorists, the fact that hypnotic responsiveness tends to be highly consistent and stable does not necessarily imply an underlying aptitude. This is because expectancies are another major potential source of response consistency. Although it is certainly possible that an underlying aptitude may have some additional effect (beyond expectancy), attitude-centered theorists imply that this additional effect can be assessed accurately only if expectancies are controlled for. These two positions are actually complementary, and their integration is straightforward. We need a model in which both expectancy and ability serve as simultaneous predictors of hypnotic responsiveness. In this way, we can assess the effect of expectancy controlling for ability, as the aptitude-centered theorists require, and we can assess the effect of ability controlling for expectancy, as the attitude-centered theorists require. The two types of theorist appear to concur on the requirement for one additional feature of an appropriate model: Hypnotic
performance, rather than simply being an outcome to be predicted, should also serve, in turn, as a cause. According to aptitudecentered theorists, an important determinant of subjects’ expectancies might be their preceding performance because it provides them with information about their level of underlying ability. Similarly, attitude-centered theorists recognize the possibility that expectancies derive, at least in part, from subjects’ preceding experience of hypnosis and hence may be fairly labile early in their experience of hypnosis (although eventually expectancies are theorized to stabilize with repeated testing). Thus, we need a model in which subjects’ ongoing response during the hypnosis session is allowed to influence (or update) their expectancy for future responsiveness. Evaluating such a model integrating ability, expectancy, and hypnotic response requires measuring and tracking expectations and performance throughout a hypnotic protocol, preferably with participants who have never before experienced hypnosis. In this article, we report such an analysis on the basis of data collected during the course of a broad-based programmatic effort that nonetheless shares a common structure and protocol. In all cases, hypnosis-naive participants were administered the SHSS:C, which consists of a hypnotic induction followed by a sequence of 12 suggestions. Further, at six points throughout the administration of the SHSS:C, the experimenter probed for participants’ expectations regarding success with future hypnosis. All the features of our proposed integrative model are readily encapsulated in a structural equation model, as depicted in Figure 1. Expect 1 through Expect 6 represent expectancies measured at six time points over the course of the SHSS:C, and SHSS 1–3 through SHSS 10 –12 represent responsiveness to five subsets (or parcels) of hypnotic suggestions administered between successive expectancies. Except for Expect 1, which is the expectancy mea-
Figure 1. Model of response to hypnotic suggestions as a function of expectancies and hypnotic susceptibility. Expect 1 through Expect 6 represent expectancies measured at six time points over the course of the Stanford Scale of Hypnotic Susceptibility (SHSS), Form C (Weitzenhoffer & Hilgard, 1962); SHSS 1–3 through SHSS 10 –12 represent responsiveness to five subsets (or parcels) of hypnotic suggestions administered between successive expectancies; the disturbance or residual variables E1 through E6 represent all other sources of variance in expectancies; the disturbance or residual variables E7 through E11 represent all other sources of variance in hypnotic performance besides those specified by the model; Hypnotizability represents the aptitudecentered hypothesis that there is a stable ability underlying hypnotic responsiveness.
HYPNOSIS: EXPECT THE UNEXPECTED
sured after the induction but before any of the test suggestions,2 all expectancies have two determinants: the previous expectancy, the effect of which is represented by the paths labeled a, and the participant’s immediately preceding hypnotic performance, the effect of which is represented by the paths labeled c. This is a lag-one structure, the essence of which is that participants continually update their expectancies on the basis of their recent experience. We believe such a structure faithfully represents Kirsch’s (1991) arguments that “unlike personality traits, expectancies can be quite labile” (p. 457) and that they can be affected by the experience of hypnosis. Note that the relative sizes of paths a and c represent a range of possibilities: The larger the as relative to the cs, the more stable expectancies would be; in contrast, the larger the cs relative to the as, the more strongly participants’ expectancies would be updated by their ongoing response to hypnosis. The disturbance or residual variables E1 through E6 represent all other sources of variance in expectancies. Response to hypnotic suggestions, broken into five sets of items over the course of the SHSS:C, likewise has two determinants: the immediately preceding expectancy, the effect of which is represented by the paths labeled b, and an unchanging latent trait (or general factor), the effect of which is represented by the paths labeled d. The general factor, labeled Hypnotizability in the structural diagram, represents the aptitude-centered hypothesis that there is a stable ability underlying hypnotic responsiveness. It is defined in the model as a unitary latent variable underlying the residual covariation, controlling for expectancies, among the responses to subsets (or parcels) of hypnotic suggestions. As before, the relative sizes of paths b and d represent a range of possibilities: The larger the bs relative to the ds, the more strongly expectancies would determine hypnotic responsiveness; in contrast, the larger the ds relative to the bs, the more participants’ responses would be due to a nonexpectancy-related latent trait. Of course, it is also possible that the contributions of the two determinants could be approximately equal. The disturbance or residual variables E7 through E11 represent all other sources of variance in hypnotic performance besides those specified by the model. To summarize, our structural equation model integrates four major questions: 1.
2.
whether a lag-one model fits the relationships among successive hypnotic performances more convincingly than a latent-factor model. If we can successfully reject such alternatives, the case for the present model is strengthened.
Method Participants Participants were 90 undergraduate psychology students (69 females, 21 males) recruited with an offer of extra credit for participation in the initial phase of an ongoing research effort (Wilson, 2001). Only individuals reporting no previous hypnotic experience were recruited.
Measures and Procedure All participants were administered the SHSS:C, as per Wickless and Kirsch (1989) and Benham et al. (1998). This standardized scale is widely recognized as the best available measure of hypnotic responsiveness. The experimenter greeted the participant and escorted him or her to a comfortable, pleasantly lit, and quiet room. The experimenter explained the nature of the study and obtained the participant’s informed consent. The participant was seated in a comfortable recliner chair with the experimenter seated slightly behind and to the left. Before commencing the SHSS:C, the experimenter introduced the idea of an expectancy probe by asking the question “If at some future time we were to give you 20 suggestions, at that time (knowing what you know now) how many of those 20 suggestions do you think you would respond to?” Half of the participants then received the induction portion of the SHSS:C unaltered. The other half received an extended induction (of 4 extra minutes) taken in detail from Wickless and Kirsch (1989) and previously used in our lab (Benham et al., 1998). Periodically throughout the administration of the SHSS:C, participants were asked to estimate how much they expected to respond to hypnosis in the future. The expectancy measures were taken at six times during the course of the SHSS:C administration: (1) immediately following the hypnotic induction (i.e., immediately prior to administration of the 1st SHSS:C item); (2) immediately following the 3rd SHSS:C item, (3) immediately following the 5th SHSS:C item, (4) immediately following the 7th SHSS:C item, (5) immediately following the 9th SHSS:C item, and (6) immediately following the 12th SHSS:C item (following termination of hypnosis). The expectancy probe was the following question: “If we were to give you 20 suggestions at some future time, how many of those do you now think you would respond to?” For each expectation probe, a response ranging from 0 to 20 was recorded.
How stable are expectancies across the course of the SHSS:C? (coefficients a1 through a5) Do expectancies influence subsequent hypnotic responses, even when the effects of a latent ability are taken account of? (coefficients b1 through b5)
3.
Do hypnotic responses, in turn, influence subsequent expectancies? (coefficients c1 through c5)
4.
Does a latent trait underlie hypnotic responses, even when the effects of expectancies are taken account of? (coefficients d1 through d5)
Finally, it is important to note that we can evaluate a variety of other possible structural equation models that serve as plausible alternatives to our hypothesized model. For example, we can look at whether the relationships among expectancies are fit more convincingly by a latent-factor model than by a lag-one model and
345
Results Induction Type Compared with the unaltered SHSS:C induction, the extended induction had very negligible effects in this study. In particular, the mean expectancy scores for the two conditions were not significantly different at any of the six times they were given during 2 Although preinduction expectancies were also measured in the experimental procedure, we excluded them from our model in accordance with Kirsch’s (1991) position that they are of questionable relevance. Kirsch noted that the failure to find strong relationships of expectancies to hypnotic performance in some previous research is due to the fact that the expectancies were measured prior to the induction rather than after it. Because the experience of an induction changes expectancies, Kirsch argued that it is postinduction expectancies that are the “major determinants of hypnotic responding” (Council, Kirsch, & Hafner, 1986, p. 188).
346
BENHAM, WOODY, WILSON, AND NASH
administration of the SHSS:C, nor were expectancies different when averaged over all six times: for the unaltered induction, M ⫽ 14.98; and for the extended induction, M ⫽ 14.74, t(88) ⫽ 0.251, ns. Likewise, type of induction had no significant effect on overall SHSS:C scores: for the unaltered induction, M ⫽ 6.15; and for the extended induction, M ⫽ 6.43, F(1, 86) ⫽ 0.243, ns. The obvious implication is that we may collapse across type of induction. Nonetheless, as an additional check, we ascertained whether the network of associations, as represented in Figure 1, was basically the same regardless of whether the extended induction was used. This is a straightforward hypothesis to test with structural equation modeling: We analyzed the two groups simultaneously and compared two models. In one model, all the paths were allowed to be unequal across the two groups; in the other model, all pairs of respective paths were set to be equal across the two groups (a1 in the extended group equal to a1 in the standard group, etc.). If the latter model fits significantly worse than the former one, we have evidence to reject the hypothesis that the phenomena are the same in the two groups. This structural equation model, as well as all others we report, was evaluated with Amos 4.0 (Arbuckle & Wothke, 1999). In the present case, constraining all paths to be equal across groups led to no loss of fit, ⌬2(19, N ⫽ 90) ⫽ 20.69, ns. Furthermore, the model constraining the two induction groups to the same solution fit extremely well, 2(89, N ⫽ 90) ⫽ 86.18, ns (comparative fit index [CFI] ⫽ 1.00, root-mean-square error of approximations [RMSEA] ⫽ .00).3 Hence, we felt justified to move ahead with the combined sample. Indeed, the foregoing analysis indicates that rather than being dependent on any particular induction protocol, the results we report below generalize across two rather different induction procedures. (The correlation matrix, means, and standard deviations of the variables for the combined sample appear in the Appendix.)
Equality Constraints There are other equality constraints that are well worth examining. In particular, there is no a priori reason to believe that the effect of expectancy on performance would change across the administration of the scale; thus, a reasonable hypothesis is that all bs are equal. We can test this hypothesis by comparing a model in which the bs are allowed to be unequal with one in which they are all constrained to be equal (b1 ⫽ b2 ⫽ . . . ⫽ b5). Constraining these paths to be equal led to no significant loss of fit, ⌬2(4, N ⫽ 90) ⫽ 1.97, ns. Likewise, there is no a priori reason to believe that the effect of performance on expectancy would change across the administration of the scale; indeed, setting all cs to be equal led to no significant loss of fit, ⌬2(4, N ⫽ 90) ⫽ 6.87, ns. Thus, in the solution we present below, these empirically verified equality constraints have been set. They are highly advantageous because they give us much better power in estimating the respective effects. We also examined whether equality constraints could be imposed on the remaining sets of paths in the model. That is, could the as be set equal and the ds be set equal, in addition to the bs and cs? However, these additional constraints resulted in significant lack of fit, ⌬2(8, N ⫽ 90) ⫽ 16.53, p ⬍ .05, and were therefore rejected.
Estimated Parameters Figure 2 presents the estimated parameters for the model, each of which is statistically significant at the p ⬍ .05 level. (The parameters are provided in standardized form. The equality constraints apply to the unstandardized coefficients; coefficients set equal become very slightly different from one another once they are standardized because the variables have slightly different standard deviations.) This model fit the data very well, 2(43, N ⫽ 90) ⫽ 48.93, ns (CFI ⫽ .99, RMSEA ⫽ .04, pclose ⫽ .60).4 Examining the path coefficients in this model, one can see that expectancies appear to have been highly stable; only the very last expectancy was somewhat less strongly predicted by the previous one. All these expectancies were rather weakly (but significantly) updated according to the previous hypnotic response. Likewise, although both expectancies and the latent trait appear consistently to have had significant effects on hypnotic responses, the effect of the latent trait was always considerably larger.
Evaluation of Alternative Models As mentioned in the introduction, we examined some plausible alternative models. One interesting alternative was to model the relationships among expectancies not with a lag-one structure but instead with a latent factor. Such a model involved deleting all the paths labeled with as in Figure 1 and adding an Expectancy general factor with paths pointing from it to each of the six specific expectancy measures. This alternative model clearly fit poorly, 2(42, N ⫽ 90) ⫽ 120.59, p ⬍ .001 (CFI ⫽ .87, RMSEA ⫽ .14, pclose ⬍ .001). In addition, allowing the two factors Hypnotizability and Expectancy to be correlated did not improve this alternative model significantly, ⌬2(1, N ⫽ 90) ⫽ 3.26, ns. Thus, the relationships among expectancies were clearly fit better by a lag-one model than by a latent-factor model. We also tested an alternative model in which the latent trait of Hypnotizability was omitted (there was no latent factor and none of the paths labeled with ds). Unsurprisingly (given the results shown in Figure 2), this model showed significant lack of fit, 2(48, N ⫽ 90) ⫽ 102.60, p ⬍ .001 (CFI ⫽ .91, RMSEA ⫽ .11, pclose ⬍ .01). A more interesting alternative model is one in which the latent factor is again omitted, but a lag-one structure is added among the hypnotic responses: a path from SHSS 1–3 to SHSS 4 –5, another path from SHSS 4 –5 to SHSS 6 –7, and so on. This model also tended to show significant lack of fit, 2(44, N ⫽ 90) ⫽ 71.92, p ⬍ .01 (CFI ⫽ .95, RMSEA ⫽ .08, pclose ⫽ .06). In addition, we used the expected cross-validation index (ECVI; Browne & Cudeck, 1993) to compare this alternative lag-one model of hypnotic responses with the latent-trait model shown in Figure 2. The alternative model yielded a larger ECVI value (1.38) 3
Widely acknowledged criteria for excellent model fit are a nonsignificant chi-square test, CFI greater than .95, and RMSEA less than .05 (Arbuckle & Wothke, 1999; Hu & Bentler, 1999). 4 The parameter pclose evaluates the hypothesis that the fit of the model to the data is close enough that any lack of fit is attributable to sampling error (Browne & Cudeck, 1993). Hence, a relatively large probability (e.g., .50) suggests that any lack of fit might plausibly be attributed to chance, whereas a relatively small probability (e.g., .05 or smaller) indicates that the obtained lack of fit is not likely due to chance.
HYPNOSIS: EXPECT THE UNEXPECTED
347
Figure 2. Standardized coefficients for the structural equation model. Expect 1 through Expect 6 represent expectancies measured at six time points over the course of the Stanford Scale of Hypnotic Susceptibility (SHSS), Form C (Weitzenhoffer & Hilgard, 1962); SHSS 1–3 through SHSS 10 –12 represent responsiveness to five subsets (or parcels) of hypnotic suggestions administered between successive expectancies; the disturbance or residual variables E1 through E6 represent all other sources of variance in expectancies; the disturbance or residual variables E7 through E11 represent all other sources of variance in hypnotic performance besides those specified by the model; Hypnotizability represents the aptitude-centered hypothesis that there is a stable ability underlying hypnotic responsiveness. All path coefficients are significant, p ⬍ .05.
than the ECVI obtained with the model in Figure 2 (1.02), supporting the superiority of the latent-factor model.5 In summary, these tests of alternative models lend support to our proposed structural model (shown in Figures 1 and 2). Specifically, the relationships among expectancies are fit better by a lag-one model than by a latent-factor model, whereas the relationships among successive hypnotic performances are fit better by a latent-factor model than by a lag-one model.
Discussion An Integrative Perspective Our results provide clear evidence in support of each of the four major effects integrated into our proposed structural equation model: (a) Expectancies showed significant stability across the course of the hypnosis protocol; (b) expectancies influenced subsequent hypnotic responses, even controlling for latent ability; (c) hypnotic responses, in turn, affected subsequent expectancies; and (d) a latent trait underlay hypnotic responses, even controlling for expectancies. As such, the results indicate a fairly complex but quite plausible web of effects interconnecting these important variables. However, the data of the present study also suggest that the four types of linkage are of different strengths. Expectancies were a very strong predictor of later expectancies, whereas hypnotic responses were a fairly weak predictor. Thus, expectancies were less labile than one might have anticipated (Kirsch, 1991). Likewise, the underlying latent trait was a strong predictor of hypnotic responses (with coefficients of .39 –.68), whereas expectancy was a more modest predictor (with coefficients around .12). In the present model, the latent trait was essentially defined by the internal consistency among parcels of the performance measures, controlling for the influences of expectancies. The strong role of
this latent variable is consistent with the hypothesis that there is an important unitary ability underlying hypnotic performance. Because the latent variable is in a sense a default source of variation, it is possible that the results may overstate the role of ability somewhat; nonetheless, they clearly show that a large portion of the systematic variance in hypnotic performance operates independently of ongoing expectation. Hence, Kirsch’s (1991) speculation about hypnotic response that “Once expectancy effects are eliminated, there may be nothing left” (p. 461) seems too extreme. Although in the present study there were genuine effects of expectancies on hypnotic responsiveness, they were of relatively modest importance. The stability of expectancies dropped somewhat toward the end of the SHSS:C (from Expect 5 to Expect 6). Perhaps this suggests a somewhat state-dependent quality to expectancies, given that the last expectancy was taken after participants had been alerted from hypnosis. Because this last expectancy precedes no further responses, its significance is somewhat indeterminate. Otherwise, the coefficients obtained along the various stages of the administration of the SHSS:C paint a very consistent picture: Expectancies are neither irrelevant nor definitive. That is, whereas expectancies shift somewhat as hypnosis progresses, the ranking of individuals in responsiveness remains quite stable over time.
Strengths and Weaknesses of the Present Study A major strength of the present approach using structural equation modeling is that it integrates multiple hypothesized underlying 5
The smaller the ECVI, the better the obtained parameter estimates would generalize to new samples. Browne and Cudeck (1993) recommended, given alternative models, selecting the one with the smallest value.
348
BENHAM, WOODY, WILSON, AND NASH
mechanisms into one dynamic process unfolding across time. This approach avoids simply assuming that, on one hand, hypnotic response represents an ability-like maximum performance or that, on the other hand, it represents an attitude-like typical tendency consolidated through ongoing experience (cf. Dennis, Sternberg, & Beatty, 2000). Instead, our methodology allows different hypotheses, such as trait and expectancy theories of hypnotic response, to provide a context for each other. Such a multivariate approach seems to approach theoretical disputes with much greater verisimilitude, showing explicitly, for example, how more than one mechanism may be simultaneously involved. Another strength is that the relative importance of these mechanisms as predictors of performance can be compared. Accordingly, we like to believe that the present research may serve as a model for future work on hypnotic responsiveness, in which performance over time is predicted by multiple hypothesized mechanisms. One aspect of our structural equation model that might strike some aptitude-centered hypnosis theorists as a possible weakness is the portrayal of the traitlike aspects of hypnotic response with a single-factor model. For example, Hilgard (1965) advanced a three-factor model of responses to the SHSS:C. Nonetheless, Balthazard and Woody (1985) raised the possibility that one or two of these factors might be artifacts of the factoring procedures that Hilgard and others have used. More recently, Sadler and Woody (2004) applied a sophisticated item response theory-based factoring method to the Waterloo–Stanford Group C Scale (Bowers, 1993, 1998), which was very closely modeled after the SHSS:C. They found that the items of the Waterloo–Stanford Group C Scale closely approximated unidimensionality. Woody, Barnier, and McConkey (2005) applied the same method to an extensive item pool that included the SHSS:C. Although they were able to isolate separable subcomponents of ability, there was a very strong second-order general factor. Given that in the present model hypnotic responsiveness was represented by parcels of items rather than by single items, such a general factor is likely the major common contributor to performance; indeed, there were no problems of model fit to suggest otherwise. From the point of view of an advocate for the responseexpectancy theory of hypnosis, the chief possible weakness of the present study might have to do with the measurement of expectancies. Although we believe our measurement of expectancies is reasonable and faithful to the theory (self-expectations of success), it is not the only way to measure them. One valuable strategy for future research would be to measure expectancies at each time point in two or three different ways and use these multiple measures in the structural equation model to correct for imperfect reliability. However, although it is possible that another measurement approach could yield stronger effects for expectancies, the effects would need considerable improvement to approach the magnitude of the role of the latent trait. A final point about our model is that although expectation was measured directly, ability was measured only indirectly. It would be preferable, if possible, to index ability with pure indicators that are arguably unaffected by expectancies and any other nonability influences; however, there is not yet any consensus about what such indicators would be. For example, Balthazard and Woody (1992) proposed that the Tellegen Absorption Scale (Tellegen & Atkinson, 1974) might possibly serve as such a pure indicator of underlying ability, but it has been argued that even the relationship
between absorption and hypnotizability may reflect shared expectancies (Kirsch & Council, 1992).
Conclusions Citing Isaiah Berlin’s essay “The Hedgehog and the Fox,” Kihlstrom (2003) commented on the tension between monism and pluralism in hypnosis theory: There never was any necessary incompatibility between cognitive and social psychological approaches to hypnosis, and it is a mistake to believe there was or to act as if there was. There is plenty of hypnosis to go around for everyone, and everyone can make a positive contribution so long as nobody makes a claim that his or her theory is universal and sufficient. (p. 177)
We view the results of our structural equation modeling to be fully congruent with this sentiment. On one hand, decades of empirical work has documented that the individual’s attitude, expectations, and motivation have an impact on hypnotic response. Indeed, in the present study there were tangible effects of expectancies on hypnotic responsiveness. On the other hand, the strength of the linkages we observed between expectation and performance did not support the more extreme position that expectations are the “sole determinants” or the “essence” of hypnosis (Kirsch, 1991, pp. 460 – 461). To the contrary, our analysis demonstrated that there is an abundance of residual variability in hypnotic performance that is unexplained by the direct or indirect influence of expectation. Further, our results are compatible with the notion that there is a latent trait or cognitive ability that underlies hypnotic response and that, along with prior attitudes, strongly codetermines the nature and extent of an individual’s response to hypnotic suggestion.
References Arbuckle, J. L., & Wothke, W. (1999). Amos 4.0 user’s guide. Chicago: SPSS/SmallWaters. Balthazard, C. G., & Woody, E. Z. (1985). The “stuff” of hypnotic performance: A review of psychometric approaches. Psychological Bulletin, 98, 283–296. Balthazard, C. G., & Woody, E. Z. (1992). The spectral analysis of hypnotic performance with respect to “absorption.” International Journal of Clinical and Experimental Hypnosis, 40, 21– 43. Bandura, A. (1977). Social learning theory. Oxford, England: Prentice Hall. Bandura, A. (1997). Self-efficacy: The exercise of control. New York: Freeman/Times Books/Henry Holt. Barber, T. X., & Calverley, D. S. (1969). Multidimensional analysis of “hypnotic” behavior. Journal of Abnormal Psychology, 74, 209 –220. Benham, G., Bowers, S., Nash, M., & Muenchen, R. (1998). Self-fulfilling prophecy and hypnotic response are not the same thing. Journal of Personality and Social Psychology, 75, 1604 –1613. Bowers, K. S. (1993). The Waterloo–Stanford Group C (WSGC) Scale of Hypnotic Susceptibility: Normative and comparative data. International Journal of Clinical and Experimental Hypnosis, 41, 35– 46. Bowers, K. S. (1998). Waterloo–Stanford Group Scale of Hypnotic Susceptibility, Form C: Manual and response booklet. International Journal of Clinical and Experimental Hypnosis, 46, 250 –268. Braffman, W., & Kirsch, I. (1999). Imaginative suggestibility and hypnotizability: An empirical analysis. Journal of Personality and Social Psychology, 77, 578 –587. Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model
HYPNOSIS: EXPECT THE UNEXPECTED fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136 –162). Newbury Park, CA: Sage. Council, J. R., Kirsch, I., & Hafner, L. P. (1986). Expectancy versus absorption in the prediction of hypnotic responding. Journal of Personality and Social Psychology, 50, 182–189. Dennis, M. J., Sternberg, R. J., & Beatty, P. (2000). The construction of “user-friendly” tests of cognitive functioning: A synthesis of maximaland typical-performance measurement philosophies. Intelligence, 28, 193–211. Derman, D., & London, P. (1965). Correlates of hypnotic susceptibility. Journal of Consulting Psychology, 29, 537–545. Graffin, N. F., Ray, W. J., & Lundy, R. (1995). EEG concomitants of hypnosis and hypnotic susceptibility. Journal of Abnormal Psychology, 104, 123–131. Grant, C. D., & Nash, M. R. (1995). The Computer-Assisted Hypnosis Scale: Standardization and norming of a computer-administered measure of hypnotic ability. Psychological Assessment, 7, 49 –58. Green, J. P., Barabasz, A. A., Barrett, D., & Montgomery, G. H. (2005). Forging ahead: The 2003 APA Division 30 definition of hypnosis. International Journal of Clinical and Experimental Hypnosis, 53, 259 – 264. Hilgard, E. R. (1965). Hypnotic susceptibility. New York: Harcourt, Brace & World. Horton, J. E., Crawford, H. J., Harrington, G., & Downs, J. H. (2004). Increased anterior corpus callosum size associated positively with hypnotizability and the ability to control pain. Brain, 127, 1741–1747. Hu, L.-T., & Bentler, P. M. (1999). Cut-off criteria for fit indices in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55. Katsanis, J., Barnard, J., & Spanos, N. P. (1988). Self-predictions, interpretational set and imagery vividness as determinants of hypnotic responding. Imagination, Cognition and Personality, 8, 63–77. Kihlstrom, J. F. (1998). Dissociations and dissociation theory in hypnosis: Comment on Kirsch and Lynn (1998). Psychological Bulletin, 123, 186 –191. Kihlstrom, J. F. (2003). The fox, the hedgehog, and hypnosis. International Journal of Clinical and Experimental Hypnosis, 51, 166 –189. Killeen, P. R., & Nash, M. R. (2003). The four causes of hypnosis. International Journal of Clinical and Experimental Hypnosis, 51, 195– 231. Kirsch, I. (1991). The social learning theory of hypnosis. In S. J. Lynn & J. W. Rhue (Eds.), Theories of hypnosis: Current models and perspectives (pp. 439 – 465). New York: Guilford Press. Kirsch, I., & Council, J. R. (1992). Situational and personality correlates of hypnotic responsiveness. In E. Fromm & M. R. Nash (Eds.), Contemporary hypnosis research (pp. 267–291). New York: Guilford Press. Kirsch, I., Wickless, C., & Moffitt, K. H. (1999). Expectancy and suggestibility: Are the effects of environmental enhancement due to detection? International Journal of Clinical and Experimental Hypnosis, 47, 40 – 45. Mischel, W., Cantor, N., & Feldman, S. (1996). Principles of self-regulation: The nature of willpower and self-control. In A. W. Kruglanski & E. T. Higgins (Eds.), Social psychology: Handbook of basic principles (pp. 329 –360). New York: Guilford Press.
349
Morgan, A. H. (1973). The heritability of hypnotic susceptibility in twins. Journal of Abnormal Psychology, 82, 55– 61. Olson, J. M., Roese, N. J., & Zanna, M. P. (1996). Expectancies. In A. W. Kruglanski & E. T. Higgins (Eds.), Social psychology: Handbook of basic principles (pp. 211–238). New York: Guilford Press. Piccione, C., Hilgard, E. R., & Zimbardo, P. G. (1989). On the degree of stability of measured hypnotizability over a 25-year period. Journal of Personality and Social Psychology, 56, 289 –295. Raz, A. (2005). Attention and hypnosis: Neural substrates and genetic associations of two converging processes. International Journal of Clinical and Experimental Hypnosis, 53, 237–258. Repka, R. J., & Nash, M. R. (1995). Hypnotic responsivity of the deaf: The development of the University of Tennessee Hypnotic Susceptibility Scale for the Deaf. International Journal of Clinical and Experimental Hypnosis, 43, 316 –331. Sadler, P., & Woody, E. (2004). Four decades of group hypnosis scales: What does Item Response Theory tell us about what we’ve been measuring? International Journal of Clinical and Experimental Hypnosis, 52, 132–158. Shor, R. E. (1971). Expectancies of being influenced and hypnotic performance. International Journal of Clinical and Experimental Hypnosis, 19, 154 –166. Shor, R. E., & Orne, E. C. (1962). Harvard Group Scale of Hypnotic Susceptibility, Form A. Palo Alto, CA: Consulting Psychologist Press. Shor, R. E., Pistole, D. D., Easton, R. D., & Kihlstrom, J. F. (1984). Relation of predicted to actual hypnotic responsiveness with special reference to posthypnotic amnesia. International Journal of Clinical and Experimental Hypnosis, 32, 376 –387. Spanos, N. P. (1991). A sociocognitive approach to hypnosis. In S. J. Lynn & J. W. Rhue (Eds.), Theories of hypnosis: Current models and perspectives (pp. 324 –361). New York: Guilford Press. Spanos, N. P., Brett, P. J., Menary, E. P., & Cross, W. P. (1987). A measure of attitudes toward hypnosis: Relationships with absorption and hypnotic susceptibility. American Journal of Clinical Hypnosis, 30, 139 –150. Tellegen, A., & Atkinson, G. (1974). Openness to absorbing and selfaltering experiences (“absorption”), a trait related to hypnotic susceptibility. Journal of Abnormal Psychology, 83, 268 –277. Weitzenhoffer, A. M., & Hilgard, E. R. (1962). Stanford Hypnotic Susceptibility Scale, Form C. Palo Alto, CA: Consulting Psychologists Press. Wickless, C., & Kirsch, I. (1989). Effects of verbal and experiential expectancy manipulations on hypnotic susceptibility. Journal of Personality and Social Psychology, 57, 762–768. Wilson, K. S. (2001). Expectations and hypnotic response. Unpublished manuscript. Wilson, T. D., Lisle, D. J., Kraft, D., & Wetzel, C. G. (1989). Preferences as expectation-driven inferences: Effects of affective expectations on affective experience. Journal of Personality and Social Psychology, 56, 519 –530. Woody, E. Z., Barnier, A. J., & McConkey, K. M. (2005). Multiple hypnotizabilities: Differentiating the building blocks of hypnotic response. Psychological Assessment, 17, 200 –211. Woody, E. Z., & Farvolden, P. (1998). Dissociation in hypnosis and frontal executive function. American Journal of Clinical Hypnosis, 40, 206 – 216.
(Appendix follows)
BENHAM, WOODY, WILSON, AND NASH
350
Appendix Correlation Matrix, Means, and Standard Deviations of the Variables Variable 1. Expect 1 2. SHSS 1–3 3. Expect 2 4. SHSS 4–5 5. Expect 3 6. SHSS 6–7 7. Expect 4 8. SHSS 8–9 9. Expect 5 10. SHSS 10–12 11. Expect 6 M SD
1
2
3
4
5
6
7
8
9
10
11
— .271 — .794 .300 — .106 .311 .184 — .747 .317 .855 .232 — .158 .226 .144 .388 .209 — .656 .348 .766 .239 .894 .233 — .146 .267 .033 .383 .041 .300 .055 — .617 .416 .699 .271 .813 .219 .913 .174 — .062 .223 .112 .372 .137 .372 .128 .251 .154 — .471 .354 .467 ⫺.022 .456 .058 .478 .028 .572 ⫺.034 — 15.078 2.678 15.478 1.222 15.622 0.978 15.700 0.589 15.856 0.833 13.544 5.016 0.647 5.091 0.742 4.936 0.789 5.587 0.713 5.523 0.860 6.877
Note. Expect 1 through Expect 6 represent expectancies measured at six time points over the course of the Stanford Scale, of Hypnotic Susceptibility (SHSS), Form C (Weitzenhoffer & Hilgard, 1962); SHSS 1–3 through SHSS 10 –12 represent responsiveness to five subsets (or parcels) of hypnotic suggestions administered between successive expectancies.
Received August 15, 2005 Revision received February 15, 2006 Accepted February 27, 2006 䡲
Journal of Personality and Social Psychology 2006, Vol. 91, No. 2, 351–367
Copyright 2006 by the American Psychological Association 0022-3514/06/$12.00 DOI: 10.1037/0022-3514.91.2.351
Conceptual Beliefs About Human Values and Their Implications: Human Nature Beliefs Predict Value Importance, Value Trade-Offs, and Responses to Value-Laden Rhetoric Paul G. Bain
Yoshihisa Kashima and Nick Haslam
University of Melbourne and Murdoch University
University of Melbourne
Beliefs that may underlie the importance of human values were investigated in 4 studies, drawing on research that distinguishes natural-kind (natural), nominal-kind (conventional), and artifact (functional) beliefs. Values were best characterized by artifact and nominal-kind beliefs, as well as a natural-kind belief specific to the social domain, “human nature” (Studies 1 and 2). The extent to which values were considered central to human nature was associated with value importance in both Australia and Japan (Study 2), and experimentally manipulating human nature beliefs influenced value importance (Study 3). Beyond their association with importance, human nature beliefs predicted participants’ reactions to value trade-offs (Study 1) and to value-laden rhetorical statements (Study 4). Human nature beliefs therefore play a central role in the psychology of values. Keywords: human values, human nature, natural kinds, trade-offs, rhetoric
In this article, we investigate another possible psychological basis for value importance— basic, often implicit, beliefs about the nature of values. We draw on extensive research on beliefs about the nature of objects, which provide a psychological basis for people’s understanding of objects in the physical world (e.g., Barton & Komatsu, 1989; Bloom, 1996; Gelman, 2003; Keil, 1989; Malt, 1990). This research distinguishes beliefs about three types of objects—natural kinds, artifacts, and nominal kinds. Natural kinds are biological and physical entities that are believed to possess an inherent essence arising from nature. Artifacts are created by humans, arising from human intentions and usually serving functions and purposes. Nominal kinds arise from human conventions and definitions. Such beliefs provide a fundamental basis on which people understand and act toward the physical world (Boyd, 1999; Gelman & Hirschfeld, 1999; Kalish, 2002; Rips, 1989). In this article, we call such beliefs conceptual beliefs because they relate to mental representations of categories, or concepts (Komatsu, 1992). We propose that people have analogous conceptual beliefs about the nature of human values, which helps them distinguish values that are fundamentally important from those that are not. That is, the importance of values may be based on conceptual beliefs that they are an integral part of nature (natural kind), serve functions and purposes (artifact), or are conventionally shared within our social and cultural groups (nominal kind). In addition, we propose a fourth kind of conceptual belief especially relevant to values— that they are based in human nature. In the present research, we compare conceptual beliefs about values with typical natural kind, nominal kind, and artifact concepts, drawn from long-standing research about objects and from emerging research in the social domain. We show that one conceptual belief in particular, centrality to human nature, appears to underlie the importance of values within and across two cultures (Australia and Japan). We also demonstrate that conceptual beliefs have implications beyond their association with value importance,
The study of human values has been dominated by the issue of value importance. The relative importance of values, whether ranked (e.g., Rokeach, 1973) or rated (e.g., S. H. Schwartz, 1992), has served as the basis for examining how people use values to understand and act in the world. However, there has been relatively little investigation into the psychological bases of value importance. Why, psychologically, do some people choose to pursue wealth or power as ultimate goals in life, while others devote their lives to creative pursuits or helping others? This question can also be extended to cultures: Given that cultures exhibit differences in value priorities, what psychological factors might underlie the importance of values for people within those cultures? The most straightforward approach to this question would be simply to ask people why their values are important. This was done in pioneering work by Maio and colleagues (Bernard, Maio, & Olson, 2003; Maio & Olson, 1998). However, they found that people lacked explicit reasons for the importance of values, and they concluded that many values are seen as self-evident truisms. As a result, explicit reasons do not appear to be able to help us explain how people distinguish their value priorities.
Paul G. Bain, Department of Psychology, University of Melbourne, Melbourne, Australia, and School of Psychology, Murdoch University, Perth, Australia; Yoshihisa Kashima and Nick Haslam, Department of Psychology, University of Melbourne. This article was supported by an Australian Postgraduate Award to Paul G. Bain and composes part of his doctoral dissertation. We thank Shalom Schwartz, Ngaire Donaghue, Helen Davis, and Melanie Freeman for comments on earlier versions of this article. Thanks also to Miki Fukunaga for coordinating data collection in Japan for Study 2, and to Jen Whelan and Pene Newitt for their assistance in collecting the data for Study 3. Correspondence concerning this article should be addressed to Paul G. Bain, School of Psychology, Murdoch University, Murdoch, 6150 Western Australia. E-mail:
[email protected] 351
352
BAIN, KASHIMA, AND HASLAM
specifically for how people react to value trade-offs and for the effectiveness of value-laden political rhetoric.
Conceptual Beliefs About Physical Objects The distinction between natural kinds, nominal kinds, and artifacts was first proposed by Locke (1975/1700), who argued that natural kinds (e.g., animals, plants) are distinct from nominal kinds (e.g., triangle) and artifacts (e.g., pencil) because only natural kinds may have a nonhuman underlying nature, which is beyond human shaping. Moreover, Locke argued that artifacts are based on the intentions of their designer, distinguishing them from nominal kinds, which are based on shared human conventions. These distinctions from erudite philosophy appear to characterize laypeople’s beliefs about many concepts. People believe that natural kind objects have underlying characteristics or essences that are responsible for their observable features (Barton & Komatsu, 1989; Diesendruck, 2001; Gelman, 2003; Gelman & Hirschfeld, 1999; Medin & Ortony, 1989). For instance, a raccoon may be believed to possess a “raccoon essence,” causing it to have the features common to raccoons, and an animal believed to have raccoon “insides” will remain so even if its surface appearance is modified (Keil, 1989). Although people usually lack detailed knowledge about underlying essences of natural kinds (Medin & Ortony, 1989, Putnam (1975) argued that laypeople display deference to experts about natural kinds, turning to experts when they need to know more about a natural kind’s essence. Artifacts are commonly believed to serve functions and purposes (Keil, 1989). Barton and Komatsu (1989) showed that changes in functions were more likely to change category membership for artifacts than for natural kinds (e.g., a mirror that could no longer reflect images). However, functions do not exclusively determine artifact categories (Malt & Johnson, 1992), and the designer’s intention about how they want an object to be categorized may be critical in some cases (Bloom, 1996, 1998; Rips, 1989). This helps explain why some artifacts belong to a category even though they do not serve the usual function (e.g., a chair intended for display in a museum). Nominal kinds are believed to be based on conventions, agreements, and definitions (Keil, 1989; Malt, 1990; S. P. Schwartz, 1978). They range from strictly defined terms (e.g., triangles) to human-created categories with no clear boundaries (e.g., the color red). Malt (1990) showed that participants rated the term by definition more meaningful for nominal kinds than for natural kinds and artifacts. However, socially defined, conventional aspects come to the fore where nominal kinds lack clear definitions. Thus, Malt (1990, Experiment 4) found that people chose the option “call it whichever you want” when asked to assign one of two categories to ambiguous nominal kinds (e.g., a number halfway between a prime number and an odd number). Here, categories can be seen as determined by agreement within relevant social/cultural groups.
Conceptual Beliefs About Social Things These beliefs about objects appear to have counterparts in the social world. People appear to make a distinction analogous to that between natural kinds and artifacts/nominal kinds in their understanding of social categories (Haslam, Rothschild, & Ernst, 2000,
2002; Hirschfeld, 1996; Rothbart & Taylor, 1992). In particular, natural-kind social categories are imbued with an underlying essence that is seen as responsible for category membership (Haslam et al., 2000) and involve deference to experts in deciding category membership (Yzerbyt, Rocher, McGarty & Haslam, 1997; reported in Yzerbyt, Corneille, & Estrada, 2001). In the area of social rules, Turiel (1998) argued for a distinction between social rules analogous to the distinction between natural kinds and nominal kinds. Moral social rules, such as those prohibiting hitting and stealing, are believed to be universally applicable and independent of human conventions. In contrast, conventional rules such as wearing a uniform at school rely on situational and group-based norms (Smetana, 1981, 1985; Tisak, 1995). That is, hitting may be seen as wrong in essence, even if rules permitting it exist, but not wearing a school uniform is wrong only if the school in question has a compulsory school uniform rule. Thus, there is some basis for the idea that these conceptual beliefs exist in the social domain. However, we propose that, in the social domain, people may think about nature in a way that is distinct from the ways they think about the nature of objects. Although humans can be viewed as physical or biological objects, they also are seen to have a specifically “human” nature. Human and physical nature have long been distinguished in philosophy and religion. Since at least the time of Galileo, philosophers have distinguished the mind of the thinking human from the rest of the physical world, with a corresponding distinction between nature as it applies to humans and nature as it applies to objects (Collingwood, 1960). This distinction is also present in religions such as Christianity, in which humans are described as transcending and controlling (physical) nature (Stevenson & Haberman, 1998). A specifically human nature has been invoked in a variety of social–psychological research areas. For instance, beliefs about human nature are said to underpin macrolevel political theories (Tetlock & Goldgeier, 2000), influence interpersonal interactions and attitudes (Wrightsman, 1992), and guide intergroup relations (Leyens et al., 2003). People have been shown to deny aspects of human nature to outgroups (Leyens et al., 2000, 2001) and individuals (Haslam, Bain, Douge, Lee, & Bastian, 2005), and personality traits rated high on human nature appear to be especially important for personal identity and interpersonal relations (Haslam, Bastian, & Bissett, 2004). In summary, just as conceptual beliefs about objects provide a psychological basis for understanding the physical world, conceptual beliefs about social things may provide a psychological basis for understanding the social world. Although some beliefs about social things may be directly analogous to beliefs about the nature of objects, a belief that is intrinsic to the social domain— human nature—may also play a role.
Present Research: Conceptual Beliefs About Human Values In the present studies, we examine how human values are associated with natural-kind, nominal-kind, artifact, and human nature beliefs, and how these beliefs about values are related to value importance. In addition, we investigate whether these beliefs may be related to how people use their values, specifically in their reactions to value trade-offs and value-laden rhetoric.
BELIEFS ABOUT VALUES
Some theories of human values have invoked ideas similar to artifact and nominal-kind beliefs to explain why some values are seen to be more important than others. For instance, functional approaches to values (S. H. Schwartz, 1992, 1994) imply that values are important because they serve useful social functions (values as artifacts). Other approaches argue that values are important to members of a culture because of shared socialization and conventions (Rokeach, 1973; values as nominal kinds). An ingroup’s values may be seen as core aspects of human nature, as S. H. Schwartz and Struch (1989; Struch & Schwartz, 1989) showed that people regarded outgroups with highly dissimilar value priorities as less human. In this article, we examined these possibilities and especially the role of human nature beliefs in value judgments. First, we developed a battery of measures of these conceptual beliefs. We conducted two pilot studies to establish their validity. Second, we examined people’s beliefs about the nature of human values in comparison with other physical and social objects in Study 1 to determine whether values were believed to be more like natural kinds, artifacts, or nominal kinds? In Study 2, we extended this examination to two cultures that have been found to differ somewhat in the structure of their values (Australia and Japan; S. H. Schwartz & Sagiv, 1995). Third, we examined how people’s beliefs about the nature of values are associated with their judgments of value importance. In Studies 1 and 2, we examined these associations correlationally and identified human nature belief as showing the strongest associations with value importance. In Study 3, we experimentally manipulated people’s beliefs about the extent to which values are based in human nature and examined whether ratings of value importance were influenced. Fourth, we sought to establish the significance of human nature beliefs for other value-related phenomena. As part of Study 1, we examined the role of human nature beliefs in people’s reactions to value trade-offs, and in Study 4, we examined the role of these beliefs in people’s reactions to rhetorical justifications of values.
Pilot Studies: Developing Measures of Conceptual Beliefs To examine conceptual beliefs about the nature of physical as well as social objects, we developed two instruments to measure conceptual beliefs. The first was designed to measure conceptual beliefs for both physical objects and social concepts and was called the Resolving Issues About Concepts (RIC) questionnaire (see Appendix A). The second instrument was designed to measure human nature beliefs.
RIC Questionnaire This questionnaire is based on two premises. First, there can be uncertainty about whether some “thing” (physical or social) belongs to a particular category. For instance, when we see an object that looks like a mirror, we can debate whether it is really a mirror. Similarly, we can argue about whether a person’s actions are consistent with a certain human value (e.g., whether their conduct reflects “ambition”). Second, the information people find most relevant to resolve this categorization issue will reflect their beliefs about the object’s conceptual basis. For example, they should find
353
information about functions and intentions to be most relevant for artifacts and information about underlying essential characteristics most relevant for natural kinds. In the RIC questionnaire, a category membership issue is presented, stating that there are opposing views about how an object/a person’s conduct should be classified. Participants rate the usefulness of different sources of information for deciding which view is more justified. A stronger reliance on a particular information source implies that the respondent regards the object as a natural kind, a nominal kind, or an artifact. The order of indicators was randomized, but they were presented in the same order for each issue. Usefulness ratings are made on a 5-point scale on which 1 ⫽ Not at all useful, 3 ⫽ Moderately useful, 5 ⫽ Very useful, and na is used if the information is not applicable or does not make sense for that issue. The version of the RIC in Appendix A contains only the validated indicators of each concept type. We conducted two pilot studies to examine the validity of these indicators, using both physical objects and social concepts. In Study A, the RIC questionnaire was administered to 82 undergraduate students at an Australian university (77.8% female). The version of the RIC in Study A differed slightly from the version in Appendix A, in that the functions and purposes artifact indicators were combined into a single item, and additional indicators were used: For artifacts, “The intention behind the creation of [concept]”; and for nominal kinds, “A definition of [concept].” Ratings on the RIC were made for two natural-kind objects (goat, gold), two artifacts (pencil, mirror), and two nominal kinds (red, majority), which were drawn from previous studies (Barton & Komatsu, 1989; Keil, 1989; Malt, 1990). In addition, ratings were made about social categories identified by Haslam et al. (2000) as involving natural-kind beliefs (e.g., blind, Hispanic) and a category not involving natural-kind beliefs (e.g., movie buff). Study B was used to replicate the findings of Study A and improve the RIC. The functions and purposes artifact indicators were separated, an additional natural-kind indicator was included, “Results from a systematic investigation of [concept],” and the nominal-kind indicator “definition” was substituted with “Information about the typical characteristics of [concept].” Therefore, in Study B there were three indicators of each concept type. Data were collected from 69 participants (68.1% female), with different participants rating object concepts and social concepts so that results cannot be explained by participants noticing differences between object and social concepts. No age or gender differences were observed in those completing the two questionnaire versions. One nominal-kind concept was substituted (news for majority). This study also included moral social rules (e.g., hitting, stealing) and conventional social rules (e.g., wearing uniforms, using teachers’ first names), adapted from Smetana (1981) and presented as an issue of determining whether someone’s behavior had breached school rules. Object concepts. Table 1 shows the mean rating for each concept type on the indicators in both pilot studies. We used repeated measures analyses of variance (ANOVAs) with planned comparisons to determine whether indicators of each concept type were rated higher for exemplars of that concept type (Table 1; means shown in boldface) than for the average of exemplars of other concept types. For example, if a natural-kind indicator is
BAIN, KASHIMA, AND HASLAM
354
Table 1 Comparison of Exemplars of Object Concepts on the RIC in Two Pilot Studies Indicator
Natural kinds
Artifacts
Nominal kinds
Planned comparison: Own type vs. others
Study A Natural kind Expert analysis Underlying characteristics Artifact Functions/purposes Intentions Nominal kind People’s opinions General survey Definition
4.7 (0.6) 4.4 (0.8)
3.5 (0.9) 3.8 (1.0)
3.8 (0.9) 3.6 (1.0)
F(1, 65) ⴝ 109.9** F(1, 65) ⴝ 41.2**
3.2 (1.1) 3.3 (1.1)
3.8 (1.1) 3.3 (1.0)
2.9 (1.0) 2.9 (0.9)
F(1, 43) ⴝ 30.6** F(1, 30) ⫽ 1.8, ns
2.2 (0.9) 2.2 (0.9) 4.1 (0.8)
2.5 (1.0) 2.8 (1.2) 4.4 (0.7)
3.0 (0.9) 3.0 (0.9) 4.2 (0.8)
F(1, 74) ⴝ 58.7** F(1, 66) ⴝ 32.3** F(1, 75) ⫽ 0.5, ns
Study B Natural kind Expert analysis Underlying characteristics Systematic investigation Artifact Purposes Functions Intentions Nominal kind People’s opinions General survey Typical features
4.9 (0.3) 4.2 (0.6) 3.8 (0.7)
4.6 (0.5) 4.0 (0.8) 3.7 (0.8)
4.0 (0.7) 3.9 (0.7) 3.3 (0.9)
F(1, 28) ⴝ 36.6** F(1, 30) ⴝ 4.9* F (1, 25) ⫽ 3.6, ns
3.1 (0.8) 3.9 (0.7) 3.0 (0.9)
4.1 (0.7) 4.1 (0.8) 3.2 (0.8)
3.6 (0.6) 3.4 (0.8) 3.1 (1.0)
F(1, 23) ⴝ 21.4** F(1, 24) ⴝ 8.9** F(1, 20) ⫽ 1.7, ns
3.0 (0.8) 2.2 (0.8) 4.0 (0.8)
3.1 (0.8) 2.4 (0.8) 3.8 (0.8)
3.5 (0.7) 3.1 (0.8) 3.6 (0.9)
F(1, 32) ⴝ 13.4** F(1, 26) ⴝ 28.9** F(1, 28) ⫽ 5.0*a
Note. For each indicator, planned comparisons test for the difference between the boldfaced means and the (averaged) means for the other two concept types. Boldfaced means should be higher if the indicator is valid. RIC ⫽ Resolving Issues About Concepts. a Difference was opposite to expected direction. * p ⬍ .05. ** p ⬍ .01.
Artifact objects were rated significantly higher than other objects on functions and purposes (combined in Study A). Nominal-kind objects were rated higher than other objects on people’s opinions and general survey. Thus, these pilot studies identified two valid indicators of each concept type. Social concepts. Table 2 shows the mean ratings of natural and nonnatural social concepts (from Study A) and of moral and conventional social rules (from Study B). Natural social categories
valid, it should be rated significantly higher for natural kinds than for other concept types. Likewise, artifact concepts should be rated significantly higher on artifact indicators, and nominal-kind concepts significantly higher on nominal-kind indicators. Table 1 shows that most indicators of each concept type were valid, reliably distinguishing concept types (comparisons within rows). Natural kind objects were rated significantly higher than other objects on expert analysis and underlying characteristics. Table 2 Comparison of Social Categories and Social Rules on the RIC Study A: Social categories Indicator Natural kind Expert analysis Underlying characteristics Artifact Functions/purposes Functions Purposes Nominal kind People’s opinions General survey
Study B: Social rules
Natural
Non-natural
Comparison
Moral
Conventional
Comparison
4.3 (0.8) 3.9 (0.9)
3.0 (1.2) 3.5 (1.3)
F(1, 78) ⴝ 102.4** F(1, 58) ⴝ 9.9**
3.3 (1.0) 3.9 (0.7)
2.6 (0.9) 3.9 (0.8)
F(1, 30) ⴝ 15.1** F(1, 27) ⫽ 0.1, ns
3.0 (1.3)
3.5 (1.1)
F(1, 40) ⴝ 8.8** 3.8 (0.7) 4.2 (0.7)
3.9 (0.7) 4.1 (0.8)
F(1, 30) ⫽ 0.5, ns F(1, 32) ⫽ 0.3, ns
2.7 (1.0) 2.5 (0.9)
2.9 (1.2) 2.5 (0.8)
F(1, 33) ⫽ 1.6, ns F(1, 29) ⫽ 0.4, ns
2.3 (1.0) 2.7 (1.1)
3.3 (1.2) 3.5 (1.2)
F(1, 78) ⴝ 61.8** F(1, 79) ⴝ 40.5**
Note. For each indicator, planned comparisons test for the difference between the boldfaced means and the (averaged) means for the other two concept types. Boldfaced means should be higher if the indicator is valid. RIC ⫽ Resolving Issues About Concepts. ** p ⬍ .01.
BELIEFS ABOUT VALUES
were rated significantly higher than the nonnatural social category on the natural-kind indicators and were rated significantly lower on all other indicators. Thus, the RIC appears to distinguish natural and nonnatural social categories. However, the findings were weaker for moral and conventional rules. As expected, moral rules were rated significantly higher than conventional rules on the natural-kind indicator expert analysis, but no difference was observed for underlying characteristics. There were no differences between moral and conventional rules on the nominal kind or artifact indicators. An explanation for this finding is that participants may have focused attention on their status as rules, seeing both types of rules (correctly) as human conventions created for the purpose of constraining behavior; that is, involving both nominal-kind and artifact beliefs. Within these general beliefs about social rules, however, it appears that moral rules are seen as more natural than conventional rules.
Human Nature Ratings The RIC questionnaire was designed to measure conceptual beliefs about the nature of things in both the physical and social domains. A separate measure of human nature beliefs was developed because human nature is tied specifically to the social domain. Human nature is something all humans are expected to possess but not nonhumans; therefore, we used a scenario comparing humans with beings that are humanlike in every way, except for their knowledge about the concept in question. These humanlike beings lived in a place called Twin Earth (following Putnam, 1975), which was described in the following way: Twin Earth is very much like Earth. In fact, apart from a few differences we will specify, you may suppose that Twin Earth is exactly like Earth. On Twin Earth live beings that are very much like people on Earth, that are similar to people in almost all ways. They even speak [language of participants]. However, they differ from people on Earth in that they don’t possess or understand some of the concepts that we do on Earth. Not only do they lack the concept in any discernible form, they also have no idea what the concept means.
Following this scenario, the list of concepts was presented in an order different from that in the RIC questionnaire. Participants were directed to think about each concept independently and consider the following question: “If Twin-Earthians do not understand or regard as important each of the concepts, how different would their nature be from humans on Earth?” Ratings on a 5-point scale were as follows: 1 (not at all different), 2 (a little different), 3 (moderately different), 4 (very different), and 5 (extremely different). Human nature ratings were obtained for 22 social and object concepts in Study B, including those mentioned earlier and human values. The mean rating for social concepts (M ⫽ 3.69, SD ⫽ 0.64) was significantly higher than for object concepts (M ⫽ 2.67, SD ⫽ 0.73), t(68) ⫽ 11.4, p ⬍ .001. In addition, human nature ratings for concepts were distinct from natural-kind beliefs. Using mean ratings (n ⫽ 22), a correlation of ⫺.66 ( p ⬍ .01) was observed between human nature and expert analysis, and the correlation between human nature and underlying characteristics was ⫺.42 (ns). These results show that human nature beliefs were found to be most relevant to concepts in the social domain than for objects
355
and distinct from natural-kind beliefs. Further validation of the human nature measure is presented in Study 1.
Study 1 Having established that the RIC and human nature measures assess distinct conceptual beliefs about values, we used them to address questions about values and value importance in Study 1, which had three aims. To identify the typical conceptual beliefs held about human values, we first compared ratings on the conceptual beliefs for values with those for concepts from the physical domain (natural-kind, artifact, and nominal-kind objects) and the social domain (social categories and social rules). Second, we conducted a preliminary investigation of the relationship between conceptual beliefs and value importance. Third, we examined whether conceptual beliefs about values are associated with people’s reactions to situations where values are traded off for monetary or economic gain. Previous research has shown that some values are treated as protected (Baron & Leshner, 2000; Baron & Spranca, 1997) or sacred (Tetlock, Kristel, Elson, Green, & Lerner, 2000; Tetlock, McGraw, & Kristel, 2004), such that some people resist or deny the possibility of forsaking a value for economic or other rewards (e.g., those who hold the value protecting the environment strongly may say that no amount of economic gain justifies harvesting pristine rainforest for timber). When value trade-off decisions are made by others, those who hold a value sacred will express anger and evaluate decision makers harshly and may praise decision makers who protect values despite the temptation of monetary gains. An obvious explanation for why values might be protected or sacred is that these values are highly important to people. However, we propose an alternative explanation. Values may be sacred or protected because they are believed to reflect an essential, inalienable aspect of what it means to be human; that is, they are seen as central to human nature. Thus, sacrificing such values may be seen as reprehensible because it desecrates people’s humanity, and protecting these values may be seen as praiseworthy for upholding humanity in the face of threat. Hence, it is expected that reactions to value trade-off situations will be stronger when people believe the value involved is more central to human nature.
Method Participants and procedure. Participants were 109 undergraduate psychology students (79% female, 21% male) from an urban Australian university who completed the questionnaire as part of a research participation requirement. Participants completed a questionnaire pack in small groups of up to 7 per group. Materials. The questionnaire pack contained the RIC questionnaire, human nature measure, a measure of value importance, and a value trade-off scenario. The concepts used in the RIC questionnaire contained six examples of six conceptual types: natural-kind objects, artifact objects, nominal-kind objects, social categories, social rules, and human values (see Table 3). The object concepts were chosen at two levels of abstraction— basic or species level (e.g., goat) and subordinate level (e.g., blow-fly), as the level of abstraction can influence conceptual properties (Malt, 1990; Rosch, Mervis, Gray, Johnson, & Boyes-Braem, 1976). For human values and social rules, the RIC was framed in terms of behavior, as in the pilot studies. Two additional concepts (juggling and hunting), called “activity” concepts, were included and were framed in terms of behavior, similar to social rules and values. These concepts enabled us to examine whether
BAIN, KASHIMA, AND HASLAM
356 Table 3 Social and Object Concepts Used in Study 1 Concept type Object Natural kinds Artifacts Nominal kinds
Social Categories Natural Non-natural Rules Moral Conventional Human values Activities
Concepts Orchid (nat1), water (nat2), goat (nat3), gold* (nat4), blowfly* (nat5), elm* (nat6) Pencil (art1), glue (art2), ice-axe (art3), tv antenna* (art4), mirror* (art5), formal shirt* (art 6) Red (nom1), prime number (nom2), equilateral triangle (nom3), purple* (nom4), circle* (nom5), hot* (nom6)
Asian (scn1), Hispanic (scn2), blind* (scn3) Blue-collar (sc1), movie-buff* (sc2), middleclass* (sc3) Stealing (mr1), cheating* (mr2), hitting* (mr3) Wearing uniform (cr1), using computer at lunchtime (cr2), using teacher’s first name* (cr3) Humble (v1), social order (v2), ambition (v3), enjoying life (v4),a freedom* (v5), exciting life* (v6) Juggling (act1), hunting* (act2)
Note. Concepts marked with an asterisk were included in the same questionnaire version. Numbers after concepts relate to their use in Figure 1. a “Enjoying life” was included in both questionnaire versions.
methodological differences in framing of the issues for objects and social concepts were responsible for differences in ratings on the indicators. Ratings of human nature and value importance were also obtained. For the human nature ratings, some slight wording changes were required for grammatical reasons; for example, “hot” was rephrased as “heat.” We obtained value importance ratings for a subset of the values, which we measured using the scale from the Schwartz Value Survey (SVS; S. H. Schwartz, 1992). This involved rating values “as a guiding principle in my life” on a 9-point scale, on which 7 ⫽ of supreme importance, 6 ⫽ very important, 3 ⫽ important, 0 ⫽ not important, and ⫺1 ⫽ opposed to my values. To minimize participant fatigue, we had the 38 concepts shown in Table 3 split between two questionnaires and completed by different participants. To check for equivalence in responses for values across questionnaires, we included one value (enjoying life) in both questionnaires. In addition, human nature ratings were obtained for all concepts in both versions of the questionnaire. Participants were randomly allocated to the two versions of the questionnaire, with a similar number completing each version (55 and 54 for Versions 1 and 2, respectively). Across questionnaire versions, participants were similar in male-to-female ratio (82% and 76% female), 2(1, N ⫽ 109) ⫽ .57, ns; age (20.7 years and 20.1 year), t(106) ⫽ .52, ns.); and a multivariate ANOVA on the six RIC ratings of enjoying life showed no multivariate difference across versions, F(6, 82) ⫽ 1.44, ns. The value trade-off scenario involved a short vignette in which the director of a policy-making body was considering recommending a policy to government that would economically stimulate an industry but would result in sacrificing a human value (see Appendix B for scenario details). Experimental manipulations included whether the policy was recommended (sacrificing the value) or not recommended (protecting the value). The value at stake (enjoying life, freedom, exciting life) and the ease of the decision (whether the decision was easy or difficult to make) were also factorially varied. Participants evaluated the decision on the seven semantic differential items used in Tetlock et al. (2000). These items were good–
bad, wise–foolish, positive–negative, moral–immoral, irrational–rational, compassionate– cruel, and crazy–sane and were rated on 7-point scales. A principal-components analysis indicated that these items formed a single factor, explaining 66.7% of the variance. After reversing items so that higher scores reflected more positive evaluations, we combined the seven items into a decision evaluation scale (Cronbach’s alpha ⫽ .92).
Results and Discussion As the RIC indicators and the human nature measure have different scales and characteristics, results for these measures are presented separately. Multidimensional scaling (MDS) was conducted on the mean ratings of the six RIC indicators for the concepts shown in Table 3. Two subordinate-level, definitionbased nominal-kind items (“equilateral triangle” and “prime number”) were rated low on nominal-kind indicators and were excluded from subsequent analyses. For the remaining 36 concepts, a two-dimensional solution exhibited an acceptable fit to the data (stress, .12; RSQ, .93), and is shown in Figure 1. To understand these dimensions, the coordinates of concepts on each dimension were correlated with scores on the indicators. The horizontal dimension was labeled a nominal-kind – natural-kind dimension, as the strongest positive correlations were with the natural-kind indicators “expert analysis” (r ⫽ .97, p ⬍ .01) and “underlying characteristics” (.45, p ⬍ .01), and there were strong negative correlations with the nominal-kind indicators “people’s opinions” (⫺.68, p ⬍ .01) and “general survey” (⫺.45, p ⬍ .01). The vertical dimension was labeled an artifact dimension, as strong negative correlations were found for purposes (⫺.95, p ⬍ .01), and functions (⫺.87, p ⬍ .01). Thus, concepts closer to the base of the figure showed more artifactlike properties. To compare conceptual beliefs about the different concept types, we used ellipses in Figure 1 to indicate the area occupied by the concepts of each type. As expected from the pilot studies, natural kinds, artifacts, and nominal kinds occupied distinct areas of multidimensional space, with a single exception for the nominal kind “hot” (nom6). This suggests that the conceptual beliefs measured by the RIC fully partition concepts into types, and thus, that these concepts are “linearly separable” (Medin & Schwanenflugel, 1981).1 Natural kinds and nominal kinds were differentiated along the horizontal dimension, with artifact concepts falling between, which corresponded to the continuum of concept types suggested by Keil (1989). However, in contrast to Keil’s unidimensional model, a second dimension distinguishing object concepts was identified, relating specifically to artifact beliefs. Social concepts were mainly clustered toward the left bottom quadrant of the MDS solution, corresponding to a combination of nominal-kind and artifact beliefs. There was little overlap in the position of values and social rules, also suggesting linear separability. In contrast, social categories were distributed across multidimensional space. This was particularly pronounced for natural social categories, which involved beliefs similar to natural-kind objects (blind; SCN3) or nominal-kind objects (racial categories: Asian [SCN1] and Hispanic [SCN2]). Although Hirschfeld (1996) has shown that race is often seen as a natural kind, this does not appear to be the case on these measures. Social rules were rated as more like artifacts than most other social concepts, consistent with 1
We thank an anonymous reviewer for pointing out this connection.
BELIEFS ABOUT VALUES
357
Figure 1. Two-dimensional multidimensional scaling solution for object and social concepts on the six RIC indicators (Study 1). RIC ⫽ Resolving Issues About Concepts questionnaire.
an emphasis on their status as rules, created specifically to serve the function/purpose of constraining behavior. However, moral rules were rated as more natural than conventional rules, consistent with Turiel’s (1998) propositions and Study B. Human values were clustered toward the left of the solution and showed little overlap with any other concept type. They were characterized more by nominal-kind beliefs than most object concepts but showed a higher level of artifact beliefs than nominalkind objects. This suggests that people believe values are purpose based and functional, and as deriving from social sources such as our community or culture. They were characterized least by natural-kind beliefs, meaning that people do not conceptualize values as natural in the same way that they conceptualize natural objects. Although beliefs about values were in an area similar to other social concepts, particularly social rules, they were distinct from social rules through being seen as less like artifacts. The position of activity concepts and natural social categories allows us to discount some methodological explanations for these results. As evidence against the results being due to the framing of issues, the activity concept hunting (ACT2) was presented in terms of conduct in the RIC but was rated more natural than all nominal kinds and several artifact objects. Similarly, it is unlikely that ratings were made on the basis of participants noticing a distinction between object and social concepts, as the natural social category “blind” (SCN3) was rated very similarly to natural-kind objects and as more natural than almost all other concepts, and Asian (SCN1) and Hispanic (SCN3) were rated similarly to nominal kinds. These results add confidence to the conclusion that values are believed to exhibit a unique combination of nominalkind and artifact properties. Human nature versus nature of objects. The average ratings of human nature beliefs were not correlated with average ratings of
the natural-kind indicators across the items used in the study (r ⫽ .01, ns., for both expert analysis and underlying characteristics). With a focus on the six values, human nature ratings were only correlated with natural-kind indicators in 1 of 12 tests (with underlying characteristics for the value “enjoying life,” r ⫽ .24, n ⫽ 104, p ⬍ .05). Human values received the highest ratings on the human nature measure of all concept types (means: values, 4.19; social categories, 3.46; natural kinds, 3.35; nominal kinds, 2.95; artifacts, 2.81; social rules, 2.66; activities, 2.28). A ranking of means revealed that all values were in the top 30% of concepts on this measure, with freedom, enjoying life, and social order exhibiting the highest ratings of all. It is clear from these results that human nature is a highly relevant belief for values and is a belief distinct from natural kind beliefs about objects. Conceptual beliefs and value importance. Value importance ratings were obtained for a subset of four values, and correlations between conceptual beliefs and value importance were computed. As values were split into two questionnaires (except for “enjoying life”), correlations involving RIC beliefs for all values except for enjoying life were based on approximately half the total sample size (total N ⫽ 109); however, human nature belief ratings were made for all values. Human nature ratings were significantly correlated with value importance for all four values tested, exciting life (r ⫽ .29, p ⬍ .05), social order (r ⫽ .22, p ⬍ .05), freedom (r ⫽ .19, p ⬍ .05), and enjoying life (r ⫽ .29, p ⬍ .05). The correlations involving human nature beliefs were based on a greater n than for the other conceptual beliefs and therefore cannot be compared directly. However, no correlations between other conceptual beliefs and value importance were significant (in the same order of values as above): expert analysis, .11, .18, ⫺.01, ⫺.03; underlying characteristics, .02, ⫺.13, .16, .11; purposes, .10, .01, .13, .09; functions, .00, .10, .08, .10; people’s opinions, .12,
BAIN, KASHIMA, AND HASLAM
358
.06, .22, .00; and general survey, .06, .02, ⫺.07, ⫺.04. These results provide initial evidence that human nature beliefs may be crucial as justifications for the importance of values. Conceptual beliefs and value trade-offs. We conducted a 2 (decision to protect vs. sacrifice a value) ⫻ 3 (value) ⫻ 2 (ease of decision) ANOVA on the decision evaluation scale as a manipulation check. A significant main effect was identified for decision, F(1, 98) ⫽ 87.5, p ⬍ .001, with no other significant main effects or interactions. Therefore, only the decision variable was included in later analyses. Responses on the decision evaluation scale were all above the midpoint in the protect condition and all below the midpoint for the sacrifice condition. To enable responses from these conditions to be analyzed together, we reversed scores for the sacrifice condition, which resulted in a single evaluation intensity scale across both conditions, with higher scores reflecting more intense reactions to the scenario (positive for the protect condition, negative for the sacrifice condition). This evaluation intensity measure was regressed onto human nature, the two natural-kind indicators, the two artifact indicators, the two nominal-kind indicators, value importance, interaction terms for all predictors with condition (with condition effect coded ⫺1 for protect and 1 for sacrifice), as well as condition itself (dummy coded as 0 for protect and 1 for sacrifice). Although value importance is not a conceptual belief, it was included to assess a traditional explanation for reactions to trade-offs; that is, people will react more intensely to value trade-offs when a value is more important to them. The regression model with 16 predictors explained a significant amount of variance, R2 ⫽ .37, F(16, 58) ⫽ 2.12, p ⫽ .019, and there was no evidence of problematic multicollinearity (all VIFs ⬍ 3.7). However, most predictors were far from significant, so a subsequent regression model was fitted using only the five predictors that showed at least a trend to significance ( p ⬍ .1; all other predictors were p ⬎ .25). This model is shown in Table 4 and was also significant, R2 ⫽ .31, F(5, 74) ⫽ 6.70, p ⬍ .001, with no problematic multicollinearity (all VIFs ⬍ 2.7). Table 4 shows that greater intensity of evaluations occurred when the value was protected from trade-offs and where values were believed to be central to human nature. However, the Human Nature ⫻ Condition interaction term indicated that the relationship between human nature and evaluation intensity was stronger when values were protected. Evaluations were less intense where values were seen as based in social conventions (“general survey,” a nominal-kind indicator). These findings demonstrate that values were more protected when they were seen as central to human nature and less protected when they are believed to be based in Table 4 Predictors of Evaluation of Policy Decision Predictor
B

t
Condition (0 ⫽ protect; 1 ⫽ sacrifice) Human nature Human Nature ⫻ Condition Underlying characteristics General survey
⫺0.74 0.45 ⫺0.36 ⫺0.21 ⫺0.29
⫺.35 .37 ⫺.34 ⫺.19 ⫺.28
3.38** 2.36* 2.18* 1.93 2.88**
* p ⬍ .05.
** p ⬍ .01.
social conventions. However, in contrast to previous research on taboo trade-offs, participants indicated stronger responses where values were upheld rather than sacrificed. This difference may be partly due to the scenario being in the political domain, where it is possible that stereotypes of politicians suggest that sacrificing values might be more typical and expected and, hence, protecting values especially worthy of praise. In summary, Study 1 showed that human values involve nominal-kind and artifact beliefs and are seen as central to human nature. This combination of beliefs distinguished them from both object concepts and other social concepts. However, only the belief that a value was central to human nature was correlated consistently with value importance. Human nature beliefs also showed the strongest relationships with people’s reactions to value trade-offs. These findings suggest that human nature beliefs may provide a significant psychological basis for human values.
Study 2 Study 2 extends the findings of Study 1 in several ways. First, with a larger and more representative set of human values, it focuses on the conceptual beliefs that distinguish different types of values. Second, a more comprehensive test of the associations between conceptual beliefs about values and value importance is conducted. Third, both beliefs about values and their relationships with value importance are examined cross-culturally, with samples from Australia and Japan, who have been shown to differ somewhat in the structure of their value systems (S. H. Schwartz & Sagiv, 1995). This permits examination of whether beliefs about values vary across cultures and whether the beliefs that underlie value importance are culture specific; for example, the importance of equality might be based on natural-kind beliefs for Australians but on artifact beliefs for Japanese.
Method Participants. Participants were 66 Australian-born undergraduate students residing in a large Australian city (71% female, mean age ⫽ 18.9 years, SD ⫽ 1.1), and 95 Japanese-born students (34% female; mean age ⫽ 19.3, SD ⫽ 1.6), residing in a large Japanese city. Australian participants completed these measures in small groups as part of a research participation program. Japanese participants completed questionnaires in groups or in their own time, on a voluntary basis. Both Australian and Japanese participants completed the 56-item Schwartz Value Survey (SVS) (S. H. Schwartz, 1992) in their first language as part of a larger questionnaire pack. In the SVS, each value is rated “as a guiding principle in my life” on a 9-point scale, on which 7 ⫽ of supreme importance, 6 ⫽ very important, 3 ⫽ important, 0 ⫽ not important, and ⫺1 ⫽ opposed to my values. Japanese value terms were taken directly from the Japanese version of the SVS (S. H. Schwartz, personal communication, March 20, 2002). Procedure. Two values were selected from each of Schwartz’s 10 motivational value domains (freedom, creativity, exciting life, daring, enjoying life, pleasure, ambition, successful, social power, wealth, social order, national security, obedience, politeness, respect for tradition, humility, honesty, helpfulness, equality, protecting the environment) plus selfrespect, which usually is found near the center of Schwartz’s circumplex, making a total of 21 values. We obtained RIC and human nature ratings for these values, using measures identical to those in Study 1. There were two versions of the questionnaire, with the order of the issues in the RIC and concepts in the human nature measure reversed. To develop the Japanese
BELIEFS ABOUT VALUES questionnaire, the English version was translated into Japanese by one bilingual and back-translated by another bilingual to establish equivalence of meaning.
Results Conceptual structure of human values. For both Australian and Japanese samples, a two-dimensional solution of MDS analysis of the mean RIC ratings for values was acceptable (Australian: stress ⫽ .07, RSQ ⫽ .98; Japanese: stress ⫽ .10, RSQ ⫽ .96), and
Figure 2. samples.
359
these models are shown in Figure 2. Coordinates for the values on the two dimensions were correlated with RIC indicator scores for interpretation of the dimensions of the models. The horizontal coordinates for values were highly correlated with all natural-kind and artifact beliefs in both samples (all ns ⫽ 21, all ps ⬍ .01; Australia: expert analysis, r ⫽ .98; underlying characteristics, r ⫽ .63; purposes, r ⫽ .86; functions, r ⫽ .83; Japan: expert analysis, r ⫽ .97; underlying characteristics, r ⫽ .69; purposes, r ⫽ .68; functions, r ⫽ .87). Vertical coordinates were highly correlated
Two-dimensional multidimensional scaling solution for values for (a) Australian and (b) Japanese
360
BAIN, KASHIMA, AND HASLAM
with nominal-kind indicators in both samples (all ns ⫽ 21, all ps ⬍ .01; Australia: people around, r ⫽ .94, general survey, r ⫽ .89; Japan: people around, r ⫽ .97, general survey, r ⫽ .97). The positive association between natural kind and artifact beliefs for values may imply that values are seen as inherent goals, corresponding to Aristotle’s (1967) claim that people are goal oriented by nature. However, these beliefs are independent from believing values are socially defined. Figure 2 shows groupings of values corresponding to the ends of the two bipolar value dimensions of S. H. Schwartz’s (1992) value model. The overlapping nature of these clusters indicates that different types of values in Schwartz’s model do not show distinct sets of conceptual beliefs. At most, openness values have lower levels of natural kind/artifact beliefs than other value types in both cultures. Although values are not identically located in the Australian and Japanese MDS solutions, their relative positions along the horizontal dimension were very similar (Spearman’s ⫽ .81, n ⫽ 21, p ⬍ .001). For instance, in both cultures national security and protecting the environment exhibited relatively high levels of natural kind/artifact beliefs, whereas enjoying life and exciting life exhibited low levels. On the vertical dimension, there was relatively less agreement across cultures, although the relationship was still strong (Spearman’s ⫽ .60, n ⫽ 21, p ⬍ .005). On this dimension politeness was relatively high on nominal-kind beliefs in both samples, and enjoying life was relatively low. For Australians, tradition and exciting life were more like nominal kinds than for Japanese, whereas Japanese regarded power and obedience as more like nominal kinds.2 To understand the relationship between RIC indicators and human nature beliefs, we examined correlations between the mean human nature and RIC ratings. The human nature rating did not correlate with RIC natural-kind ratings in either sample (Australia: expert analysis, r ⫽ .03, ns; underlying characteristics, r ⫽ .06, ns; Japan: expert analysis, r ⫽ .07, ns; underlying characteristics, r ⫽ .15, ns), replicating Study 1. Moreover, it was not correlated with artifact and nominal-kind ratings in either sample. There was very high agreement across cultures in human nature ratings for the values (Spearman’s ⫽ .70, n ⫽ 21, p ⬍ .001), although there were also some notable differences. For Australians, freedom, equality, and honesty were most central to human nature, whereas for the Japanese, social order, pleasure, and equality were most central. In both cultures, exciting life and daring (both stimulation values in S. H. Schwartz’s, 1992, model) were rated least central to human nature. Conceptual beliefs and value importance. By investigating a relatively large number of values, we could examine the relationship between conceptual beliefs and value importance in two ways. First, the relationship can be examined separately for each value, as was done in Study 1. That is, do those people who hold a particular conceptual belief about a value rate that value as more important? This will be called the individual-level analysis (where individual relates to individual values). The second way of analyzing the relationship between conceptual beliefs and value importance is to examine whether the relative importance of values in a particular group (such as a culture) is associated with the group’s conceptual beliefs about those values. That is, are values about which group members hold a particular conceptual belief especially important in that group? This question can be addressed by correlating the mean ratings of
value importance with mean ratings of conceptual beliefs within a group or culture. This is called the group level analysis. Analyses at individual and group levels are distinct and independent (Howard, 1994; Snijders & Bosker, 1999). However, multilevel modeling procedures allow simultaneous examination of relationships at both levels (Bryk & Raudenbush, 1992; Snijders & Bosker, 1999). At the individual level, multilevel modeling can be used to fit a separate regression model for each value, and enables the assessment of whether these regression slopes are similar or different across values. At the group level, mean ratings of the conceptual belief variables can be used to predict mean ratings of value importance.3 We conducted multilevel modeling using HLM 5.04 (Raudenbush, Bryk, & Congdon, 2001), with separate models fitted for Australian and Japanese samples. For each value, importance ratings were regressed on ratings of human nature and the RIC indicators (all ratings were centered around their mean for each value). Initially, regression coefficients were allowed to vary for each value, but tests of the variance in regression coefficients showed that they did not vary significantly across values in either sample; all variance components ⬍ .03, all 2(20) ⬎ .05.4 Therefore, simpler models were fitted, assuming that the regression coefficients were the same for all values. Thus, the individual level coefficients reported in Table 5 estimate the average relationship across the 21 values. The group level analysis uses mean ratings of conceptual beliefs to predict variation in the mean importance of values. As the number of values was small relative to the number of predictors, we followed Bryk and Raudenbush’s (1992) recommendation to first fit submodels based on theoretical criteria and then compute an overall model using only the significant predictors from the submodels. Hence, three submodels were fitted first, using (a) natural-kind indicators and human nature, (b) nominal2
As the gender composition of the samples differed in Australia and Japan, differences identified in cultures may be attributable to gender. Hence, we computed t tests to test for gender differences for ratings of the six indicators on the 21 values within each culture (total of 126 tests per culture). At ␣ ⫽ .05, gender differences would be expected to be found by chance between six and seven times. Significant gender differences were found four times in the Japanese sample and eight times in the Australian sample. This lack of reliable gender difference in ratings suggests that gender composition is not responsible for differences found across cultures. 3 Typically, multilevel models are used to group people into categories, such as students into classes. The present model does not group people into categories but rather individual ratings of values into group ratings of values. 4 A typical multilevel model used in this analysis is shown below (assessing artifact indicators), allowing for a random intercept (U0) and random slopes in the independent variables (U1 and U2): Level-1 model (individual) Value importance ⫽ B0 ⫹ B1 ⫻ (functions) ⫹ B2 ⫻ (purposes) ⫹ R Level-2 model (group) B0 ⫽ G00 ⫹ G01 ⫻ (functions[mean]) ⫹ G02 ⫻ (purposes[mean]) ⫹ U0 B1 ⫽ G10 ⫹ U1 B2 ⫽ G20 ⫹ U2 Variation in regression slopes is assessed by chi-square tests on the random slope parameter for each Level-1 predictor (in this case, U1 and U2). As these were not significant in any of the submodels, they were removed from subsequent analyses (in this case, resulting in B1 ⫽ G10 and B2 ⫽ G20). This results in the model estimating a single parameter to describe the relationship between conceptual beliefs and value importance for all values at Level 1.
BELIEFS ABOUT VALUES
361
Table 5 Predicting the Importance Ratings of Values at Individual and Group Levels Australia
Japan
Value
Coefficient (95% CI)
df
T
Individual level Human naturea Purposes Expert analysis
0.32 (0.23, 0.42) 0.15 (0.06, 0.24) 0.10 (0.02, 0.17)
1196 1196 1196
6.88*** 3.22*** 2.51*
Group level Human nature
1.61 (0.44, 2.77)
19
Explained variance
Coefficient (95% CI)
df
T
.12 (0.05, 0.18)a .13 (0.06, 0.21)
1735 1735
3.55*** 3.58***
13.5%
8.6%
24.9% 2.76*
Explained variance
32.3% 1.47 (0.54, 2.40)
19
3.15**
a
95% confidence interval (CI) estimates differ between Australian and Japanese samples. * p ⬍ .05. ** p ⬍ .01. *** p ⬍ .005.
kind indicators, and (c) artifact indicators. The model shown in Table 5 includes only those predictors that were significant in these submodels. Variance estimates were computed using equations from Snijders and Bosker (1999). Table 5 shows that human nature beliefs predicted value importance at both the individual and group levels in both Australian and Japanese cultures. First, individuals who regarded particular values as central to human nature were more likely to rate those values as important. Second, values seen as central to human nature by the group as a whole were seen as relatively important in that group. At the individual level only, beliefs emphasizing the purposes of values also predicted value importance in both samples. For the Australian sample only, the natural-kind belief expert analysis predicted the importance at the individual level. Although this was not a significant predictor in the Japanese sample, a supplementary analysis including expert analysis as a predictor (B ⫽ .05; 95% CI [⫺.02, .12]) indicated that it was not significantly different from the coefficient for the Australian sample.
Discussion This study showed that the conceptual space for values could be described using a natural-kind/artifact belief dimension, and a nominal-kind belief dimension. These dimensions were remarkably consistent across Australian and Japanese samples, as were the values that scored high and low on these dimensions. However, these dimensions were largely orthogonal to the centrality of values to human nature. Human nature beliefs were most consistently related to value importance. At the individual level, human nature beliefs were a significant predictor of value importance in both cultures, although the confidence intervals suggest that this relationship was somewhat stronger in Australia than in Japan. At the group level, human nature beliefs were the only significant predictor of value importance in both cultures. It is perhaps most remarkable that cultures were similar in the types of conceptual beliefs that predicted value importance: human nature at both individual and group levels, and purposes at the individual level. Overall, this study clearly shows that human nature beliefs are associated with value importance, predicting the importance of individual values and the relative importance of values in Australia and Japan. Although a subtle cultural difference in the strength of
associations was found, the findings demonstrate that similar conceptual beliefs form a basis for value importance across these two cultures.
Study 3 Studies 1 and 2 demonstrated that human nature beliefs were consistently associated with the importance of human values, suggesting that these beliefs might provide a psychological foundation for value importance. In Study 3, we conducted a stronger test of this hypothesis by manipulating human nature beliefs and observing the effects on value importance. If the belief that a value is underpinned by human nature contributes to its importance, then inculcating or strengthening this belief should lead to higher ratings of that value’s importance. Moreover, given that values do not exist in isolation but are linked in compatible and incompatible relationships with other values within people’s value systems (S. H. Schwartz, 1992, 1994), we examined whether manipulating human nature beliefs about a value could also produce changes in the wider value system, specifically in the importance of an incompatible value. That is, by leading people to believe that a value is based in human nature, not only should the importance of that (target) value increase, but the importance of an incompatible value should decrease (and vice versa).
Method Participants. Data were collected from 143 participants (59 male, 76 female, 8 not indicated), who were approached in public areas on the campus of an Australian university and were asked to complete a short 5–10-min survey. Sixty-two percent were born in Australia, 14% in Asia, 14% in Europe, and 10% elsewhere. Procedure. The survey was titled “Reactions to Advances in Scientific Research in Psychology,” printed on two sides of a sheet of paper. On the front page, a short paragraph outlined the nature–nurture debate in psychology and stated that, although in most cases these factors combine to produce behavior, recent scientific advances in some areas have resolved these controversies in favor of nature or nurture. The study was presented as examining how the general public accepts such conclusions from the scientific community and was followed by a paragraph that summarized research in favor of a certain value being either part of human nature, or not being part of human nature but rather a social/cultural product. Four values
362
BAIN, KASHIMA, AND HASLAM
were used: obedience, ambition, power, and helpfulness, taken from S. H. Schwartz’s (1992, 1994) value domains of conformity, achievement, power, and benevolence. Obedience and ambition were treated as incompatible values, as were power and helpfulness. Each paragraph was headed in boldface type: “Is [value] due to nature or society?” followed by a general description of research supporting one or the other view. An example of a nature statement is “A number of recent articles in prominent journals in psychology and sociology have demonstrated that [value] is a universal aspect of human experience, as it has been shown to be valued in a diverse array of cultures, and across very different historical periods.” An example of a society statement is “A number of recent articles in prominent journals in psychology and sociology have demonstrated that [value] is a value that depends on a complex system of social rewards and punishments.” A final, boldfaced sentence in each paragraph concluded that “the weight of evidence is firmly in favor of the view that [value] is an integral part of our human nature” (human nature manipulation), or “it can now be safely concluded that [value] is not an integral part of our human nature, with the weight of evidence firmly in favor of the view that it arises from social and cultural norms” (society manipulation). On the reverse side of the page, participants were requested to make ratings of the target value (i.e., the value about which human nature beliefs were manipulated), the incompatible value, and four other filler values on the value rating scale from the SVS (S. H. Schwartz, 1992). They then rated an item asking about the plausibility of the research findings (“The conclusions arising from this research are plausible”) and several filler items on 7-point Likert scales. The final section asked for demographic variables, including previous university-level study in psychology. On completion of the survey, but before it was collected, participants were debriefed about the study and told that they did not have to return the survey. No participant declined to return the survey.
Results and Discussion As the nature of the experimental manipulation required that participants knew little about actual psychological research, participants who had studied psychology at university for a year or more (n ⫽ 20) were excluded from analyses. Five participants who rated the findings as implausible (ratings of less than 3 for the item “The conclusions arising from this research are plausible”) were also excluded, and data were missing from one case, leaving a total sample of 117. These cases were approximately equally spread across the four values (ambition ⫽ 27; helpfulness ⫽ 31; obedience ⫽ 30; power ⫽ 29). A repeated-measures ANOVA was conducted, with the importance ratings of the target value and the incompatible value as a within-participant variable (value type), and target value (each of the four values were used as targets) and research finding (human nature vs. not human nature) as between-participants variables. The expected interaction was found between the research finding and value type, F(1, 109) ⫽ 7.32, p ⬍ .01. When target values were presented as part of human nature, they were rated as more important (M ⫽ 4.12, SE ⫽ 0.22) than their incompatible values (M ⫽ 3.49, SE ⫽ 0.21), and when they were presented as not part of human nature they were rated as less important (M ⫽ 3.63, SE ⫽ 0.23) than incompatible values (M ⫽ 3.98, SE ⫽ 0.22). Simple effects analyses (using one-tailed tests of the directional hypotheses) showed that, as expected, values were rated more important when they were presented as part of human nature than as a social product, t(110) ⫽ 1.68, p ⬍ .05, and that incompatible values were rated as more important when the target value was presented as a social product than as part of human nature,
t(109) ⫽ 1.87, p ⬍ .05. The difference between the importance ratings of target and incompatible values depended on what was a target value, F(3, 109) ⫽ 36.20, p ⬍ .001. Helpfulness and ambition were generally rated as more important than their incompatible values regardless of whether they were the target or incompatible value. These results confirm the findings from Study 2 by demonstrating a causal relationship between human nature beliefs and value importance. Moreover, they extend these findings by showing that not only do human nature beliefs influence the importance of values, but they also influence the importance of other values in people’s value systems in expected ways.
Study 4 Having demonstrated that conceptual beliefs about values may underlie their importance and reactions to value trade-offs, in Study 4 we addressed a further consequence of holding conceptual beliefs about values: Does holding some conceptual beliefs about values make people more susceptible to social influence? To address this question, we examined the relationship between conceptual beliefs about values and the perceived credibility of valueladen rhetoric. Although values are a common tool of mass influence in the public domain (Gamson & Modigliani, 1989), surprisingly little research in social psychology has investigated rhetoric about values (except for how the use of kin terms influences processing of family-values rhetoric; Garst & Bodenhausen, 1996). However, value-laden rhetoric has received more attention in sociology, communication science, and political science, in which research has concentrated more on the effects of the presentation (or framing) of values on how people apply values to an issue (e.g., Brewer, 2002, 2003; Cloud, 1998; Nelson, Clawson, & Oxley, 1997). For example, Brewer (2002) showed that people were more likely to explain their views on an issue (e.g., gay rights) using the values invoked in a rhetorical presentation. Although this shows that rhetoric about values can influence the dimensions of debate, research is lacking about what makes value-laden rhetoric credible, which is crucial as rhetoric is intended to directly influence people’s opinions. In this study, we tested two hypotheses about the link between conceptual beliefs and the effectiveness of rhetorical justifications. One is the “belief-rhetoric match” hypothesis. In line with research on framing, canny politicians (or their speechwriters) may construct rhetorical justifications of values designed to resonate with, or match, the audience’s beliefs. If this strategy is effective, rhetorical statements should appear more credible to the extent that they correspond to people’s beliefs about a value. That is, a positive relationship would be expected between natural-kind beliefs and the credibility of statements presenting a value as a natural kind, between artifact beliefs and the credibility of an artifact-type statement, and so on. Alternatively, if people hold certain values dear on the basis of a particular conceptual belief, those values may be seen to be important regardless of the rhetorical justification given. Rather, rhetoric may simply provide explicit reasons for justifying values that are judged important on other implicit grounds. We propose that this may be especially the case for human nature beliefs. For example, values such as liberty and freedom may implicitly be
BELIEFS ABOUT VALUES
363
Table 6 Multilevel Model Parameters Predicting the Credibility of Natural Kind, Nominal Kind, and Artifact Statements Statement type Natural kind Individual level Human nature People’s opinions Group level Human nature General survey Artifact Individual level Human nature Expert analysis Group level Human nature Purposes General survey Nominal kind Individual level Functions Group level Human nature People’s opinions General survey * p ⬍ .05.
Coefficient (95% CI)
df
T
Explained variance 11.7%
0.11 (0.05, 0.18) 0.09 (0.03, 0.15)
1251 1251
3.60*** 3.12***
18 18
6.15*** ⫺2.73*
847 847
5.02*** 3.90***
17 17 17
5.40*** ⫺3.49*** ⫺2.39*
0.14 (0.07, 0.22)
988
3.45***
0.40 (0.10, .69) ⫺1.02 (⫺1.78, ⫺.25) 1.84 (1.00, 2.69)
17 17 17
4.23*** ⫺3.96*** 5.57***
67.6% 0.96 (0.65, 1.27) ⫺0.85 (⫺1.47, ⫺0.23)
22.9% 0.17 (0.10, 0.24) 0.11 (0.05, 0.17)
68.2% 1.18 (0.75, 1.62) ⫺1.27 (⫺2.01, ⫺0.54) ⫺1.06 (⫺1.94, ⫺0.68)
25.9% 56.5%
*** p ⬍ .005.
seen as central to human nature—inherent and inalienable to humans—and therefore function as self-evident truisms (Maio & Olson, 1998). To the extent that a value is believed to reflect human nature, any rhetorical justification may therefore be seen as credible. Hence, this human nature hypothesis proposes a positive relationship between human nature beliefs and the credibility of rhetorical statements, regardless of the nature of these claims (e.g., invoking nominal- vs. natural-kind beliefs).
Method Participants in this study were the same as the Australian sample in Study 2 (the Japanese sample did not complete this section because of time constraints). We constructed three types of rhetorical justification that corresponded to the three types of conceptual beliefs about values (natural kind, artifact, and nominal kind). “[Value] is a value that is right and true and unchanging for all people everywhere” (natural kind). “[Value] is a value because it tends to promote happiness” (artifact). “[Value] is a value that is a product of a cultural process” (nominal kind). Each justification was presented at the top of a page, with the 21 values listed below in randomized order. Participants were instructed to imagine that a person had made the statement about each value and to rate how credible they would find that statement on a 5-point scale (1 ⫽ not at all credible, 3 ⫽ moderately credible, and 5 ⫽ very credible).
Results and Discussion Multilevel modeling was used to examine the relationship between conceptual beliefs and statement credibility at both the individual and group levels. A separate model was fitted for each of the three statements. In all models, the dependent variable was
the credibility rating of statements, with mean ratings of credibility for each value used at the group-level (models were essentially the same as in Footnote 1 but with statement credibility as the dependent variable). Predictor variables were as follows: (a) human nature ratings of values, (b) the two natural-kind indicators, (c) the two artifact indicators, and (d) the two nominal-kind indicators. Means for each predictor on each value were used for estimating the group-level model. As in Study 2, four submodels were fitted first, each containing one of the four groups of aforementioned predictors. In Table 6, we report final models that were calculated using only significant predictors from the submodels. Regression slopes were initially allowed to vary, but as these did not vary significantly across values, random slope terms for individuallevel predictors were omitted. Table 6 shows that clear support was found for the human nature hypothesis, as human nature beliefs predicted statement credibility for all three statement types both at the individual and group levels. The only exception was for the nominal-kind statement at the individual level. The “match” hypothesis (i.e., people evaluate as more credible those statements that match their conceptual beliefs) was not supported. Few relationships consistent with the match hypothesis were identified, and some significant relationships were in the opposite direction. Support for this hypothesis came from the positive association between human nature and the credibility of the natural-kind statement and between the nominal-kind indicator “general survey” and nominal-kind statement credibility at the group level. However, no relationships were found between other natural-kind indicators and the natural-kind statement; an association opposite to expectations was found for the artifact indicator “purposes” and credibility of the artifact statement at the group
BAIN, KASHIMA, AND HASLAM
364
level; and the nominal-kind indicator “people’s opinions” was also negative, opposite to expectations. Although the present results do not conclusively refute the match hypothesis, as there is only a single rhetorical statement of each type, it does give credence to the human nature hypothesis. Different types of rhetorical justifications for values were seen as more credible to the extent that those values were believed to be more central to human nature. This suggests that political rhetoric is likely to be more effective based on the values public figures choose to invoke, rather than the specific justifications used to support them. Specifically, the selection of values that are more central to human nature may lead us to accept rhetorical claims less critically. This finding provides an avenue for extending framing research on political rhetoric by examining whether invoking values that are more central to human nature leads people to more readily accept a rhetorical position.
General Discussion This article examined people’s conceptual beliefs about human values and their relationships with value importance and valuerelated phenomena. Using a newly developed measure of conceptual beliefs, we found that values were characterized by a combination of nominal-kind and artifact beliefs and were seen as relatively central to human nature. People’s beliefs about values themselves differed along three dimensions: (a) the extent to which they exhibit artifact and natural-kind beliefs, two types of belief which overlap for values; (b) the extent to which they are seen as nominal kinds; and (c) their centrality to human nature. Human nature beliefs emerged as the most critical predictor of the importance of values for both individual differences in the importance of particular values and for the relative importance of values within two cultures. Study 3 showed that these beliefs exerted a causal influence on value importance. Human nature beliefs were also associated with people’s evaluations of value trade-off situations and predicted the credibility of rhetoric about values. Other conceptual beliefs were relevant in specific areas (e.g., artifact beliefs for the importance of values, nominal-kind beliefs for approval of value trade-offs), but none of these rivaled human nature beliefs in the consistency and magnitude of their effects. The fact that human nature emerged as a key belief about values, whereas traditional natural kind beliefs did not, suggests that the construal of nature is different in social and physical domains, which is consistent with theories that propose broader distinctions between them (e.g., Hirschfeld, 1995; Keil, 1995). The significant role of human nature beliefs suggests that this concept warrants more detailed exploration. For instance, does human nature relate to something only humans possess or to basic characteristics that we might share with other creatures (Haslam et al., 2005; Kagan, 2004)? Rationality, agency (free will), consciousness, and intentionality are also candidates for human nature that can be gleaned from philosophy and religion (Stevenson & Haberman, 1998), which may be translated into lay theories. Unpacking people’s implicit conceptions of human nature is a crucial further step in understanding the importance and use of values. The domain distinction in the construal of natural kinds may also have implications for domain-general measures of beliefs such as the RIC questionnaire. One of the major strengths of the
RIC is its applicability across domains, even beyond the range of concepts investigated in these studies. It was effective in characterizing people’s beliefs about values in terms of these other concepts, but the beliefs it measured were of limited relevance to the importance and use of values. This could be because these beliefs are indeed not associated with value importance, value trade-offs, or value-laden rhetoric, but it may be that these beliefs may be relevant but need to be measured in a domain-specific way. Although we had less reason to suspect that artifact and nominalkind beliefs would differ across domains, it would be useful to create measures of these beliefs specifically for the social domain, to establish whether this increases their relevance for values. As such social domain-specific measures have not yet been developed, researchers may have to rely on face validity rather than discriminant validity until more is known about conceptual beliefs in the social domain. Other issues need to be considered when using an instrument like the RIC. First, it is a highly intellectual and challenging task and may not be suitable for all populations. Second, there is a limit to the number of concepts that can be measured before fatigue and administration time become factors, and so the RIC is best suited to a focused investigation of a small number of concepts, which is one reason why we did not investigate all of the values identified by S. H. Schwartz (1992). Third, as a context-free measure, it lacks many contextual features that may be useful in understanding concepts such as values. However, it is an extremely useful technique for understanding how a set of conceptual beliefs applies to different concepts. For instance, the MDS analysis of conceptual beliefs in Figure 1 indicates that a unidimensional continuum of concepts proposed by Keil (1989), with natural and nominal kinds opposed and artifacts in the middle, may oversimplify distinctions between concepts, as a second dimension specifically of artifact beliefs is important in describing differences between these concepts. Although some evidence of cultural moderation in relationships between beliefs and values was observed, what was most notable was the extent of similarity across cultures in the dimensions of beliefs that characterize values, in the location of values along these dimensions, and in the relationships between beliefs and value importance. In both cultures, there was overlap in naturalkind and artifact beliefs, with similar values occupying each end of this continuum. In both cultures, this dimension of beliefs was independent of nominal-kind beliefs, with the values that typified nominal-kind beliefs showing more variation across cultures. The importance of values, for both the importance of individual values and for their relative importance within each cultural sample, was predicted consistently by human nature beliefs. Such cross-cultural similarities for values may be a product of similar socialization, could arise from endemic cultural processes that result in a similar endpoint (see Cohen, 2001), or may reflect universal cognitive structures. Each of these explanations provides possibilities for further cultural research in beliefs about values. The socialization explanation would be supported if pancultural representations of human values, particularly relating them to human nature, were identified through, for instance, content analyses of public documents and speeches. The socialization explanation would also be supported if there were international conventions on values, similar to those for human rights (i.e., the Universal Declaration of Human Rights; United Nations, 1948),
BELIEFS ABOUT VALUES
but such conventions do not appear to exist outside academia. The endemic explanation would be the most challenging and difficult to investigate, as it would involve in-depth analyses of the cultural processes operating in each culture and their effects on values. The universal cognition explanation would be supported if human nature beliefs were found to operate similarly in a wider range of cultures, independent of societal features such as industrialization. The findings that human nature beliefs had a variety of implications for the psychology of values demonstrate that these beliefs are worth studying in greater detail. As the implications studied involved reactions to the use of values by others, one promising extension of this research stream would be to examine how human nature beliefs influence people’s own actions, such as their use of values in negotiation or in conflict and mediation settings. For example, the present findings suggest that, when values are salient in these settings, human nature beliefs would be associated with the types of trade-offs offered and even predict impasses in negotiations (e.g., when someone attempts to negotiate an issue relating to a value central to human nature). The findings on rhetoric credibility were also noteworthy. One disturbing implication is that when we believe a value is part of human nature, we may be susceptible to influence by religious, political, or organizational leaders who invoke that value to further their own aims. Thus, further examination of these implications of our beliefs about values is warranted. On what basis do people determine the relative importance of their values? These studies provide a promising response to this question, which until now has hardly been addressed in the values literature. Human nature beliefs were found to be associated with the importance of values for individuals and groups across two cultures, and they were found to have practical implications for people’s reactions to value-laden situations. These findings extend the work of Maio and colleagues (e.g., Maio & Olson, 1998) by showing that, in spite of lacking explicit reasons for values, people have more implicit beliefs that contribute to why values are important and how they are used. Further examination of these beliefs and their implications represents a promising way of advancing knowledge about how and why people use values as guiding principles in their lives.
References Aristotle. (1967). The ethics of Aristotle: The Nichomachean Ethics (J. A. K. Thomson, Trans.). London: Penguin. Baron, J., & Leshner, S. (2000). How serious are expressions of protected values? Journal of Experimental Psychology: Applied, 6, 183–194. Baron, J., & Spranca, M. (1997). Protected values. Organizational Behavior and Human Decision Processes, 70, 1–16. Barton, M. E., & Komatsu, L. K. (1989). Defining features of natural kinds and artifacts. Journal of Psycholinguistic Research, 18, 433– 447. Bernard, M. M., Maio, G. R., & Olson, J. M. (2003). Effects of introspection about reasons for values: Extending research on values-as-truisms. Social Cognition, 21, 1–25. Bloom, P. (1996). Intention, history, and artifact concepts. Cognition, 60, 1–29. Bloom, P. (1998). Theories of artifact categorization. Cognition, 66, 87–93. Boyd, R. (1999). Homeostasis, species, and higher taxa. In R. A. Wilson (Ed.), Species: New interdisciplinary essays. Cambridge, MA: MIT Press. Brewer, P. R. (2002). Framing, value words, and citizens’ explanations of their issue opinions. Political Communication, 19, 303–316.
365
Brewer, P. R. (2003). Values, political knowledge, and public opinion about gay rights: A framing-based account. Public Opinion Quarterly, 67, 173–201. Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical linear models: Applications and data analysis methods. Newbury Park, CA: Sage. Cloud, D. L. (1998). The rhetoric of “family values”: Scapegoating, Utopia, and the privatization of social responsibility. Western Journal of Communication, 62, 387– 419. Cohen, D. (2001). Cultural variation: Considerations and implications. Psychological Bulletin, 127, 451– 471. Collingwood, R. G. (1960). The idea of nature. London: Oxford University Press. Diesendruck, G. (2001). Essentialism in Brazilian children’s extensions of animal names. Developmental Psychology, 37, 49 – 60. Gamson, W. A., & Modigliani, A. (1989). Media discourse and public opinion on nuclear power: A constructionist approach. The American Journal of Sociology, 95, 1–37. Garst, J., & Bodenhausen, G. V. (1996). “Family values” and political persuasion: Impact of kin-related rhetoric on reactions to political campaigns. Journal of Applied Social Psychology, 26, 1119 –1137. Gelman, S. A. (2003). The essential child: Origins of essentialism in everyday thought. New York: Oxford University Press. Gelman, S. A., & Hirschfeld, L. A. (1999). How biological is essentialism? In D. L. Medin & S. Atran (Eds.), Folkbiology (pp. 403– 446). Cambridge, MA: MIT Press. Haslam, N., Bain, P., Douge, L., Lee, M., & Bastian, B. (2005). More human than you: Attributing humanness to self and others. Journal of Personality and Social Psychology, 89, 937–950. Haslam, N., Bastian, B., & Bissett, M. (2004). Essentialist beliefs about personality and their implications. Personality & Social Psychology Bulletin, 30, 1661–1673. Haslam, N., Rothschild, L., & Ernst, D. (2000). Essentialist beliefs about social categories. British Journal of Social Psychology, 39, 113–127. Haslam, N., Rothschild, L., & Ernst, D. (2002). Are essentialist beliefs associated with prejudice? British Journal of Social Psychology, 41, 87–100. Hirschfeld, L. A. (1995). Anthropology, psychology, and the meanings of social causality. In D. Sperber, D. Premack & A. J. Premack (Eds.), Causal cognition: A multidisciplinary debate (pp. 313–350). New York: Clarendon Press/Oxford University Press. Hirschfeld, L. A. (1996). Race in the making: Cognition, culture, and the child’s construction of human kinds. Cambridge, MA: The MIT Press. Howard, J. A. (1994). A social cognitive conception of social structure. Social Psychology Quarterly, 57, 210 –227. Kagan, J. (2004). The uniquely human in human nature. Daedalus, 133, 77– 88. Kalish, C. (2002). Gold, jade, and emeruby: The value of naturalness for theories of concepts and categories. Journal of Theoretical & Philosophical Psychology, 22, 45– 66. Keil, F. C. (1989). Concepts, kinds, and cognitive development. Cambridge, MA: MIT Press. Keil, F. C. (1995). The growth of causal understandings of natural kinds. In D. Sperber, D. Premack, & A. J. Premack (Eds.), Causal cognition: A multidisciplinary debate (pp. 234 –267). New York: Oxford University Press. Komatsu, L. K. (1992). Recent views of conceptual structure. Psychological Bulletin, 112, 500 –526. Leyens, J.-P., Cortes, B., Demoulin, S., Dovidio, J. F., Fiske, S. T., Gaunt, R., et al. (2003). Emotional prejudice, essentialism, and nationalism. European Journal of Social Psychology, 33, 703–717. Leyens, J.-P., Paladino, P. M., Rodriguez-Torres, R., Vaes, J., Demoulin, S., Rodriguez-Perez, A., et al. (2000). The emotional side of prejudice: The attribution of secondary emotions to ingroups and outgroups. Personality & Social Psychology Review, 4, 186 –197.
366
BAIN, KASHIMA, AND HASLAM
Leyens, J.-P., Rodriguez-Perez, A., Rodriguez-Torres, R., Gaunt, R., Paladino, M.-P., Vaes, J., et al. (2001). Psychological essentialism and the differential attribution of uniquely human emotions to ingroups and outgroups. European Journal of Social Psychology, 31, 395– 411. Locke, J. (1975). An essay concerning human understanding. Oxford, England: Clarendon Press. (Original work published 1700) Maio, G. R., & Olson, J. M. (1998). Values as truisms: Evidence and implications. Journal of Personality and Social Psychology, 74, 294 – 311. Malt, B. C. (1990). Features and beliefs in the mental representation of categories. Journal of Memory and Language, 29, 289 –315. Malt, B. C., & Johnson, E. C. (1992). Do artifact concepts have cores? Journal of Memory and Language, 31, 195–217. Medin, D. L., & Ortony, A. (1989). Psychological essentialism. In S. Vosniadou & A. Ortony (Eds.), Similarity and analogical reasoning (pp. 179 –195). New York: Cambridge University Press. Medin, D. L., & Schwanenflugel, P. J. (1981). Linear separability in classication learning. Journal of Experimental Psychology: Human Learning and Memory, 355–368. Nelson, T. E., Clawson, R. A., & Oxley, Z. M. (1997). Media framing of a civil liberties conflict and its effect on tolerance. American Political Science Review, 91, 567–583. Putnam, H. (1975). The meaning of “meaning.” Mind, language and reality: Philosophical papers (Vol. 2, pp. 215–271). Cambridge, United Kingdom: Cambridge University Press. Raudenbush, S. W., Bryk, A. S., & Congdon, R. (2001). HLM for Windows (Version 5.04). [Computer software]. Lincolnwood, IL: Scientific Software International. Rips, L. J. (1989). Similarity, typicality, and categorization. In S. Vosniadou & A. Ortony (Eds.), Similarity and analogical reasoning (pp. 21–59). New York: Cambridge University Press. Rokeach, M. (1973). The nature of human values. New York: The Free Press. Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M., & Boyes-Braem, P. (1976). Basic objects in natural categories. Cognitive Psychology, 8, 382– 439. Rothbart, M., & Taylor, M. (1992). Category labels and social reality: Do we view social categories as natural kinds? In G. R. Semin & K. Fiedler (Eds.), Language, interaction, and social cognition (pp. 11–36). Thousand Oaks, CA: Sage. Schwartz, S. H. (1992). Universals in the content and structure of values: Theoretical advances and empirical tests in 20 countries. In M. P. Zanna (Ed.), Advances in experimental social psychology (Vol. 25, pp. 1– 65). San Diego, CA: Academic Press. Schwartz, S. H. (1994). Are there universal aspects in the structure and contents of human values? Journal of Social Issues, 50, 19 – 45.
Schwartz, S. H., & Sagiv, L. (1995). Identifying culture-specifics in the content and structure of values. Journal of Cross-Cultural Psychology, 26, 92–116. Schwartz, S. H., & Struch, N. (1989). Values, stereotypes, and intergroup antagonism. In D. Bar-Tal, C. F. Graumann, A. W. Kruglanski, & W. Stroebe (Eds.), Stereotyping and prejudice: Changing conceptions (pp. 151–167). New York: Springer-Verlag. Schwartz, S. P. (1978). Putnam on artifacts. The Philosophical Review, LXXXVII, 566 –574. Smetana, J. G. (1981). Preschool children’s conceptions of moral and social rules. Child Development, 52, 1333–1336. Smetana, J. G. (1985). Children’s impressions of moral and conventional transgressors. Developmental Psychology, 21, 715–724. Snijders, T. A. B., & Bosker, R. J. (1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling. London: Sage. Stevenson, L., & Haberman, D. L. (1998). Ten theories of human nature (3rd ed.). New York: Oxford University Press. Struch, N., & Schwartz, S. H. (1989). Intergroup aggression: Its predictors and distinctness from in-group bias. Journal of Personality and Social Psychology, 56, 364 –373. Tetlock, P. E., & Goldgeier, J. M. (2000). Human nature and world politics: Cognition, identity, and influence. International Journal of Psychology, 35, 87–96. Tetlock, P. E., Kristel, O. V., Elson, S. B., Green, M. C., & Lerner, J. S. (2000). The psychology of the unthinkable: Taboo trade-offs, forbidden base rates, and heretical counterfactuals. Journal of Personality and Social Psychology, 78, 853– 870. Tetlock, P. E., McGraw, A. P., & Kristel, O. V. (2004). Proscribed forms of social cognition: Taboo trade-offs, blocked exchanges, forbidden base rates, and heretical counterfactuals. In N. Haslam (Ed.), Relational models theory: A contemporary overview (pp. 247–262). Mahwah, NJ: Erlbaum. Tisak, M. S. (1995). Domains of social reasoning and beyond. In R. Vasta (Ed.), Annals of Child Development (Vol. 11, pp. 95–130). Philadelphia: Jessica Kingsley. Turiel, E. (1998). The development of morality. In N. Eisenberg (Ed.), Handbook of child psychology (5th ed.). Vol. 3: Social, emotional, and personality development (pp. 863–932). New York: Wiley. United Nations. (1948). Universal declaration of human rights. New York: Author. Wrightsman, L. S. (1992). Assumptions about human nature: Implications for researchers and practitioners (2nd ed.). Newbury Park, CA: Sage. Yzerbyt, V., Corneille, O., & Estrada, C. (2001). The interplay of subjective essentialism and entitativity in the formation of stereotypes. Personality & Social Psychology Review, 5, 141–155.
BELIEFS ABOUT VALUES
367
Appendix A The Resolving Issues About Concepts (RIC) Questionnaire, With Examples of Object and Value Concepts and Descriptors Used for Each Belief Indicator Object Example
Value Example
Imagine that an object was found that looks like a mirror, and there is a dispute about whether it really is a mirror. One view is that it is, and another is that it is not. If you were asked to decide which view was more justified, and the following sources were available to you to help you decide, how useful would each be?
Imagine there is a dispute about whether someone’s conduct reflects the human value of being ambitious. One view is that it does, and another is that it does not. If you were asked to decide which view was more justified, and the following sources were available to you to help you decide, how useful would each be?
Indicator type
Type of information
Descriptor
Natural kind
An analysis of the (object/conduct) from an expert in the area Information about whether the underlying characteristics of the (object/conduct) are consistent with (a mirror/being ambitious)
Expert analysis Underlying characteristics
Artifact
Information about whether the (object/conduct) fulfills the purposes of (a mirror/being ambitious) Information about whether the functions performed by (a mirror/being ambitious) and the (object/conduct) are consistent
Purposes
Opinions on this issue from people around you Results from a survey of the general public on this issue
People’s opinions General survey
Nominal kind
Functions
Appendix B Value Trade-Off Scenario Used in Study 1 The director of an economic policy-making body, has been asked by the government to recommend ways for economically stimulating the local textile and manufacturing industries, which are believed to need a policy overhaul due to poor economic performance. The policy-making body has a track record of having its policies accepted by the government, and it is likely that their policy recommendations in this case are also likely to be successful. The policy-making body has reviewed the practices of other countries, taken submissions from interested parties, and conducted economic simulations of different policies. They decided on the policy that was best for the economy and then conducted an extensive survey within the affected industries to investigate the likely effects of the new policy on workers. Their major finding about the effects of the proposed policy was that workers would have to give up a lot of [VALUE] in their lives. In other countries where similar policies are actually in place, they found research that consistently showed that people felt their levels of [VALUE] were very low because of the employment duties that are required of them by the policy. The head of the policy-making body, Steven Wilson, has ultimate authority for recommending this policy to the government or to direct his staff to come up with an alternative proposal. Steven read the policy
proposal prepared by his staff, which outlined both the economic benefits and the major costs in terms of [VALUE].
[Protect Value, Easy Decision] Steven felt the decision was a very easy one, and decided quickly. He did not recommend the policy to the government, and in his briefing to his staff to produce an alternative proposal he wrote: “. . . giving up [VALUE] is not an acceptable price for people in the industry to pay for stimulating these areas economically.”
[Sacrifice Value, Difficult Decision] Steven felt the decision was an extremely difficult one, requiring a great deal of contemplation and consideration. In the end, he recommended the policy to the government, and in his briefing to the government he wrote: “. . . giving up [VALUE] is an acceptable price for people in the industry to pay for stimulating these areas economically.”
Received December 3, 2003 Revision received February 28, 2006 Accepted March 20, 2006 䡲
FEAR and LEARNING From Basic Process to Clinical Implications Edited by Michelle G. Craske, Dirk Hermans, and Debora Vansteenwegen lthough fear plays an important role in human development, adaptation, and, ultimately, survival, fear can be disabling when it manifests itself as a phobia or an anxiety disorder. Effective treatment for fear-based disorders depends upon the basic science that informs theories about the origins of fear and about how fear is learned. This book brings together the most recent empirical developments in learning theory for understanding the etiology and treatment of fears and phobias. The editors have assembled contributions from leading scientists whose work represents the cutting edge in such areas CONTENTS: as measurement methodology, neurobiology, cognitive Preface ■ Acknowledgements processing, behavioral models, emotion regulation, and ■ Introduction – Fears and Phobias: Etiological Factors ■ Part I. Fear and Learning: Basic Issues pharmacological and other clinical treatments. After a ■ Chapter 1. Fear Conditioning and Clinical Implications: What Can Be Learned from the Past? ■ Chapter 2. Human Fear Learning: Contemporary Procedures and Measurement review of the history of fear learning and basic concepts ■ Part II. Acquisition and Maintenance of Fear ■ Chapter 3. Defenses and Memories: Functional and methods in fear measurement, subsequent chapters Neural Circuitry of Fear and Conditional Responding ■ Chapter 4. Contemporary Learning elucidate processes of acquisition and maintenance of Theory Perspectives on the Etiology of Fears and Phobias ■ Chapter 5. Cognitive Mechanisms fear, finally moving to the extinction, renewal, and in Fear Acquisition and Maintenance ■ Chapter 6. Fear and Avoidance: An Integrated reinstatement of fear. The research synthesized in this Expectancy Model ■ Chapter 7. Fear Conditioning in an Emotion Regulation Context: A Fresh book has applicability to the entire spectrum of anxiety Perspective on the Origins of Anxiety Disorders ■ Part III. Extinction, Renewal, and Reinstatement of Fear ■ Chapter 8. Anatomical, Molecular, and Cellular Substrates of Fear disorders, including obsessive-compulsive disorder, social Extinction ■ Chapter 9. Counteracting the Context-Dependence of Extinction: Relapse and Tests anxiety disorder, and specific phobias. This book is for of Some Relapse Prevention Methods ■ Chapter 10. Renewal and Reinstatement of Fear: clinical researchers, scientist practitioners, and students Evidence from Human Conditioning Research ■ Chapter 11. Exposure Therapy and Extinction: who are looking for research ideas. It will stimulate Clinical Studies ■ Part IV: Final Thoughts Fear and Learning: Debates, Future Research and and guide future research in human fear learning. Clinical Implications ■ References ■ Author Index ■ Subject Index 2006. 312 pages. Hardcover.
A
List: $69.95 • APA Member/Affiliate: $49.95 • ISBN 1-59147-414-0 • Item # 4318034
ALSO AVAILABLE PRINCIPLES OF EXPERIMENTAL PSYCHOPATHOLOGY Essays in Honor of Brendan A. Maher Edited by Mark F. Lenzenweger and Jill M. Hooley 2003 • 305 pages • Hardcover • List: $49.95 APA Member/Affiliate: $39.95 ISBN 1-55798-928-1 • Item # 431892A
UNDERSTANDING AND TREATING ANXIETY DISORDERS: An Integrative Approach to Healing the Wounded Self Barry E. Wolfe 2005 • 301 pages • Hardcover • List: $59.95 APA Member/Affiliate: $44.95 • ISBN 1-59147-196-6 • Item # 4317058
PHOBIC DISORDERS AND PANIC IN ADULTS A Guide to Assessment and Treatment Martin M. Antony and Richard P. Swinson 2000 • 422 pages • Hardcover • List: $39.95 APA Member/Affiliate: $34.95 • ISBN 1-55798-696-7 • Item # 431756A
APA Books Ordering Information
800-374-2721 www. apa.org/books In Washington, DC, call: 202-336-5510 TDD/TTY: 202-336-6123 • Fax: 202-336-5502 In Europe, Africa, or the Middle East, call: 44-207-240-0856
AMERICAN PSYCHOLOGICAL ASSOCIATION AD0476