RESEARCH METHODOLOGY IN STRATEGY AND MANAGEMENT
i
RESEARCH METHODOLOGY IN STRATEGY AND MANAGEMENT Series Editors: David J. Ketchen, Jr. and Donald D. Bergh Volume 1:
Research Methodology in Strategy and Management. Edited by David J. Ketchen, Jr. and Don D. Bergh
Volume 2:
Research Methodology in Strategy and Management. Edited by David J. Ketchen, Jr. and Don D. Bergh
ii
RESEARCH METHODOLOGY IN STRATEGY AND MANAGEMENT VOLUME 3
RESEARCH METHODOLOGY IN STRATEGY AND MANAGEMENT EDITED BY
DAVID J. KETCHEN, JR. College of Business, Auburn University, USA
DONALD D. BERGH Daniels College of Business, University of Denver, USA
Amsterdam – Boston – Heidelberg – London – New York – Oxford Paris – San Diego – San Francisco – Singapore – Sydney – Tokyo JAI Press is an imprint of Elsevier
iii
JAI Press is an imprint of Elsevier The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, UK Radarweg 29, PO Box 211, 1000 AE Amsterdam, The Netherlands 525 B Street, Suite 1900, San Diego, CA 92101-4495, USA First edition 2006 Copyright r 2006 Elsevier Ltd. All rights reserved No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without the prior written permission of the publisher Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone (+44) (0) 1865 843830; fax (+44) (0) 1865 853333; email:
[email protected]. Alternatively you can submit your request online by visiting the Elsevier web site at http://elsevier.com/locate/permissions, and selecting Obtaining permission to use Elsevier material Notice No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. Because of rapid advances in the medical sciences, in particular, independent verification of diagnoses and drug dosages should be made British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN-13: 978-0-7623-1339-6 ISBN-10: 0-7623-1339-0 ISSN: 1479-8387 (Series) For information on all JAI Press publications visit our website at books.elsevier.com Printed and bound in The Netherlands 06 07 08 09 10 10 9 8 7 6 5 4 3 2 1
iv
CONTENTS ix
LIST OF CONTRIBUTORS
xiii
INTRODUCTION SCHOLARSHIP THAT ENDURES Sumantra Ghoshal
1
THEORY, PRACTICE, AND SCHOLARSHIP Raymond E. Miles and Charles C. Snow
11
CONSTRUCTS AND CONSTRUCT MEASUREMENT IN UPPER ECHELONS RESEARCH Mason A. Carpenter and Gregory P. Reilly
17
SURVEYING THE CORPORATE ELITE: THEORETICAL AND PRACTICAL GUIDANCE ON IMPROVING RESPONSE RATES AND RESPONSE QUALITY IN TOP MANAGEMENT SURVEY QUESTIONNAIRES Michael K. Bednar and James D. Westphal
37
MANAGERIAL CONSTRAINT: THE INTERSECTION BETWEEN ORGANIZATIONAL TASK ENVIRONMENT AND DISCRETION Brian K. Boyd and Steve Gove
57
v
vi
CONTENTS
ASSESSING THE EXTERNAL ENVIRONMENT: AN ENRICHMENT OF THE ARCHIVAL TRADITION C. Chet Miller, dt ogilvie and William H. Glick
97
ANALYSIS OF EXTREMES IN MANAGEMENT STUDIES Joel A. C. Baum and Bill McKelvey
123
THE ROLE OF FORMATIVE MEASUREMENT MODELS IN STRATEGIC MANAGEMENT RESEARCH: REVIEW, CRITIQUE, AND IMPLICATIONS FOR FUTURE RESEARCH Nathan P. Podsakoff, Wei Shen and Philip M. Podsakoff
197
INDIVIDUALS AND ORGANIZATIONS: THOUGHTS ON A MICRO-FOUNDATIONS PROJECT FOR STRATEGIC MANAGEMENT AND ORGANIZATIONAL ANALYSIS Teppo Felin and Nicolai Foss
253
RIGOR AND RELEVANCE USING REPERTORY GRID TECHNIQUE IN STRATEGY RESEARCH Robert P. Wright
289
STUDYING THE DYNAMICS OF REPUTATION: A FRAMEWORK FOR RESEARCH ON THE REPUTATIONAL CONSEQUENCES OF CORPORATE ACTIONS Matthew S. Kraatz and E. Geoffrey Love
343
Contents
AN ASSESSMENT OF THE USE OF STRUCTURAL EQUATION MODELING IN INTERNATIONAL BUSINESS RESEARCH G. Tomas M. Hult, David J. Ketchen, Jr., Anna Shaojie Cui, Andrea M. Prud’homme, Steven H. Seggie, Michael A. Stanko, Alex Shichun Xu and S. Tamer Cavusgil
vii
385
This page intentionally left blank
viii
LIST OF CONTRIBUTORS Joel A. C. Baum
Rotman School of Management, University of Toronto, Toronto, Canada
Michael K. Bednar
Department of Management, The University of Texas at Austin, Red McCombs School of Business, Austin, TX, USA
Brian K. Boyd
W.P. Carey School of Business, Arizona State University, Tempe, AZ, USA
Mason A. Carpenter
School of Business, University of WisconsinMadison, Madison, WI, USA
S. Tamer Cavusgil
Eli Broad Graduate School of Management, Michigan State University, East Lansing, MI, USA
Anna Shaojie Cui
Department of Marketing and Supply Chain Management, Michigan State University, MI, USA
Teppo Felin
Organizational Leadership & Strategy, Marriott School of Management, Brigham Young University, Provo, UT, USA
Nicolai Foss
Center for Strategic Management and Globalization, Copenhagen Business School, Denmark
Sumantra Ghoshal y
London Business School, London, UK
William H. Glick
Jones Graduate School of Management, Rice University, Houston, TX, USA
Steve Gove
Department of Management and Marketing, University of Dayton, Dayton, OH, USA ix
x
LIST OF CONTRIBUTORS
G. Tomas M. Hult
Center for International Business Education and Research, Eli Broad Graduate School of Management, Michigan State University, East Lansing, MI, USA
David J. Ketchen, Jr.
College of Business, Auburn University, Auburn, AL, USA
Matthew S. Kraatz
College of Business, University of Illinois at Urbana-Champaign, Champaign, IL, USA
E. Geoffrey Love
College of Business, University of Illinois at Urbana-Champaign, Champaign, IL, USA
Bill McKelvey
The Anderson School of Management at UCLA, Los Angeles, CA, USA
Raymond E. Miles
Haas School of Business, University of California, Berkeley, CA, USA
C. Chet Miller
Babcock Graduate School of Management, Wake Forest University, Winston-Salem, NC, USA
dt ogilvie
Rutgers Business School – Newark and New Brunswick, Rutgers University, Newark, NJ, USA
Philip M. Podsakoff
Department of Management, Indiana University, Bloomington, IN, USA
Nathan P. Podsakoff
Department of Management, University of Florida, Gainesville, FL, USA
Andrea M. Prud’homme
Department of Marketing and Supply Chain Management, Michigan State University, MI, USA
Gregory P. Reilly
School of Business, University of WisconsinMadison, Madison, WI, USA
Steven H. Seggie
Department of Marketing and Supply Chain Management, Michigan State University, MI, USA
List of Contributors
xi
Wei Shen
Department of Management, University of Florida, Gainesville, FL, USA
Charles C. Snow
Department of Management and Organization, The Pennsylvania State University, University Park, PA, USA
Michael A. Stanko
Department of Marketing and Supply Chain Management, Michigan State University, MI, USA
James D. Westphal
Department of Management, Red McCombs School of Business, The University of Texas at Austin, Austin, TX, USA
Robert P. Wright
Department of Management and Marketing, Faculty of Business, Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
Alex Shichun Xu
Department of Marketing and Supply Chain Management, Michigan State University, MI, USA
This page intentionally left blank
xii
INTRODUCTION Welcome to the third volume of Research Methodology in Strategy and Management. This book series’ mission is to provide a forum for critique, commentary, and discussion about key research methodology issues in the strategic management field. Strategic management relies on an array of complex methods drawn from various allied disciplines to examine how managers attempt to lead their firms toward success. The field is undergoing a rapid transformation in methodological rigor, and researchers face many new challenges about how to conduct their research and in understanding the implications that are associated with their research choices. For example, as the field progresses, what new methodologies might be best suited for testing the developments in thinking and theorizing? Many long-standing issues remain unresolved as well. What methodological challenges persist as we consider those matters? This book series seeks to bridge the gap between what researchers know and what they need to know about methodology. We seek to provide wisdom, insight, and guidance from some of the best methodologists inside and outside the strategic management field. Before we discuss the contents of this volume, let us briefly reflect on its predecessor. Volume 2 debuted at the 2005 Academy of Management meeting in Hawaii. The volume was showcased in a symposium sponsored by the Business Policy and Strategy division. We were thrilled that our large meeting room was filled to capacity. We believe this turnout reflects a strong desire by strategy researchers to improve their methodology skills; a desire we hope this book series is serving capably. We want to thank Phil Bromiley, Scott Johnson, Kent Miller, Caron St. John, Devi Gnyawali, Beverly Tyler, James Combs, T. Russell Crook, and Christopher Shook for offering excellent presentations of their chapters. We hope that Volume 3 also will be well received. The volume you hold in your hands offers 12 chapters. Sumantra Ghoshal provided the lead chapter shortly before his untimely death. His chapter encourages researchers to aspire to conduct research that offers wisdom that endures, using the work of Raymond E. Miles and Charles C. Snow as an exemplar. Sumantra’s writing once again beckons us to think critically and comprehensively about our research, and he challenges us to find the courage to pursue big and broad xiii
xiv
INTRODUCTION
questions and to develop useful frameworks within the realities of our institutional and individual constraints. In what is a testimony to his important legacy, the chapter exhorts us to seek enduring scholarship and to embark on a long and meaningful intellectual journey that will help make the world a better place. We are deeply indebted to Professor Ghoshal’s perspective and drive to help us define, challenge, and improve our lives. Next, Miles and Snow react to Ghoshal’s insights by discussing their views on the links among theory, practice, and scholarship. They also take the reader ‘‘behind the scenes’’ of the formation of their ideas on strategy that have profoundly shaped the strategy field since the 1970s. Taken together, we believe these companion pieces offering budding and experienced scholars alike many great ideas to guide their research. In remembrance of Sumantra Ghoshal and his many significant contributions to research and practice, the editors and the publisher are donating a portion of the proceeds from this volume to the scholarship fund set up in his name within the Business Policy and Strategy division. The volume includes two other pairs of chapters that each center on a general theme. The first of these themes is the study of top managers. Mason A. Carpenter and Gregory P. Reilly focus on three key tasks: how to identify upper echelons constructs, how to embed such constructs to build theory, and how to operationalize such constructs. This chapter comprehensively reviews how upper echelons research has been conducted, and it provides a practical and user-friendly process for improving how we study this very important subject. Eliciting strong response rates in surveys of top managers has long been notoriously difficult. In response, Michael K. Bednar and James D. Westphal outline some empirically based guidance for maximizing the effectiveness of top management surveys. They delve into important survey design factors and find how researcher decisions influence the rate and quality of survey responses. This interesting chapter gives practical advice about what to do and what not to, when surveying subjects that tend to have low response histories. We believe that any researcher attempting to study top managers would benefit considerably by following the action plans laid out in these two chapters. Understanding the environment is the second theme that has attracted two chapters. Brian K. Boyd and Steve Gove describe key conceptual and methodological issues surrounding the investigation of task environments and managerial discretion. Drawing on a review of published studies and original data analysis, they offer important suggestions to guide future research. In sum, the chapter offers a great primer for those interested in the organization/ environment interface. C. Chet Miller, d.t. Ogilvie, and William H. Glick seek
Introduction
xv
to improve how researchers assess the environment using archival measures. Although there is a long tradition of using archival sources to measure the environment, problems involving constitutive definitions and mismatches between constitutive and operational definitions plague this work. In response, Miller et al. clarify existing definitions and propose new ones where needed. They extend our understanding of the archival tradition and provide us with new ideas about how to study a central and critical factor in the field of strategic management. Overall, we are confident that absorbing the insights offered by the above array of scholars about three key issues (enduring scholarship, top managers, and the environment) will prove quite valuable to our colleagues. The remaining chapters address six diverse topics of importance. Joel A. C. Baum and Bill McKelvey discuss how extreme value theory can enhance management research. As the authors note, the field tends to fixate on central tendencies, but often studying unusual events is more valuable to efforts to understand organizations. They provide a detailed and interesting insight into perspectives such as power law and the concept of normal extreme behavior. These authors offer perhaps the most interesting set of examples we have ever encountered in one chapter, and we are certain that you will be intrigued by the ideas they offer. More generally, they offer very relevant insights into explaining unusual outcomes, a critically relevant and important factor in the field of strategic management. Like Carpenter and Reilly, Nathan P. Podsakoff, Wei Shen, and Philip M. Podsakoff focus on construct validity. However, the latters’ focus is on the strategy field as a whole. These authors provide a thorough and expansive discussion of measurement models within the field, identifying common problems and highlighting methods for how to improve measurement and integration between concept and variable. They argue convincingly that many key constructs in strategic management would be better modeled as having formative indicators rather than as having reflective indicators. This insight has important implications not just for future research, but also for interpreting the findings of past research. In Volume 2 of this series, Caron St. John laid out some foundational insights on the role of levels of analysis within strategic management research. In what could be viewed as a companion piece to St. John’s chapter, Teppo Felin and Nicolai Foss develop what they refer to as the microfoundations of strategic management. They seek to enhance strategic management theory by explicitly acknowledging that ‘‘organizations are made up of individuals, and there is no organization without individuals.’’ The chapter provides new insights into the organizational capabilities concept,
xvi
INTRODUCTION
and creatively applies Coleman’s General Model of Social Explanations to evaluate management research. The result is a very lively treatise offering novel ideas that promise to greatly enrich future theory development. While building theory among scholars is vital, understanding the conceptual frameworks used by managers to make sense of their reality is also important. Repertory grid technique is a valuable tool to tap into these frameworks, but few strategy scholars possess detailed knowledge of the technique. In response, Robert P. Wright provides ample examples and guidance on how to use repertory grid technique. He notes that concepts traditionally linked with a cognitive perspective provide important methodological challenges and he describes several accessible approaches for testing such dimensions. Next, Matt Kraatz and Geoffrey Love focus on corporate reputation, a topic of growing conceptual and practical importance. The norm for corporate reputation research has been to explain the antecedents of reputation through static or cross sectional models. In response, Kraatz and Love discuss how to study reputation in a dynamic, longitudinal fashion, as well as how to link changes in corporate reputation to other structures and practices that firms adopt over time (e.g., governance changes, downsizing). They provide insightful recommendations that not only apply to reputation but can also guide researchers interested in understanding intangible assets in general. Volume 2 of this series offered a chapter on entrepreneurship that signaled that our intended domain reaches beyond the strategy field. Similarly, the final chapter in this volume takes seriously the ‘‘and Management’’ portion of our book series’ title. G. Tomas M. Hult and his co-authors enlighten us about the use of structural equation modeling in the international business field. There are interesting parallels between the current development of the international area and that of strategic management research a couple of decades ago. Just as with strategy in the 1980s, international business tackles issues of practical importance, but critics question its theoretical robustness and methodological rigors. The chapter explains the strengths and weaknesses of extant practices, chronicles the effects of certain decisions, and offers suggestions for improvement. As with one of the chapters in Volume 1 of our series, this chapter is the product of a doctoral seminar led by the first author. Such projects are not only valuable to the field but also provide excellent learning experiences for doctoral students. Overall, the work offered in this volume strives to gently prod the field toward better using method and theory. If the advice offered is followed, the
Introduction
xvii
consequence will be an enhanced capability to understand how organizations behave and perform. We hope your research benefits from these chapters as much as we enjoyed working with their respective authors. We are very grateful to all of the contributors for their insights and efforts. David J. Ketchen, Jr. Donald D. Bergh Editors
This page intentionally left blank
xvii
SCHOLARSHIP THAT ENDURES Sumantra Ghoshaly As academics, we collectively publish thousands of articles and hundreds of books each year. We spend a large part of our lives producing them – sacrificing, in the process, sleep, time with our families, reading things we want to read, seeing places we wish to see. Most of these books and articles soon vanish without a trace, helping us get tenure perhaps, but taking with them into oblivion very large parts of the best years of our lives. Few – very few – of the outputs of our intellectual endeavors endure. What is it that distinguishes scholarship that endures from scholarship that does not? Painters hone their skills by copying grandmasters. Doctors and lawyers often learn by observing the greats of their profession at work. Scientists, too, analyze the work of those who have blazed new trails, in order to learn from their styles. In the same spirit, I describe some of the features of the research presented by Professors Raymond Miles and Charles Snow (M&S, in future references) in their book Organizational Strategy, Structure and Process (OSSP in future references) that, to my mind, still remain valid as hallmarks of ‘‘scholarship that endures.’’
THE ADAPTIVE CYCLE AND THE ORGANIZATIONAL TYPOLOGY Most readers of this chapter are very familiar with OSSP. The book’s basic proposition is that successful companies need to develop consistency among their strategy, the business model they adopt, including the choice of technology, and their organizational capability including human resource Research Methodology in Strategy and Management, Volume 3, 1–10 r 2006 Published by Elsevier Ltd. ISSN: 1479-8387/doi:10.1016/S1479-8387(06)03001-3
1
2
SUMANTRA GHOSHAL
practices. The basic framing of the problem in OSSP is not a static analysis of fit but rather the dynamic problem of adaptation. How do organizations adapt to changing environments? Why do adaptive failures occur? These were the starting questions for M&S. They saw the adaptive process as consisting of three sets of problems that a company has to solve in a mutually consistent way (see Fig. 1). The first is what they described as the entrepreneurial problem, focused on the choice of product-markets that the company would serve. In an established company, this choice is constrained by the existing activities of the firm: Should it remain within its historical domain, or should it venture out to exploit new opportunities that may lie outside that domain? The second problem in the adaptation process is what M&S described as the engineering problem, involving ‘‘the creation of a system which puts into actual operation management’s solution to the entrepreneurial problem.’’ In other words, the company has to select an appropriate business model including a choice of
The Entrepreneurial Problem
Choice of product-market domain
Selection of areas for future innovation
The Engineering Problem
The Administrative Problem Rationalization of structure and processes
Fig. 1.
Choice of technologies for production and distribution
The Adaptive Cycle (Reproduced from Fig. 2.1 in OSSP).
Scholarship that Endures
3
technology. The administrative problem relates to the structure and processes of the organization: how to build the organizational arrangements necessary to efficiently implement existing activities without jeopardizing the company’s ability to create new activities in response to evolving market demands. Perhaps the most enduring and insightful contribution of M&S lies in their developing a categorization scheme for organizations based on how they respond to the adaptive challenge. In contrast to most work on strategic and organizational ‘‘archetypes’’ that tend to be grounded in static analysis of organizational snapshots, M&S’s scheme is based on a view of organizations as ‘‘integrated wholes in dynamic interaction with their environments.’’ Basically, they argued that organizations can create the coherence in their strategy, structure, and processes that is needed for effective adaptation in some distinct ways. While theoretically a very large number of different permutations and combinations of the different dimensions was possible, M&S’s field research in companies revealed only four archetypes. Defenders are organizations that focus on a well-defined and narrow market in which they attempt to remain competitive by constantly improving their efficiency and productivity. Prospectors focus on constant innovations. They search for new market opportunities and tend to be highly flexible and entrepreneurial. Analyzers are more complex organizations that combine some aspects of both Defenders and Prospectors. They operate in some businesses that are relatively stable, in which they compete on the Defender’s strength of efficiency and low cost, and others that are dynamic, in which they act like Prospectors, searching for innovations. Finally, Reactors are organizations that lack consistency in their strategy, structure, and processes. They are, therefore, unable to create a coherent and sustainable position for themselves, and unable to respond effectively to environmental change, unless somehow forced to do so by external forces. What is remarkable about this categorization scheme is the comprehensiveness of organizational attributes – across strategic orientation, organizational features, and management processes – they capture (see Table 1). It is in the comprehensiveness of the scheme that the M&S framework distinguishes itself from other categorization schemes that relate primarily to the strategy of firms, with limited insights into the associated organizational attributes, such as Michael Porter’s generic strategies,1 or to organizational structural attributes, with limited attention to either strategy or process, as in Oliver Williamson’s distinctions among F, H, M, and corrupted M forms,2 or even in schemes that span strategy and organization, such as Henry Minzberg’s or Danny Miller’s typologies.3
4
SUMANTRA GHOSHAL
Table 1. The Four Organizational Archetypes (Summarized from Tables 3.1, 4.1, and 5.1 in OSSP). Defender
Prospector
Analyser
Entrepreneurial problem
Focused and narrow market segment Limited environment scanning Focus on efficiency and productivity Incremental growth through deeper penetration within segment Some product development close to existing products
Engineering problem
Single core technology Investment in technology to improve efficiency Tendency toward vertical integration
Multiple markets, Broad market with some stable others continuous changing development Broad environment Marketing and R&D oriented scanning Growth through environmental scanning product and Steady growth market through market development Acquisition to obtain penetration and new business new technologies development Tendency toward a fast follower strategy Multiple Dual technological technologies core with stable Low degree of and flexible routinization components Technology Large and influential embedded in applied research people group Moderate degree of technical efficiency Product division Matrix structure Marketing and organizations Marketing and R&D applied research carry most carry most influence influence Large, diverse, and Moderately transitory top centralized control management with horizontal Proliferation of task sharing of forces and project information Complex, teams Decentralized multidimensional control co-ordination Extensive rather than Complex planning intensive planning and multi-attribute performance management
Administrative problem Functional organization Production and financial control carry most influence Long tenure of top management Centralized control Intensive rather than extensive planning Co-ordination through standardization and scheduling Performance measurement against own history Difficult for Effective in dynamic Robust but needs Risks and benefits competitors to environments, but constant review of dislodge but major vulnerable to low portfolio Can be overwhelmed shift in market can profitability and threaten survival loss of focus by internal Unable to exploit complexity new opportunities
Reactor Inarticulated or ambiguous strategy with no clear direction or prioritising Adherence to a strategic path rendered unviable by environmental changes
No clear business model
Organizational features not consistent with strategy Organizational features not coherent among themselves Persistence with an unsuitable strategy–structure fit
Inability to respond effectively to market changes Poor performance and potential crisis
Scholarship that Endures
5
A JOURNEY OF ADVENTURE What can we learn from the enduring relevance and insightfulness of OSSP, about the conduct of scholarship that endures? At one level, an academic career, like any other career, is a career. It has its usual career demands – publishing ‘‘A’’ journal articles to get promoted, getting enough numbers of such articles to obtain tenure, and so on. Even after tenure, the need to maintain a scholarly reputation within the field poses its own ongoing demands. Amid all these demands of the career, it is all too easy to forget what the profession is all about. I believe the profession is all about enduring scholarship. Admittedly, very few of us are likely to make contributions that will maintain their relevance and salience for over 25 years, as OSSP has done. Therefore, at the individual level, it is perhaps inadvisable to make such contributions the test of one’s own worth, because that will only set most of us up for disappointment or failure. Besides, there is considerable value in contributing the incremental pieces that are always useful in the process of knowledge development, even if they are quickly lost from sight as the intellectual edifice builds on top of them. But, at the level of the overall profession, it is enduring scholarship that counts. It is these bursts of insights and ideas that move academic fields forward and that create the energy and momentum that keep the rest of us engaged. And, even at the individual level, while it may be unwise to make it the goal, is it not true that the possibility of making such contributions was what seduced most of us into the profession in the first place? What does it take? What can OSSP tell us about the path that those who make such contributions follow? In describing himself and his work, Sigmund Freud wrote (p. 297): You often estimate me too highly. I am not really a man of science, not an experimenter, and not a thinker. I am nothing but by temperament a conquistador – an adventurer, if you want to translate the word – with the curiosity, the boldness, and the tenacity that belong to that type of being. Such people are apt to be treasured if they succeed, if they really discover something; otherwise they are thrown out. And that is not altogether unjust.4
This, it seems to me, is a necessary, though by no means sufficient, condition for enduring scholarship: the courage and curiosity to embark on a long intellectual journey. And it is precisely such a journey that M&S undertook. It started with a study of 16 college textbook publishers that led to some initial ideas about the different organizational types. The process was
6
SUMANTRA GHOSHAL
inductive and, being so, the data were insufficient to provide a complete specification for each type. A follow-up study of three of those publishers completed the models. But these were all firms in the same industry. To look for inter-industry comparisons and also some empirical validation, the authors turned to their Berkeley colleagues, Henry Coleman Jr. and Alan Meyer, who went out to the field to collect data from 22 electronics firms and 27 food processing companies (Coleman), and then to 19 voluntary hospitals (Meyer). While OSSP was published in 1978, the same journey has continued over the next two decades, as the authors have kept on probing into the same question of organizational adaptation.5 Each step in the journey has produced ideas but has also left gaps that have led to the next step, with genuine and relentless curiosity as the driving engine. This is essentially a model of research as detective work that combines information with imagination to search for ideas and insights – much more in-line with the recommendations of Karl Weick than with the doctrine of Karl Popper.6 I emphasize this if only because so many of us today, when asked what we are working on, list a diverse range of ‘‘papers’’ that we are writing, typically with different co-authors, perhaps on a variety of relatively unrelated topics. There is little adventure in this process, and little sustained curiosity. Such an approach to research, while necessary perhaps for meeting career needs or personal preferences for variety, is unlikely to create enduring contributions. Freud highlighted the risks of adventurers: if they do not discover something important, they are thrown out. To untenured colleagues, the prospect of working for a long period of time on one topic with highly uncertain outcomes may give the words ‘‘thrown out’’ some very tangible and immediate meaning. What then can those willing to undertake such adventures do to enhance the likelihood that something worthwhile will really be found?
THE COURAGE TO PURSUE BIG, BROAD QUESTIONS Exciting adventures require big, broad, and relatively unchartered territories to explore. Again, this does not assure success, but its obverse fails to qualify as adventure. If scholarship that endures is a product of intellectual adventure, then it must start with similarly broad issues to address. M&S started with a set of fundamental and enduring questions. Indeed, if I try to recall any work in the strategy field that has endured over many decades – Chandler (1962), Rumelt (1974), Mintzberg (1973) – this is the one feature all of them shared.7 Each of these scholars identified – and then had
Scholarship that Endures
7
the courage or the foolhardiness to pursue – big, broad questions. It is also not true that they took up such challenges only in their post-tenure years: two of the three pieces I have identified as examples were doctoral theses. One consequence of focusing on such enduring questions and issues was that each of these authors ended up presenting a broad framework rather than a tightly argued causal theory. M&S highlight the importance of such frameworks in their introduction to the recently published Stanford Business Classics reissue of OSSP: Of utmost importance is the need for a broad, flexible conceptual framework. Such a framework serves several purposes: it helps to classify and explain what is already known, it allows you to interpret the work of others, and it guides your future research. A truly flexible framework is also never complete; it can always be modified and extended to make it more useful. Indeed, it is the conceptual framework itself that is probably of most value to other researchers, not any single idea derived from, or study spawned by, the framework.
I recognize that such pursuit of big, broad questions runs directly counter to some of the conventions of our career. First, it requires a degree of tolerance for ambiguities and incompleteness that can easily be viewed as lack of rigor and as signs of a poor intellect. Second, such work is hard to publish in journals – indeed as M&S wrote in the preface of OSSP, ‘‘During the course of this process, it became clear that we could not compress the entire framework and its related research evidence into a single article or monograph. Therefore, we decided that a book would be more appropriate y’’ How many of today’s promotion review committees will accept such an explanation while reviewing a tenure case? Beyond the issue of limited space in journal articles, there is also the associated problem of methodology. In exploring big, complex issues, it is hard to maintain methodological purity – particularly if one is unwilling to pay the price of extreme reductionism. As Robert Sutton and Barry Staw wrote (p. 383), ‘‘We ask the reader to consider whether the evidence provided by people such as Freud, Marx or Darwin would meet the empirical standards of the top journals in organizational research.’’8 In the same vein one can ask, would any top journal today publish the key arguments of OSSP, even if M&S found a way to dice up the overall story? In his book, Scholarship Revisited, Ernest Boyer described four different kinds of scholarship: The scholarship of discovery (research), the scholarship of integration (synthesis), the scholarship of practice (application), and the scholarship of teaching (pedagogy).9 Over the last two decades, we have collectively narrowed the definition of scholarship to only that of research, and have excluded from it the other three categories. Although the focus on
8
SUMANTRA GHOSHAL
discovery has yielded great benefits, one of the costs has been the delegitimization of the scholarship of integration. OSSP is, in essence, an outstanding example of scholarship of integration, and it demonstrates the enormous value of creative and parsimonious synthesis in academic fields connected with professional practice. If we celebrate this book, then surely we will like to see more work of this kind. To do so, perhaps we need to once again legitimize the role of books – not only as complements to but also, in exceptional cases, as substitutes for, journal articles. Beyond that, perhaps we also need to create a new journal – to be called the Academy of Management Essays, perhaps – as a fully legitimate outlet for the scholarship of integration.
MAKING THE WORLD A BETTER PLACE All of us live in two worlds. One is our own world, within the academic community, in which we derive great joy from speaking to one another – whether physically or metaphorically, through our papers. The language of these conversations is theory, and its grammar is shaped by rules of logical and empirical rigor. The other world is the real world – of companies, managers, employees, consumers, regulators, students and y our children, most of whom will be residents of that world. The two worlds sometimes connect, as when theory addresses a real-world issue. A lot of the time, however, we keep the two worlds apart. Enduring scholarship always exists at the intersection of these two worlds. Without theory, there is no scholarship; without the real world, there is no endurance. The ultimate purpose of all scholarship is to help, directly or indirectly, make the world a better place, often by first making it a better understood place. In this sense, scholarship that divorces the endeavor of building positive theory from normative purposes is unlikely to endure. I know that there are many in our profession who disagree with this proposition. In response to them, I offer a quote from James Coleman, perhaps the most pre-eminent sociologist of the last quarter century (p. 14): The rational reconstruction of society y is now upon us in full force y It is the task of sociologists to aid in that reconstruction, to bring to it the understanding of social processes, to ensure that this reconstruction of society is not naı¨ ve but sophisticated; to ensure, one might say, that it is indeed a rational reconstruction of society.10
What Coleman asserted about the tasks of sociology is doubly true for all of us involved in the Academy of Management. This is indeed the promise to
Scholarship that Endures
9
society, on the strength of which all other diverse scholarly interests related to business and management have derived their legitimacy, and because of which the Academy has enjoyed its enormous expansion. The promise is not one of merely observing, describing, or even explaining. It is a promise to help in the reconstruction – in other words, a promise to build an intellectual grounding that can guide the actions of those responsible for effective performance of key social institutions, including companies. OSSP is normative theory. It is grounded in a rich understanding of the existing literatures, and is informed through the observation of several institutions. But its goal is to help managers understand their existing situations, and to aid them in improving the effectiveness of their organizations. It is not normative in the sense of providing a simple and universal prescription that cures all ills; it is normative in providing perspective, a basis for diagnosis, and a method for reflection on further action. There are some very practical career benefits that can accrue from this form of scholarship: it allows the scholar to harmonize the different aspects of his or her professional work. While focusing on research and publications, most of us also do some teaching, and perhaps some consulting. Typically, these aspects of our professional work tend to be unconnected. For M&S, however, not only were they connected, but the connections were self-reinforcing. As they wrote, in describing the process of writing the book, ‘‘we have used portions of these materials in classes at several universities, with managers in university executive development programs, and in private consulting activities with top-management groups. The verbal and written responses of these individuals were more than adequate encouragement to keep us at our task.’’ In the world of business schools, this is an increasingly rare form of scholarship – one based on creating a tight link between research, teaching, and consulting. The conventional advice is exactly to the contrary. Yet, such a synthesis is of great advantage for scholarship that attempts to illuminate and improve professional practice. And once again, if I apply the test of long-term endurance, it is this form of scholarship that I believe has the best chance of being remembered and celebrated 25 years after publication.
NOTES 1. Porter, M.E., 1980. Competitive Strategy: Techniques for Analyzing Industries and Competitors. New York: The Free Press; Treacy, M. and F. Wiersema, 1995. The Discipline of Market Leaders. Reading, MA: Addison-Wesley.
10
SUMANTRA GHOSHAL
2. Williamson, O.E., 1975. Markets and Hierarchies: Analysis and Antitrust Implications. New York: Free Press. 3. Miller, D., 1990. The Icarus Paradox: How Exceptional Companies Bring about Their Own Downfall. New York: Harper Business; Mintzberg, H., 1979. The Structuring of Organizations: A Synthesis of the Research. Englewood Cliffs, NJ: PrenticeHall. 4. See Jones, E., 1964. The Life and Work of Sigmund Freud. London: Penguin Books. 5. Miles, R.E. and C.C. Snow, 1986. ‘‘Network Organizations: New Concepts for New Forms,’’ California Management Review, Vol. 28, 66–73; Miles, R. E. and C. C. Snow, 1994. Fit, Failure and the Hall of Fame: How Companies Succeed or Fail. New York: The Free Press. 6. Popper, K.R., 1968. The Logic of Scientific Discovery. New York: Harper & Row; Weick, K.E., 1989. ‘‘Theory Construction as Disciplined Imagination,’’ Academy of Management Review, Vol. 14, No. 4, 516–531. 7. Chandler, A.D., Jr., 1962. Strategy and Structure: Chapters in the History of the American Industrial Enterprise. Cambridge, MA: The MIT Press; Mintzberg, H., 1973. The Nature of Managerial Work. New York: Harper & Row; Rumelt, R., 1974. Strategy, Structure and Economic Performance. Boston, MA: Division of Research, Harvard Business School. 8. Sutton, R.I. and B.M. Staw, 1995. ‘‘What Theory is Not,’’ Administrative Science Quarterly, Vol. 40, 371–384. 9. Boyer, E.L., 1990. Scholarship Reconsidered: Priorities of The Professoriate. Princeton, NJ: The Carnegie Foundation for the Advancement of Teaching. 10. Coleman, J.S., 1992. ‘‘The Rational Reconstruction of Society,’’ American Sociological Review, Vol. 58, 1–15.
THEORY, PRACTICE, AND SCHOLARSHIP Raymond E. Miles and Charles C. Snow In theory, there is no difference between theory and practice. In practice, there is. – Yogi Berra
While you are pondering this observation on the relationship between theory and practice, let us note that Sumantra Ghoshal has been saying essentially the same thing about theories of management for over a decade now (e.g., Ghoshal & Moran, 1996). His most insightful observations are contained in Ghoshal (2005), where he points out that not only are theory and practice inextricably intertwined, but flawed or ‘‘bad’’ theories can even do damage to good management practices. We are honored and pleased that Ghoshal, who was one of the world’s great organization theorists, has cited our work as exemplary scholarship – ‘‘good’’ theory if you will. Ghoshal identifies four main reasons why he believes that the research reported in our 1978 book has endured. First, he says that we had ‘‘y the courage and curiosity to embark on a long intellectual journey.’’ We agree with his observation, though we hasten to add that at the time our curiosity far exceeded our courage. Miles wanted to widen the context of his research on individual managers’ philosophies of management to include organization design and development, and Snow (his doctoral student) was interested in the firm as a unit of analysis. Thus, our research collaboration was a natural partnership – Snow was the student learning how to do research, and Miles was expanding his research focus to include the entire organization. Research Methodology in Strategy and Management, Volume 3, 11–15 Copyright r 2006 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1479-8387/doi:10.1016/S1479-8387(06)03002-5
11
12
RAYMOND E. MILES AND CHARLES C. SNOW
Ghoshal’s second observation concerns the scope of the issues addressed by our research: ‘‘If scholarship that endures is a product of intellectual adventure, then it must start with similarly broad issues to address.’’ In the year before the Miles–Snow research collaboration began, Snow served as the research assistant to Professor Charles Perrow, who was on sabbatical in Berkeley at the time and writing his book Organizational Analysis: A Sociological View (Perrow, 1970). While searching for examples for Perrow’s book, as well as taking a seminar from him, Snow became intrigued with the idea of how organizations interacted with their various environments and the role that top managers’ perceptions played in the process. Approaching Miles with the request to serve as his dissertation chair, and after many intellectual conversations that ranged across the topics of managerial decision making, organizational change and development, and management ideologies, Snow came to believe that his dissertation research was about organizational adaptation. Ghoshal notes that Miles and Snow ‘‘y ended up presenting a broad framework rather than a tightly argued causal theory,’’ and we are especially pleased that he attaches as much theoretical importance to our adaptive cycle – a dynamic construct embodying ideas drawn from the work of Chandler (1962), Lawrence and Lorsch (1967), Thompson (1967), and Weick (1977) – as he does to the organizational typology. We were fortunate that two other doctoral students, Alan Meyer and Henry Coleman, arrived at Berkeley soon after we had developed our overall theoretical framework. Their dissertation research provided valuable tests and extensions of the adaptive cycle and organizational typology, and their work helped to make this the rich framework that has served us (and others) well for many years. Thus, we are not suggesting that every organizational researcher should seek to develop a broad theoretical framework. We do believe, however, that any particular piece of research should examine issues of practical significance and should be clearly embedded in a broad theoretical perspective. Using Boyer’s (1990) four types of scholarship, Ghoshal discusses a third reason for the endurance of our framework – the scholarship of integration (or synthesis). Our framework did indeed provide a measure of integration, not because we set out to do so, but rather because we needed to bridge two somewhat separate fields in order to explain our findings. For example, the emerging field of business policy portrayed a firm’s strategy as an essentially unique solution to the problems posed by environmental threats and opportunities. The more established field of organization theory, on the other hand, hardly acknowledged strategy at all. Our research showed that (a) patterns in top managers’ perceptions and decisions could be identified
Theory, Practice, and Scholarship
13
across firms in an industry, (b) an evolving industry could support several different competitive strategies, and (c) a particular firm’s strategy had to be fitted with an appropriate organization structure and set of management processes in order for the firm to succeed. These are, to use Ghoshal’s words, ‘‘big, broad’’ ideas that span fields and stimulate further, more detailed research. Lastly, Ghoshal says that our theoretical framework is normative: ‘‘It is not normative in the sense of providing a simple and universal prescription that cures all ills; it is normative in providing perspective, a basis for diagnosis, and a method for reflection on further action.’’ We strongly believe, as does Ghoshal, that we are members of an applied academic discipline and that our efforts as researchers should be focused heavily on the improvement of management practice. For our current thinking on this point, see Snow et al. (2006), where we argue that the sub-field of organization design should devote more attention to issues such as formulating theories that help managers anticipate new organizational forms, and building diagnostic and investment tools that aid managers in preparing their organizations for the future. Ghoshal’s comments on our dynamic theoretical framework of strategy, structure, and process are so laudatory that we hesitate to draw attention to any omissions on his part. Nevertheless, because one piece of our framework that he did not mention is so closely linked to his own work, we feel that we should point it out. In Chapter 8 of Miles and Snow (1978), we discussed the impact of managers’ beliefs and assumptions on their strategic, structural, and process choices. Using Miles’ (1975) typology of management theories, we argued that both the prospector strategy and the decentralized decision-making process it required probably would not be successful unless managers use a ‘‘human resources’’ theory of management – that is, unless they have positive views of the motivation and capabilities of those managers and employees below them in the hierarchy and are genuinely committed to joint goal-setting and delegation. Essentially, we argued that the prevailing ‘‘traditional’’ or ‘‘human relations’’ theories constrained managers’ strategic choices. Our overall argument was confirmed by Alan Meyer’s research which is reported in Chapter 13 of the book and which is the basis of his highly cited article on how management ideology affects an organization’s mode of adaptation to environmental jolts or crises (Meyer, 1982). Moreover, we have included the link between management theory and organizational form in our subsequent work. In Miles and Snow (1994), we pointed out that multi-firm network organizations required a willingness on
14
RAYMOND E. MILES AND CHARLES C. SNOW
the part of management to invest heavily in the development of new individual skills and collective capabilities. We argued, for example, that managers who attempted to pursue innovation-based strategies supported by new structural arrangements needed a new theory of management, which we called the ‘‘human investment’’ model, a view that expanded the human resources perspective. Thus, while Ghoshal has passionately described the negative impact of ‘‘bad’’ theory on management practice, we have argued that ‘‘good’’ theory allows managers more freedom to create innovative strategies, structures, and processes. Nevertheless, the theories that Ghoshal and Moran (1996) and Ghoshal (2005) cite as bad for practice – most notably, agency theory (Jensen & Meckling, 1976) and transaction cost economics (Williamson, 1975) – have endured, and progress toward the development of new organizational forms has been slow in coming. Why? Is it possible to clearly specify the factors that separate good theories from bad ones? We’d like to close our commentary by offering our thoughts on this matter. We agree with Ghoshal’s controversial point that agency theory and transaction cost economics are bad for practice – in the sense that strict adherence to the dictates and implications of those approaches can create a self-fulfilling prophecy that results in the widespread use of undesirable management practices. We also agree with his analysis of what is wrong with such theories, especially his notions of partial analysis and the exclusion of human intentionality and choice. For example, both agency theory and transaction cost economics have a narrow view of organizational behavior (e.g., shareholders are the only stakeholders of a firm or costs are the most important consideration in managerial decision making). Therefore, it seems to us, a theory can endure but still be bad if it is too narrowly defined – if it only permits the partial analysis of some important organizational or managerial phenomenon. Such a theory can be valid, and can even garner substantial empirical support, but still not fully or properly reflect management practice. Also, both agency theory and transaction cost economics have a flawed conception of human motivation and behavior (e.g., people act only in their self-interest and often with guile). Such theories can survive, partly because some managers do in fact behave as the theories presume, but also because the theories are at least somewhat self-fulfilling (Ferraro, Pfeffer, & Sutton, 2005). To the extent that a management theory is self-fulfilling, it crowds out the development of useful management practices (as well as ‘‘good’’ theories) based on positive human attributes such as generosity, trusting in and caring for others, intrinsic motivation, and a desire to cooperate and
Theory, Practice, and Scholarship
15
collaborate. In short, bad management theories do not adequately incorporate human volition, and they will not, as Ghoshal says, make the world a better place. Indeed, not only may they lead managers to make bad choices but also adherence to them inhibits managers from experimenting with new and more desirable approaches. In conclusion, perhaps it bears repeating that researchers need not begin their careers, or launch a particular study, with the objective of producing enduring scholarship. They simply need to focus on important management issues, both theoretical and practical, and study how people go about resolving them.
REFERENCES Boyer, E. L. (1990). Scholarship reconsidered: Priorities of the professoriate. Princeton, NJ: The Carnegie Foundation for the Advancement of Teaching. Chandler, A. D., Jr. (1962). Strategy and structure: Chapters in the history of the American industrial enterprise. Cambridge, MA: MIT Press. Ferraro, F., Pfeffer, J., & Sutton, R. I. (2005). Economics language and assumptions: How theories can become self-fulfilling. Academy of Management Review, 30, 8–24. Ghoshal, S. (2005). Bad management theories are destroying good management practices. Academy of Management Learning & Education, 4, 75–91. Ghoshal, S., & Moran, P. (1996). Bad for practice: A critique of the transaction cost theory. Academy of Management Review, 21, 13–47. Jensen, M., & Meckling, W. (1976). Theory of the firm: Managerial behavior, agency costs and ownership structure. Journal of Financial Economics, 3, 305–360. Lawrence, P. R., & Lorsch, J. W. (1967). Organization and environment: Managing differentiation and integration. Boston: Harvard Graduate School of Business Administration. Meyer, A. D. (1982). Adapting to environmental jolts. Administrative Science Quarterly, 27, 515–536. Miles, R. E. (1975). Theories of management: Implications for organizational behavior and development. New York: McGraw-Hill. Miles, R. E., & Snow, C. C. (1978). Organizational strategy, structure, and process. New York: McGraw-Hill. Miles, R. E., & Snow, C. C. (1994). Fit, failure, and the Hall of Fame: How companies succeed or fail. New York: Free Press. Perrow, C. (1970). Organizational analysis: A sociological view. Belmont, CA: Wadsworth. Snow, C. C., Miles, R. E., & Miles, G. (2006). The configurational approach to organization design: Four recommended research initiatives. In: R. M. Burton, D. D. Ha˚konsson, B. Eriksen & C. C. Snow (Eds), Organization design: The evolving state-of-the-art. New York: Springer. Thompson, J. D. (1967). Organizations in action. New York: McGraw-Hill. Weick, K. E. (1977). Enactment processes in organizations. In: B. M. Staw & G. R. Salancik (Eds), New directions in organizational behavior (pp. 267–300). Chicago: St. Clair Press. Williamson, O. E. (1975). Markets and hierarchies: Analysis and antitrust implications. New York: Free Press.
This page intentionally left blank
16
CONSTRUCTS AND CONSTRUCT MEASUREMENT IN UPPER ECHELONS RESEARCH Mason A. Carpenter and Gregory P. Reilly ABSTRACT Upper echelons research considers the relationship of top executives to organizational attributes or outcomes, vis-a`-vis, their individual or group demographic characteristics such as tenure or experience. The upper echelons perspective is typically associated with the theorizing of Hambrick and Mason in their 1984 Academy of Management Review article, but also has much broader and deeper organizational theory roots as demonstrated by Pfeffer’s (1983) earlier exhaustive review of organizational demography. Since the early 1980s, hundreds of upper echelons studies have been published – some explicitly invoking the upper echelons theoretical perspective, while others employing its underlying methodology of relying on executive demographic characteristics as proxies for executive and top management team (TMT) related constructs. This chapter examines three important features and their related challenges and opportunities in future upper echelons research. Specifically, we focus on (1) the identification of upper echelons constructs, (2) embedding those constructs in a meaningful way to develop new theory or better our understanding of extant theory, and (3) the related operationalization and measurement of those constructs that are eventually included in qualitative and quantitative analyses using TMT demographics. We Research Methodology in Strategy and Management, Volume 3, 17–35 Copyright r 2006 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1479-8387/doi:10.1016/S1479-8387(06)03003-7
17
18
MASON A. CARPENTER AND GREGORY P. REILLY
conclude our chapter by drawing these three features together to provide a benchmark process to gauge the theoretical and methodological contributions of upper echelons-related work, and ultimately improve the chances of getting such research published.
INTRODUCTION The purpose of this chapter is to help researchers develop new studies and get their work published when they adopt, either explicitly or implicitly, the upper echelons perspective (UEP) introduced by Hambrick and Mason (1984) (henceforth H&M). In doing so, we outline the challenges and issues related to constructs and construct measurement that must be considered by researchers when embarking upon UEP research. We will approach this task by first describing what novel research looks like, and then move on quickly to describe the UEP model and identify the important constructs that comprise it. Next, we will describe strategies for connecting these constructs to ensure the advancement of theory in the field. After that, we will turn our focus to a discussion of how construct validity can be attained in an UEP study. Finally, we will bring these topics together in a checklist intended to help future UEP researcher to select interesting and appropriate constructs for study and to ensure that these constructs are operationalized with high internal and external validity. Interesting? Regardless of the domain, UEP or not, your research won’t go beyond working paper status if it is not interesting. Is ‘‘interesting’’ subjective? Sure. Is ‘‘interesting’’ purely defined by academic fad and fashion? No. Fortunately, there is a concise and easily accessible starting point, the classic paper by Murray S. Davis (1971), for determining if your UEP research question stands up to this test. Davis sets up his paper by stating: ‘‘It has long been thought that a theorist is considered great because his theories are true, but this is false. A theorist [or theory] is considered great, not because his theories are true, but because they are interesting’’ (1971, p. 309). Davis asserts a theory to be interesting if it has been given wide circulation (e.g., is cited in other research, textbooks or taught in courses). As an introductory footnote, we’d say that the UEP would pass Davis’ test given the fact that H&M is cited in over 500 subsequent refereed journal articles, across
Constructs and Construct Measurement in Upper Echelons Research
19
domains as diverse as psychology and economics, and the citation rate and domain diffusion of the UEP appears to be continuing at a healthy clip (Carpenter, Sanders, & Geletkanycz, 2004). In general terms, Davis finds that, ‘‘A new theory will be noticed only when it denies an old truth, proverb, platitude, maxim, adage, saying, commonplace, etc.’’ (1971, p. 310). He adds, ‘‘all interesting theories, at least all interesting social theories, then, constitute an attack on the takenfor-granted world of their audience. y If it does not challenge but merely confirms one of their taken-for-granted beliefs, [the audience] will respond to it by rejecting its value while affirming its truth’’ (1971, p. 310) Finally, ‘‘an interesting proposition [is] always the negation of an accepted one’’ (1971, p. 311). In summary then, what Davis is saying is that your UEP theory or study will be considered interesting if it challenges widely shared, but weakly held assumptions about the world, organizations and human behavior. Such research is not only interesting, it is also important because it helps us to better understand extant knowledge and theories, and create opportunities and motivation for new research. Valuable, Inimitable, and Rare? Recall that the UEP is typically considered within the domain of strategic management, and there too you have a useful tool for determining if your proposed study meets the ‘‘that’s interesting!’’ test. Specifically, a recent editorial in the Academy of Management Journal (2003) by Don Bergh walks through the resource-based view (RBV) as a tool to evaluate whether you are making a valuable contribution with your UEP study. While we encourage you to read the full text yourself, the gist of his suggestion is that your research must be considered (a) valuable (would academics and practitioners see enough value in your work to cite or apply it?), (b) inimitable (have you controlled for all reasonable competing explanations of the proposed advance, either through the logical consistency of your theoretical model or through controlling for rival theories in the research method?) and, (c) rare (is your contribution novel, surprising, and unexpected?). Although we agree that there remains great promise and opportunity in UEP research, we also strongly endorse the view that ‘‘the bar has been raised such that any new UEP study that looks at the effects of demographic characteristic X on organizational outcome Y is likely to be viewed as simply reaffirming that which is already understood – namely, that executives and their demographics matter’’ (Carpenter et al., 2004, p. 770). In many ways, because UEP relationships are taken as given, particularly in certain quarters of
20
MASON A. CARPENTER AND GREGORY P. REILLY
strategy researchers, they are no longer novel from a Davis (1971) perspective. Admittedly, replication of existing work does have tremendous academic value, and there are venues that will publish it. However, editors and reviewers in the top management journals will apply the ‘‘that’s interesting’’ and resource based criteria we summarized above.
THE UPPER ECHELONS PERSPECTIVE AND ITS CONSTITUENT CONSTRUCTS It should be clear to you that the first step in developing and publishing research that employs the UEP is to pass the ‘‘that’s interesting!’’ hurdle. This is a very high, non-trivial hurdle, particularly if you are aiming to publish your work in one of our top academic journals. But you are a savvy social scientist, and let us assume you have done a reasonably good job of identifying an interesting research question. The second step in developing and publishing your UEP research is an understanding of the theoretical model originally developed by H&M, its constituent constructs, and how the model and UEP research has evolved in the past 20 years. Fortunately, much of this review has been undertaken for you already. If your research is a dissertation, then you would be well served by reading H&M, Pfeffer (1983), Jackson (1992), Hambrick (1992), Hambrick (1994), Finkelstein and Hambrick (1996), Carpenter et al. (2004), and the dialogue by several authors in Dansereau and Yammarino (2005, pp. 197–273) on the various multi-level analytical issues embedded in the UEP. Admittedly this is a long laundry list of readings on the UEP perspective, but if you are aiming for something interesting and publishable, you had best arm yourself with the requisite knowledge of the UEP history and landscape. H&M’s original model identifies characteristics of a top management team (TMT) as being important factors in the determination of overall firm performance. H&M reason that TMT members are most likely to make decisions that will have the greatest impact on firm actions and ultimately on firm performance outcomes. H&M proposed that the personal characteristics of these central decision makers will influence their individual problem-framing, habits, values and biases and, in turn, affect the decision-making processes of the firm and its success or failure. An updated H&M model provided by Carpenter et al. (2004), that accounts for UEP research to-date is presented in Fig. 1. Additionally, H&M proposed that demographic information about TMT members could be used as a proxy for their personal (and difficult to measure and access) characteristics. In fact,
Antecedents E xternal E nvironment • E xternal stakeholders • E xternal managerial labor markets • E nvironmental characteristics Org anizational • Firm characteristics • Board characteristics • Internal labor markets
• • • • • • • • •
Fig. 1.
Skills and orientations Cognitions or social cognitions Behavioral propensities Access to information Access to resources H uman capital Social capital Relative status – within TMT or across firms H eir apparent
Moderators/ Mediators of TMT D emog raphic E ffects • • • • •
Power D iscretion Incentives Integration Team processes
Org anizational Outcomes
Strateg ic • Business • Corporate • International • Change • Strategic interactions • Policies
Performance • Financial • Market • Social • Innovation
TMT/ Board • Turnover • Composition
Constructs and Construct Measurement in Upper Echelons Research
Theoretical Constructs Proxied by TMT D emog raphics
Carpenter, Geletkanycz, and Sanders’ (2004) Stylized Model of the Upper Echelons Perspective. 21
22
MASON A. CARPENTER AND GREGORY P. REILLY
the ease of collection of both firm performance information and demographic information such as education, experience, age and functional background may have contributed to the boom in UEP research in the past decades. If you undertook the literature review we recommended above, you will know that a typical study specifies one or more measures of TMT demographics and investigates their relationships with a measurable outcome variable (i.e., strategy, action, performance, etc.) using the logic of H&M to build a causal chain and explain why the relationship is found. While early UEP research often focused only on specifying measures for these two constructs in the model, to make a contribution to the field today researchers are more likely to expand their investigations to explicitly measure variables which represent other components of the original model. In this spirit, we present in Fig. 2 an overview of the key constructs in the UEP and their relationships to each other. This model incorporates the conclusions elaborated in the UEP review by Carpenter et al. (2004), and shown in Fig. 1. Fig. 2 is intended to serve as construct selection guide to future UEP researchers. Specifically, it is a tool that you can use in developing your UEP research agenda. In the remainder of this section we will describe potential variables for each major construct area and review examples of actual variables that have been studied in past UEP research.
TMT Demographics Although their work may be among the most well-known, H&M are not the only authors, nor the first, to argue for the influence of the demographics of an organization on its performance outcomes. Pfeffer’s (1983) review of organizational demography illustrates the long history of this approach. He identified organizational characteristics such as growth rate, technology and human resource practices as being important antecedents of organizational demographics. Additionally, he identified overall firm performance and other outcomes (e.g., cohort conflict, form of organizational control) as being affected by demographics. Even prior to that, Song (1982) showed a relationship between TMT characteristics and firm diversification strategies. The UEP’s main contributions to the existing demographic literature are its focus on the TMT (rather than the firm or other employees as a whole) and its identification of demographic information as a proxy for firm characteristics rather than as a mediating variable between firm characteristics and outcomes. While TMT demographic variables are likely to remain an important part of UEP research, it has already been pointed out in this chapter
Constructs and Construct Measurement in Upper Echelons Research
23
Early Upper Echelons Research
Demographic proxies for TMT Characteristics Mediating Operational Processes
Mediating Strategy Processes
- Design - Execution - Evolution
- Planning - Scanning, etc.
TMT Characteristics - Individual level - Group level
Decision Making Processes
Organizational Outcomes
- TMT-level - Lo wer-level - E xternal
- Process - Output - Overall
Context (Antecedents and/or Moderators) Organization-level Characteristics
Fig. 2.
Environment-level Characteristics
Upper Echelons Constructs.
that demographics, even novel ones, will be unlikely to give a study any unique advantage or novelty.
TMT Characteristics This category of UEP constructs are at the heart of the causal model proposed by H&M. The authors view organizational outcomes as ‘‘reflections of the values and cognitive bases of powerful actors in the organization’’ (Hambrick & Mason, 1984, p. 193). Carpenter et al. (2004) enrich this original idea by more specifically describing characteristics that might have the greatest impact on firm outcomes. Again, these are summarized in Fig. 1. They identify skills, behavioral tendencies, access to resources and social status as being among the most influential attributes of a top manager. However, these characteristics have only rarely been directly modeled and tested in the UEP literature. While researchers have for the most part used TMT demographics as a proxy for leader or leadership team characteristics, there are some important exceptions. In one notable recent example, Peterson et al. (2003) describe how Big 5 personality characteristics in CEOs affect TMT dynamics, and subsequent firm performance. By directly measuring these psychological characteristics of the CEO and the
24
MASON A. CARPENTER AND GREGORY P. REILLY
TMT rather than using demography-based proxies, these authors begin to provide a glimpse inside the causal black box of UEP. Attributes of Decision-Making Processes Although Hambrick and Mason (1984) do not explicitly discuss the causal path between top managers and firm outcomes being mediated by strategic decisions, Carpenter et al. (2004) point out that such mediation was in the spirit of a first draft toward theory building. Twenty years of research has further defined the story of UEP to include an understanding that managers’ characteristics affect firms through their strategic choices and decisionmaking processes. It follows then that operationalizing characteristics of the decision-making process holds promise and opportunity for interested researchers. Baum and Wally’s (2003) exploration of the impact of speed of decision making on performance may provide guidance for future UEP researchers who want to link this construct back to TMT characteristics. Similarly, a study could use a small case-study approach to compare UEPbased predictions derived from demographic characteristics, and the actual behaviors manifested over time by the managers being studied. Mediating Strategy Processes Another conceptual opportunity that has not been adequately explored is the inclusion of characteristics of other strategy process activities into UEP research. While it may seem logical to assume that characteristics of TMT individuals or the group as a whole will affect firm decision-making, one could argue that a large portion of this effect will be realized through their participation in the planning process. That is, characteristics of the TMT will affect characteristics of strategic plans, and the strategic plans will affect the nature of firm decision-making. Markoczy (2001), for instance, looked at the consensus process, but did not factor in any of the respective players demographic characteristics. In contrast, West and Schwenk (1996) tried to link demographics, consensus and performance, but were unable to detect any significant relationships. Most recently, Arendt et al. (2005) introduced a CEO-advisor model of strategic decision making where they explored how the use of formal versus informal advisory systems and how advisors were selected was affected by context. While they did not include an exhaustive set of demographic characteristics in their theorizing, Arendt et al. (2005, p. 690) did propose that longer-tenured CEOs would be more likely to rely on formal advisory systems in making strategic decisions. Such a study
Constructs and Construct Measurement in Upper Echelons Research
25
provides a nice example of integrating the constituent constructs of the UEP to develop a novel theoretical perspective. Organizational Outcomes While H&M’s original model identified performance, growth, variability and survival as potential outcome variables, UEP researchers have experimented in more broadly in the past 20 years with their choices of dependent variable outcome measures. Perhaps mirroring the field of strategy as a whole, researchers have become interested in innovation-related outcomes (e.g., team innovativeness, West & Anderson, 1996), socio-cognitive outcomes (e.g., group conflicts Amason & Sapienza, 1997), and specific functional practices (e.g., HR practices, Collins & Clark, 2003). We would encourage research in this vein, particularly work that shows how the UEP is really capturing a cascade of decisions and outcomes, with performance as but one important albeit rather final outcome. Mediating Operational Processes While use of alternative outcome variables is beneficial for development of the UEP, we suggest that an opportunity exists in further investigating, intermediate operational outcome variables as mediating variables between strategic decision making and overall firm performance. In a recent review of the strategic consensus literature, Kellermans, Walter, Lechner, and Floyd identified the need for ‘‘(a) measures of consensus that take account of locus as well as differences in how the context of strategy is perceived by top-, middle-, and lower-level managers, (b) research designs wherein assumptions about the locus and content of consensus govern the choice of antecedents, and (c) more consistent use of moderators’’ (2005, p. 719). Such an agenda provides the perfect context for integrating and extending the UEP because it can be shown how particular constellations of managers affect the strategic process, and the conditions under which they are most likely to affect particular pieces of the process. Contextual Forces Our last category of conceptual pieces of the UEP model is contextual variables that describe the organizational and environmental circumstances
26
MASON A. CARPENTER AND GREGORY P. REILLY
under which other relationships occur. These characteristics might be important moderators of the relationship between two previously mentioned types of variables or might provide insight if modeled as antecedents to one specific construct. Previous examples of such work include a firm’s global strategy or the discretion afforded the CEO or members of the TMT (see Carpenter et al., 2004, for a summary). However, just as Kellermans et al. (2005) identified the need for the greater use of moderators in strategy consensus research, so too does there remain tremendous opportunity to understand when, where and why UEP effects play out in certain contexts versus others, particularly if the studies are simultaneously able to shed light on the previously black-boxed dynamics explaining those contingent UEP effects.
EMBEDDING CONSTRUCTS TO CREATE NEW THEORY Having laid out a framework for helping researchers to broadly consider a variety of constructs in their UEP research, we turn now to a discussion of how to select and piece together these elements to create a study that will move the field forward. In a sample of 111 manuscripts that he had personally reviewed ‘‘over a four-year period,’’ Daft (1995) identifies the lack of coherent theory as being the most significant problem he found. He concludes that more than half the studies he reviewed failed to tell a compelling story and explain why relationships between variables exist as they do. The idea that success in publishing requires scholars to contribute new theory is applicable to research in the UEP. As Carpenter et al. (2004) pointed out, we already understand that demographic characteristics affect firm outcomes. And as Davis (1971) and Bergh (2003) emphasize, the study has to be interesting (and it is inherently not interesting if it is incoherent!). To make a contribution in the field, researchers must explore and relate some richer combination of the constructs identified in the previous section and then tell a compelling story that explains why the relationships exist. In the remainder of this section, we will outline a process for theory development that can serve to guide UEP researchers in this effort. Crafting the Causal Story Again, we are assuming that you have come up with what you believe to be an interesting theoretical question – you have addressed the questions of
Constructs and Construct Measurement in Upper Echelons Research
27
value, inimitability and rareness. Beyond this, Whetten proposes that ‘‘theoretical insights come from demonstrating how the addition of a new variable significantly alters our understanding of the phenomena by reorganizing our causal maps’’ (Whetten, 1989, p. 493). It is not enough just to add new variables described in the first section to existing UEP equations. Instead, future UEP researchers must help us to better understand why the relationships have been found. Daft suggests storytelling as one technique for ensuring that a clear rationale is laid out as to why variable X relates to variable Y. He suggests that ‘‘storytelling explains the ‘why’ of the data and gives meaning to observed relationships’’ (1995, p. 178). In the case of UEP research, explanations of what happens inside the TMT and within the strategy process of the organization to cause specific organizational outcomes have only been offered at a general level of theorizing. Sometimes these stories are simple – for instance, research shows that when individuals are faced with uncertainty they are more likely to fall back on what they know to process information and make decisions. Such an explanation is central to H&M’s original model, was first tested in Carpenter and Fredrickson (2001), and later elaborated upon in Hambrick, Finkelstein, and Mooney (2005) in a broader theory on the job demands facing top executives. Demographic characteristics often provide us with clues about what individuals already know. Thus, a significant opportunity exists to provide the details of a causal story about why particular characteristics of leaders translate into specific plans or decision-making models within an organization. Similarly, researchers who could explain the effect of a specific characteristic on firm actions and link them both to outcomes would advance the field. Also, a contribution can be made by scholars who are better able to explain boundary conditions associated with UEP relationships. We outline one possible tool that you can use to experiment with different possible UEP relationships in Fig. 3. Based on your review of the literature, see if you can fill in the boxes such that you have identified important gaps in our knowledge base. Then imagine, and try to write out the story that links up these pieces of the puzzle so that your newly developed theory satisfies Daft’s (1995) coherence criterion. In effect, this is a tool to structure different UEP causal scenarios or theories. For instance, researchers might profitably test whether there are significant organizational or environmental-level factors that strengthen or weaken cause-effect linkages predicted by the UEP. In developing a causal story, researchers must pay careful attention to how the level-of-analysis issues are managed and to explaining the significance of temporal dimensions of relationships. This attention to detail
28
MASON A. CARPENTER AND GREGORY P. REILLY
Demographic
Mediating Strategy Processes TMT Characteristic
Attribute of Operating Process
Attribute of TMT Decision Making
Organizational Outcome
Contextual Factor Organization-level Characteristic
Fig. 3.
Environment-level Characteristic
Hypothesis Development Map.
in theory development will lend credence to any causal story advanced to explain new proposed UEP relationships.
Link to Existing Research Development of the UEP is ongoing. Carpenter et al. (2004) found that H&M’s original work has been cited in excess of 500 times in journals covering a wide variety of disciplines. Besides minimizing the risk of repeating what has already been done, a careful and appropriate review of relevant work provides evidence to potential reviewers that a researcher can be taken seriously, and this chapter has provided you with ample resources to conduct such a thorough review. Any new UEP work should focus on at least one community of researchers within the field and explicitly aim to extend a conversation they have initiated in the form of articles in recent publications. A causal story that does not explicitly incorporate past knowledge and address issues seen as important by current UEP researchers has little chance of being published. At the same time, researchers must not fall into the trap that was described by Ketchen (2002) as ‘‘argumentation by citation.’’ That is, hypotheses cannot be developed based only on the findings in previous studies. Building on the work of others is necessary but not
Constructs and Construct Measurement in Upper Echelons Research
29
sufficient for making a contribution. Successful research must be grounded in what has been done while offering something new within a concise, logical and compelling story. Ensure a Contribution While explicitly building on existing research is an absolute necessity for research in the UEP, development of a compelling causal story requires a researcher to depart somewhat from the accepted wisdom of the field and find a part of the bigger story that has not been adequately explored or explained. Yes, we return here to Davis’ ‘‘that’s interesting!’’ criterion. One potentially promising approach in UEP research could be classified as ‘‘testing black box assumptions.’’ Beginning with H&M’s landmark paper, UEP researchers for the most part describe a detailed causal chain similar to our Fig. 2. However, in most cases, they only operationalize demographics and outcome variables. Testing the assumptions found in the intermediate steps in the causal chain would contribute to the field by empirically showing that previous assumptions about causal linkages were correct. A second approach to making a contribution would be to understand a relationship in some part of the model that does not conform to expectations. Davis’ (1971) categorization scheme for identifying approaches to developing ‘‘interesting’’ research helps us identify several possible paths for finding an unexpected result. One approach would theorize about constructs in the UEP framework that would be expected to be correlated but turn out to uncorrelated. In another, researchers might identify a negative correlation between constructs where the more established general UEP model would predict a positive correlation. Finally, UEP scholars could propose causation that contradicts the expected direction of causation in the established UEP thinking. Given that very little of the ‘‘black box’’ linkages in the UEP model have been tested empirically, there may be significant opportunity for surprising findings.
ENSURING CONSTRUCT VALIDITY IN UEP RESEARCH As important as selecting applicable constructs and linking them together via new theory are, effective research must also ensure that the operationalization of the constructs is properly executed. Such construct validity is critical to all research endeavors to ensure that theory is tested in as accurate
30
MASON A. CARPENTER AND GREGORY P. REILLY
a manner as is possible. Schwab said that construct validity is ‘‘present when there is a high correspondence between cases’ scores on a measure and the mental definition of the construct it is designed to represent’’ (1999, p. 26). In UEP research, there are several components of the generally accepted framework for which high levels of construct validity are especially difficult to attain. The first area of difficulty is the identification of the TMT itself. Sidestepping the fact that prominent UEP researchers themselves take issue with the ‘‘team’’ label (Hambrick, 1994), most research employing the UEP identifies the TMT (a construct in the theory) using those individuals identified in public documents. This practice raises two questions. First, would these be the strategic decision makers identified by the CEO if she or he were asked who comprises their top management team? Roberto (2003) and Arendt et al. (2005), among others, suggest not. Roberto (2003), for instance, argues that there is a common core group for all strategic decisions, but that other members come in and out of the team depending on the strategic issue. Arendt et al. (2005) suggest that strategic choices are influenced not only by TMT members, but also other internal and external advisors. To the extent that the team demographic characteristics obtained based on public data diverge from the characteristics of the team members actually involved in strategic processes (and your theory), such divergence is highly problematic for construct validity. The second area of difficulty is related and has similar validity consequences. Specifically, should team membership (and therefore the TMT demographic characteristics) be defined by the organization’s structure (i.e., who is shown to be a member of the team) or by the strategic decision (i.e., a merger or acquisition versus the launch of a new product or diversification)? Determining Construct Valid Measures Researchers in the upper echelons perspective, as in most streams of research, have not organized a well orchestrated process through which to develop valid measures of the constructs in the field. Instead, there have been a variety of approaches to measure development, which have contributed to learning and serve to steer the field toward increasingly more valid measurements of its constructs. One implication of this idea for a new UEP researcher is that it is important to understand the process of developing construct valid measures even though he or she is joining a process already very much underway. Nunnally and Bernstein (1994) describe three primary steps to achieving construct valid measures. First, they suggest that
Constructs and Construct Measurement in Upper Echelons Research
31
researchers should theorize about the domain of observables that could represent a construct. Inherent in this idea is that the process will specifically exclude observables thought to not represent the construct. This intuitive theorizing process does not, on its own, yield construct valid measures. Rather, it helps researchers to narrow the group of observables that will be tested. In the UEP, researchers use demographic measures to represent the mental models of top managers. While these measures are useful because they are readily available, it is recognized by researchers that more construct valid measures of TMT mental models are possible to find (Calori, Johnson, & Sarnin, 1994). A second step in determining construct valid measures is an analysis of the extent to which multiple measures of a single construct go together. This can be tested empirically by assessing the level of inter-correlation between two supposed measures of the same construct. Researchers doing UEP work might compare multiple demographic indicators to understand the extent to which they represent constructs related to TMT decision-making processes. For example, the researcher could employ Agresti and Agresti’s (1978) Index of Qualitative Variation (IQV). The IQV indicates demographic dispersion over multiple nominal categories, such as within education (i.e., engineering, science and arts/humanities). A final step in Nunnally and Bernstein’s (1994) framework for developing construct valid measures is to look at existing relationships between focal measures and measures of other constructs. If a particular measure behaves as expected in relation to variables not related to the focal construct, the researcher gains information about the focal measure’s construct validity. For example, if an UEP researcher wanted to use a subjective measure of firm performance (such as the response to a survey question) instead of a more typical objective measure (e.g., profitability in the form of return on sales or return on investment), he or she could test the correlation of the subjective measure to an unrelated measure that has been shown to be positively correlated with objective measures of firm performance, such as the degree of a firm’s multinationality (Delios & Beamish, 1999). Challenges to Construct Validity One way to avoid problems in construct definition in an UEP study is to be aware of the most common threats to construct validity found in published research. Schwab (1999) categorizes these threats as being due to either deficiency (under-representation of the construct), contamination (constructs including surplus irrelevancies), or unreliability (randomness in measurement).
32
MASON A. CARPENTER AND GREGORY P. REILLY
Cook and Campbell (1979) list measure development practices, which lead to one or more of the above problems. One practice is described as poor explication of the construct prior to identification of the measure. If a construct is vague, it is less likely that different researchers will utilize similar measures to operationalize it. One example of this problem can be found in the literature related to the goal commitment construct. Goal commitment was originally defined as determination to reach a goal (Locke & Latham, 1990). This vague definition led to disagreement about measures. Hollenbeck et al. (1989) introduced a scale that was attacked by others (Tubbs, 1993), who questioned the correspondence between the measure and the construct. A second practice is use of a single operation or method to generate measures for variables. As noted by Cook and Campbell, ‘‘Since single operations both under-represent constructs and contain irrelevancies, construct validity will be lower in single exemplar research than in research where each construct is multiply operationalized’’ (1979, p. 65). Tests of Construct Validity Schwab notes that ‘‘Since the criterion in construct validity is conceptual, direct tests are not possible. As a consequence, construct validity must be inferred on the basis of indirect assessments’’ (Schwab, 1980, p. 34). In this spirit, he offers five procedures for indirectly testing construct validity (Schwab, 1999). First, a researcher can assess the content validity of a measure by using expert judges to review proposed measures. Second, reliability statistics can be used to assess the extent to which a measure is free from random errors. Third, a researcher can test the convergent validity of multiple indicators that are supposed to measure the same construct. Fourth, tests of discriminant validity can show that measures of different constructs do not converge with measures of a focal construct. Finally, nomological networks of construct relationships can help to provide evidence of construct validity when research provides support for the relationships proposed in the network.
CONCLUDING WITH A BENCHMARK PROCESS FOR UPPER ECHELONS RESEARCH The aim of this chapter has been to provide specific guidance and tools to students and scholars interested in using the UEP in their research. These frameworks and tools are expected to help ensure the theoretical importance
Constructs and Construct Measurement in Upper Echelons Research
33
and internal and external construct validity of their work. In that spirit, we conclude our essay by providing a checklist tool to readers, summarized in Table 1, which categorizes the steps researchers should consider to ensure that the constructs chosen are relevant and valid. The tool provides a reminder for researchers to think broadly about what constructs from the UEP to include in a study, to identify interesting relationships and to ensure that constructs are validly measured. We remain steadfastly optimistic about the import and bountifulness of future research opportunities in the UEP tradition and expect such research to continue to proliferate, albeit Table 1.
Benchmark Process for Upper Echelons Perspective Research Development.
1. Define UEP Constructs of Interest from the Categories below &TMT demographics ___________________________________________________ &TMT characteristics ___________________________________________________ &Attributes of TMT decision-making ___________________________________________________ &Attributes of other strategy processes ___________________________________________________ &Organizational outcomes ___________________________________________________ &Attributes of operating processes ___________________________________________________ &Contextual factors ___________________________________________________ 2. Combine Constructs to Develop Hypotheses &Use Fig. 2 to diagram planned hypotheses &Describe in words your causal story 3. Test your Construct-Level Idea &Does the causal story make sense? &Is proposed relationship linked to existing theory and a current conversation? &Would the results of your study be interesting? 4. Select Valid Measures &Theorize about the domain of variables &Identify potential measures of each construct &Check intercorrelation of multiple measures 5. Test Construct Validity of Measures &Assess content validity &Assess reliability &Assess convergent validity &Assess discriminant validity &Assess fit with nomological network
_______________________ _______________________ _______________________ _______________________ _______________________ _______________________ _______________________
34
MASON A. CARPENTER AND GREGORY P. REILLY
subject to increasingly stringent hurdles in terms of theoretical novelty and empirical rigor. Best wishes with your research.
REFERENCES Agresti, A., & Agresti, B. (1978). Statistical analysis of qualitative variation. Sociological Methodology, 9, 204–237. Amason, A. C., & Sapienza, H. G. (1997). The effects of top management team size and interaction norms on cognitive and affective conflict. Journal of Management, 23(4), 495– 601. Arendt, L., Priem, R., & Ndofor, H. (2005). A CEO-advisor model of strategic decision making. Journal of Management, 31(5), 680–699. Baum, J. R., & Wally, S. (2003). Strategic decision speed and firm performance. Strategic Management Journal, 24(11), 1107–1129. Bergh, D. (2003). Thinking strategically about contribution. Academy of Management Journal, 46(2), 135–136. Calori, R., Johnson, G., & Sarnin, P. (1994). CEO’s cognitive maps and the scope of the organization. Strategic Management Journal, 15, 437–458. Carpenter, M. A., & Fredrickson, J. W. (2001). Top management teams, global strategic posture, and the moderating role of uncertainty. Academy of Management Journal, 44, 533–546. Carpenter, M. A., Sanders, W. G., & Geletkanycz, M. A. (2004). The upper echelons revisited: The antecedents, elements, and consequences of TMT composition. Journal of Management, 30(6), 749–778. Collins, C. J., & Clark, K. D. (2003). Strategic human resource practices, top management team social networks, and firm performance: The role of human resource practices in creating organizational competitive advantage. Academy of Management Journal, 46(6), 740–751. Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis for field settings. Chicago: Rand McNally. Daft, R. L. (1995). Why I recommended that your manuscript be rejected and what you can do about it. In: L. L. Cummings & P. J. Frost (Eds), Publishing in the organizational sciences (2nd edn., pp. 164–182). Thousand Oaks, CA: Sage. Dansereau, F., & Yammarino, F. (Eds). (2005). Multi-level issues in strategy and methods. Research in Multi-Level Issues, 4, 197–276. Davis, M. (1971). That’s interesting!: Towards a phenomenology or sociology and a sociology of phenomenology. Philosophy of the Social Sciences, 1, 309–344. Delios, A., & Beamish, P. W. (1999). Geographic scope, product diversification and the corporate performance of Japanese firms. Strategic Management Journal, 20, 711–728. Finkelstein, S., & Hambrick, D. (1996). Top executives and their effects on organizations. St. Paul, MN: West Publishing Company. Hambrick, D. C. (1992). Commentary: Consequences of group composition for the interpersonal dynamics of strategic issue processing. In: P. Shrivastava, A. Huff & J. Dutton (Eds), Advances in strategic management (pp. 383–389). Greenwich, CT: JAI Press. Hambrick, D. C. (1992). Top management groups: A conceptual integration and reconsideration of the ‘‘team’’ label. In: B. Staw & L. L. Cummings (Eds), Research in organizational behavior (Vol. 16 pp. 171–213). Beverly Hill: JAI Press.
Constructs and Construct Measurement in Upper Echelons Research
35
Hambrick, D. C., Finkelstein, S., & Mooney, A. C. (2005). Executive job demands: New insights for explaining strategic decisions and leader behaviors. Academy of Management Review, 30, 472–491. Hambrick, D. C., & Mason, P. A. (1984). Upper echelons – The organization as a reflection of its top managers. Academy of Management Review, 9(2), 193–206. Hollenbeck, J., Williams, C., & Klien, H. (1989). An empirical examination of the antecedents of commitment to difficult goals. Journal of Applied Psychology, 74(1), 18–23. Jackson, S. (1992). Consequences of group composition for the interpersonal dynamics of strategic issue processing. In: P. Shrivastava, A. Huff & J. Dutton (Eds), Advances in strategic management (pp. 345–382). Greenwich, CT: JAI Press. Kellermans, F., Walter, J., Lechner, C., & Floyd, S. (2005). The lack of consensus about strategic consensus: Advancing theory and research. Journal of Management, 31(5), 719–737. Ketchen, D. (2002). Some candid thoughts on the publication process. Journal of Management, 28(5), 585–590. Locke, E., & Latham, G. (1990). A theory goal setting and task performance. Engelwood Cliffs, NJ: Prentice-Hall. Markoczy, L. (2001). Consensus formation during strategic change. Strategic Management Journal, 12, 1013–1031. Nunnally, J., & Bernstein, I. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill. Peterson, R. S., Smith, D. B., Martorana, P. V., & Owens, P. D. (2003). The impact of chief executive officer personality on top management team dynamics: One mechanism by which leadership affects organizational performance. Journal of Applied Psychology, 88, 795–808. Pfeffer, J. (1983). Organizational demography In: L. L. Cummings & B. M. Staw (Eds), Research in organizational behavior (Vol. 5. pp. 299–357). Greenwich, CT: JAI Press. Roberto, M. (2003). The stable core and the dynamic periphery in top management teams. Management Decision, 41, 120–131. Schwab, D. P. (1980). Construct validity in organizational behavior. Research in Organizational Behavior, 2, 3–43. Schwab, D. P. (1999). Research methods for organizational studies. Mahwah, NJ: Lawrence Erlbaum. Song, J. (1982). Diversification strategies and the experience of top executives of large firms. Strategic Management Journal, 3, 377–380. Tubbs, M. (1993). Commitment as a moderator of the goal–performance re-lation: A case for clearer construct definition. Journal of Applied Psychology, 78(1), 86–97. West, C. T., & Schwenk, C. R. (1996). Top management team strategic consensus, demographic homogeneity and firm performance: A report of resounding non-findings. Strategic Management Journal, 17, 571–576. West, M. A., & Anderson, N. R. (1996). Innovation in top management teams. Journal of Applied Psychology, 81(6), 680–693. Whetten, D. A. (1989). What constitutes a theoretical contribution. Academy of Management Review, 14(4), 490–495.
This page intentionally left blank
36
SURVEYING THE CORPORATE ELITE: THEORETICAL AND PRACTICAL GUIDANCE ON IMPROVING RESPONSE RATES AND RESPONSE QUALITY IN TOP MANAGEMENT SURVEY QUESTIONNAIRES Michael K. Bednar and James D. Westphal ABSTRACT Survey research of top managers is critical to addressing many contemporary research questions in the field of strategic management. Yet, the threat of low response rates has discouraged many researchers from attempting this type of work, steering the field of strategic management away from issues related to strategic process. This article provides an empirical examination of factors that determine the likelihood and quality of response to top management surveys. More generally, we advance a theoretical perspective on survey response rooted in social influence theory that should help researchers make better choices about the design of their survey questionnaires.
Research Methodology in Strategy and Management, Volume 3, 37–55 Copyright r 2006 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1479-8387/doi:10.1016/S1479-8387(06)03004-9
37
38
MICHAEL K. BEDNAR AND JAMES D. WESTPHAL
Questionnaire surveys of top management have historically suffered from very low response rates, which increase the risk of sample selection bias (Heckman, 1979; Fowler, 1993). Sample selection bias, in turn, threatens the internal and external validity of statistical tests performed on the data (Berk, 1983). Thus, for those interested in studying top executives, survey research is perceived to be a risky endeavor and many researchers are deterred from attempting this type of work. Yet, questionnaire surveys, especially aimed at top executives, are critical to addressing many contemporary research questions in the field of strategic management. Many topics of interest simply cannot be addressed with available archival data and top executives are often the only individuals with the necessary knowledge to answer questions concerning organizational-level phenomena, especially issues related to strategic process (Zajac, 1990; Pettigrew, 1992). The aversion to survey research directed at top executives has thus limited the types of questions that strategy scholars have asked, and steered the field of strategic management toward macro and content oriented research that can be examined with more easily accessible archival data. Some have lamented that issues relating to strategic process and the human side of strategy have been greatly underexplored (Hambrick, 2004). For example, more fine grained phenomena among top managers including various social and cognitive processes involved in the formulation and implementation of strategy remain largely understudied. There are many reasons why executives do not respond to questionnaire surveys. Executives are exceedingly busy individuals who often lack sufficient time to perform critical job demands, let alone respond to an academic survey. In addition, some surveys call for potentially sensitive data about the firm that executives are reluctant to reveal, despite promises of confidentiality. Some executives may not respond due to lack of interest in the study or a company policy against returning questionnaires (Baruch, 1999). Despite these obstacles, however, some research demonstrates that executives can be successfully surveyed with adequate response rates to yield valid measures of theoretically important constructs (Zajac, 1990; Westphal, 1998, 1999; Steensma & Corley, 2001; Christmann, 2004). Nevertheless, most survey research of executives suffers from very low response rates with the authors citing previous studies with similarly low response rates as a justification. Perhaps more importantly, the prospect of low response rates has discouraged researchers from attempting surveys of corporate elites, and has ultimately helped steer the strategy field away from empirical research on micro-behavioral processes. The purpose of this chapter is to take a first step toward understanding how specific survey characteristics affect not only the rate of response
Surveying the Corporate Elite
39
among executives, but also the quality of the responses. We hope that this chapter will serve as a practical guide for researchers interested in conducting survey research on the corporate elite. In addition to practical guidance about specific techniques for increasing survey effectiveness among executives, more theoretical grounding is needed to guide our survey research practices. The literature on survey practices is generally not theory-driven (see Groves, Cialdini, & Couper, 1992 for an exception). In this chapter, we draw on theory from social psychology to provide a better understanding of why executives may or may not fill out questionnaire surveys, and what factors may influence the quality of responses. We develop a theoretical framework that suggests how social influence theory, including the wellestablished principles of reciprocity, social proof, and legitimacy and authority, can be applied by strategy researchers to improve survey response rates and the quality of responses among the population of corporate elites. We test our theoretical framework with a large scale executive survey and conclude by discussing the implications of this study and the prospects for future survey research on top managers.
DETERMINANTS OF SURVEY RESPONSE AND RESPONSE QUALITY From a purely rational perspective, individuals who receive a survey must analyze the costs and benefits of participation. To encourage participation in a survey, researchers should seek to minimize the costs of completing the questionnaire (Dillman, 1991). For example, lengthy surveys require more time and effort to complete and would generally be perceived as more costly to potential respondents than shorter questionnaires. Research on survey methods generally shows that longer questionnaires prompt lower response rates (Yammarino, Skinner, & Childers, 1991; Jobber & Saunders, 1993). Given the time constraints facing executives, we expect that survey length will be an especially important determinant of response rates for corporate elites. Longer surveys may also decrease the quality of response. We speculate that individuals tend to devote a fixed amount of time and effort to filling out a questionnaire, regardless of its length (e.g. an executive may decide to spend 5 min on a survey and will spend that amount of time whether the questionnaire is 1 page or 5 pages). Hence, as the survey increases in length, executives may give hurried, less reliable answers in an attempt to finish in the predetermined amount of time, resulting in less reliable responses.
40
MICHAEL K. BEDNAR AND JAMES D. WESTPHAL
While the length of the survey is extremely important to induce quality responses, the decision to complete a questionnaire does not simply involve a rational calculation of costs and benefits. From a social influence perspective, the decision to fill out a survey is an act of compliance, which can be affected by social norms. Researchers can use their knowledge of these social norms to increase the likelihood that executives will respond to their surveys and that they will make the necessary effort to produce high quality responses. Norm of Reciprocity The norm of reciprocity is a nearly universal code of moral conduct wherein individuals perceive an obligation to reciprocate a received benefit in some manner (Gouldner, 1960). Individuals who receive favors, gifts, concessions, or other favorable treatment often experience a psychological sense of indebtedness. Because the sense of indebtedness is unpleasant for most people, they typically seek opportunities to reduce its psychological burden through reciprocation (Greenberg, 1980; Cialdini, 1993; Uehara, 1995; Settoon, Bennett, & Liden, 1996; Cialdini, 2001). In short, when we receive something, there is typically a powerful perceived obligation to give something in return. There is also theory and evidence to suggest that the norm of reciprocity can be initiated by unsolicited favors (Befu, 1980; Cialdini, 1993) and that the substance and value of what is exchanged between parties can vary somewhat (Deckop, Cirka, & Andersson, 2003; Molm, 2003). Thus, small favors can potentially induce greater favors in return. The norm of reciprocity can prompt exchange between parties in situations where exchange does not seem likely from a purely rational or economic perspective. There is evidence that the norm of reciprocity persists even in situations where the recipient has no expectation of receiving further benefits from the giver, as in exchanges between strangers (Hoffman, McCabe, & Smith, 1998; Whatley, Webster, & Smith, 1999; Perugini, Gallucci, Presaghi, & Ercolani, 2003). The powerful nature of this norm is further highlighted by research suggesting that individuals often incur significant economic costs to return a favor (Fehr & Gachter, 2000) and sometimes reciprocate favors from individuals whom they dislike (Regan, 1971; Cialdini, 1993). Individuals who are familiar with the norm of reciprocity can exploit their knowledge of the norm by rendering favors to powerful individuals who have the potential to benefit or harm them. In fact, personal favors aimed at powerful actors in organizations can result in a variety of positive outcomes including improved performance evaluations, higher pay, and increased
Surveying the Corporate Elite
41
likelihood of promotions (e.g.,Yukl & Tracey, 1992; Westphal & Stern, 2005) while decreasing the likelihood of negative actions such as firings, demotions or pay cuts (Westphal, 1998). It is important to note that the norm of reciprocity can break down if the initial favor, gift, or concession is viewed by the other party as a bribe or as a measure to apply pressure to coerce compliance with the request (see Brehm & Cole, 1966; Weiner & Brehm, 1966; Groves et al., 1992). In fact, if the initial behavior is viewed as a bribe, then subsequent compliance to the request may actually be less likely. While there is a strong norm to respond to a genuine gift, no such action is required in response to a sales trick, gimmick or bribe (Cialdini, 2001). In survey research, the norm of reciprocity provides a theoretical basis for including monetary and non-monetary incentives with a questionnaire. Research has shown that the inclusion of even a small amount of money or other incentive with a survey can significantly increase response rates, even if the reward is not made contingent upon completion of the survey (Church, 1993). Individuals who receive a small gift from the researcher should feel an obligation to reciprocate this treatment by filling out the survey. In addition, potential respondents who receive a gift or a promise of some favor may increase the amount of time and effort they devote to completing the survey, thus resulting in more considered responses and ultimately higher reliability. Some may ask whether the inclusion of a small monetary incentive can influence the behavior of executives who often earn very high salaries. As mentioned, the norm of reciprocity transcends rational consideration of costs and benefits, such that even small gifts can induce large favors in return (Cialdini, 2001). Owing to the pervasiveness of this norm, we would not expect executives to be exempt from its influence. Social Proof Individuals often rely on others as a standard of comparison before making decisions about an appropriate course of action (Festinger, 1954). In their studies of bystander intervention, Latane and Darley (1970) found that bystanders to a possible emergency were influenced by the actions of those around them. When others failed to act in a concerned manner, individuals were less likely to respond to the emergency. Other studies have similarly found that the actions of groups of people can greatly influence the behavior of others. Milgram, Bickman, and Berkowitz (1969) conducted an experiment in which a researcher would stop in a busy street and look up at nothing in particular. When just one individual looked up, the action did
42
MICHAEL K. BEDNAR AND JAMES D. WESTPHAL
not garner much attention, but when five individuals engaged in the behavior, the research team found that over 80% of those passing by would stop to look up as well. These examples illustrate the powerful principle of social proof. Cialdini (1993) asserts that we tend to view a behavior as correct in a given situation to the degree to which we see others performing it. Whether the question is what to do with an empty popcorn box in a movie theater, how fast to drive on a certain stretch of highway, or how to eat chicken in a restaurant, the actions of those around us will be important guides in defining the answer (p. 95).
He further states that if a lot of people are doing the same thing, they must know something we don’t. Especially when we are uncertain, we are willing to place an enormous amount of trust in the collective knowledge of the crowd (Cialdini, 1993, p. 131).
Social proof is derived from the idea that most individuals are followers rather than initiators and seems to work best when the proof is provided by the actions of many other people (Cialdini, 2001). Advertisers often use this idea to their advantage by claiming that a product is the best-selling or fastest growing. If lots of people are already using a certain product, we often infer that it must be a quality product. Although the idea of social proof is often associated with inefficient outcomes, as in the bystander intervention studies, the use of social proof as a decision-making tool is not necessarily maladaptive. Very often, many people engage in a behavior because it is the appropriate thing to do, such that looking to others before deciding how to act can lead to effective solutions. Thus, the tendency to rely on the actions of others is a natural decisionmaking tool in many situations (Cialdini, 2001). The principle of social proof is consistent with organizational scholars’ explanation for why firms often imitate others in response to uncertainty in decision-making (Cyert & March, 1963; Rao, Greve, & Davis, 2001). The idea of mimetic isomorphism from institutional theory for example, suggests that under conditions of uncertainty, organizations will look to others and mimic their behavior (DiMaggio & Powell, 1983). Through the imitation of others, firms can gain legitimacy that is often an important antecedent to survival. An understanding of social proof can be applied by survey researchers to increase the likelihood that executives will complete a questionnaire and to increase the quality of responses. According to social proof, to the extent that executives believe that similar others have already completed a survey, they should be more willing to do so. When surveying executives, researchers
Surveying the Corporate Elite
43
can include information in the instructions or on the cover letter indicating that many other executives have already participated in the same survey (or similar surveys conducted by the same researchers). On the basis of this information, some executives should make the assumption, consciously or otherwise, that if many other top managers have filled out the survey, it must be important and worth the time to complete. The fact that similar others have participated in the survey gives it credibility and suggests that taking the time to respond would not be counter normative. If the survey is viewed as credible and important, respondents should devote more time to each question resulting in more considered, accurate, and reliable responses. Legitimacy and Authority People are typically more willing to comply with a request if it is made by an individual or organization perceived as having legitimate authority (Cialdini, 2001). From birth, we are socialized to respond to requests from individuals in positions of authority. Research has demonstrated that this norm of responding to perceived legitimate authority is extremely powerful in affecting behavior. For example, Milgram’s famous study demonstrated just how far individuals were willing to go when asked to do something by a person in a position of authority. In his study, Milgram found that individuals were willing to inflict great amounts of pain on others if a ‘‘scientist’’ in a position of legitimate authority made the request to do so (Milgram, 1963). As with the principle of social proof, responding to requests from individuals in positions of authority is often a beneficial decision-making heuristic. People with legitimate authority can often supply access to valuable resources and information. Generally, individuals in positions of authority have satisfied certain requirements pertaining to their education, work experience or professional certification to obtain their position and in the process, have accumulated a wealth of knowledge or expertise. Research from the literature on persuasion demonstrates that individuals possessing expert power, or relevant knowledge and information not possessed by the influence target, are more successful in their attempts at persuasion (Porter, Allen, & Angle, 1981; Raven, 1999). In fact, research suggests that we often take for granted the appropriateness of requests made by legitimate experts (Eagly & Chaiken, 1993; Ziegler, Diehl, Zigon, & Fett, 2004). Individuals often look for cues concerning the authority or legitimacy of others when responding to requests. For example, titles such as educational degrees or professional certification are indicators of legitimate expertise. Requests made by high status individuals or organizations should also be
44
MICHAEL K. BEDNAR AND JAMES D. WESTPHAL
viewed as more legitimate. For this reason, survey researchers sometimes include reference to a sponsoring institution or include an endorsement of the survey from prominent executives. Research on survey methods suggests that the inclusion of a university sponsor can significantly increase the expected response rate in surveys (Green, Boser, & Hutchinson, 1998; Jobber & O’Reilly, 1998; Greer, Chuchinprakam, & Seshadri, 2000). Sponsorship of the survey by a prominent university should be an important signal to potential survey respondents about the legitimacy and importance of the survey and should provide assurance to the respondent that the survey will be conducted in a professional manner. Endorsement by a prominent executive should also legitimize the survey in the eyes of the potential respondent, increasing the likelihood of response. In addition, respondents who perceive the survey to be legitimate may increase the amount of time and effort they devote to the survey, thus increasing the quality of response. Helping Norm In most societies, a norm of helping, often referred to as a norm of social responsibility, exists wherein people feel a moral obligation to help those in need and who are dependent upon them for aid (Berkowitz & Daniels, 1964; Groves et al., 1992). For example, an interviewer standing at a doorstep may be more likely to get into homes on a rainy day than on a sunny day because people feel a greater obligation to help under such circumstances. Researchers can appeal to this norm to potentially increase the rate and quality of response. For example, Mowen and Cialdini (1980) found that response rates in interview surveys increased dramatically simply by adding the phrase ‘‘it would really help us out’’ to the end of the request. Survey researchers are entirely dependent on the potential respondents for the success of the survey. Making this point salient with a plea for help could induce increased participation due to the helping norm. Similarly, respondents may be willing to devote more time and effort to the survey in order to help out the researcher, potentially resulting in higher quality responses.
METHOD The preceding discussion outlines principles from the social influence literature that can inform our understanding of executive survey response rates and the quality of their responses. We tested these ideas in a large-sample
Surveying the Corporate Elite
45
survey of top executives. Specifically, we sent survey questionnaires to top managers at 500 companies randomly selected from the Reference USA index of mid-sized companies. Companies in the sample frame were between $50 million and $100 million in total revenues. We selected up to seven senior officers from each company with the title of Vice President or higher. If the firm had more than seven senior officers, seven were randomly selected. This resulted in a sample frame of 2,632 top managers. To assess how features of the survey questionnaire affect the likelihood and quality of response, different versions of the questionnaire were randomly assigned to managers in the sample frame. The survey response rate was 36%, resulting in a sample of 958 top managers from 387 companies. We randomly assigned surveys to recipients based on the following characteristics. Survey Length. The length of the survey may be an especially important determinant of response rates and quality of responses from executives. Shorter questionnaires require less time to complete, thus reducing the perceived cost of response. To test this idea, we developed four versions of the survey, which were randomly assigned to executives in the sample frame. The first version included questions about board monitoring of top management (Westphal, 1999), the provision and seeking of strategic advice from other managers (McDonald & Westphal, 2003), and friendship and advice ties to outside directors (Westphal, 1999). The second version of the questionnaire included survey scales to assess task conflict and relational conflict among members of the top management team; these scales were based on measures developed by Jehn (1995). The third version of the questionnaire included all the questions from version 1 followed by questions from version 2, and the fourth version of questionnaire included questions from the second version followed by questions from the first version. Pre-testing indicated that the first version required about 10 min to complete, the second version required about 7 min to complete, and the third and fourth versions required about 17 min to complete. Reciprocity. The norm of reciprocity suggests that individuals should be more likely to fill out a survey, and produce higher quality responses after receiving a favor or gift. We tested this idea in several ways. Two different types of monetary incentives were included in the survey. Some individuals were given a dollar while others were given 50 cents. This manipulation was used to determine how the amount of the monetary incentive affects response. We also sent surveys in which respondents were promised to receive a summary report of the results of the study. To examine whether the conditionality of a gift affects the likelihood or quality of response, we made
46
MICHAEL K. BEDNAR AND JAMES D. WESTPHAL
half of the promised reports unconditional on the response while telling the other recipients that they would only receive a report upon the successful completion of the survey. Social Proof. To test our theoretical arguments about the affect of social proof, in one manipulation of the survey, we included a line in the cover letter indicating that many of the executive’s peers had responded to prior surveys by the same researchers. Specifically, the cover letter states that ‘‘over 5000 top managers and directors have responded to our prior surveys [on corporate governance and strategy].’’ Legitimacy and Authority. We included appeals to legitimate authority in two ways. First, in some surveys we included an endorsement of the survey by a prominent executive. The cover letter notes that the survey is endorsed by the executive, and his/her signature is included at the bottom of the letter. Second, some surveys were printed on official letterhead from the McCombs School of Business at the University of Texas at Austin and enclosed in envelopes that contained the university name and printed symbol. Helping Norm. We tested for possible effects of the helping norm in one condition by including a plea for help from the researchers to the potential respondents. After describing the topic of the survey, the cover letter states: ‘‘our knowledge about corporate governance depends on information from top executives such as yourself. Please help us learn more about this important subject by responding to this survey.’’ Because we had data from multiple executives at the same company, we could test for inter-rater agreement among executives at the same firm. We used inter-rater agreement as a measure of the quality of the survey response. To the extent that executives make an effort to carefully respond to survey items, the level of agreement between executives should increase. Moreover, inter-rater agreement is an indicator of the validity of the survey measures. Thus, if a particular survey characteristic increases inter-rater agreement of a survey measure, it increases the validity of that measure. We measured inter-rater agreement as the average difference between the focal manager’s standardized responses to survey items about board monitoring and task and relational conflict and the responses of another manager from the same company to the same set of questions. When there was more than one other respondent from the same company, one respondent was randomly selected. We used logit regression to estimate the likelihood of responding to the survey, and we used a Heckman selection model to estimate inter-rater agreement. The Heckman model is a two-stage procedure in which the firststage model estimates the likelihood of responding to the survey with
Surveying the Corporate Elite
47
probit regression and the second-stage model incorporates parameter estimates from the selection equation to predict inter-rater agreement using multiple regression (Heckman, 1979). This procedure ensures that regression estimates are not biased by unmeasured differences between managers who responded to the survey and managers in the larger sample frame.
RESULTS Results of the logit and heckman selection models are presented in Table 1. The results show that the length of the questionnaire significantly influenced responses to the survey. Managers who received the long version of the questionnaire were less likely to respond than managers who received the medium-length version. Moreover, inter-rater agreement was lower for the long version of the survey, suggesting that the length of the survey lowered the quality of responses. Individuals receiving the short version were also significantly more likely to respond than individuals receiving the medium-length survey. However, the quality of the responses to the short version was not significantly different from the quality of responses to the medium condition. The comparison of the medium-length survey vs. the short survey is a highly conservative test of the effect of survey length on the likelihood of response and inter-rater agreement, as the medium-length version is only 3 min longer than the short version. Evidence that an increment of 3 min in survey length has a significant effect on the likelihood of response suggests that survey length is a very important determinant of the response decision for this population. In fact, based on the estimated coefficients in the logit analysis, the odds of receiving a response to the short survey were 72% greater than with the medium version. Compared to the medium-length version of the survey, using the longer version decreased the odds of a positive response by 65% for version one and 55% for version two. The results also provide evidence that the norm of reciprocity influences response to executive surveys. The inclusion of monetary incentives of either 1 dollar or 50 cents was associated with increased likelihood of response. Individuals receiving either type of monetary incentive were approximately 50% more likely to respond than those not receiving the incentive. Inter-rater agreement was also higher for individuals receiving either type of monetary incentive. Interestingly, the size of the incentive did not seem to matter: The magnitude of the incentive effect on the likelihood of response and inter-rater
48
MICHAEL K. BEDNAR AND JAMES D. WESTPHAL
Table 1. Models of Response Likelihood and Inter-Rater Argeement. Independent Variable
Length: short version vs. medium version Length: long version (1) vs. medium version Length: long version (2) vs. medium version UT letterhead Endorsement by executive Fifty cents included with survey Dollar included with survey Promise of summary report (if respond) Promise of summary report (no strings attached) Plea for help Note that peers have responded Constant Wald w2
Logit Regression Model of Survey Response
Heckman Selection Model of Inter-Rater Agreement
0.542 (0.216) 1.041 (0.156) 0.788 (0.145) 0.049 (0.111) 1.000 (0.150) 0.404 (0.123) 0.446 (0.127) 0.974 (0.217) 0.674 (0.165) 0.100 (0.093) 0.292 (0.098) 1.366 (0.230) 237.16
0.062 (0.036) 0.109 (0.024) 0.067 (0.017) 0.014 (0.015) 0.020 (0.016) 0.060 (0.017) 0.089 (0.016) 0.064 (0.028) 0.070 (0.023) 0.021 (0.013) 0.002 (0.013) 0.129 (0.031) 160.86
N ¼ 2,632. Note: z-statistics are one-tailed for hypothesized effects, two-tailed for control variables. Standard errors are in parentheses. po0.05. po0.01. po0.001.
agreement was not significantly greater for 1 dollar vs. 50 cents. The results suggest that the inclusion of monetary incentives is an effective means of invoking the norm of reciprocity, and that even very small favors can prompt a response. This finding speaks to the strength of the norm or reciprocity and the obligation that individuals feel to reciprocate a gift or favor. Some may be surprised to find that an unsolicited gift of 50 cents to an executive, to whom the money is materially insignificant, significantly
Surveying the Corporate Elite
49
increased the likelihood of response. It also appears that the gift elicited more considered and accurate responses, as evidenced by the effect on interrater reliability. The promise of a summary report also increased the likelihood of response. Those promised a report conditional on their completion of the survey were over two and a half times more likely to respond than those not promised a report. Similarly, for those receiving an unconditional promise of a report, the odds of responding increased by 96%. As for the quality of response, when the report was promised regardless of whether or not the survey was completed, inter-rater agreement was higher than in cases where no promise was given. Interestingly, when the report was made conditional on the completion of the survey, the quality of response actually decreased. Cialdini (1993) suggests that incentives are helpful to the extent that they are seen as gifts and not as bribes. Perhaps in the conditional case, the incentive (i.e., the report) was viewed by the respondent as a bribe to induce compliance and the individual was less likely to devote the effort necessary to produce a quality response. Alternatively, whereas the unconditional case may induce respondents to devote extra effort to filling out the survey by triggering the norm of reciprocity, the conditional case may trigger a rational exchange mindset in which respondents do the minimum required to receive a report. Social proof also appears to have had an effect on the response rate. Indicating on the survey that similar others had responded to the questionnaire had a positive effect on the likelihood of response, increasing the odds of response by 40% compared to surveys with no such inclusion. It seems that the reassurance that many similar individuals had participated in surveys by the same researchers made managers more likely to respond. However, it did not affect inter-rater agreement. Endorsement by an executive was associated with an increased likelihood of response, although inter-rater agreement was unaffected. The coefficient from the logit regression suggests that an endorsement by an executive increased the odds of getting a response by a factor of 2.71. In other words, surveys with an executive endorsement were almost three times more likely to be completed and returned. This result is consistent with Cialdini’s idea that individuals are more likely to respond to a request from an individual or group perceived to be a legitimate authority figure (Cialdini, 1993). The presence of University letterhead had no statistically significant impact on the likelihood of response or inter-rater agreement. Moreover, the inclusion of a plea for help on the survey also did not affect the response rate or the quality of response. In this case, it does not appear that the norm of helping was an important determinant of survey response (Table 2).
50
MICHAEL K. BEDNAR AND JAMES D. WESTPHAL
Table 2. Summary of Advice about Conducting Surveys of Top Executives. Limit the number of items on the survey Get an endorsement from a prominent executive Include some type of incentive (i.e. small amount of money, promise of summary report) If you promise a summary report, do so whether or not they respond Inform respondents that others have responded to similar surveys Pleas for help and using university letterhead have no effect
CONCLUSION In this chapter, we have sought to invoke principles of social influence to suggest how certain features of surveys are more likely to elicit a response from top executives. In addition, we have provided initial evidence of how these principles can affect the quality of response, a subject that has received scant research attention in the past. This chapter suggests that survey researchers can use principles of reciprocity, social proof and appeals to legitimate authority to obtain better response rates and higher quality responses from top executives. Consideration of these principles can aid in survey design by indicating what characteristics are important to response rates and quality of response, and what characteristics are not. Thus, in addition to providing a theoretical framework to enhance our understanding of responses to survey questionnaires, our study may serve as a practical guide to researchers who are interested in conducting surveys of corporate elites. The first finding suggests the importance of keeping executive surveys short in length. Long versions of the survey were less likely to be completed and the quality of response suffered as well. It is important to note that even the long version of the survey in this study, which took only 17 min to complete, is rather short in comparison to most surveys conducted in psychology or organizational behavior. On the basis of the empirical findings of this chapter, it seems that surveys of top executives must include only essential questions. Results from this study further suggest that the use of incentives is a viable option for increasing response rates from executives. Even small amounts of money, as little as 50 cents, increased response rates significantly. Some research budgets may limit the extent to which monetary incentives can be used. Our results suggest that non-monetary incentives such as the promise of a summary report of the findings can be used to increase
Surveying the Corporate Elite
51
response. To ensure the quality of response, such promised incentives may be more effective when they are given unconditionally (i.e., regardless of whether a survey is returned). These findings suggest that survey characteristics, which invoke the norm of reciprocity can be effective in enhancing the rate and quality of survey responses by corporate elites. Executives also responded to cues about the legitimacy of the survey. Following the logic of social proof, they were more likely to return a survey when it was noted that many peers had already filled out the questionnaire. This information sends a strong signal to the executives that the survey must be worthwhile if others are participating. Researchers can increase executive response rates by including information in the cover letter or instructions about peers who have also responded to the survey. This finding points to the importance of having a meaningful stream of survey research, and may suggest that success begets success when working with executives. Noting that peers have responded is essentially costless, and yet it had a strong impact on the likelihood of response. Executive endorsement also appears to be a very useful tool in increasing response rates. Endorsement by a prominent executive is a powerful signal about the legitimacy and importance of the research. Survey researchers will benefit by establishing ties with executives who can endorse their work. Although some investment is required to develop and maintain such ties, the returns are considerable: Endorsement by a prominent executive increased the likelihood of response by a factor of 2.7. Of course, researchers must be able to explain the potential importance of their work to executives if they hope to receive such an endorsement. Two attributes of surveys studied here did not produce a significant increase in response rates for this sample. Specifically, the use of university letterhead and a plea for help did not have a significant effect on response rates or the quality of response. Perhaps affiliation with a university does not send a strong signal of legitimacy and importance to business executives on a par with endorsement by a prominent executive. Alternatively, the use of letterhead may have less impact than an official endorsement. Researchers may be better served by focusing their resources on the other survey attributes mentioned previously. It is useful to compare our theoretical framework and empirical results with previous research on the total design method, a popular survey design and administration system used by some researchers to increase survey response rates (see Dillman, 1978, 1991 for a review). The total design method provides detailed recommendations for the entire process of survey research. For example, the method gives guidance about the layout and ordering of
52
MICHAEL K. BEDNAR AND JAMES D. WESTPHAL
questions, the size of the survey booklet, the type of return envelope to be used, the timing of follow-up responses, and many other practical suggestions for administering a survey. When the total design method is strictly followed, research has shown that response rates can be significantly increased (Dillman, 1991). The method is based on the idea that respondents will be most likely to respond to a survey when the perceived costs of participation are less than the perceived benefits. Thus, the recommendations advocated by the total design method represent an attempt to increase the perceived rewards of participation, decrease the perceived costs, and increase the participant’s trust that the rewards will be realized (Dillman, 1991). These principles are consistent with many of the theoretical ideas presented in this chapter although we have not attempted to empirically test the value of the total design method for an executive population. It seems that survey researchers interested in studying top managers would be well served by incorporating aspects of the total design method that reduce costs and increase the perceived benefits of participation for managers. Our study complements the total design method in several ways. Consistent with the total design method logic, we consider the effect of reducing perceived costs to respondents (i.e. shortening the questionnaire length). However, we also provide evidence that the decision to respond to a survey goes beyond a rational calculation of costs and benefits and that invoking social norms and applying other social influence processes can encourage participation and increase response quality. While the total design method focuses on increasing response rate, we examine the effect of several survey characteristics on both response rate and response quality. Moreover, we have included in our study survey characteristics that seem especially pertinent to executives such as endorsement of the survey by a prominent executive and the promise of reports detailing the survey results. Thus, our study is directly applicable to researchers interested in studying executive populations. In addition, while the total design method does not explicitly address the role of incentives in survey research, we have evidence suggesting that several types of incentives can effectively increase the likelihood that executives will respond to questionnaires. Finally, we test social influence mechanisms such as pleas for help and social proof that are not addressed by the total design method. Through an improved understanding and application of the principles outlined in this chapter, we hope that scholars of strategic management will be better equipped to engage in surveys of corporate elites. We realize that executive surveys will always be perceived by some as overly risky, especially in comparison to archival methods, and thus many will be deterred from
Surveying the Corporate Elite
53
attempting this type of work. However, we believe that the potential returns to survey research of top executives are extremely high – both to individual scholars and to the field of strategic management as a whole. The effective use of survey methods could significantly broaden the kinds of research questions that scholars investigate. Surveys of corporate elites are critical to the development of cognitive and micro-social perspectives on the formulation and implementation of strategy. More generally, such research is essential to the study of strategy process phenomena, or what Hambrick (2004, p. 94) called ‘‘the human element’’ of strategy formulation and implementation. Similarly, executive surveys could accelerate progress in addressing a variety of understudied topics in organization theory, including institutional entrepreneurship, symbolic management, and the interface of politics and institutional processes. We hope that the theoretical and practical guidance offered in this chapter will be a first step in that direction.
ACKNOWLEDGMENTS We thank Dave Ketchen and seminar participants at Penn State University for their helpful comments.
REFERENCES Baruch, Y. (1999). Response rate in academic studies: A comparative analysis. Human Relations, 52, 421–438. Befu, H. (1980). Structural and motivational approaches to social exchange. In: M. Greenberg & R. H. Willis (Eds), Social exchange: Advances in theory and research. New York: Plenum Press. Berk, R. A. (1983). An introduction to sample selection bias in sociological data. American Sociological Review, 48, 386–398. Berkowitz, L., & Daniels, L. R. (1964). Affecting the salience of the social responsibility norm. Journal of Abnormal and Social Psychology, 68, 275–281. Brehm, J. W., & Cole, A. H. (1966). Effect of a favor which reduces freedom. Journal of Personality and Social Psychology, 3, 420–426. Christmann, P. (2004). Multinational companies and the natural environment: Determinants of global environmental policy standardization. Academy of Management Journal, 47, 747–760. Church, A. H. (1993). Estimating the effect of incentives on mail survey response rates: A metaanalysis. Public Opinion Quarterly, 57, 62–79. Cialdini, R. B. (1993). Influence: The psychology of persuasion. New York: Quill. Cialdini, R. B. (2001). Influence: Science and practice (4th ed.) Boston: Allyn & Bacon. Cyert, R. M., & March, J. G. (1963). A behavioral theory of the firm. Englewood Cliffs, NJ: Prentice-Hall.
54
MICHAEL K. BEDNAR AND JAMES D. WESTPHAL
Deckop, J. R., Cirka, C. C., & Andersson, L. M. (2003). Doing unto others: The reciprocity of helping behavior in organizations. Journal of Business Ethics, 47, 101–113. Dillman, D. A. (1978). Mail and telephone surveys: The total design method. New York: WileyInterscience. Dillman, D. A. (1991). The design and administration of mail surveys. Annual Review of Sociology, 17, 225–249. DiMaggio, P. J., & Powell, W. W. (1983). The iron cage revisited: Institutional isomorphism and collective rationality in organizational fields. American Sociological Review, 48, 147–160. Eagly, A. H., & Chaiken, S. (1993). The psychology of attitudes. Ft Worth, TX: Harcourt Brace Jovanovich College Publishers. Fehr, E., & Gachter, S. (2000). Fairness and retaliation: The economics of reciprocity. Journal of Economic Perspectives, 14, 159–181. Festinger, L. (1954). A theory of social comparison processes. Human Relations, 7, 117–140. Fowler, F. J. (1993). Survey research methods. Newbury Park, CA: Sage. Gouldner, A. W. (1960). The norm of reciprocity: A preliminary statement. American Sociological Review, 25, 161–178. Green, K. E., Boser, J. A., & Hutchinson, S. R. (1998). Response-rate differences and responseenhancement effects by population type. Psychological Reports, 83, 336–338. Greenberg, M. S. (1980). A theory of indebtedness. In: K. J. Gergen, M. S. Greenberg & R. H. Willis (Eds), Social exchange: Advances in theory and research (pp. 3–26). New York: Plenum Press. Greer, T. V., Chuchinprakam, N., & Seshadri, S. (2000). Likelihood of participating in mail survey research: Business respondents’ perspective. Industrial Marketing Management, 29, 97–109. Groves, R. M., Cialdini, R. B., & Couper, M. P. (1992). Understanding the decision to participate in a survey. Public Opinion Quarterly, 56, 475–495. Hambrick, D. (2004). The disintegration of strategic management: It’s time to consolidate our gains. Strategic Organization, 2, 91–98. Heckman, J. J. (1979). Sample selection bias as a specification error. Econometrica, 47, 153–161. Hoffman, E., McCabe, K. A., & Smith, V. L. (1998). Behavioral foundations of reciprocity: Experimental economics and evolutionary psychology. Economic Inquiry, 36, 335–352. Jehn, K. A. (1995). A multimethod examination of the benefits and detriments of intragroup conflict. Administrative Science Quarterly, 40, 256–282. Jobber, D., & O’Reilly, D. (1998). Industrial mail surveys: A methodological update. Industrial Marketing Management, 27, 95–107. Jobber, D., & Saunders, J. (1993). A note on the applicability of the Bruvold–Comer model of mail survey response rates to commercial populations. Journal of Business Research, 26(3), 223–236. Latane, B., & Darley, J. (1970). The unresponsive bystander: Why doesn’t he help? New York: Appleton-Century-Crofts. McDonald, M. L., & Westphal, J. D. (2003). Getting by with the advice of their friends: Ceos’ advice networks and firms’ strategic responses to poor performance. Administrative Science Quarterly, 48, 1–32. Milgram, S. (1963). Behavioral study of obedience. Journal of Abnormal and Social Psychology, 67, 371–378. Milgram, S., Bickman, L., & Berkowitz, L. (1969). Note on the drawing power of crowds of different size. Journal of Personality and Social Psychology, 13, 79–82.
Surveying the Corporate Elite
55
Molm, L. D. (2003). Power, trust, and fairness: Comparisons of negotiated and reciprocal exchange. In: S. R. Thye J. Skvoretz (Eds), Advances in group processes: Power and status organizing processes (Vol. 20, pp. 31–65). Elsevier: Ames. Mowen, J. C., & Cialdini, R. B. (1980). On implementing the door-in-the-face compliance technique in a business context. Journal of Marketing Research, 17, 253–258. Perugini, M., Gallucci, M., Presaghi, F., & Ercolani, A. P. (2003). The personal norm of reciprocity. European Journal of Personality, 17, 251–283. Pettigrew, A. M. (1992). The character and significance of strategy process research. Strategic Management Journal, 13, 5–16. Porter, L. W., Allen, R. W., & Angle, H. L. (1981). The politics of upward influence in organizations. Research in Organizational Behavior, 3, 109–164. Rao, H., Greve, H. R., & Davis, G. F. (2001). Fool’s gold: Social proof in the initiation and abandonment of coverage by wall street analysts. Administrative Science Quarterly, 46, 502–526. Raven, B. H. (1999). Kurt Lewin address: Influence, power, religion, and the mechanisms of social control. Journal of Social Issues, 55, 161–186. Regan, D. T. (1971). Effects of favor and liking on compliance. Journal of Experimental Social Psychology, 7, 627–639. Settoon, R. P., Bennett, N., & Liden, R. C. (1996). Social exchange in organizations: Perceived organizational support, leader-member exchange, and employee reciprocity. Journal of Applied Psychology, 81, 219–227. Steensma, H. K., & Corley, K. G. (2001). Organizational context as a moderator of theories on firm boundaries for technology sourcing. Academy of Management Journal, 44, 271–291. Uehara, E. (1995). Reciprocity reconsidered: Gouldner’s ‘‘moral norm of reciprocity’’ and social support. Journal of Social and Personal Relationships, 12, 483–502. Weiner, J., & Brehm, J. W. (1966). Buying behavior as a function of verbal and monetary inducements. In: J. W. Brehm (Ed.), A theory of psychological reactance. New York: Academic Press. Westphal, J. D. (1998). Board games: How CEOs adapt to increases in structural board independence from management. Administrative Science Quarterly, 43, 511–538. Westphal, J. D. (1999). Collaboration in the boardroom: Behavioral and performance consequences of CEO-board social ties. Academy of Management Journal, 42, 7–24. Westphal, J. D., & Stern, I. (2005). The other pathway to the boardroom: How interpersonal influence behavior can substitute for elite credentials and demographic majority status in gaining access to board appointments. The University of Texas at Austin. Whatley, M. A., Webster, M. J., & Smith, R. H. (1999). The effect of a favor on public and private compliance: How internalized is the norm of reciprocity? Basic Applied Social Psychology, 21, 251–259. Yammarino, F. J., Skinner, S. J., & Childers, T. L. (1991). Understanding mail survey response behavior: A meta-analysis. Public Opinion Quarterly, 55, 613–639. Yukl, G., & Tracey, J. B. (1992). Consequences of influence tactics used with subordinates, peers, and the boss. Journal of Applied Psychology, 77, 525–535. Zajac, E. J. (1990). CEO selection, succession, compensation and firm performance – a theoretical integration and empirical-analysis. Strategic Management Journal, 11, 217–230. Ziegler, R., Diehl, M., Zigon, R., & Fett, T. (2004). Source consistency, distinctiveness, and consensus: The three dimensions of the Kelley ANOVA model in persuasion. Personality and Social Psychology Bulletin, 30, 352–364.
This page intentionally left blank
56
MANAGERIAL CONSTRAINT: THE INTERSECTION BETWEEN ORGANIZATIONAL TASK ENVIRONMENT AND DISCRETION Brian K. Boyd and Steve Gove ABSTRACT Managerial constraint is a central theme in strategic management research. Although discussed using a variety of labels (including choice and determinism) and theoretical perspectives (including resource dependence and population ecology), the common question is the degree to which executives have choices or options when making decisions. Two of the most commonly used approaches for discussing constraint are organizational task environments (Dess & Beard, 1984) and managerial discretion (Hambrick & Finkelstein, 1987). These two papers share substantial commonalities in both their theoretical background and operationalization, raising the question of whether discretion and task environment are indeed separate constructs. This chapter reviews both conceptual and methodological issues associated with the use of task environment and discretion. Drawing on a review of published studies and original data analysis, we offer methodological suggestions for future research.
Research Methodology in Strategy and Management, Volume 3, 57–95 Copyright r 2006 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1479-8387/doi:10.1016/S1479-8387(06)03005-0
57
58
BRIAN K. BOYD AND STEVE GOVE
Do managers matter? And, if so, under what conditions? Managerial constraint is a key component in the debate between population ecology and strategic choice perspectives. Two frameworks: Dess and Beard’s (1984) model of organizational task environments, and Hambrick and Finkelstein’s (1987) managerial discretion are commonly cited sources for studies of managerial constraint. These two papers also share some interesting characteristics. First, while both papers are widely cited, there are only a small number of empirical applications of each framework. Within this set of empirical studies, there are widely varying practices regarding variable selection and the measurement of these variables; differences which can limit the generalizability of results. Additionally, the two perspectives draw on similar theory and employ similar – and sometimes identical – measures. Thus, although the two papers are rarely cited concurrently, it is not clear whether task environment are unique, overlapping, or identical constructs. The purpose of this chapter is to facilitate the use of task environment and discretion variables in future studies. We begin with a review of the two constructs, including theoretical foundations and intended focus. Second, we review empirical studies that have used task environment or discretion variables, including a content analysis of methodological practices. Next, we discuss the implications of different approaches to definition and measurement. We assess these measurement options based on an analysis of data from 130 industry groups. We offer suggestions based on these analyses for future studies. Finally, we evaluate the overlap between discretion and task environment.
CONSTRUCT DEVELOPMENT AND APPLICATION In this section, we will review two topics. In a theoretical overview, we compare the notions of strategic choice and determinism. Additionally, we examine how both task environment and discretion relates to these perspectives. In the second section, we present content analyses of how task environment and discretion have been used in prior studies. We identified nearly 500 journal articles which have cited either Dess and Beard (1984) or Hambrick and Finkelstein (1987). From these articles, we identified 87 studies that used task environment or discretion variables in their analyses. Common Roots, Different Applications What constrains managerial action? Influenced by general systems theory (Bertalanffy, 1968), management scholars began to explore the role of
Managerial Constraint
59
external forces in shaping a company’s direction. One example of this trend was Emery and Trist’s (1965) depiction of four environmental archetypes of increasing complexity: placid randomized, placid clustered, disturbed reactive, and turbulent field. A prominent research stream during this period was the creation of measures to assess uncertainty due to environmental factors (e.g., Duncan, 1972; Lawrence & Lorsch, 1967). However, a number of methodological limitations were identified with these studies (Downey, Hellriegel, & Slocum, 1975; Tosi, Aldag, & Storey, 1973), resulting in little consensus on either the definition or measurement of firm constraints. Subsequently, competing theoretical models emerged to explain the effect of external factors on organizations. Population ecology (Aldrich, 1979; Hannan & Freeman, 1977) characterized the firm’s interaction with outside forces as a Darwinian model of natural selection: Companies are essentially unable to control their environment, and internal inertia prevents successful adaptation. Survival, therefore, is contingent on having the right set of attributes at the right time. The conclusion that senior managers are largely interchangeable with one another is representative of this perspective (Lieberson & O’Connor, 1972). In contrast, strategists argued that adaptation to external constraints was the key to organizational survival, and that top managers played a central role in this process (Andrews, 1971; Chandler, 1962). A number of strategy models are descendents of strategic choice: Resource dependence theory (Pfeffer & Salancik, 1978) focused on mechanisms that companies could use to help buffer outside forces. Similarly, research on top management teams and upper echelons (Gupta, 1984; Hambrick & Mason, 1984) addressed how attributes of senior managers drove decision-making and subsequent firm performance. Despite this growing interest in external constraint on organizations, there was little consensus – at either conceptual or empirical levels – on the articulation of these factors. As part of the population ecology framework, Aldrich (1979) integrated prior work to propose six dimensions of business environments: capacity, heterogeneity, stability, concentration, domain consensus, and turbulence. Drawing on these common roots, two prominent papers emerged in the 1980s that offered frameworks to characterize factors that constrain managerial action: Dess and Beard’s (1984) study of organizational task environments, and Hambrick and Finkelstein’s (1987) paper on managerial discretion. On the basis of citations, both articles have been very influential: The Social Science Citation Index reports 309 citations to Dess and Beard,1 and 172 citations to Hambrick and Finkelstein. However, influence does not translate directly to application: despite the large number of citations, there
60
BRIAN K. BOYD AND STEVE GOVE
have been only 19 empirical applications of Dess and Beard’s framework that have used all three dimensions and comparable variables; with a number of other papers that used either a partial set of dimensions or very different measures. Similarly, there have been only 16 empirical applications of the discretion construct. Of greater concern, there has been little consistency in the use of either construct: Empirical studies have used a wide range of variables, with an equally wide range of parameters to define and measure these variables. Additionally, empirical studies often tap only limited aspects of these two constructs. Looking across the two perspectives, the concepts of task environment and discretion have been used interchangeably: multiple articles, for example, have equated highly munificent industry environments with high levels of discretion (e.g., Datta, Rajagopalan, & Zhang, 2003; Goll & Rasheed, 1997). Similarly, high levels of industry volatility have been equated with high levels of discretion (Haleblian & Finkelstein, 1993). Additionally, similar or identical indicators are often used to measure both discretion and task environment: industry growth rates, for example, have been used to tap both task environments (e.g., Boyd, 1990; Keats & Hitt, 1988) and discretion (e.g., Hambrick & Abrahamson, 1995). Additionally, this same variable, measured at the firm level, has also been used to measure discretion (Finkelstein & Boyd, 1998). Thus, although the two papers are rarely cited in the same article, the question remains: How distinct are task environments and discretion from one another? Fig. 1 shows some possible answers to this question. First, the lack of co-citation may simply reflect that these are separate and distinct – i.e., they are orthogonal or independent constructs. Alternately, these may be loosely linked – sharing some aspects, but still largely independent of each other. Finally, the task environment and discretion labels may simply another case a common problem: The use of inconsistent labels used to describe a common external pressure (Boyd, Dess, & Rasheed, 1993). Both papers share a common theoretical foundation: The distinction between strategic choice and determinism. However, there are substantial differences in the two frameworks. Dess and Beard’s focus is at the industry level, and their intent was to develop a reliable set of indicators to measure how levels of uncertainty varied from industry to industry. In contrast, Hambrick and Finkelstein (1987, p. 370) were interested in constraints at the individual level: It is important to stress that our focus is on the discretion of top managers – specifically chief executives – and not the discretion of organizations per se. Obviously, managerial discretion is limited by organizational discretion, so part of our analysis will still pertain to those who are interested solely in restrictions on organizations. Our interest in chief
Managerial Constraint
61
(a) Unrelated
Task Environment
(b) Loosely Linked
Task Environment
Discretion
Discretion
(c) Virtually Identical
Task Environment Discretion
Fig. 1.
Possible Relationships between Organizational Task Environment and Discretion.
executive discretion stems from related research we are conducting on executive characteristics, compensation, and succession – phenomena we believe will be far better understood if discretion is considered.
Both papers treat their respective constructs as multidimensional. Aspects of each are shown in Fig. 2. Dess and Beard integrated work by Aldrich (1979), Child (1972), and others to propose that environmental uncertainty would be shaped by three elements of the task environment for a given industry: Munificence is the availability of resources, and is negatively associated with uncertainty. Munificent environments are resource rich, with ample ability to support organizational growth. Dynamism refers to volatility or unpredictability in the environment, and is positively associated with uncertainty. Finally, complexity refers to the variety of an industry – concentration of inputs, for example, or organizational density. Complexity is positively associated with uncertainty. A number of subsequent articles have focused specifically on the conceptualization and measurement of Dess and Beard’s framework (Castrogiovanni, 2002; Harris, 2004; Sharfman & Dean, 1991). Hambrick and Finkelstein also proposed three factors that would shape the level of discretion2 – generally defined as the latitude of action available to an executive. Their determinants were found at the individual, firm, and industry levels. First, they noted that managerial characteristics would drive discretion. Specifically, executive personality characteristics (e.g., cognitive complexity, commitment), personal power base, and political skills would
62
BRIAN K. BOYD AND STEVE GOVE
Munificence Capacity to Support Growth Organizational Slack Resource Scarcity
Environmental Uncertainty
Complexity
Dynamism
Organizational Density and Concentration
Unpredictable Change
(a) Three Elements of Task Environment
Internal Forces Inertia Resource Availability Powerful Inside Forces
Managerial Discretion
Managerial Characteristics Aspiration Level Commitment Cognitive Complexity Internal Locus of Control Power Base Political Acumen
Task Environment Product Differentiation Market Growth Industry Structure Demand Instability Quasi-legal Constraints Powerful Outside Forces
(b) Three Elements of Discretion
Fig. 2. Conceptualization of Task Environment and Discretion (a) Dess and Beard’s (1984) Three Elements of Task Environment and (b) Hambrick and Finkelstein’s (1987) Three Elements of Discretion.
Managerial Constraint
63
serve to create, or limit, options available to a particular manager. Second, internal forces, such as organizational slack, firm inertia, or political power held by others, can also shape discretion. Finally, the task environment was expected to shape discretion as well. As shown in Fig. 2, many of the discretion task environment elements closely resemble munificence, dynamism, and complexity. Because Hambrick and Finkelstein’s article was a conceptual one, as opposed to Dess and Beard, subsequent researchers had less guidance – one might even say greater latitude of action – in how to define and measure discretion. As a result, in subsequent applications, discretion has been studied at the individual, firm, and industry levels, has been observed directly and inferred indirectly, and measured in a variety of ways, including expert assessment, survey, and archival measure (Boyd & Salamin, 2001). A few studies have focused primarily on the measurement of discretion (Carpenter & Golden, 1997; Finkelstein & Boyd, 1998; Hambrick & Abrahamson, 1995). In the next two sections, we describe empirical studies that have operationalized both the task environment and discretion frameworks.
Empirical Applications of Dess and Beard’s Task Environment Using the Social Science Citation Index, we identified 306 published papers that cited Dess and Beard’s (1984) article on task environments. We should note that this article pool is not an exhaustive list of possible uses, as books and book chapters are not included in the SSCI, nor are some journals. We then reviewed each article on the SSCI list, excluding papers that were not available as hard copy journals, electronic journals, or via interlibrary loan. We then classified papers into three categories: Articles which cited Dess and Beard, but did not use any of their measures; articles which used a subset of Dess and Beard dimensions (e.g., used munificence and dynamism, but not complexity), and articles which used all three of the Dess and Beard dimensions. Overall, we identified 71 papers that used Dess and Beard constructs in their analyses: 28 studies, which used all three dimensions, and 43 studies which used a subset of papers. In three quarters of the papers, task environment variables were used as predictor or contingency variables. The remaining one quarter of papers used these dimensions to control for industry effects. Interestingly, while the Dess and Beard model was framed to study objective aspects of industry constraint, a number of studies measured munificence, dynamism, or complexity with perceptual, or survey, items. In fact, of the 28 papers that used all three dimensions, nine relied solely on
64
BRIAN K. BOYD AND STEVE GOVE
perceptual measures. Therefore, of the 306 papers, we identified only 19 articles that used industry-level data for all three constructs. Attributes of the full-use studies are shown in Table 1, and partial-use studies are shown in Table 2.
Time Horizon The first aspect of the papers we reviewed was the time horizon. Munificence and dynamism scores are calculated by regressing an industry variable against time. While 5-and 10-year time windows are the most commonly used horizons, studies also used 2-, 3-, 7-, 9-, and even 19-year windows. Five- and ten-year windows typically cited prior work as precedent; otherwise, the rationale for a specific horizon was not emphasized. Additionally, studies did not explore whether different time horizons might affect results.
Industry Variables The variables used to measure munificence and dynamism varied widely. Industry sales and total employment were the most commonly used measures. The price-cost margin and value added were also used by a number of studies. Less commonly used variables included the number of establishments, capital expenditures, and return on assets. Prior use of a variable was the most common basis for justifying its use. The overwhelming majority of studies represented each dimension with a single indicator – Harris (2004) and Jarley, Fiorito, and Delaney (1997) are the two prominent exceptions. Because of the reliance on single indicators, most studies were not able to report tests of dimensionality3 for the industry variables used. Among studies that did test dimensionality, three quarters relied on partial (e.g., reliability of individual dimensions) versus extensive (e.g., confirmatory factor analysis) testing. Also, some studies with multiple indicators reported correlations among composite variables versus raw indicators, which limits comparison against other studies (Boyd, Gove, & Hitt, 2005). Complexity is most frequently measured by economic concentration. Boyd (1990) reported results of the MINL transformation (see Schmalensee, 1977) that yields an approximation of the Herfindahl index based on traditional concentration ratios (4-firm, 8-firm, etc). The use of H with the MINL transformation is the most widely used approach to operationalize complexity. Other studies have measured this dimension via concentration or dispersion of other variables, however, such as value-added or employees.
Article
Studies Using all Three Dess and Beard Dimensions.
Munificence Variables
Time
Dynamism Adjust
Variable
Time
Complexity Adjust
Variable
(a) Studies Using Archival Measures Bamford, Dean, and Douglas (2004)
Bank deposits
5
Log
Bank deposits
5
Log
Bamford, Dean, and McDougall (2000)
Bank deposits
5
Log
Bank deposits
5
Log
Bergh (1998) Berman, Wicks, Kotha, and Jones (1999) Boyd (1990) Boyd (1995) Castrogiovanni (2002)
Sales Sales
5 9
Mean Mean
Sales Sales
5 9
Mean Mean
Sales Sales Sales, employees, value added, price-cost margin Sales, price-cost margin, employees, value-added, # of establishments
5 5 5
Mean Mean Mean
Sales Sales Sales
5 5 5
Mean Mean Mean
H (MINL) H (MINL) Specialization ratio, coverage ratio
10
Mean
Sales, price-cost margin, employees, value-added, intermediate industry output Not fully described
10
Mean
Dispersion of sales, value-added, employees, # of establishments
Log
Industry size, regional production dispersion, concentration
Harris (2004)
Jarley et al. (1997)
Not fully described
5
Log
5
Number of bank branches divided by log of population Number of bank branches divided by log of population H (MINL) C (4-firm)
Managerial Constraint
Table 1.
65
66
Table 1. (Continued ) Article
Munificence Variables
Karimi, Somers, and Gupta (2004)
Pagell and Krause (2004) Palmer and Wiseman (1999) Sharfman and Dean (1991)
Snell, Lepak, Dean, and Youndt (2000)
Adjust
Variable
NA NA Sales, price-cost Sales, price-cost margin, margin, employment, employment, value-added value-added Sales, net income 5 Log Sales, net income Summary MDC scores taken directly from Dess and Beard (1984) Sales Sales
Time
Complexity Adjust
Variable
NA
NA
5
Log
Concentration of sales, value-added, employment, # of establishments Grossack
5 5
Mean Mean
H (MINL) H (MINL)
10
Mean
Concentration of sales, value-added, employment, inputs, # of establishments
5 5
Mean Mean
Sales Sales
Sales, price-cost, employment, value-added, # of establishments Sales
10
Mean
Sales, price-cost, employment, value-added
5
Log
Sales
5
Log
C (MINL)
Sales, employment Sales, employees, C-8 ratio
5
Log
Sales, net income
5
Log
Mean
Sales, employees
10
Mean
Log
Sales
NA
Log
C (4-firm), count of competitors Geographic concentration, product breadth, percentage of scientists and engineers in the workforce C (MINL)
Sales
10
NA
BRIAN K. BOYD AND STEVE GOVE
Keats and Hitt (1988) Lawless and Finch (1989) Lepak and Snell (2002) Lepak, Takeuchi, and Snell (2005) McArthur and Nystrom (1991)
Time
Dynamism
Munificence Indicators
Items Reported
Dynamism Reliability
Indicators
Items Reported
Complexity Reliability
Indicators
Items Reported
Reliability
(b) Studies Using Survey Measures Baum, Locke, and Smith (2001) Bensaou and Venkatraman (1995) Camison (2004) Chen and Lin (2004) Hart and Banbury (1994) Hart and Quinn (1993) Luo (1999) Luo and Peng (1999) Panayotopoulou, Bourantas, and Papalexandris (2003)
4
None
1
Full
0.73 Single item
4
None
0.84
2
None
3
Partial
0.79
1
Full
0.68 Single item
Managerial Constraint
Authors
25 Survey items for all three dimensions, aggregated into a single variable. Individual items are not reported, but a Cronbach’s alpha of 0.93 is reported 2 Partial 0.60 4 Partial 0.66 4 Partial 0.97 2
Full
1
Full
0.63
8
Full
Single item
1
Full
0.63
2
Full
Single item
1
Full
16 10
Full Partial
0.70+ 0.70+
5
Partial
0.81
16 10
Full Partial
0.70+ 0.70+
16 10
Full Partial
0.70+ 0.70+
3
Partial
0.60
11
Partial
0.79
0.67 Single item
67
Article
Studies Using a Subset of Dess and Beard Dimensions. Munificence
Variables
Time
Dynamism Adjust
3 Survey items
Variable
Complexity
Time
Adjust
Sales
10
Mean
Sales, income 5 Survey items Sales
10
Std. error
19
NA
Sales
5
Mean
Sales Sales
5 5
Mean Mean
Sales
5
Mean
Sales
5
Mean
Sales
NA
NA 5
NA Mean
7
Sales NA
5 10
Sales
5
Sales
NA
Input–output
# of firm SIC codes
NA Production Sales
Sales, capital expenditures, assets
Variable
NA
Sales, capital expenditures, assets Sales
7
NA
NA
Std. dev.
Sales, employees
10
Mean
5
1r2
H
Growth rate NA Log
NA
H (MINL)
Sales
BRIAN K. BOYD AND STEVE GOVE
Andersen (2001) concentration Andersen (2004) Bantel (1998) Baucus and Near (1991) Bergh (1993) Bergh and Lawless (1998) Bloom and Michel (2002) Boeker (1997) Buchko (1994) Carpenter and Fredrickson (2001) Chattopadhyay, Glick, Miller, and Huber (1999) David et al. (2002) Dawley et al. (2002) Dean and Sharfman (1993) Dean and Snell (1996) Dean and Sharfman (1996) Delacroix and Swaminathan (1991)
68
Table 2.
5
Mean
Sales
5
Mean
Sales
5
Mean
Sales
5
Mean
Sales
10
Mean
Sales
10
Mean
Sales
10
Mean
Sales
10
Mean
Sales
10
NA
Change in Japan GNP
C (4-firm) Sales Sales 3 Survey items
3 Survey items
5 5
Mean Mean
Sales
5
Mean
2 Survey items Sales
5
Mean
Sales
5
NA
Sales
5
5 10
NA Mean
Sales
5
Coefficient of variation NA
Sales
5
Mean
Sales Sales Sales
NA 5 5
s Mean Mean
Sales
2
NA Sales, cost of operations, ROA, # of establishments Sales
10
Mean
5
(1r2)
Sales Sales, employees
Sales, cost of operations, ROA, number of establishments Sales
10
5
Mean
NA
69
Stoeberl et al. (1998)
Sales
Managerial Constraint
Floyd and Wooldridge (1992) Floyd and Wooldridge (1997) Goll and Rasheed (1997) Goll and Rasheed (2004) Goll and Rasheed (2005) Kotha and Nair (1995) Li and Simerly (1998) Li and Ye (1999) Lumpkin and Dess (2001) Luo (2005) Mishina, Pollock, and Porac (2004) Rajagopalan and Datta (1996) Rasheed (2005) Sharfman and Dean (1997) Sheppard (1995) Simerly (1997) Simerly and Li (2000) Spanos, Zaralis, and Lioukas (2004) Stetz and Beehr (2000)
70
Table 2. (Continued ) Article
Munificence Variables
Time
Tushman and Anderson (1986)
Annual sales growth
5
Wally and Baum (1994)
Ordinal ranking of industries based on environmental uncertainty Sales
NA
Adjust
Complexity
Variable
Time
Adjust
Mean growth
Ratio of forecasted to actual industry growth
5
Mean
Sales
NA
Mean
Sales
3
Mean
Sales, employment
3
Mean
4 Survey items
Variable
BRIAN K. BOYD AND STEVE GOVE
Weinzimmer et al. (1998) Wholey and Brittain (1989) Wiklund and Shepherd (2005) Zhang and Rajagopalan (2004)
Dynamism
Managerial Constraint
71
Standardization To facilitate comparison across industries, munificence and dynamism scores are standardized, using either the mean of the industry variable, or a log transform. Keats and Hitt (1988) used the log transform approach, while Boyd (1990) used mean standardization; the latter approach is more widely used. These papers are typically cited as rationale in the article. A number of papers, however, used unstandardized scores; the comparability of scores across industries using this approach is not known. Perceptual Measures As noted previously, roughly one-third of the papers that use the full set of task environment variables do so via perceptual measures. These articles are noted in section (b) of Table 1. We also identified two studies that used only a subset of task environment dimensions and perceptual measures: Lumpkin and Dess (2001) and Luo (2005). While all studies report reliabilities of each dimension of 0.60 or greater, there is virtually no consistency in the survey measures used: The number of survey items used for a dimension, for instance, range between 1 and 16. Some studies will use very different numbers of indicators for each dimension as well – both Hart and Banbury (1994) and Panayotopoulou, Bourantas, and Papalexandris (2003), for example, use twice as many survey items for dynamism than for munificence or complexity. Also, while each paper created a unique set of survey items, limited testing is done in regard to reliability and validity. Most studies reported either the survey item itself, or topic of survey questions. Partial Use of Task Environment Dimensions The majority of studies that include a measure based on Dess and Beard (1984) include only a subset of the munificence, dynamism, and complexity dimensions. These studies typically focus on munificence and dynamism; of the 43 partial studies, only 5 included the complexity dimension. Partial studies are more likely to include task environment as a control variable, and most often rely on single indicator measures.
Empirical Applications of Hambrick and Finkelstein’s Discretion We identified 172 articles through the Social Sciences Citation Index citing Hambrick and Finkelstein’s (1987) article. As with our review of articles citing Dess and Beard’s (1984) work, this approach excludes most books and chapters, and we excluded papers not available through our libraries,
72
BRIAN K. BOYD AND STEVE GOVE
electronic means, or via interlibrary loan. This may represent a limiting factor in our review. We examined each article citing Hambrick and Finkelstein (1987) and classified how discretion was utilized in the methodology. In total, we identified 16 papers4 that used discretion in some form in the analysis, some with multiple operationalizations (e.g., Hambrick & Abrahamson, 1995), and some with measures at multiple levels of analysis (e.g., Boyd & Salamin, 2001; Magnan & St. Onge, 1997). Unlike Dess and Beard’s work, discretion has evolved in the literature from conceptual discussion (Hambrick & Finkelstein, 1987). Early applications (Finkelstein & Hambrick, 1990; Haleblian & Finkelstein, 1993; Hambrick, Geletkanycz, & Fredrickson, 1993) measured discretion as ordered categories – i.e., either high/low discretion or high/ medium/low discretion settings. These rankings of discretion were based on reviews of archival data by study authors, but with little explicit information as to analyses conducted, cutoff levels, or other considerations presented in their methodologies. Additionally, while Hambrick and Finkelstein initially characterized discretion at the level of the individual executive, these high/low groupings were done at the industry level: Neither internal forces nor managerial characteristics were included in these analyses. As developed by Hambrick and Finkelstein, executive discretion was expected to be shaped by three types of forces: Individual level (managerial characteristics), firm level (internal forces), and industry level (task environment). Subsequently, there have been efforts to refine the measurement of discretion at all three levels, with varying degrees of conformance with Hambrick and Finkelstein. At the industry level, Hambrick and Abrahamson (1995) created discretion measures via an unaffiliated panel of expert raters, (surveys of academic and securities analysts). They reported a high level of agreement within both groups of experts. Following a multiple-trait, multiple-measure approach, the authors examined the correlation between expert ratings and quantitative measures from archival sources. Their methodology did not include a factor analysis, but rather utilized a regression-based approach of predicting the expert panel discretion scores using quantitative indicators. At the firm level, Finkelstein and Boyd (1998) developed a multi-indicator factor model of discretion in a structural equation model predicting CEO pay. Included in their model were indicators of product differentiability, growth, demand instability, capital intensity, regulation, and industry concentration. These indicators can best be described as micro versions of the industry task environment, and differ substantially from Hambrick and Finkelstein’s discussion of firm-level forces such as inertia or the presence of powerful inside forces. Finally, at the individual level,
Managerial Constraint
73
Carpenter and Golden (1997) reported results based on a 15-item scale (a ¼ 0.82) assessing discretion as part of a simulation. Their approach is notable as it included sub-scales assessing both low discretion (5 items; a ¼ 0.74) and high discretion (10 items; a ¼ 78) environment, though actual items were not reported, for the use of perceived managerial discretion as a dependent variable, and usage of a lab study using a simulation. Again, the Carpenter and Golden measures differ substantially from the individuallevel forces as presented by Hambrick and Finkelstein. We use these levels (industry, firm, and individual) to summarize empirical use of discretion, as shown in Table 3. Next, we will highlight some of the key characteristics of this research. Level of Analysis While conceptualized as a multi-level construct, the vast majority of studies have measured discretion at a single level. As shown in Table 3, the industry level of analysis is dominant, with only a handful of studies tapping either firm- or individual-level aspects of discretion. Only two studies (Boyd & Salamin, 2001; Magnum & St. Onge, 1997) used measures from multiple levels. At the industry level, indicators typically include some of the usual suspects seen in the Dess and Beard applications – i.e., growth and volatility. Unique indicators included the intensity of advertising and R&D to tap differentiation, capital intensity, and regulation. While industry level studies mainly used archival data, Hambrick and Abrahamson (1995) also used expert ratings of industry discretion. Two different approaches have been used to measure discretion at the firm level. First, Finkelstein and Boyd (1998) used many of the same archival indicators as the industry studies, but measured at the firm level. So, for example, industry growth became firm growth, and industry concentration became a weighted composite of Herfindahl scores from each of the firm’s business segments. Second, Rajagopalan and Finkelstein (1992) extended the scope of discretion by proposing that discretion flowed from company strategy. Using the Miles and Snow (1978) typology, they developed hypotheses based on the premise that executives in Prospector firms would enjoy greater latitude in decision options than Defender executives. Strategic orientation has been used in subsequent studies as well (Boyd & Salamin, 2001; Rajagopalan, 1997). Similarly, Magnum and St. Onge (1997) used aspects of firm strategy (product mix, scope, and internationalization) to measure discretion. Finally, the individual level has seen the least empirical use. Additionally, the measures for individual-level studies are very different from those proposed by Hambrick and Finkelstein: Two studies used hierarchical
74
BRIAN K. BOYD AND STEVE GOVE
Table 3. Studies Using Hambrick and Finkelstein Discretion. Article
Level of Analysis Industry
Abrahamson and Hambrick (1997)
Firm
Expert assessment data from Hambrick and Abrahamson (1995)
Aragon-Correa, MatiasReche, and SeniseBarrio (2004)
Boyd and Salamin (2001) Carpenter and Golden (1997) Datta, Guthrie, and Wright (2005) Datta and Rajagopalan (1998) Datta et al. (2003)
Strategic orientation
Capital intensity, growth, R&D intensity, and industry volatility Advertising intensity, growth, and capital intensity Capital intensity, growth, advertising intensity
Finkelstein and Boyd (1998)
Finkelstein and Hambrick (1990)
Haleblian and Finkelstein (1993)
Hambrick and Abrahamson (1995)
Individual
Market growth, R&D intensity, advertising intensity, demand instability, capital intensity, concentration, and regulation Industries ranked high, medium, and low discretion based on a review of product differentiation, growth, demand instability, capital intensity, and degree of regulation Industries ranked high and low discretion based on R&D intensity, growth, advertising intensity, instability, and regulation Archival measures: R&D intensity, advertising intensity, capital intensity, growth,
Political power based on membership in a dominant coalition Hierarchical position Perceived discretion
Managerial Constraint
75
Table 3. (Continued ) Article
Level of Analysis Industry
Hambrick et al. (1993)
Magnan and St. Onge (1997)
Rajagopalan (1997) Rajagopalan and Finkelstein (1992)
demand instability, and regulation. Expert (academic and analyst) ratings Industries ranked high and low discretion, based on Finkelstein and Hambrick (1990) Banking laws that prohibit takeovers or branching
Firm
Individual
Company strategy, including product mix, internationalization, and geographic scope Strategic orientation Strategic orientation
position (Boyd & Salamin, 2001; Aragon-Correa, Matias-Reche, & SeniseBarrio, 2004) as proxies of individual power. A third paper (Carpenter & Golden, 1997) developed a perceptual measure of individual discretion. Measurement Practices As with the pool of Dess and Beard articles, there was considerable variation in the manner that discretion variables were operationalized. For example, some studies standardized growth and volatility scores to adjust for differences in industry size, while others did not. Time windows used to measure variables varied as well, including single and multi-year composites. As with the Dess and Beard articles, the rationale for a specific time window was often not justified when creating growth or volatility measures. The sophistication of measurement also varied widely across studies, including ordinal categories, single measures, and multiple measures. Among the studies with multiple measures, indicators have been aggregated into a single index measure, treated as separate predictors, and loaded onto a multiindicator latent factor model.
IMPLICATIONS FOR FUTURE RESEARCH We address three topics in this section. First, as demonstrated in our content analysis, prior studies have used a wide range of options when defining
76
BRIAN K. BOYD AND STEVE GOVE
variables, particularly for growth and volatility scores. Using data from 130 industry groups, we examine how choices in definition affect variables. Second, we examine the degree of overlap between discretion and task environment variables. For this analysis, we compare data from a sample of 400 firms with our industry-level measures. Finally, we use the Hrebiniak and Joyce (1985) model of strategic choice and determinism to integrate task environment and discretion.
Recommendations for Measurement In this section, we review the implications of different choices in the construction of growth and volatility measures, which are central to both discretion and task environment frameworks. While we illustrate these issues with industry level data, the analyses are applicable to firm level measures as well. We collected data for 130 SIC industry groups from U.S. Industrial Outlook. Based on our content analysis of prior studies, we created munificence and dynamism scores using a variety of approaches, and then compared the correlations among these different measures. Results are reported in Tables 4 and 5. Prior studies used a wide range of time horizons to calculate scores, ranging from as little as 2 years, to as long as 19 years for Dess and Beard articles. Time periods varied for discretion studies as well. We examined the effect of temporal stability by creating scores with four different time horizons: 3, 5, 7, and 10 years. To facilitate comparison, all time windows end in 1986 – i.e., the 10-year window includes 1977–1986 data, while the 5-year window includes 1982–1986 data. To address the effect on the choice of industry variable, we constructed separate measures based on industry sales (labeled value of shipments in the Industrial Outlook), and industry employment – these were the two most commonly used industry variables as reported in the content analysis of prior studies. Next, we used three different approaches to standardizing scores: Mean and log standardization, and unstandardized scores. While mean standardization is the most commonly used approach, Keats and Hitt (1988) standardized via log transforms, and this alternate method has been used in a minority of studies. For mean standardization, a regression model is run using the industry variable (i.e., sales or employment) as the dependent variable, with time as the predictor. Both the parameter estimate and standard error of the regression slope coefficient are then divided by the mean of the industry variable to create munificence and dynamism scores, respectively. For log standardization,
Managerial Constraint
77
Table 4. Comparison of Munificence Scores. Variable 1 2 3 4 5 6 7 8 9 10 11 12 13
VoS TE VoS TE VoS TE VoS TE VoS TE VoS VoS TE Mean: Std. Devn:
Basis Window Mean Mean Log Log Mean Mean Mean Mean Log Log None Mean Mean
10 10 10 10 7 7 5 5 5 5 5 3 3
1 1.00 0.74 1.00 0.89 0.91 0.89 0.73 0.73 0.73 0.73 0.16 0.57 0.61
2
1.00 0.74 0.94 0.68 0.69 0.56 0.58 0.55 0.57 0.12 0.41 0.48
3
4
5
1.00 0.89 0.88 0.88 0.69 0.71 0.71 0.71 0.16 0.54 0.60
1.00 0.81 0.85 0.63 0.69 0.63 0.67 0.14 0.48 0.57
1.00 0.93 0.90 0.86 0.87 0.84 0.07 0.69 0.69
6
1.00 0.78 0.91 0.80 0.89 0.12 0.61 0.77
7
1.00 0.85 0.95 0.85 0.14 0.84 0.66
8
1.00 0.88 0.98 0.07 0.66 0.87
9
1.00 0.92 0.08 0.82 0.75
10
11
12
13
1.00 0.05 1.00 0.70 0.35 1.00 0.87 0.07 0.66 1.00
0.05 0.02 1.07 0.99 0.04 0.02 0.05 0.01 1.06 1.00 5475 0.03 0.00 0.05 0.06 0.06 0.05 0.07 0.06 0.08 0.07 0.12 0.09 29492 0.10 0.08
Note: VoS ¼ Value of Shipments; TE ¼ Total Employment. All data from US Industrial Outlook. All time windows end with 1986 data – e.g., 10-year window includes years 1977–1986, 3year window includes 1984–1986. Correlations in italics nonsignificant. All other correlations significant at 0.001.
a log transform is used on the industry variable prior to regression. The antilog of the parameter estimate and standard error of the regression slope coefficient are then used to calculate munificence and dynamism scores, respectively. Unstandardized scores have been used as indicators for both task environment and discretion studies, and are more commonly used in the pool of discretion studies. This approach uses the unadjusted parameter estimate and standard error of the regression slope coefficient. Temporal Stability Munificence and dynamism scores are only minimally affected by the choice of time horizon. For example, munificence scores based on any time window will have a correlation between 0.84 and 0.91 with any adjacent window – e.g., comparing a 5-year window to either 3- or 7-year windows. Scores are also very similar even when comparing extreme ranges of time windows: Using mean-based standardization, for example, 3- and 10-year windows correlate on average at 0.53. Similarly, mean standardized dynamism scores correlate at 0.83 on average. The importance of the time window has been raised in a number of articles. Boyd et al. (1993) noted that scores based on a broad time horizon may have limited meaning, as older data may be less relevant to current organizational issues. Additionally, Castrogiovanni
78
Table 5. Variable VoS TE VoS TE VoS TE VoS TE VoS TE VoS VoS TE Mean: Std. Devn:
Basis
Window
1
2
3
4
5
6
7
8
9
10
11
12
13
Mean Mean Log Log Mean Mean Mean Mean Log Log None Mean Mean
10 10 10 10 7 7 5 5 5 5 5 3 3
1.00 0.32 0.94 0.39 0.96 0.90 0.91 0.82 0.93 0.93 0.13 0.78 0.88
1.00 0.29 0.99 0.29 0.31 0.26 0.29 0.27 0.32 0.04 0.23 0.29
1.00 0.38 0.91 0.83 0.89 0.87 0.88 0.85 0.20 0.77 0.74
1.00 0.36 0.40 0.34 0.41 0.35 0.41 0.06 0.25 0.34
1.00 0.88 0.95 0.78 0.96 0.91 0.15 0.81 0.84
1.00 0.81 0.89 0.86 0.95 0.11 0.59 0.87
1.00 0.76 0.98 0.86 0.25 0.83 0.77
1.00 0.90 0.99 0.09 0.58 0.86
1.00 0.92 0.18 0.84 0.85
1.00 0.08 0.68 0.89
1.00 0.38 0.03
1.00 0.72
1.00
0.01 0.01
0.01 0.03
1.01 0.02
0.01 0.02
0.01 0.01
1.01 0.02
2960 16982
0.02 0.04
0.01 0.03
1.01 0.01
0.02 0.02
0.02 0.02
1.02 0.03
Note: VoS ¼ Value of Shipments; TE ¼ Total Employment. All data from US Industrial Outlook. All time windows end with 1986 data – e.g., 10-year window includes years 1977–1986, 3-year window includes 1984–1986. Correlations in italics nonsignificant. Significant at 0.05. All other correlations significant at 0.001.
BRIAN K. BOYD AND STEVE GOVE
1 2 3 4 5 6 7 8 9 10 11 12 13
Comparison of Dynamism Scores.
Managerial Constraint
79
(2002) reported that munificence, dynamism, and complexity scores tended to decline as an industry matured. However, the close correspondence of scores across time horizons suggests that the choice of a 5- versus 7-, or even 10-year window may not be a meaningful concern. Industry Variable While industry sales is the most widely used variable, a variety of other indicators have been used, including capital expenditures, net income, profitability ratios, and assets. Total employment is the second most widely used industry variable. As shown in Tables 4 and 5, munificence and dynamism scores based on sales and employment data correlated strongly with each other. However, the 5- and 7-year window mean-adjusted scores tracked much more closely than scores based on either the shortest or longest time windows. For munificence, scores based on industry sales and employment at 7- and 5-year windows correlated at 0.93 and 0.85, respectively. In comparison, 10-year scores correlated at 0.74, and 3-year scores correlated at 0.66. Dynamism scores based on industry sales and employment reported correlations of 0.88 and 0.76 at 7- and 5-year windows, respectively. The 10-year window reported a correlation of only 0.32, and the 3-year window correlated at 0.72. Part of this disparity may be the upward bias of industry sales – typically studies do not discuss transforming sales to constant dollars to control for inflationary pressures. Standardization Comparisons of mean and log standardization were made for 5- and 10-year windows, and using both industry sales and employment. Boyd (1990) had previously reported that both approaches correlated strongly; however, this analysis was based on a relatively small sample of industry groups. Our analysis is based on a much larger pool of industries, and report similar, but stronger results than Boyd (1990). On average, munificence measures correlated at 0.97, and dynamism measures correlated at 0.98. There was no noticeable pattern of variation based on time window or the choice of industry variable. Unstandarized scores, however, report a very different pattern of correlations. Munificence scores using industry sales and a 5-year window were not significantly correlated with any other estimates at either 5-, 7-, or 10-year windows. The unstandardized munificence score did report a significant correlation of 0.35 with the 3-year estimate also based on industry sales, however. Unstandardized dynamism scores reported slightly better results, with a correlation of 0.25 with the 5-year mean-adjusted
80
BRIAN K. BOYD AND STEVE GOVE
industry sales score, and a correlation of 0.38 with the 3-year mean-adjusted industry sales score.
Potential for Omitted Variable Problems As noted in earlier Tables, the majority of studies that use task environment variables use only a subset of the munificence, dynamism, and complexity constructs. Similarly, the potential for omitted variable problems are substantial for discretion studies as well: Not only are there multiple levels of discretion that need to be addressed, but most studies use a small number of indicators to represent a particular level. Given the nature of both phenomena, researchers should be alert to potential omitted variable problems. We will illustrate this issue using data on task environments, but the concern is applicable to discretion as well. In their article, Dess and Beard (1984) proposed that munificence, dynamism, and complexity would be independent dimensions. Thus, their factor model reported orthogonal versus oblique factors. It is important to note that there was no a priori theoretical basis for this – rather, Dess and Beard viewed their model as an exploratory one, and reported independent factors solely in the interest of parsimony. In a subsequent replication, Harris (2004) reported significant covariances across all three dimensions.5 Based on our review of prior studies, both the strength and the direction of covariance among task environment dimensions is highly influenced by the characteristics of a particular sample. Table 6 aggregates the correlations for these dimensions, based on section (a) articles from Table 1. Using on a weighted mean of these articles, the three dimensions would appear to be virtually independent: Munificence and dynamism correlate at 0.11, munificence and complexity at 0.02, and dynamism and complexity at 0.02. However, these scores vary dramatically across studies. The largest correlation between munificence and dynamism was 0.88; at the other extreme, one study reported a negative correlation of 0.46. Similarly, munificence and complexity reported correlations ranging from 0.34 to 0.62. Finally, dynamism and complexity has correlations ranging from 0.46 to 0.48. Omitted variable problems occur when two predictor variables have overlapping variance, and one of the predictors is excluded from the model. As a result, a statistical model may overestimate the effect of the included predictor on the dependent variable. Given the range of strong correlations that have been reported between munificence, dynamism, and complexity, and the general tendency to use only a subset of these dimensions, there is ample basis to question the accuracy of prior studies.
Managerial Constraint
81
Table 6. Correlations between Munificence, Dynamism, and Complexity. Article
N
Correlations Munificence– Dynamism
Bamford et al. (2004) Bamford et al. (2000) Bergh (1998) Berman et al. (1999) Boyd (1990) Boyd (1995) Castrogiovanni (2002) Harris (2004) Jarley et al. (1997) Karimi et al. (2004) Keats and Hitt (1988) Lepak and Snell (2002) Lepak et al. (2003) McArthur and Nystrom (1991) Pagell and Krause (2004) Palmer and Wiseman (1999) Snell et al. (2000) Weighted mean Minimum Maximum
Munificence– Complexity
Dynamism– Complexity
490
0.04
0.33
0.01
140
0.14
0.15
0.07
168 486
0.21 0.10
0.13 0.39
0.03 0.46
147 192 45
0.36 0.49 0.19
0.58 0.02 0.38
0.29 0.13 0.24
247 50
0.46 0.28
0.17 0.62
0.48 0.20
77
0.35
0.29
0.39
262
0.06
0.34
0.14
206
0.20
0.03
0.13
148
0.22
0.01
0.16
109
0.26
0.03
0.18
168
0.88
0.04
0.05
235
0.15
0.06
0.13
74
0.86
0.21
0.15
0.11 0.46 0.88
0.02 0.34 0.62
0.02 0.46 0.48
82
BRIAN K. BOYD AND STEVE GOVE
Net Recommendations The most common approach to building industry-level munificence and dynamism scores is to use a 5-year time window, mean standardization, and industry sales. Continued use of this approach facilitates comparison of results across different samples and analyses. However, if authors have a theoretical basis for either a longer or shorter time window, this change should have only a minimal effect on scores. Similarly, the choice of standardization method should not have an effect on results. Many studies, however, either do not standardize their measures, or are unclear whether or not they standardize scores. As shown in our comparison, unstandardized growth or volatility scores show, at best, only tenuous correspondence with either the mean or log adjusted measures. Thus, some form of standardization should be included for any studies that utilize multi-industry samples. The choice of an industry variable is more important than either the time window or standardization approach, as there is some variation between scores based in industry sales and employment. These differences are likely affected by multiple factors, including failure to control for inflation with sales scores, and advances in man-hour productivity, such as lean manufacturing. Scores based on industry sales should ideally be restated to constant dollars. Additionally, the use of multiple measures would be helpful. Finally, studies should consider including all three task environment dimensions, even as control variables, to address potential omitted variable problems. In the case of discretion, studies should include a broader pool of indicators. A summary of measurement recommendations is shown in Table 7.
Different Labels, Common Phenomenon? Based on our review, there appear to be more similarities than differences in the applied use of task environment and discretion. Although intended to capture individual psychological traits, and organizational factors such as inertia or internal politics, as Table 3 shows, the de facto indicators for discretion are typically measured at the industry level. Consequently, industry-level growth rates, volatility, economic concentration, and capital intensity have been used to characterize both discretion and task environment. So, are these truly different constructs, or simply different labels? Boyd et al. (1993) noted that organizational environments can be measured at multiple levels, and with multiple approaches. A modified version of their framework for classifying measures is shown in Fig. 3. Consider a regression slope coefficient for sales, standardized by its mean. When
Managerial Constraint
Table 7.
Recommended Measurement Practices for Future Studies.
Issue Time window for growth and volatility scores
Standardization for growth and volatility scores
Choice of variables for growth and volatility scores Omitted variables
Level of analysis
83
Description Five-year windows offer greatest generalizability to other studies. However, measures are generally comparable if window size is adjusted up or downwards by two years Mean and log standardization yield almost identical results, but mean is most widely used practice. Unstandardized scores correlate poorly with either mean or log standardization, and should be limited to single-industry samples Sales offer the greatest generalizability to other studies Correlations between dimensions have varied widely from study to study, creating a realistic potential for omitted variable problems. Task environment studies should include measures for munificence, dynamism, and complexity even if hypotheses relate to just one dimension Discretion studies should use multiple measures, ideally at different levels – e.g., firm versus individual – when appropriate Scores based on industry- versus firm-level data will covary but still measure different aspects of constraint. Authors should provide explicit rationale for use of a given level
‘‘sales’’ is measured as ‘‘industry sales,’’ this is a measure of munificence, an aspect of industry task environment. However, when ‘‘sales’’ is measured as ‘‘firm sales’’ the variable is now less macro, and now represents the level of discretion facing a firm. In practice, how similar or different are these two measures, separated only by the level of analysis? Boyd et al. (1993) reported comparative data for the semiconductor industry, based on sales data for 1984–1989. While semiconductor is often described as a ‘‘boom or bust’’ business, this time period was relatively stable, and reported moderate levels of munificence and dynamism. However, scores based on data for individual members of this industry reported considerably more variation. Intel, for instance, reported munificence scores
84
BRIAN K. BOYD AND STEVE GOVE
Environment Dimensions
Objective
Perceptual
Industry
Industry Munificence Firm
Outsiders
Simulation
Discretion Insiders
Slope of Sales Mean of Sales TMT
Middle Managers
Fig. 3.
Levels of Environmental Measurement. Source: Boyd et al. (1993).
of 60 percent higher than that of the industry; in contrast, Siliconix reported a munificence score, only one quarter that of the industry. Additionally, deviation from the industry norm on a single task environment dimension did not guarantee similar abnormality on another: Chips and Technologies, for instance, reported an average level of dynamism, despite having a munificence score five times that of the industry. Overall, Boyd et al. (1993, p. 215) concluded that industry level measures of uncertainty are ‘‘less relevant for characterizing the level and nature of uncertainty felt by individual firms.’’ Since the comparison by Boyd et al. (1993) was anecdotal, and limited to a single industry, we conducted a more comprehensive analysis of firm and industry level munificence and dynamism scores. We obtained firm-level growth and volatility scores for 400 firms; these were a subset of the sample used by Finkelstein and Boyd (1998). We then matched these data via 4-digit SIC codes with the industry-level scores that we reported earlier. Both sets of measures were based on the same time period, and correlations are reported in Table 8. Munificence scores based on sales at the two levels reported correlations of approximately 0.46, and were essentially unaffected by the basis for
Managerial Constraint
85
Table 8. Comparison of Industry-Level and Firm-Level Scores. Variable (a) Munificence Variables 1 VoS 2 TE 3 VoS 4 TE 5 Firm Sales
Basis
1
2
3
4
5
Mean Mean Log Log Mean
1.00 0.84 0.95 0.84 0.45
1.00 0.86 0.98 0.21
1.00 0.91 0.46
1.00 0.18
1.00
0.06 0.07
0.00 0.05
1.06 0.08
1.00 0.06
0.08 0.12
1.00 0.36 0.97 0.43 0.16
1.00 0.54 0.99 0.19
1.00 0.57 0.17
1.00 0.29
1.00
0.02 0.02
0.01 0.01
1.02 0.02
1.01 0.01
0.03 0.03
Mean: Std. Devn: (b) Dynamism Variables 1 VoS 2 TE 3 VoS 4 TE 5 Firm Sales Mean: Std. Devn:
Mean Mean Log Log Mean
Note: All correlations significant at a ¼ 0.001. All variables based on 1982–1986 data. Industry data from US Industrial Outlook. Firm data based on a subset of the sample from Finkelstein and Boyd (1998).
standardizing the industry-level scores. Firm-level growth scores based on sales correlated less strongly with industry-level measures based on total employment, approximately at 0.20. On average, dynamism scores were more loosely linked: Firm and industry-level scores based on sales correlated at approximately 0.17. Interestingly, the firm-level score based on sales correlated more strongly with industry measured based on employment. While recognizing the limitations of correlational analysis, these results should be considered as preliminary evidence that task environment and discretion are loosely linked versus independent or totally overlapping constructs. Future research could help clarify the degree of overlap by reporting results with scores based on different levels of analysis. Integrating Task Environment and Discretion Recognizing that task environment and secretion are distinct yet overlapping constructs, what are the implications for management research? One salient issue is a broader form of an omitted variable problem: If a
86
BRIAN K. BOYD AND STEVE GOVE
researcher is interested in firm-level discretion, and includes it as a predictor in a model, an unknown degree of explained variance may actually be due to an unmeasured, industry-level counterpart. As a result, industry-level studies may have overestimated the role of munificence, dynamism, and complexity, while firm-level studies may have overstated the effect of discretion. One solution may simply be for studies to create both firm- and industrylevel variables. A superior approach, however, would be to go beyond integrating measures and instead integrate both task environment and discretion frameworks in future research. A model developed by Hrebiniak and Joyce (1985) offers an excellent basis for developing integrative hypotheses. Briefly, the authors proposed that choice and determinism were not competing perspectives, but rather, complementary ones. A modified version of their model is shown in Fig. 4a. Environmental determinism is driven by structural characteristics of industries – a perfectly competitive market, for instance, is highly deterministic. Alternately, an environment rich in resources – which Hrebiniak and Joyce (1985) described as ‘‘benign’’ – is less deterministic. Independent of these structural characteristics, individual firms will have varying degrees of strategic choice available to them. For example, a firm in a commodity market – e.g., Nucor and steelmaking – may develop innovative ways to build a cost advantage. Alternately, other firms might develop strategies to consolidate industries that were previously heavily fragmented; Blockbuster and video rental, or Service Corp International and funeral homes are two examples. Hrebiniak and Joyce (1985, p. 342) noted ‘‘The essential point is that external constraints and high environmental determinism need not necessarily prevent individual choice and impact on strategic adaptation.’’ Hrebiniak and Joyce (1985) proposed that different strategies would be appropriate for each of the choice – determinism combinations, and that firm performance would be affected by the match between strategy and context. Lawless and Finch (1989) conducted the first empirical test of the choice – determinism model, using the task environment variables munificence, dynamism, and complexity. Overall, they reported that high deterministic quadrants had higher levels of environmental uncertainty; i.e., high determinism translated into less munificence, more dynamism, and more complexity. However, Lawless and Finch were unable to find support for the Hrebiniak and Joyce’s propositions regarding matching structure to context, or the accompanying performance effects. A probable explanation for their lack of results was the use of task environment variables to characterize both determinism and strategic choice. Hrebiniak and Joyce (1985)
Managerial Constraint
Strategic Choice
High
Low
87
Maximum Choice
Differentiated Choice
Low Environmental Uncertainty
High Environmental Uncertainty
High Managerial Discretion
High Managerial Discretion
Incremental Choice
Restricted Choice
Low Environmental Uncertainty
High Environmental Uncertainty
Low Managerial Discretion
Low Managerial Discretion
Low
Maximum Choice
High
High
Environmental Determinism
(a)
Differentiated Choice
Differentiated Choice
Fi
rm
Low
an d Tr Ind ac us k Cl try os At el trib Incremental y u Choice
Restricted Choice
te
s
Maximum Choice
es
ut
Strategic Choice
Strategic Choice
High
rm
ib ttr A y ely r st os du o In k L d c an Tra
Fi
Restricted Choice
Low Incremental Choice
Low
(b)
High
Environmental Determinism
Low
(c)
High
Environmental Determinism
Fig. 4. Using Hrebiniak and Joyce (1985) to Integrate Task Environment and Discretion (a) Original Model, (b) Diagonal Elements, and (c) Off-Diagonal Elements.
defined determinism as an external constraint, which is consistent with Dess and Beard’s (1984) model of organizational task environments. Strategic choice, however, was framed either at the level of the firm, or the CEO. In either case, Hambrick and Finkelstein’s (1987) discretion model tailors closely to this dimension of the Hrebiniak and Joyce 2 2 model. A subsequent paper (Bamford, Dean, & McDougall, 2000) also used the Hrebiniak and Joyce framework; however, their 21 strategic choice indicators were measures of decisions made at the time of company founding,
88
BRIAN K. BOYD AND STEVE GOVE
versus the latitude of choice available to firms. More recently, Dawley, Hoffman, and Lamont (2002) applied the Hrebiniak and Joyce’s framework: They used munificence to measure the degree of determinism, and firm slack to measure strategic choice. They found that, for firms emerging from bankruptcy, strategic choice was more important than environmental conditions in predicting firm survival. For many companies, their firm-level discretion will closely mirror the broader industry conditions: Settings with low industry uncertainty (e.g., high munificence, low dynamism, and low complexity) will also have high levels of firm-level discretion, and highly uncertain industry environments will also tend to offer less firm-level discretion. These diagonal elements are shown in Fig. 4b, and are the sectors of Maximum Choice and Restricted Choice. The off-diagonal elements are shown in Fig. 4c. From a research perspective, these sectors are more interesting: they include Differentiated Choice, where firms have managed to cultivate discretion despite a highly uncertain environment, and Incremental Choice, where firms have little discretion despite a benign industry structure. In Fig. 4b settings, task environment and discretion variables are likely to have very similar effects. However, in Fig. 4c conditions, the effects of discretion and task environment might vary dramatically. Thus, one avenue for future research would be to include both industry- and firm-level indicators of constraint, and to test hypotheses that integrate task environment and discretion perspectives.
CONCLUSION Strategic management research has been characterized as placing less emphasis on construct measurement than is warranted (Boyd et al., 2005; Hitt, Boyd, & Li, 2004; Venkatraman & Grant, 1986). This criticism is applicable to research on both organizational task environments and discretion. Prior studies on both topics have used a wide array of variables, and inconsistent approached to measuring these variables. Greater consistency in measurement will facilitate comparisons of findings and generalizability of future studies. In addition to greater consistency in measurement, studies should consider using multiple indicators of respective phenomena. Finally, we would encourage authors to develop hypotheses that explicitly integrate both discretion and task environment frameworks.
Managerial Constraint
89
NOTES 1. As of August, 2005. 2. It is relevant to note that the concept of discretion did not originate with Hambrick and Finkelstein (1987). Pfeffer and Salancik (1978, pp. 244–247), for example, discussed determinism, constraint, and superior-subordinate relations as determinants of individual discretion. Similarly, Mintzberg (1983) discussed discretion in the context of organizational power. 3. Although studies did not typically report tests of dimensionality, virtually all papers followed the convention of treating munificence, dynamism, and complexity as separate predictors. Exceptions to this practice included Camison (2004), who aggregated 24 survey items into a single environmental uncertainty measure. Similarly, Wally and Baum (1994) reduced the three task environment dimensions to a composite score of environmental uncertainty. 4. We included Datta and Rajagopalan (1998) in this list. Discretion is not developed as a prominent theory in the paper, but the measures and approach are consistent with Hambrick and Finkelstein (1987), and another discretion paper (Hambrick & Abrahamson, 1995) is used to justify measures. 5. Harris concludes that the Dess and Beard’s framework does not meet the requirements of construct validity, based on (a) covariance among dimensions, (b) lack of predictive validity, and (c) methods bias. As we noted previously, Dess and Beard reported that their model used orthogonal versus oblique factors because they viewed their analysis to be exploratory in nature. Subsequent applications (e.g., Keats & Hitt, 1988; Boyd, 1990) have allowed munificence, dynamism, and complexity to covary. Regarding predictive validity, Harris cites Sharfman and Dean (1991), but none of the 70 other empirical applications of Dess and Beard that are listed in Tables 1 and 2. Finally, the assessment of methods bias is made based on a comparison of two models (Models 4 and 5, on p. 868). While Harris (2004, p. 869) noted that ‘‘the final model (5) suggests good fit with the data,’’ the w2 difference between models is not significant, and the NFI, CFI, and GFI for the two models are identical. Thus, while acknowledging the importance of covariation across dimensions, the conclusions of Harris (2004) appear to overstep available data.
REFERENCES Abrahamson, E., & Hambrick, D. C. (1997). Attentional homogeneity in industries: The effect of discretion. Journal of Organizational Behavior, 18, 513–532. Aldrich, H. (1979). Organizations and environments. Englewood Cliffs, NJ: Prentice-Hall. Andersen, T. J. (2001). Information technology, strategic decision making approaches and organizational performance in different industrial settings. Journal of Strategic Information Systems, 10(2), 101–119. Andersen, T. J. (2004). Integrating decentralized strategy making and strategic planning processes in dynamic environments. Journal of Management Studies, 41(8), 1271–1299. Andrews, K. R. (1971). The concept of corporate strategy. Homewood, IL: Richard D. Irwin.
90
BRIAN K. BOYD AND STEVE GOVE
Aragon-Correa, J. A., Matias-Reche, F., & Senise-Barrio, M. E. (2004). Managerial discretion and corporate commitment to the natural environment. Journal of Business Research, 57(9), 964–975. Bamford, C. E., Dean, T. J., & Douglas, T. J. (2004). The temporal nature of growth determinants in new bank foundings: Implications for new venture research design. Journal of Business Venturing, 19(6), 899–919. Bamford, C. E., Dean, T. J., & McDougall, P. P. (2000). An examination of the impact of initial founding conditions and decisions upon the performance of new bank start-ups. Journal of Business Venturing, 15(3), 253–277. Bantel, K. A. (1998). Technology-based, ‘‘adolescent’’ firm configurations: Strategy identification, context, and performance. Journal of Business Venturing, 13(3), 205–230. Baucus, M. S., & Near, J. P. (1991). Can illegal corporate-behavior be predicted?: An event history analysis. Academy of Management Journal, 34(1), 9–36. Baum, J. R., Locke, E. A., & Smith, K. G. (2001). A multidimensional model of venture growth. Academy of Management Journal, 44(2), 292–303. Bensaou, M., & Venkatraman, N. (1995). Configurations of interorganizational relationships: A comparison between U.S. and Japanese automakers. Management Science, 41(9), 1471–1492. Bergh, D. D. (1993). Don’t waste your time – the effects of time-series errors in management research – the case of ownership concentration and research-and-development spending. Journal of Management, 19(4), 897–914. Bergh, D. D. (1998). Product-market uncertainty, portfolio restructuring, and performance: An information-processing and resource-based view. Journal of Management, 24(2), 135–155. Bergh, D. D., & Lawless, M. W. (1998). Portfolio restructuring and limits to hierarchical governance: The effects of environmental uncertainty and diversification strategy. Organization Science, 9(1), 87–102. Berman, S. L., Wicks, A. C., Kotha, S., & Jones, T. M. (1999). Does stakeholder orientation matter? The relationship between stakeholder management models and firm financial performance. Academy of Management Journal, 42(5), 488–506. Bertalanffy, L. V. (1968). General Systems Theory. New York: George Braziller. Bloom, M., & Michel, J. G. (2002). The relationships among organizational context, pay dispersion, and managerial turnover. Academy of Management Journal, 45(1), 33–42. Boeker, W. (1997). Strategic change: The influence of managerial characteristics and organizational growth. Academy of Management Journal, 40(1), 152–170. Boyd, B. (1990). Corporate linkages and organizational environment – a test of the resource dependence model. Strategic Management Journal, 11(6), 419–430. Boyd, B. K. (1995). CEO duality and firm performance – a contingency-model. Strategic Management Journal, 16(4), 301–312. Boyd, B. K., Dess, G. G., & Rasheed, A. M. A. (1993). Divergence between archival and perceptual measures of the environment: Causes and consequences. Academy of Management Review, 18(2), 204–226. Boyd, B. K., Gove, S., & Hitt, M. A. (2005). Construct measurement in strategy research: Illusion or reality? Strategic Management Journal, 26(3), 239–257. Boyd, B. K., & Salamin, A. (2001). Strategic reward systems: A contingency model of pay system design. Strategic Management Journal, 22(8), 777–792.
Managerial Constraint
91
Buchko, A. A. (1994). Conceptualization and measurement of environmental uncertainty – an assessment of the Miles and Snow perceived environmental uncertainty scale. Academy of Management Journal, 37(2), 410–425. Camison, C. (2004). Shared, competitive, and comparative advantages: A competencebased view of industrial-district competitiveness. Environment and Planning, 36(12), 2227–2256. Carpenter, M. A., & Fredrickson, J. W. (2001). Top management teams, global strategic posture, and the moderating role of uncertainty. Academy of Management Journal, 44(3), 533–545. Carpenter, M. A., & Golden, B. R. (1997). Perceived managerial discretion: A study of cause and effect. Strategic Management Journal, 18(3), 187–206. Castrogiovanni, G. J. (2002). Organization task environments: Have they changed fundamentally over time? Journal of Management, 28(2), 129–150. Chandler, A. D. (1962). Strategy and structure: Chapters in the history of American industrial enterprise. Cambridge, MA: MIT Press. Chattopadhyay, P., Glick, W. H., Miller, C. C., & Huber, G. P. (1999). Determinants of executive beliefs: Comparing functional conditioning and social influence. Strategic Management Journal, 20(8), 763–789. Chen, C. J., & Lin, B. W. (2004). The effects of environment, knowledge attribute, organizational climate, and firm characteristics on knowledge sourcing decisions. R & D Management, 34(2), 137–146. Child, J. (1972). Organizational structure, environment and performance. The role of strategic choice. Sociology, 6, 1–22. Datta, D. K., Guthrie, J. P., & Wright, P. M. (2005). Human resource management and labor productivity: Does industry matter? Academy of Management Journal, 48(1), 135–145. Datta, D. K., & Rajagopalan, N. (1998). Industry structure and CEO characteristics: An empirical study of succession events. Strategic Management Journal, 19(9), 833–852. Datta, D. K., Rajagopalan, N., & Zhang, Y. (2003). New CEO openness to change and strategic persistence: The moderating role of industry characteristics. British Journal of Management, 14(2), 101–114. David, J. S., Hwang, Y. C., Pei, B. K. W., & Reneau, J. H. (2002). The performance effects of congruence between product competitive strategies and purchasing management design. Management Science, 48(7), 866–885. Dawley, D. D., Hoffman, J. J., & Lamont, B. T. (2002). Choice situation, refocusing, and postbankruptcy performance. Journal of Management, 28(5), 695–717. Dean, J. W., & Sharfman, M. P. (1993). Procedural rationality in the strategic decision-making process. Journal of Management Studies, 30(4), 587–610. Dean, J. W., & Sharfman, M. P. (1996). Does decision process matter? A study of strategic decision making effectiveness. Academy of Management Journal, 39(2), 368–396. Dean, J. W., & Snell, S. A. (1996). The strategic use of integrated manufacturing: An empirical examination. Strategic Management Journal, 17(6), 459–480. Delacroix, J., & Swaminathan, A. (1991). Cosmetic, speculative, and adaptive organizationalchange in the wine industry – a longitudinal-study. Administrative Science Quarterly, 36(4), 631–661. Dess, G. G., & Beard, D. W. (1984). Dimensions of organizational task environments. Administrative Science Quarterly, 29, 52–73.
92
BRIAN K. BOYD AND STEVE GOVE
Downey, H. K., Hellriegel, D., & Slocum, J. W. (1975). Environmental uncertainty: The construct and its application. Administrative Science Quarterly, 20, 613–629. Duncan, R. (1972). Characteristics of organizational environments and perceived environmental uncertainty. Administrative Science Quarterly, 17, 313–327. Emery, F., & Trist, E. (1965). The causal texture of organizational environments. Human Relations, 18, 21–31. Finkelstein, S., & Boyd, B. K. (1998). How much does the CEO matter? The role of managerial discretion in the setting of CEO compensation. Academy of Management Journal, 41(2), 179–199. Finkelstein, S., & Hambrick, D. C. (1990). Top-management-team tenure and organizational outcomes – the moderating role of managerial discretion. Administrative Science Quarterly, 35(3), 484–503. Floyd, S. W., & Wooldridge, B. (1992). Middle management involvement in strategy and its association with strategic type – a research note. Strategic Management Journal, 13, 153–167. Floyd, S. W., & Wooldridge, B. (1997). Middle management’s strategic influence and organizational performance. Journal of Management Studies, 34(3), 465–485. Goll, I., & Rasheed, A. A. (2004). The moderating effect of environmental munificence and dynamism on the relationship between discretionary social responsibility and firm performance. Journal of Business Ethics, 49(1), 41–54. Goll, I., & Rasheed, A. M. A. (1997). Rational decision-making and firm performance: The moderating role of environment. Strategic Management Journal, 18(7), 583–591. Goll, I., & Rasheed, A. M. A. (2005). The relationships between top management demographic characteristics, rational decision making, environmental munificence, and firm performance. Organization Studies, 26(7), 999–1023. Gupta, A. K. (1984). Contingency linkages between strategy and general manager characteristics: A conceptual examination. Academy of Management Review, 9, 399–412. Haleblian, J., & Finkelstein, S. (1993). Top management team size, CEO dominance, and firm performance – the moderating roles of environmental turbulence and discretion. Academy of Management Journal, 36(4), 844–863. Hambrick, D. C., & Abrahamson, E. (1995). Assessing managerial discretion across industries – a multimethod approach. Academy of Management Journal, 38(5), 1427–1441. Hambrick, D. C., & Finkelstein, S. (1987). Managerial discretion – a bridge between polar views of organizational outcomes. Research in Organizational Behavior, 9, 369–406. Hambrick, D. C., Geletkanycz, M. A., & Fredrickson, J. W. (1993). Top executive commitment to the status-quo – some tests of its determinants. Strategic Management Journal, 14(6), 401–418. Hambrick, D. C., & Mason, P. (1984). Upper echelons: The organization as a reflection of its top managers. Academy of Management Review, 9, 193–206. Hannan, M. T., & Freeman, J. H. (1977). The population ecology of organizations. American Journal of Sociology, 82, 929–964. Harris, R. D. (2004). Organizational task environments: An evaluation of convergent and discriminant validity. Journal of Management Studies, 41(5), 857–882. Hart, S., & Banbury, C. (1994). How strategy-making processes can make a difference. Strategic Management Journal, 15(4), 251–269. Hart, S. L., & Quinn, R. E. (1993). Roles executives play – CEOs, behavioral complexity, and firm performance. Human Relations, 46(5), 543–574.
Managerial Constraint
93
Hitt, M. A., Boyd, B. K., & Li, D. (2004). The state of strategic management research and a vision of the future. In: D. J. Ketchen & D. D. Bergh (Eds), Research methodology in strategy and management, (Vol. 1, pp. 1–31). Amsterdam: Elsevier. Hrebiniak, L. G., & Joyce, W. F. (1985). Organizational adaptation: Strategic choice and environmental determinism. Administrative Science Quarterly, 30, 336–349. Jarley, P., Fiorito, J., & Delaney, J. T. (1997). A structural, contingency approach to bureaucracy and democracy in US national unions. Academy of Management Journal, 40(4), 831–861. Karimi, J., Somers, T. M., & Gupta, Y. P. (2004). Impact of environmental uncertainty and task characteristics on user satisfaction with data. Information Systems Research, 15(2), 175–193. Keats, B. W., & Hitt, M. A. (1988). A causal model of linkages among environmental dimensions, macro organizational characteristics, and performance. Academy of Management Journal, 31(3), 570–598. Kotha, S., & Nair, A. (1995). Strategy and environment as determinants of performance: Evidence from the Japanese machine-tool industry. Strategic Management Journal, 16(7), 497–518. Lawless, M. W., & Finch, L. K. (1989). Choice and determinism: A test of Hrebiniak and Joyce framework on strategy-environment fit. Strategic Management Journal, 10(4), 351–365. Lawrence, P. R., & Lorsch, J. W. (1967). Organizations and environment. Boston: Harvard University Graduate School of Business Administration. Lepak, D. P., & Snell, S. A. (2002). Examining the human resource architecture: The relationships among human capital, employment, and human resource configurations. Journal of Management, 28(4), 517–543. Lepak, D. P., Takeuchi, R., & Snell, S. A. (2003). Employment flexibility and firm performance: Examining the interaction effects of employment mode, environmental dynamism, and technological intensity. Journal of Management, 29(5), 681–703. Li, M. F., & Simerly, R. L. (1998). The moderating effect of environmental dynamism on the ownership and performance relationship. Strategic Management Journal, 19(2), 169–179. Li, M. F., & Ye, L. R. (1999). Information technology and firm performance: Linking with environmental, strategic and managerial contexts. Information & Management, 35(1), 43–51. Lieberson, S., & O’Connor, J. (1972). Leadership and organizational performance: A study of large corporations. American Sociological Review, 37, 117–130. Lumpkin, G. T., & Dess, G. G. (2001). Linking two dimensions of entrepreneurial orientation to firm performance: The moderating role of environment and industry life cycle. Journal of Business Venturing, 16(5), 429–451. Luo, Y. D. (1999). Environment–strategy–performance relations in small businesses in China: A case of township and village enterprises in southern China. Journal of Small Business Management, 37(1), 37–52. Luo, Y. D. (2005). Transactional characteristics, institutional environment and joint venture contracts. Journal of International Business Studies, 36(2), 209–230. Luo, Y. D., & Peng, M. W. (1999). Learning to compete in a transition economy: Experience, environment, and performance. Journal of International Business Studies, 30(2), 269–295. Magnan, M. L., & St. Onge, S. (1997). Bank performance and executive compensation: A managerial discretion perspective. Strategic Management Journal, 18(7), 573–581.
94
BRIAN K. BOYD AND STEVE GOVE
McArthur, A. W., & Nystrom, P. C. (1991). Environmental dynamism, complexity, and munificence as moderators of strategy-performance relationships. Journal of Business Research, 23(4), 349–361. Miles, R. E., & Snow, C. C. (1978). Organizational strategy, structure, and process. New York: McGraw-Hill. Mintzberg, H. (1983). Power in and around organizations. Englewood Cliffs, NJ: Prentice-Hall. Mishina, Y., Pollock, T. G., & Porac, J. F. (2004). Are more resources always better for growth? Resource stickiness in market and product expansion. Strategic Management Journal, 25(12), 1179–1197. Pagell, M., & Krause, D. R. (2004). Re-exploring the relationship between flexibility and the external environment. Journal of Operations Management, 21(6), 629–649. Palmer, T. B., & Wiseman, R. M. (1999). Decoupling risk taking from income stream uncertainty: A holistic model of risk. Strategic Management Journal, 20(11), 1037–1062. Panayotopoulou, L., Bourantas, D., & Papalexandris, N. (2003). Strategic human resource management and its effects on firm performance: An implementation of the competing values framework. International Journal of Human Resource Management, 14(4), 680–699. Pfeffer, J., & Salancik, G. R. (1978). The external control of organizations: A resource dependence perspective. New York: Harper & Row. Rajagopalan, N. (1997). Strategic orientations, incentive plan adoptions, and firm performance: Evidence from electric utility firms. Strategic Management Journal, 18(10), 761–785. Rajagopalan, N., & Datta, D. K. (1996). CEO characteristics: Does industry matter? Academy of Management Journal, 39(1), 197–215. Rajagopalan, N., & Finkelstein, S. (1992). Effects of strategic orientation and environmentalchange on senior management reward systems. Strategic Management Journal, 13, 127–141. Rasheed, H. S. (2005). Foreign entry mode and performance: The moderating effects of environment. Journal of Small Business Management, 43(1), 41–54. Schmalensee, R. (1977). Using the H-index of concentration with published data. Review of Economics and Statistics, 59, 186–213. Sharfman, M. P., & Dean, J. W. (1991). Conceptualizing and measuring the organizational environment: A multidimensional approach. Journal of Management, 17(4), 681–700. Sharfman, M. P., & Dean, J. W. (1997). Flexibility in strategic decision making: Informational and ideological perspectives. Journal of Management Studies, 34(2), 191–217. Sheppard, J. P. (1995). A resource dependence approach to organizational failure. Social Science Research, 24(1), 28–62. Simerly, R. L. (1997). An empirical examination of the relationship between corporate social performance and firms’ diversification. Psychological Reports, 80(3), 1347–1356. Simerly, R. L., & Li, M. F. (2000). Environmental dynamism, capital structure and performance: A theoretical integration and an empirical test. Strategic Management Journal, 21(1), 31–49. Snell, S. A., Lepak, D. P., Dean, J. W., & Youndt, M. A. (2000). Selection and training for integrated manufacturing: The moderating effects of job characteristics. Journal of Management Studies, 37(3), 445–466. Spanos, Y. E., Zaralis, G., & Lioukas, S. (2004). Strategy and industry effects on profitability: Evidence from Greece. Strategic Management Journal, 25(2), 139–165.
Managerial Constraint
95
Stetz, T. A., & Beehr, T. A. (2000). Organizations’ environment and retirement: The relationship between women’s retirement, environmental munificence, dynamism, and local unemployment rate. Journals of Gerontology Series B-Psychological Sciences and Social Sciences, 55(4), S213–S221. Stoeberl, P. A., Parker, G. E., & Joo, S. J. (1998). Relationship between organizational change and failure in the wine industry: An event history analysis. Journal of Management Studies, 35(4), 537–555. Tosi, H., Aldag, R., & Storey, R. (1973). On the measurement of the environment: An assessment of the Lawrence and Lorsch environmental uncertainty subscale. Administrative Science Quarterly, 18, 27–36. Tushman, M. L., & Anderson, P. (1986). Technological discontinuities and organizational environments. Administrative Science Quarterly, 31(3), 439–465. Venkatraman, N., & Grant, J. H. (1986). Construct measurement in organizational strategy research: A critique and proposal. Academy of Management Review, 11(1), 71–87. Wally, S., & Baum, J. R. (1994). Personal and structural determinants of the pace of strategic decision-making. Academy of Management Journal, 37(4), 932–956. Weinzimmer, L. G., Nystrom, P. C., & Freeman, S. J. (1998). Measuring organizational growth: Issues, consequences and guidelines. Journal of Management, 24(2), 235–262. Wholey, D. R., & Brittain, J. (1989). Characterizing environmental variation. Academy of Management Journal, 32(4), 867–882. Wiklund, J., & Shepherd, D. (2005). Entrepreneurial orientation and small business performance: A configurational approach. Journal of Business Venturing, 20(1), 71–91. Zhang, Y., & Rajagopalan, N. (2004). When the known devil is better than an unknown god: An empirical study of the antecedents and consequences of relay CEO successions. Academy of Management Journal, 47(4), 483–500.
This page intentionally left blank
96
ASSESSING THE EXTERNAL ENVIRONMENT: AN ENRICHMENT OF THE ARCHIVAL TRADITION C. Chet Miller, dt ogilvie and William H. Glick ABSTRACT Organization theorists and strategy researchers have effectively leveraged archival assessments of the environment to better understand organizational actions and performance. Despite the successes, several issues continue to plague research. Vague constitutive definitions and mismatches between constitutive and operational definitions are among the most pressing of these issues. To further develop the archival tradition, we clarified existing definitions and proposed new definitions where warranted. Our work has implications not only for the selection of concepts and measures in future work but also for interpretations of past research.
The external environment has been a concern of organization theorists for many years. Although not emphasized in any distinct way until after World War II, its status has risen steadily over the past several decades with the advent of environmental contingency theory, institutional theory, and ecological theory. In the relatively young field of strategy, the importance of the environment has been recognized from the very beginning. By its very nature, the strategy field emphasizes an open-system conceptualization of Research Methodology in Strategy and Management, Volume 3, 97–122 Copyright r 2006 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1479-8387/doi:10.1016/S1479-8387(06)03006-2
97
98
C. CHET MILLER ET AL.
organizations, which assigns a central role to the environment. From the resource-based view of the firm to upper-echelons research, the external circumstances of an organization have been key considerations. The external environment is a set of elements (organizations, individuals, and so on) that lie beyond the organization’s boundary. It has been formally defined as those elements that are important to the tasks of the organization – those elements ‘‘potentially relevant to goal setting and goal attainment’’ (Dill, 1958). Although frequently considered to be a single construct with several correlated dimensions, the environment is best conceptualized as a multi-construct phenomenon with several connected but clearly distinct constructs. Variation, including dimensions such as instability and unpredictability, is perhaps the most popular of these constructs (for example studies, see Hough & White, 2004; Huber, Miller, & Glick, 1990; Li & Atuahene-Gima, 2001; March, 1991). Complexity, including dimensions such as number and heterogeneity of external elements, is another (for examples, see George, 2005; Keats & Hitt; 1988; Luo & Peng, 1999). Munificence is a third commonly used environmental construct (see Baum & Wally, 2003; Castrogiovanni, 1991; Park & Mezias, 2005). Two distinct strategies have been used to assess the environment. The first utilizes the perceptions of organizational decision makers or outside panels of experts. Informants are asked to respond to survey or interview questions in order to characterize an organization’s external circumstances. Although perceptions can be biased, the perceptual approach has been used successfully in hundreds of studies. From Child’s influential study of strategic choice (Child, 1972), to Miller’s studies of strategy and structure (e.g., Miller, 1988), to Priem’s studies of strategy process (e.g., Priem, Rasheed, & Kotulic, 1995), many important insights have been generated with this approach. The second strategy utilizes archival records. Here, archival data from sources such as Standard and Poor’s Compustat database and the U.S. Census of Manufactures are used to characterize the environments of organizations. One drawback of this approach is data availability. An organization theorist or strategy researcher with a nuanced view of an environmental property may be unable to find archival data suitable for capturing the appropriate phenomena. Despite this and other concerns, many studies have successfully relied upon the archival assessment strategy. Glick, Miller, and Huber (1993), for example, assessed turbulence through Compustat data. They found that comprehensiveness of strategic decisionmaking positively affected firm performance in turbulent industries. This finding complemented existing work based on different methods (Bourgeois
An Enrichment of the Archival Tradition
99
& Eisenhardt, 1988) and helped to motivate subsequent work that supported and elaborated the earlier findings (e.g., Priem et al., 1995; Zahra, Neubaum, & El-Hagrassey, 2002). As another example, Beckman, Haunschild, and Phillips (2004) assessed market volatility, finding that firms tend to exploit and reinforce their existing alliance networks under volatile market conditions. Their work complemented and extended existing research on alliances (e.g., Podolny, 1994). Although both the perceptual and archival traditions have contributed to the development of important organizational and strategic insights, each has shortcomings that must be addressed as research continues to evolve. For the perceptual approach, the propensity for inaccurate assessments must be considered. Bourgeois (1985) studied perceptual errors and found that they negatively impacted firm performance. Boyd, Dess, and Rasheed (1993), Sutcliffe (1994), and others have built on this research by examining sources of perceptual error. For the archival approach, limitations in constitutive definitions and mismatches between constitutive and operational definitions are pressing issues. Although some work has addressed these definitional concerns (e.g., Wholey & Brittain, 1989; Rasheed & Prescott, 1992), a great deal remains to be done. It is this definitional task within the archival tradition to which we now turn. While clarification of definitional issues has direct implications for the perceptional tradition, our focus is the archival tradition where these issues are most acute.
BACKGROUND AND PURPOSE The environment can be characterized in terms of its variation, complexity, and munificence (Aldrich, 1979; Dess & Beard, 1984). For each of these three constructs, multiple dimensions exist, and this is where the difficulties begin. First, organizational and strategy researchers are not always clear about the specific dimension being used to develop theoretical arguments. The fundamental problem is that constitutive definitions (i.e., conceptual definitions used in theory building) often are not specified in precise terms. In some cases, definitions are not explicitly provided (e.g., Anderson, 2001; Goll & Rasheed, 1997; Pelham, 1999). In other cases, constitutive definitions are provided but are overly generic. For example, variation is often labeled instability, dynamism, or volatility and defined in terms of amount or rate of change (e.g., Harris, 2004; Li & Simerly, 1998). Change, however, has specific, fundamental aspects related to magnitude, frequency,
100
C. CHET MILLER ET AL.
and unpredictability, and this should be taken into consideration. In order for theory development to build from one study to the next, and in order to avoid confusion in the organization theory and strategy communities, more precision is called for. Second, and perhaps more problematic, dimensions of environmental constructs are often unintentionally mishandled within a given study. The primary problem here is that constitutive definitions do not always match operational definitions. Keats and Hitt (1988), for example, defined instability in terms of ‘‘difficult to predict discontinuities’’ (p. 579). Their operational definition, however, corresponded to average magnitude of change, a variable that may or may not be strongly correlated with unpredictability. Similarly, Deepak, Guthrie, and Wright (2005) defined dynamism in terms of unpredictable change but adopted an operational definition focused on average magnitude of change. Although the two studies cited here seem to have produced meaningful results and interpretations, mismatches between constitutive and operational definitions can cause inaccurate or unwarranted conclusions to be drawn from a study. Cumulative science is impossible without greater agreement on core definitions. Previous critiques of archival environmental assessments have been focused on additional issues, such as the lack of consensus concerning the most important dimensions of the environment, inconsistencies in labels applied to various dimensions, the weak connection between perceptual and archival measures, and possible weaknesses in convergent and discriminant validity among common measures, (e.g., Harris, 2004; Rasheed & Prescott, 1992; Sharfman & Dean, 1991). Summing all of the relevant issues, the archival tradition to environmental assessment is in need of attention. One purpose of our work is to address the above problems by clarifying and organizing constitutive definitions of environmental dimensions. A second purpose is to clarify existing operational definitions and to propose new ones where warranted. Implications of our work relate to interpretations of past research as well as future theory building and empirical assessment.
VARIATION Organizations are open systems that engage in transactions with their task environments (Thompson, 1967; Yuchtman & Seashore, 1967). As noted earlier, important constructs associated with the environment include variation, complexity, and munificence. Variation and complexity, the two most popular constructs, are discussed in this chapter, starting with variation.
An Enrichment of the Archival Tradition
101
Instability In simple terms, instability is the extent to which an environment exhibits change. Although this common definition has merit, clarification is necessary in order to highlight fundamental aspects of change. As is well known, instability can be decomposed into magnitude and frequency of change (see, for example, Child, 1972; Wholey & Brittain, 1989). Magnitude is the size of changes while frequency is the number of changes over some time period. Less established in the research literature is the fact that magnitude and frequency have different representations based on outliers versus the full texture of variation. Depending upon the focus and purpose of a particular research study, outlying versus full representations might be more useful. To date, however, the outlier approach has received almost no empirical attention in the archival tradition of environmental assessment. Similarly, frequency as an overall concept has received almost no empirical attention. Amplitude is the representation of magnitude based on outliers. It represents the difference between the most extreme states of the environment within discrete units of time. Differences for each unit of time are averaged to characterize the entire time period of interest (e.g., differences within years can be averaged to characterize a 5-year period). Environments characterized by high amplitude exhibit large changes. Environments that exhibit low amplitude exhibit small changes. Operationally, amplitude corresponds to the largest observed difference per unit of time in the plot of a variable such as industry sales, with differences being averaged across units of time to characterize the overall time period (see, for example, Lea´o, 2005; Wholey & Brittain, 1989). If, for example, changes in industry-level sales are used to indicate industry instability for a 5-year period, then amplitude might be assessed annually as the difference between maximum and minimum quarterly sales and the annual differences then would be averaged across the 5 years. Average magnitude of instability is the overall representation of magnitude of change. It represents the average size of changes in the environment for the time period of interest. This dimension has been by far the most popular in empirical examinations of instability based on the archival tradition. In operational terms, average magnitude corresponds roughly to the mean difference in environmental states. It is typically assessed using the coefficient of variation (e.g., Tosi, Aldag, & Storey, 1973) or the standard error of regression divided by the mean level of the dependent variable for a time series (e.g., Dess & Beard, 1984).
102
C. CHET MILLER ET AL.
Frequency of changes in fortune is the representation of frequency based on outliers. It is the number of valence reversing changes that occur across the relevant time period. Environments characterized by high frequency exhibit many changes that constitute a move from a positive to a negative trajectory or a move from a negative to a positive trajectory. Environments characterized by low frequency exhibit few such changes. Operationally, frequency of changes in fortune corresponds to a count of shifts in direction (e.g., Wholey & Brittain, 1989). If changes in industry-level sales are being used to indicate industry instability, frequency would be the count of positive to negative and negative to positive shifts in the slope of the sales plot over time (controlling for the general trend in sales). Total frequency is the overall representation of frequency. It is the number of changes in an environment, similar to Monge’s (1990) theoretical discussion of rate of change and periodicity. Operationally, total frequency corresponds to a count of changes. Again using sales as an example, this dimension of instability is the count of all shifts in the slope of the sales plot over time. Such shifts would not necessarily entail changes in slope valence, but could instead entail simple points of inflection. These clarifications of instability distinguish four dimensions: two based on outlying data points and two based on the full texture of variation. As a graphical method of clarification, the four dimensions are portrayed in Fig. 1. In this figure, the dotted line exhibits many changes. Because the changes involve many slope reversals that are quite small, both dimensions of frequency are high while both dimensions of magnitude are low. The solid bold line exhibits few but large changes. Thus, both dimensions of magnitude are high while both frequency dimensions are low. Finally, the solid non-emboldened line exhibits a more complex pattern. Here, there are both small and large changes, resulting in amplitude being high but average magnitude being moderate. Because there are a moderate number of changes in fortune (8) but many shifts in the slope (45), total frequency is high while frequency of fortune changes is moderate. Which of the four dimensions of instability is most important? This in fact is the wrong question to ask. Importance of a particular dimension depends upon specific research questions, theoretical issues, and the overall goals of a study. If, for example, managerial perceptions, behavior, and reactions to the environment are the focus, then amplitude and frequency of changes in fortune should be strongly considered. Both amplitude and frequency of fortune changes depend on extreme data points. Amplitude is focused on maximum and minimum values of resource flows in an industry (sales, profits, number of employees, and so on) while frequency is focused on
An Enrichment of the Archival Tradition
103
Resource Flows
Different Patterns of Environmental Instability
Overall Time Period
------------- Low amplitude, Low average magnitude, High frequency of changes in fortune, and High total frequency
_____ ________
High amplitude, High average magnitude, Low frequencyof changes in fortune, and Low total frequency High amplitude, Moderate average magnitude, Low Frequency of changes in fortune, and High total frequency
Fig. 1.
Different Patterns of Environmental Instability.
reversals of fortune in those resource flows. Research on behavioral decision theory (see, for example, Bazerman, 2006), and particularly the availability heuristic (Tversky & Kahneman, 1973, 1974), suggest that these extreme data points are likely to have disproportionate effects on perceptions and behavior, and by implication disproportionate effects on decisions about strategy and organizational design. Upper-echelon managers are more likely to notice and allocate attention to extreme values because they are vivid (Barnes, 1984). Amplitude and frequency of changes in fortune have been utilized in only a very limited way in the archival tradition, and this perhaps has been to our collective detriment. For example, studies of congruence between archival assessments of instability and perceptions of instability (e.g., Sharfman & Dean, 1991) probably would result in stronger relationships if the more vivid dimensions of amplitude and frequency of fortune changes were examined. Studies of congruence, however, have typically focused on average magnitude for the archival dimension and a mixed array of dimensions for perceived instability. Average magnitude of instability and total frequency also have important roles to play. By assessing the full texture of variation, these variables
104
C. CHET MILLER ET AL.
probably capture important trends and cycles more accurately and consistently than do simple amplitude and frequency of changes in fortune. If so, they would be more useful in contingency studies involving performance. Unfortunately, to our knowledge, total frequency has never been used in an empirical study in the archival assessment tradition. Unpredictability Closely related to instability is unpredictability. Unpredictability refers to a lack of regularity in the pattern of change in an environment. In more unpredictable environments, changes are less foreseeable (Hannan & Freeman, 1977), and the probability of the environment being in a particular future state is less knowable. When a pattern is absent, change is difficult to predict (Bourgeois, 1980; Cameron, Kim, & Whetten, 1987; Duncan, 1972; Emery & Trist, 1965; Singh, 1986; Snyder & Glueck, 1982). Similar to the initial definition of instability, this definition of unpredictability requires clarification. Below, two specific dimensions are discussed. Magnitude of unpredictability is the extent to which irregular, nonsystematic change is present in an environment. In an environment with little change, this aspect of unpredictability is low, by definition. In an environment with a great deal of change, this aspect can be low or high depending upon the presence or absence of pattern in the flow of change over time. With regard to operational definitions, magnitude of unpredictability is the average size of fluctuations after controlling for systematic change involving growth, decline, and cyclicality. Despite its importance, this dimension has been neglected in empirical work. Proportional unpredictability is the proportion of change that is irregular or nonsystematic. If an environment exhibits modest change but much of that change does not follow a pattern, then proportional unpredictability would be high. If an environment exhibits a great deal of change, this aspect of unpredictability could be high if most of the change does not follow a pattern. In operational terms, proportional unpredictability roughly corresponds to unsystematic change divided by total change. In the few cases where it has been used in empirical research, proportional unpredictability has been assessed using the inverse of the R2 from times-series regression (McCabe & Dutton, 1993; Miller, Burke, & Glick, 1998; Wholey & Brittain, 1989). If changes in sales are the foundation for the measure of unpredictability, then sales in one year are used to predict sales in the next (higher order terms can be used to detect non-linearity). A small R2 indicates unpredictability.
An Enrichment of the Archival Tradition
105
Resource Flows
Magnitude of unpredictability Versus Proportional Unpredictability
Time period ----------- Low magnitude of unpredictability, High proportional unpredictability
____
High magnitude of unpredictability, High proportional unpredictability
___ ____
Low magnitude of unpredictability, Low proportional unpredictability
Fig. 2.
Magnitude of Unpredictability Versus Proportional Unpredictability.
As a final effort to clarify unpredictability, its two dimensions are portrayed graphically in Fig. 2. The thin solid and dotted lines reflect low magnitude of unpredictability because the average size of the unsystematic change is low in both cases. The thin solid line reflects low proportional unpredictability because a large proportion of the total change revolves around a predictable upward trend line. The dotted line, however, reflects high-proportional unpredictability because a higher proportion of the total change revolves around a trend line that is unpredictable. The bold solid line reflects high magnitude of unpredictability because the average size of the unsystematic change is quite large. It also reflects high-proportional unpredictability because a large proportion of the total change revolves around an unpredictable trend line. Many of the constitutive definitions used by previous researchers (e.g., Child, 1972) appear to emphasize proportional unpredictability and this aspect of unpredictability has been emphasized in the few relevant empirical analyses, but magnitude of unpredictability is probably more salient to managers and probably more relevant to organization theorists and strategy researchers. Managers are more likely to be concerned with overall unforeseeable change rather than trivial nonsystematic change that occurs in the context of general stability (i.e., changes that cause proportional unpredictability to be high in the absence of large unforeseeable change). For trivial
106
C. CHET MILLER ET AL.
changes occurring in the context of general stability, managers can maintain a modest amount of flexibility to handle the minor disturbances. Large unsystematic change, however, is difficult, if not impossible, to plan for and can disrupt proposed managerial actions. It follows that important decisions regarding organizational design and strategy would be more tied to magnitude of unpredictability than to proportional unpredictability. As noted earlier, one of the chief issues in the archival tradition relates to a specific mismatch between constitutive and operational definitions of variation – variation is often defined constitutively as amount and unpredictability of change but then operationalized only as average magnitude of instability (e.g., Deepak et al., 2005; Goll & Rasheed, 2004; Simerly & Li, 2000). The relationship between average magnitude of instability and unpredictability is not, however, certain. For example, average magnitude of instability and magnitude of unpredictability are not the same and may not be strongly correlated in all contexts. In some contexts, there may be large changes that are mostly predictable. This seems to have been the case for the forest products industry studied by Fredrickson and Mitchell (1984) in their work on comprehensiveness (see Bourgeois & Eisenhardt, 1988). Similarly, average magnitude of instability and proportional unpredictability are not the same and may not be strongly correlated. Wholey and Brittain (1989), for example, found a correlation of 0.05 for a sample of manufacturing industries and 0.07 for a sample of cities. In our own analysis of this issue (Glick, ogilvie, & Miller, 1990), 583 industries were assessed using Compustat data. Exploratory factor analysis suggested that average magnitude of instability and proportional unpredictability are two distinct factors (variation was assessed in terms of changes in sales, net income, capital expenditures, return on assets, and total assets). The correlation between the two dimensions was only 0.44.
Aggregate Versus Intra-Industry Variation An additional issue that has received very little attention in the research literature involves the specific method used to compute scores for the various dimensions of variation. In the dominant method (e.g., Dess & Beard, 1984; Tushman & Anderson, 1986; Wholey & Brittain, 1989), scores are based on firm-level data aggregated to the industry level without regard for variation among firms within an industry. Fluctuations within an industry are not considered. This method ignores intra-industry competition and dynamics, which is particularly problematic if the industry is in a state of
An Enrichment of the Archival Tradition
107
zero-sum competition. In such an industry, aggregate data would suggest very little industry variation across time while each firm is having many ups and downs in the face of competitor tactics and strategies. When aggregate industry variables such as sales and asset levels are constant, organizations may still face unstable, unpredictable environments if the distribution of sales and assets is shifting rapidly and unpredictably from one competitor to another within an industry. Ruefli, Adaniya, Gallegos, and Limb (1993), in their longitudinal investigation of rank shift behavior at the inter and intra-industry levels, found evidence that patterns of behavior are different at the two levels. For instance, they found stability across industries, but instability from year to year within industries. Although their sample was only two industries, their finding clearly suggests the dangers of using only aggregate data. The patterns observed at the aggregate level are not necessarily duplicated at the intra-industry level (Ruefli et al., 1993). In our analysis of 583 industries (Glick et al., 1990), the dominant method based on aggregate data and an alternative method that included intraindustry variation were compared. For the dominant method, firms’ sales, capital expenditures, total assets, and so on were averaged within an industry for each year, and then average magnitude of instability and proportional unpredictability were assessed based on year to year fluctuations in the averages. For the alternative method, each firm’s sales, capital expenditures, and so on within an industry were used to assess instability and unpredictability for that firm in that industry. Then, these firm-level scores were averaged at the industry level. The latter method captures both firmlevel and industry-level variation. For instability, the shared variance between the two approaches was moderately high (66%), but for unpredictability, the shared variance was lower (46%). Moving forward, significant attention must be paid to this issue. Research based on the common aggregate industry approach misses a great deal of variation.
Summary for Variation Two components of environmental variation, instability and unpredictability, have been clarified (see Table 1). Instability can be decomposed into four separate dimensions – amplitude, average magnitude of instability, frequency of changes in fortune, and total frequency. For those interested in examining the full texture of instability, average magnitude of instability and total frequency seem most appropriate. For those interested in predicting managerial
108
C. CHET MILLER ET AL.
Table 1. Constitutive and Operational Definitions of Variation. Dimension
Constitutive Definition
Operational Definition
Amplitude
Difference between the most extreme states of the environment within a given unit of time, with differences being averaged across units of time to characterize the overall time period
Average magnitude of of instability
Average size of changes in the environment for the time period
Frequency of changes in fortune
The number of valence reversing changes that occur during the time period Number of changes in the environment during the relevant time period Average size of irregular, nonsystematic changes in the environment during the relevant time period
Largest observed difference for a given unit of time in the plot of a variable such as industry sales, with differences being averaged across units of time to characterize the overall time period Standard error of regression divided by the mean level of the dependent variable for a time series regression Count of the number of directional shifts in the slope of a plot over time Count of the number of shifts in the slope of a plot over time Average size of fluctuations after controlling for systematic change involving growth, decline, and cyclicality Unsystematic change divided by total change
Total Frequency
Magnitude of unpredictability
Proportional unpredictability
Proportion of change that is irregular/nonsystematic
perceptions and reactions to instability, amplitude and frequency of changes in fortune are likely to be more appropriate. Unpredictability can be decomposed into two dimensions – magnitude of unpredictability and proportional unpredictability. Magnitude of unpredictability is likely to be more useful for both managers and researchers.
COMPLEXITY In very simple terms, complexity is the degree to which an environment is difficult to understand and effectively manage at a given moment in time. Four primary dimensions of complexity have been emphasized by organization theorists and strategy researchers: numerosity or the number of relevant elements in the environment (see, for example, Gifford, Bobbitt,
An Enrichment of the Archival Tradition
109
& Slocum, 1979; Huber, 1984; Shortell, 1977; Williamson, 1975); dispersion across the elements (see Dess & Beard, 1984; George, 2005; Pelham, 1999); heterogeneity or diversity of the elements (see Bourgeois, 1980; Child, 1972; Keats & Hitt, 1988; Tung, 1979); and interconnectedness among the elements in the firm’s environment (see Galbraith, 1977; Huber, 1984; Keats & Hitt, 1988; Lawless & Finch, 1989; Tung, 1979). Rather than conceiving of these dimensions as competing representations of complexity, we believe it is more helpful to view them as a developmental progression of complexity. The dimensions move from simple to more elaborate aspects of the complexity construct. Although simpler dimensions such as dispersion have been emphasized in operational definitions, the more elaborate dimensions, particularly when combined as we discuss below, better represent the complexity construct originally developed by Emery and Trist (1965) and Galbraith (1972). Unlike the dimensions of variation, which characterize the common environment experienced by a number of firms (e.g., within an industry), the complexity dimensions characterize the environment experienced by a particular firm. Complexity is experienced differently by different firms based on the number of elements in the environment with which they have relations and the nature and characteristics of the resource flows with those elements. Numerosity A basic definition of complexity is focused on the number of elements in the environment. Shortell (1977) argued that environmental complexity should be based solely on numerosity because inclusion of other dimensions causes a lack of clarity and imprecision, resulting in confusion. This position carries an implicit assumption that each element in the environment is discrete and that transactions with each element are of equal size. Numerosity seems deficient as an indicator of complexity because it ignores variations in the size of the resource flows with each element, which are captured by measures of dispersion. Dispersion Dispersion, a very common dimension of complexity in empirical work, indicates whether a firm’s resource flows (inputs and outputs) involve a large number of external elements and whether the exchanges with these elements vary in size. In comparison with numerosity, dispersion is sensitive to both
110
C. CHET MILLER ET AL.
the number of elements and the size of transactions with each. This is similar to Pfeffer and Salancik’s (1978) resource concentration – ‘‘the relative number of alternatives available, as well as the size or importance of these alternatives’’ (p. 50). For example, two firms with 100 external elements have equal numerosity but face very different dispersion if one firm has 80% of its transactions with four of the elements while the other has 1% of its transactions with each of the 100 elements. In the first case, the environment is concentrated and is generally considered to be less complex because the firm must monitor and concern itself with only four elements. In the second case, the environment is dispersed and is generally considered to be more complex because issues (1) can come from anywhere and (2) can have an almost random feel as knowledge related to the activities and concerns of any one element may be somewhat superficial. Consider a third case in which a firm has transactions that are of equal size with only 10 external elements. Although numerosity is lower in this environment, the firm faces more dispersion than the first firm discussed above but not the second. In sum, dispersion is a refinement of numerosity that reflects both number of elements and size of transactions with those elements. Increasing the number of elements and decreasing the variation in the size of transactions together contribute to increasing dispersion. Operationally, dispersion has been .P with the Gibbs–Martin forP assessed 2 mula (Gibbs & Martin, 1962): 1 x2 x (for an example study, see Dess & Beard, 1984). It can also be measured in terms of entropy by considering a firm, i, selling to and buying from different organizations, j. Let Pij be the proportion of firm i’s total sales to organization j and let Pji be the proportion Pn of firm i’s purchases Pn from organization j. The entropy for the firm is P lnð1=P Þ þ ij ij j¼1 j¼1 Pji lnð1=Pji Þ: Note that this equation captures both resource outflows and inflows to reflect the dispersion of the overall environment. It captures dispersion more accurately than the Gibbs– Martin formula because it is more sensitive to variation among smaller environmental elements (Jacquemin & Berry, 1979). As suggested above, dispersion’s conceptual connection to complexity has been based largely on information processing arguments with the common position being that increasing dispersion is associated with increasing complexity (e.g., Harris, 2004; Keats & Hitt, 1988). Several counterarguments can be made, however. One argument suggests a negative relationship – complexity decreases with increases in dispersion. Highly dispersed environments suggest low power and impact for any one external element (Schiller, 1989), and suggest that each element can be ignored to
An Enrichment of the Archival Tradition
111
some degree. Low dispersion suggests a strong need to develop very close relationships and manage very closely the power dynamics associated with each important external element (Pfeffer, 1978; Pfeffer & Salancik, 1978). Thus, low dispersion may present the more complex case. A second and potentially more helpful counter argument suggests that dispersion’s relationship with complexity depends on heterogeneity among the external elements. This argument suggests there is no straightforward relationship between dispersion and complexity, as elaborated below. This is a very important issue, given dispersion’s popularity in empirical research.
Heterogeneity Heterogeneity has been identified as a key aspect of complexity. In terms of constitutive definitions, this aspect of complexity is most common (see, for example, Castrogiovanni, 2002; Child, 1972; Dess & Beard, 1984; Tung, 1979). Heterogeneity relates to differentiation among elements in the environment – the extent to which these elements are qualitatively different from each other. Qualitative differences among suppliers, customers, and competitors lead to heightened learning requirements and development of multiple interaction which lead to increased complexity. In formal P routines,P terms, H i ¼ nj¼1 H ij =n þ nj¼1 H ji =n; where Hi is the heterogeneity faced by firm i, Hij the dissimilarity of firm i to firm j (a customer), and Hji the dissimilarity of firm j (a supplier) to firm i. Entropy measures reflect differences in sizes of transactions with various elements, but not the qualitative differences among types. In contrast with measures of dispersion, heterogeneity measures should be sensitive to the average degree of qualitative difference. Thus, the operational definition of heterogeneity must be based on the dissimilarity of each element relative to other elements. Dissimilarity of the environmental elements might be operationally defined in terms of differences along a variety of dimensions or more simply in terms of different taxonomic categories. Although useful in some ways, neither dispersion nor heterogeneity adequately captures the meaning of complexity, despite one being commonly used in operational definitions and the other being commonly used in constitutive definitions. Dispersion emphasizes the number of elements and variation in the size of transactions with those elements. An implicit assumption associated with the dispersion dimension is that all elements are similar in nature. If, in fact, all of the environmental elements are similar, it is not clear that having smaller or larger transactions with more or fewer
112
C. CHET MILLER ET AL.
elements contributes much to complexity. The assumption of similarity does not allow for distinguishing qualitative differences, but such differences are no doubt crucial in creating more difficult managerial situations. The heterogeneity dimension is sensitive to important differences in environmental elements, but the implicit assumption underlying heterogeneity is that transactions with each element are of equal size. This may or may not be true in a given case. Dispersion and Heterogeneity Elaborated as Proximal Complexity Dispersion and heterogeneity are conceptually distinct and both are integral aspects of the complexity construct. Thus, the interaction of dispersion and heterogeneity is important, an interaction we label proximal complexity. Formally, this interaction can be calculated as follows: PC i ¼
n X j¼1
ðPij lnð1=Pij ÞðH ij Þ þ
n X
ðPji lnð1=Pji ÞðH ji Þ
j¼1
where PCi is the proximal complexity of firm i (the terms corresponding to entropy and dissimilarity in the above equation should be normalized to ensure equal contributions to the variance of the composite value of proximal complexity). This operational definition explicitly incorporates all three elements of complexity from the preceding discussion: numerosity (through the entropy measure of dispersion), dispersion, and heterogeneity. By specifying a multiplicative interaction, this definition implies that the relationship between dispersion and complexity depends on heterogeneity. Although suggesting, in essence, that dispersion be weighted by heterogeneity is in many ways a simple idea, it is important given the nature of dispersion and heterogeneity and the lack of attention to the issue in past practice. In sum, if high levels of heterogeneity are coupled with broad dispersion, then proximal complexity is high. If low levels of heterogeneity are coupled with broad dispersion, then proximal complexity is not high. The interaction of dispersion and heterogeneity is labeled proximal complexity because it captures local rather than total complexity. Proximal complexity focuses on resource flows from the focal organization to the environment and from the environment to the organization. Thus, resource flows among external elements in the environment are assumed to be unimportant. This assumption can be relaxed to consider resource flows among all environmental elements because they are interconnected. Network
An Enrichment of the Archival Tradition
113
research highlights the importance of overall interconnectedness (see, for example, Brass, Galaskiewicz, Greve, & Tsai, 2004). Total Complexity Long ago, Emery and Trist (1965) argued that: A comprehensive understanding of organizational behavior requires some knowledge of each member of the following set, where L indicates some potentially lawful connection, and the suffix 1 refers to the organization and the suffix 2 to the environment:
L11 ; L12 L21 ; L22 L11 here refers to processes within the organization – the area of internal interdependencies; L12 and L21 to exchanges between the organization and its environment – the area of transactional interdependencies, from either direction; and L22 to processes through which parts of the environment become related to each other – i.e., its causal texture–the area of interdependencies that belong within the environment itself (p. 22).
Galbraith (1972) defined interconnectedness (which he also referred to as interrelatedness) as the ‘‘amount of connectedness or interdependence among the elements’’ (p. 55, emphasis added). Galbraith (1977) later noted that ‘‘the focal organization recognizes that those elements of the task environment on whom it is dependent also have dependence problems of their own’’ (p. 209). Thus, there are two types of environmental interdependence. Most researchers (e.g., Stearns, Hoffman, & Heide, 1987; Baker, 1990) have restricted their examination to transactional interdependence between the organization and its direct exchange partners (L12, L21). Extra-organizational interactions occurring among exchange partners and their partners are, however, very important aspects of interdependence (L22). Total complexity incorporates both types of environmental interdependence in order to augment more rudimentary aspects of complexity. Operationally, total complexity is the sum of (1) the focal firm’s proximal complexity, (2) a weighted average of the proximal complexities of the firm’s direct partners, and (3) a weighted average of the proximal complexities of indirect exchange partners (where the weights for direct partners correspond to relative transaction sizes with the focal firm and the weights for indirect partners depend upon transaction sizes with direct partners). Two final points are relevant to the discussion. First, under our scheme a firm’s competitors do not have a direct effect on complexity because the
114
C. CHET MILLER ET AL.
complexity definitions are transactionally based. Competitors, however, are included because they are represented through the proximal complexity of a firm’s customers and suppliers. Second, the choice of proximal complexity or total complexity should be made on the basis of research questions and purposes. In cases where managerial perceptions and reactions to the environment are the focus, proximal complexity should be considered for study because direct relationships are likely to be very salient. In cases where, for example, firm performance is the focus, total complexity might be the better alternative. Summary for Complexity Complexity can be decomposed into progressively more encompassing dimensions (see Table 2). Numerosity is the least encompassing, but can be elaborated as dispersion. Dispersion does not have a direct relationship with complexity, but is made useful through weighting by heterogeneity. Numerosity, dispersion, and heterogeneity collectively form proximal complexity for a given firm. Taking into account a focal firm’s proximal Table 2. Dimension
Constitutive and Operational Definitions of Complexity. Constitutive Definition
Numerosity
Number of elements in the environment
Dispersion
The extent to which resource flows are even across external elements Qualitative differences among external elements
Heterogeneity
Proximal complexity
Total complexity
The extent to which resource flows are even across elements and the diversity of those elements The extent to which resource flows are even across elements and the diversity of those elements, for the focal firm and for its direct and indirect exchange partners
Operational Definition Count of number of elements with which transactions occur Entropy for resource flows across elements with which transactions occur Profile dissimilarity averaged across external elements with which transactions occur Dispersion weighted by heterogeneity
Proximal complexity of the focal firm plus weighted average of the proximal complexities of direct and indirect exchange partners, where weight factor involves transaction sizes
An Enrichment of the Archival Tradition
115
complexity and the proximal complexities of its direct and indirect exchange partners yields total complexity for that firm. The choice of proximal versus total complexity should be made on the basis of specific research questions and purposes.
DISCUSSION AND CONCLUSION Our efforts in this chapter represent an attempt to contribute to the archival tradition of environmental assessment. As part of our work, existing constitutive definitions have been discussed, clarified, and organized. Existing operational definitions also have been discussed, and several new ones proposed. Importantly, our clarifications and extensions have implications for the interpretation of past research and for the design of future research. Implications Related to Variation Our clarifications of variation suggest that researchers should select environmental dimensions contingent on whether they are interested in (1) variation that strongly influences managerial perceptions and reactions or (2) the full texture of variation that no doubt plays a role in firm performance. Unlike the full texture of variation, extreme variation, such as that associated with amplitude, is likely to influence managers’ perceptions and reactions in a strong way. If researchers investigating, for example, the connection between perceived environmental uncertainty and archivally assessed variation (e.g., Tosi et al., 1973) had focused on either amplitude or frequency of fortune changes, then stronger relationships might have been observed. Amplitude and frequency of changes in fortune, and also magnitude of unpredictability, are likely to be the aspects of variation that cause what Milliken (1987, p. 137) refers to as state uncertainty among managers. See Table 3 for a summary of applications. Perhaps our most important contribution for variation is the clarification of operational definitions for average magnitude of instability and two types of unpredictability. As it stands, researchers frequently develop theory related to unpredictable change and then measure average magnitude of instability. The result is invalid tests of theory and erroneous interpretations of empirical findings. Going forward, researchers, reviewers, and editors must be alert to this problem. Our discussion of variation suggests several research questions. First, how do effective executives cope with industries that exhibit many changes in
116
C. CHET MILLER ET AL.
Table 3.
Environmental Dimensions Matched to Research Goals.
Focus of Research
Managerial perceptions, behavior, and reactions to the environment
Suggested Environmental Dimension(s) Instability: amplitude and frequency of changes in fortune Unpredictability: magnitude of unpredictability
Complexity: proximal complexity Firm performance
Instability: average magnitude of instability and total frequency
Unpredictability: magnitude of unpredictability Complexity: total complexity
Rationale
Emphasis is on outliers, which are likely to influence managerial perceptions, behavior, and reactions Emphasis is on large unpredictable changes, which are likely to influence managerial perceptions, behavior, and reactions Emphasis is on direct rather than remote transaction partners Emphasis is on the full texture of variation in the environment (which presents forces to which firms must adapt for success) Emphasis is on large-scale nonsystematic variation that a firm must handle effectively for success Emphasis is on resource flows that are local and distant, where the two types of flows can interact to exert important forces on the firm, and where distant flows can cause unforeseen complications, which must be handled successfully in order to secure positive outcomes
fortune? Strategic planning and organizational buffering are no doubt important. Less obvious mechanisms such as establishing illicit inter-firm cooperation, hiring executives with lower needs for achievement, and keeping stock closely held may also be critical. Second, do firms react similarly to high average magnitude of instability and many changes in fortune? It may be that very different strategies are used to cope with these different aspects of variation. Third, does magnitude of unpredictability influence perceived
An Enrichment of the Archival Tradition
117
environmental uncertainty to a greater extent than amplitude or frequency of fortune changes? Related to this, does magnitude of unpredictability influence such variables as inter-firm coordination to a greater extent than do changes in fortune?
Implications Related to Complexity Our clarifications of complexity also have a number of implications. Perhaps the most important one relates to the common use of dispersion to characterize the complexity faced by a firm. Dispersion has a complicated relationship with complexity, one that must be viewed in light of prevailing heterogeneity. Use of dispersion alone is a problematic strategy. To effectively capture complexity, researchers should go beyond common but simple definitions and measures. Our work may help to promote richer measures by highlighting proximal and total complexity, the latter focused on broad networks. By focusing on these networks, researchers can include remote disturbances in their work. The effects of remote elements can be dramatic. First, changes occurring in remote elements may be large in magnitude and may strongly affect the focal firm through multiple indirect channels. Second, as system dynamics research (see Forrester, 1971) has shown, small disturbances occurring in interconnected systems can create deviation amplifying chains of events that strongly influence elements systemically distant from the original disturbance. That is, changes in remote elements may become magnified as they influence successive elements. Managers may not be able to control these remote events, but they may be able to design organizations to accommodate them. Our concept of total complexity, reflecting dispersion, heterogeneity, and interconnectedness, suggests a number of research questions. First, how do firms cope with complexity generated by remote, indirectly relevant environmental elements? Are these coping mechanisms the same as those used to cope with complexity generated by more immediate exchange partners? Second, how do effective firms embedded in complex environments protect themselves from inter-firm conflict? Complex environments are characterized by differentiated, interconnected elements and thus are prone to such conflict (see Alter, 1990). Third, do firms embedded in complex environments require different sets of managerial characteristics or skills than firms that operate in less complex environments? What strategies do firms embedded in interconnected, differentiated environments use to manage the many uncontrollable factors they encounter?
118
C. CHET MILLER ET AL.
CONCLUSION Effectively conceptualizing and assessing the task environment is critical for organization theorists and strategy researchers. This, however, has proven to be a difficult task. In some instances, constitutive definitions have been vague, which harms theory building. In many instances, constitutive definitions and theory building have not matched empirical measures, which harms the value of empirical research. As we move forward, attention to the pixels of environmental assessment is crucial. Despite past problems, a strong platform of research has been built over the years, and this platform can serve as the basis for future contributions.
ACKNOWLEDGEMENTS Support for this research was provided by the Babcock Graduate School of Management, Wake Forest University; Rutgers Business School – Newark and New Brunswick, Rutgers University; and the Jones Graduate School of Management, Rice University. Helpful comments on earlier versions of this chapter were provided by Jack Brittain, Margaret Duval, Reuben McDaniel, Tim Ruefli, John Slocum, Kathie Sutcliffe, and Doug Wholey.
REFERENCES Aldrich, H. E. (1979). Organizations and environments. Englewood Cliffs, NJ: Prentice-Hall. Alter, C. (1990). An exploratory study of conflict and coordination in interorganizational service delivery systems. Academy of Management Journal, 33, 478–502. Anderson, T. J. (2001). Information technology, strategic decision making approaches and organizational performance in different industrial settings. Journal of Strategic Information Systems, 10, 101–119. Baker, W. E. (1990). Market networks and corporate behavior. American Journal of Sociology, 96, 589–625. Barnes, J. H. (1984). Cognitive biases and their impact on strategic planning. Strategic Management Journal, 5, 129–137. Baum, J. R., & Wally, S. (2003). Strategic decision speed and firm performance. Strategic Management Journal, 24, 1107–1129. Bazerman, M. H. (2006). Judgment in managerial decision making. Hoboken, NJ: Wiley. Beckman, C. M., Haunschild, P. R., & Phillips, D. J. (2004). Friends or strangers? Firm specific uncertainty, market uncertainty, and network partner selection. Organization Science, 15, 259–275.
An Enrichment of the Archival Tradition
119
Bourgeois, L. J. (1980). Strategy and environment: A conceptual integration. Academy of Management Review, 1, 25–39. Bourgeois, L. J. (1985). Strategic goals, perceived uncertainty, and economic performance in volatile environments. Academy of Management Journal, 28, 548–573. Bourgeois, L. J., & Eisenhardt, K. (1988). Strategic decision processes in high velocity environments: Four cases in the minicomputer industry. Management Science, 34, 816–835. Boyd, B. K., Dess, G. G., & Rasheed, A. M. A. (1993). Divergence between archival and perceptual measures of the environment: Causes and consequences. Academy of Management Review, 18, 204–226. Brass, D. J., Galaskiewicz, J., Greve, H. R., & Tsai, W. (2004). Taking stock of networks and organizations: A multilevel perspective. Academy of Management Journal, 47, 795–819. Cameron, K. S., Kim, M. U., & Whetten, D. A. (1987). Organizational effects of decline and turbulence. Administrative Science Quarterly, 32, 222–240. Castrogiovanni, G. J. (1991). Environmental munificence: A theoretical assessment. Academy of Management Review, 16, 542–563. Castrogiovanni, G. J. (2002). Organizational task environments: Have they changed fundamentally over time. Journal of Management, 28, 129–150. Child, J. (1972). Organizational structure, environment and performance: The role of strategic choice. Sociology, 6, 2–21. Deepak, K. D., Guthrie, J. P., & Wright, P. M. (2005). Human resource management and labor productivity: Does industry matter? Academy of Management Journal, 48, 135–145. Dess, G. G., & Beard, D. W. (1984). Dimensions of organizational task environments. Administrative Science Quarterly, 29, 52–73. Dill, W. R. (1958). Environment as an influence on managerial autonomy. Administrative Science Quarterly, 2, 409–443. Duncan, R. B. (1972). Characteristics of organizational environments and perceived environmental uncertainty. Administrative Science Quarterly, 17, 313–327. Emery, F. E., & Trist, E. L. (1965). The causal texture of organizational environments. Human Relation, 18, 21–32. Forrester, J. W. (1971). Principles of systems. Cambridge, MA: Wright-Allen. Fredrickson, J. W., & Mitchell, T. R. (1984). Strategic decision processes: Comprehensiveness and performance in an industry with an unstable environment. Academy of Management Journal, 27, 399–423. Galbraith, J. R. (1972). Organization design: An information processing view. In: J. Lorsch & P. Lawrence (Eds), Organization planning: Cases and concepts (pp. 49–74). Homewood, IL: Irwin-Dorsey. Galbraith, J. R. (1977). Organization design. Reading, MA: Addison-Wesley. George, G. (2005). Slack resources and the performance of privately held firms. Academy of Management Journal, 48, 661–676. Gibbs, J., & Martin, N. (1962). Urbanization, technology, and the division of labor: International patterns. American Sociological Review, 27, 667–677. Gifford, W. E., Bobbitt, H. R., & Slocum, J. W. (1979). Message characteristics and perceptions of uncertainty by organizational decision makers. Academy of Management Journal, 22, 458–481. Glick, W. H., Miller, C. C., & Huber, G. P. (1993). The impact of upper-echelon diversity on organizational performance. In: G. Huber & W. Glick (Eds), Organizational change and
120
C. CHET MILLER ET AL.
redesign: Ideas and insights for improving performance (pp. 176–214). New York: Oxford University Press. Glick, W.H., ogilvie, d., & Miller, C.C. (1990). Assessing dimensions of task environments: Intra-industry and aggregate industry measures. Paper presented at the annual meeting of the Academy of Management, San Francisco, CA. Goll, I., & Rasheed, A. A. (2004). The moderating effect of environmental munificence and dynamism on the relationship between discretionary social responsibility and firm performance. Journal of Business Ethics, 49, 41–54. Goll, I., & Rasheed, A. M. A. (1997). Rational decision making and firm performance: The moderating role of environment. Strategic Management Journal, 18, 583–591. Hannan, M. T., & Freeman, J. (1977). The population ecology of organizations. American Journal of Sociology, 82, 929–964. Harris, R. D. (2004). Organizational task environments: An evaluation of convergent and discriminant validity. Journal of Management Studies, 41, 857–882. Hough, J. R., & White, M. A. (2004). Scanning actions and environmental dynamism: Gathering information for strategic decision making. Management Decision, 42, 781–793. Huber, G. P. (1984). The nature and design of post-industrial organizations. Management Science, 30, 928–951. Huber, G. P., Miller, C. C., & Glick, W. H. (1990). Developing more encompassing theories about organizations: The centralization-effectiveness relationship as an example. Organization Science, 1, 11–40. Jacquemin, A. P., & Berry, C. H. (1979). Entropy measure of diversification and corporate growth. The Journal of Industrial Economics, 27, 359–369. Keats, B. W., & Hitt, M. A. (1988). A causal model of linkages among environmental dimensions, macro organizational characteristics, and performance. Academy of Management Journal, 31, 570–598. Lawless, M. W., & Finch, L. K. (1989). Choice and determinism: A test of Hrebiniak and Joyce’s framework on strategy–environment fit. Strategic Management Journal, 10, 351– 365. Lea´o, P. (2005). Why does the velocity of money move pro-cyclically? International Review of Applied Economics, 19, 119–135. Li, H., & Atuahene-Gima, K. (2001). Product innovation strategy and the performance of new technology ventures in China. Academy of Management Journal, 44, 1123–1134. Li, M., & Simerly, R. L. (1998). The moderating effect of environmental dynamism on the ownership and performance relationship. Strategic Management Journal, 19, 169–179. Luo, Y., & Peng, M. W. (1999). Learning to compete in a transition economy: Experience, environment, and performance. Journal of International Business Studies, 30, 269–295. March, J. G. (1991). Exploration and exploitation in organizational learning. Organization Science, 2, 78–87. McCabe, D. L., & Dutton, J. E. (1993). Making sense of the environment: The role of perceived effectiveness. Human Relations, 46, 623–643. Miller, C. C., Burke, L. M., & Glick, W. H. (1998). Cognitive diversity among upper-echelon executives: Implications for strategic decision processes. Strategic Management Journal, 19, 39–58. Miller, D. (1988). Relating porter’s business strategies to environment and structure: Analysis and performance implications. Academy of Management Journal, 31, 280–308.
An Enrichment of the Archival Tradition
121
Milliken, F. J. (1987). Three types of perceived uncertainty about the environment: State, effect, and response uncertainty. Academy of Management Review, 12, 133–143. Monge, P. R. (1990). Theoretical and analytical issues in studying organizational processes. Organization Science, 1, 406–430. Park, N. K., & Mezias, J. M. (2005). Before and after the technology sector crash: The effect of environmental munificence on stock market response to alliances of e-commerce firms. Strategic Management Journal, 26, 987–1007. Pelham, A. M. (1999). Influence of environment, strategy, and market orientation on performance in small manufacturing firms. Journal of Business Research, 45, 33–46. Pfeffer, J. (1978). Organizational design. Arlington Heights, IL: AHM Publishing. Pfeffer, J., & Salancik, G. (1978). The external control of organizations: A resource dependence perspective. New York: Harper & Row. Podolny, J. (1994). Market uncertainty and the social character of economic exchange. Administrative Science Quarterly, 39, 458–483. Priem, R. L., Rasheed, A. M. A., & Kotulic, A. G. (1995). Rationality in strategic decision processes, environmental dynamism and firm performance. Journal of Management, 21, 913–929. Rasheed, A. M. A., & Prescott, J. E. (1992). Towards and objective classification scheme for organizational task environments. British Journal of Management, 3, 197–206. Ruefli, T. W., Adaniya, A. R., Gallegos, J. A., & Limb, S. J. (1993). Longitudinal analysis of industries. In: Y. Ijiri (Ed.), Creative and innovative approaches to management science (pp. 269–298). Westport, CT: Greenwood-Praeger Press. Schiller, B. R. (1989). The micro economy today (4th ed.). New York: Random House. Sharfman, M. P., & Dean, J. W., Jr. (1991). Conceptualizing and measuring the organizational environment: A multidimensional approach. Journal of Management, 17, 681–700. Shortell, S. M. (1977). The role of environment in a configurational theory of organizations. Human Relations, 30, 275–302. Simerly, R. L., & Li, M. (2000). Environmental dynamism, capital structure and performance: A theoretical integration and an empirical test. Strategic Management Journal, 21, 31–49. Singh, J. V. (1986). Performance, slack, and risk taking in organizational decision making. Academy of Management Journal, 29, 562–585. Snyder, N. H., & Glueck, W. F. (1982). Can environmental volatility be measured objectively? Academy of Management Journal, 25, 185–192. Stearns, T. M., Hoffman, A. N., & Heide, J. B. (1987). Performance of commercial television stations as an outcome of interorganizational linkages and environmental conditions. Academy of Management Journal, 30, 71–90. Sutcliffe, K. M. (1994). What executives notice: Accurate perceptions in top management teams. Academy of Management Journal, 37, 1360–1378. Thompson, J. D. (1967). Organizations in action. New York: McGraw-Hill. Tosi, H., Aldag, R., & Storey, R. (1973). On the measurement of the environment: An assessment of the Lawrence and Lorsch environmental uncertainty subscale. Administrative Science Quarterly, 18, 27–36. Tung, R. L. (1979). Dimensions of organizational environments: An exploratory study of their impact on organization structure. Academy of Management Journal, 22, 672–693. Tushman, M. L., & Anderson, P. (1986). Technological discontinuities and organizational environments. Administrative Science Quarterly, 31, 439–465.
122
C. CHET MILLER ET AL.
Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 4, 207–232. Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124–1131. Wholey, D. R., & Brittain, J. (1989). Characterizing environmental variation. Academy of Management Journal, 32, 867–882. Williamson, O. E. (1975). Markets and hierarchies: Analysis and antitrust implications. New York: The Free Press. Yuchtman, E., & Seashore, S. E. (1967). A system resource approach to organizational effectiveness. American Sociological Review, 32, 891–903. Zahra, S. A., Neubaum, D. O., & El-Hagrassey, G. M. (2002). Competitive analysis and new venture performance: Understanding the impact of strategic uncertainty and venture origin. Entrepreneurship Theory & Practice, 26, 1–28.
ANALYSIS OF EXTREMES IN MANAGEMENT STUDIES Joel A. C. Baum and Bill McKelvey ABSTRACT The potential advantage of extreme value theory in modeling management phenomena is the central theme of this paper. The statistics of extremes have played only a very limited role in management studies despite the disproportionate emphasis on unusual events in the world of managers. An overview of this theory and related statistical models is presented, and illustrative empirical examples provided.
As I am sure almost every geophysicist knows, distributions of actual errors and fluctuations have much more straggling extreme values than would correspond to the magic bell-shaped distribution of Gauss and Laplace. John Tukey
Consider the coast of Norway. It appears jagged whether measured in kilometers, meters, centimeters, or millimeters. This is called scalability – no matter what the scale of measurement, the phenomena appear the same. Scalability equates to what Benoit Mandelbrot (1982) calls fractal geometry. A cauliflower is an obvious example. Cut off a floret, cut a smaller floret from the first floret, then an even smaller one, and another one still y . Now set them all on a table, in line. Each fractal subcomponent is smaller than the former, but each has the same shape and structure. Research Methodology in Strategy and Management, Volume 3, 123–196 Copyright r 2006 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1479-8387/doi:10.1016/S1479-8387(06)03007-4
123
124
JOEL A.C. BAUM AND BILL MCKELVEY
Why should we care about the coast of Norway and cauliflower? Because they are exemplars of a broader set of phenomena that we believe are highly relevant to the field of management studies. Many complex systems tend to be self-similar across levels – the same process drives order-creation behaviors across multiple levels of an emergent system (Kaye, 1993; Casti, 1994; West, Brown, & Enquist, 1997). These processes are called scaling laws because they represent empirically discovered system attributes applying similarly across many orders of magnitude despite the scale of measuring or viewing (Zipf, 1949) – from atoms to galaxies and from base-pairs to species in nature. Brock (2000, p. 30) observes that the study of complexity ‘‘ y tries to understand the forces that underlie the patterns or scaling laws that develop’’ as newly ordered systems emerge. Fractal structures are signified by power laws. They exhibit a power-law effect because they shrink by a fixed ratio. Power laws are ‘‘fat-tailed,’’ Paretian probability distributions that have been detected in a variety of seemingly unrelated processes in nature and society such as earthquakes, hurricanes, asteroid hits, DNA structures, protein interactions, heartbeats, species abundance, extinction events in the fossil record, population size fluctuations, price fluctuations in stock markets, firm growth, interfirm networks, and consumer product sales. Their fat tails reflect infrequent, large-magnitude events (e.g., stock market crashes) or objects (e.g., highly connected firms in a network). Many such extremes are due to interdependency and positive feedback. Power laws call for scale-free theories because the same theory applies to each of the different levels – i.e., the explanation of the generative process is the same across all levels of analysis. Many scholars now believe that power laws are the best analytical framework to describe the origin and shape of most natural objects (Preston, 1950; MacArthur, 1960; Bak, 1996; Halloy, 1998). We follow McKelvey and Andriani (2005) in suggesting that they apply even more to organizations and other social phenomena. Much of conventional statistics is concerned not with extremes, however, but rather with problems of the following types: Finding the probability distribution most appropriate to describe a set of data Estimating or testing hypotheses about key parameters Studying relationships among two or more variables Estimating change over time in some variable, taking into account correlations among successive time points Statistical methods employed to address such problems are concerned primarily with what goes on at the center of a statistical distribution and do
Analysis of Extremes in Management Studies
125
not pay particular attention to its tails, or in other words, the most extreme values at either the high or the low end. Indeed, a core belief in the highly studied area of statistics concerned with robust methods is that it is a bad thing for statistical methods to be influenced too greatly by extreme values (Greene, 2002). The field of management relies heavily on such statistical methods. For example, Hamilton and Nickerson’s (2003) analysis of empirical methods employed in Strategic Management Journal articles from 1990–2001 revealed that regression analysis of statistical averages is increasingly the dominant empirical method employed. Management research began with the study of the traits, productivity, and personalities of people. In studying human attributes such as size and productivity, there are natural limits and strong tendencies toward the mean. People do not range in size from mosquitoes to elephants – they cluster around a mean; unlike firms, they cannot grow almost indefinitely in size to produce more. And, yet, in management much of what is important bears on the management of giant plants and firms. There are many situations in which extreme values are the most important part of the problem: earthquakes, hurricanes, brush fires, pandemics, terrorist events, sexual disease transmission, airport congestion, and winning (Washington, Lincoln, Wilson, Roosevelt, and Truman) and losing wars (Presidents from Texas). Management studies is no different: Bill Gates and Michael Dell in entrepreneurship; Xerox and IBM in strategy; the Sony Walkman and Apple iPod in marketing; October 28, 1929 and October 19, 1987 in finance; Enron and Parmalat in accounting, operational, credit, and insurance losses in risk management; blockbuster drugs in R&D and technology management; Silicon Valley and Route 128 in economic geography; and Jack Welch and (oppositely) Ken Lay in building wealth for shareholders. Though California has 10 ‘‘average’’ earthquakes per day, everyone worries about the next ‘‘big one.’’ Managers and management researchers also need to worry about extremes, not averages. Indeed, managers seem immersed in a world of power laws and extremes. Many management scholars have commented on the apparent incongruity between research appearing in academic journals and practitioner-oriented writing in management (e.g., Beyer & Trice, 1982; Lawler, Monty Mohrman, Mohrman, Ledford, & Cummings, 1985; Brief & Dukerich, 1991; Pfeffer, 1993; Anderson, Herriot, & Hodgkinson, 2001; Beer, 2001; Rynes, Bartunek, & Daft, 2001; Weick, 2001; McKelvey, 2003; Bennis & O’Toole, 2005; Ghoshal, 2005; Van de Ven & Johnson, forthcoming). We think an important part of this problem stems from researchers favoring the Gaussian over Paretian distributions (Andriani & McKelvey, 2005).
126
JOEL A.C. BAUM AND BILL MCKELVEY
Virtually all statistics-based research in management journals rests on assumptions of independent events and Gaussian distributions. In contrast, if one scans practitioner-oriented journals and books, one quickly enters a Paretian world of interdependence, interaction, and coevolution – a world In Search of Excellence (Peters & Waterman, 1982), of firms Built to Last (Collins & Porras, 1994), and of managers seeking to take their firms from Good to Great (Collins, 2001). Most of the stories we read in Business Week, Fortune, and The Economist as well as the case studies we teach our students are also about extremes – success or failure – but seldom about averages. No wonder there seems to be a disjunction – managers and our students live in the world of extremes; we live in a world of statistics and findings about averages. If Pareto distributions and power laws – and accompanying downside risks – are more prevalent in organizations and management than we typically acknowledge, it follows that extreme value theory and statistics are also far more relevant than their current usage suggests. The purpose of this chapter is to introduce extreme value theory and a related set of statistical methods specifically designed to quantify the stochastic behavior of a process at unusually large (or small) values. By definition, extreme values are rare, and as a result, estimates are often required for events that are much greater in magnitude than have previously occurred. This requires extrapolation from observed to unobserved levels. Extreme value theory provides a class of models that enable such extrapolation based on asymptotic argument to estimate the probability of events more extreme than any previously observed. Many fields have begun to use extreme value theory and some have been using it for a very long time including engineering, meteorology, hydrology, oceanography, finance, and insurance, for example (for a review see Castillo, Hadi, Balakrishnan, & Sarabia, 2005). Extreme events often come with incredible costs. One obvious use of extreme statistics is to get some idea of these costs and develop ways of either avoiding or coping with them before the fact. A second usage is to build from our empirical examples. What we learn about doping in athletics and Intifada fatalities create interesting extreme event analytical possibilities. A third use is to build from the idea that power laws are mostly based on interactive events. This is particularly relevant to organizations and markets because so much of these phenomena are comprised of people interacting. For example, events such as insider trading before a stock tumble are clearly interactive; this sets up the possibility of using extreme value theory to uncover fraud. We begin with a brief description and empirical examples of the many kinds of power law discoveries, followed by a review of the scale-free
Analysis of Extremes in Management Studies
127
theories developed to explain these phenomena. Next, we set up our transition from Gaussian to lognormal to Paretian distributions and implications for underlying statistical assumptions. The heart of the chapter reviews the three basic models of extreme statistics: (1) generalized extreme value distribution; (2) r-largest order statistics; and (3) threshold exceedence statistics. For illustration, we draw examples from sports, international conflict, stock markets, and insider trading.
THE PREVALENCE OF POWER LAWS Table 1, adapted from Andriani and McKelvey (2005), identifies examples of power law-distributed phenomena in four realms – physical, biological, social, and managerial/organizational/economic.1 Power laws often take the form of rank/size expressions such as P(k)k–g, where P(k) is the probability of a given rank, k (the variable), and g (the exponent) is a constant. In most exponential equations the exponent is a variable. Power laws call for scale-free theories because the same theory applies to each of the different levels – i.e., the explanation of the generative process is the same across all levels of analysis.
Examples of Power Law Distributions in Management Studies Firm Growth Stanley et al. (1996) examine the statistical properties of all publicly traded manufacturing firms listed in Compustat (U.S.) for the period 1975–1991. They start with Gibrat’s model of firm growth, which assumes that growth in sales is independent of firm size and uncorrelated in time (i.e., Gaussian). They find that, in reality, variance in growth rate is Paretian not Gaussian, and follows a power law with exponent, b: s (s0) ¼ aSb 0 , where s (s0) is the standard deviation of growth per year based on initial sales value, s0; the growth rate, r ¼ S1/S0, is measured as changes in yearly sales; s0ln S0; is a constant (6.66); and b is the slope of factors affecting growth ranging from 0 to 0.5. The equation holds over seven orders of magnitude of firm size. The power law holds when growth is measured as the cost of goods sold (b0.16), assets (b0.17), property, plant and equipment (b0.18), and the number of employees (b0.16). Given their findings, Stanley et al. conclude that processes governing growth rates are scale-free. They give an example of a hierarchical
128
JOEL A.C. BAUM AND BILL MCKELVEY
Table 1. (a) In the physical world Cities Traffic jams Coast lines Brush-fire damage Water levels in the Nile
Power Laws.
Hurricanes and floods Earthquakes Magma rising to surface Asteroid hits Sunspots
(b) In the biological world Epidemics Species abundance
Galactic structure Sand pile avalanches Brownian motion Music Laser technology evolution
Protein–protein Genomic properties interaction networks (DNA words) Metabolism of cells Sizes of ecosystems Networks in brains Punctuated equilibria Tumor growth Mass extinctions Biodiversity Brain functioning Frequency of DNA base chemicals
Predicting premature births Magnitude estimation of sensorial stimuli Fluid circulation Fetal lamb breathing Bronchial structure Cellular substructures Death from heart attack
(c) In the social world Language word usage Social networks
Casualties in war Deaths of languages
Distribution of wealth Publications and citations Structure of the WWW Co-authorships Size of villages Actor networks News website visitation decay patterns Structure of Internet hardware
Willis’ Law: number vs. size of plant genera Genetic circuitry Heartbeat rates Phytoplankton
Global terrorism events
Delinquency rates Sexual networks Aggressive behavior among boys during recess Macroeconomics effects of zero rational agents Number of hits received from website per day
(d) In the organizational world Firm sizes Director interlock structure Job vacancies Supply chains
Sales of consumer products Cotton prices
Salaries
Blockbuster drugs
Intra-firm decision events Growth rates of firms Italian industrial clusters Price movements on exchanges Growth rate of country GDPs
Moore’s Law Fractures of materials Structure of airplane routes
Internal structure of firms Fordist hierarchical power Economic fluctuations
Movie profits World trade relationships among countries
Source: Drawn from Andriani and McKelvey (2005).
‘‘Fordist’’-type organization where the CEO can order an increase in production, causing a Markov chain along the hierarchical levels – each subsequent action step at time t is a replica of action at step t–1. If it is carried out exactly from the top to the bottom of the firm, then the organization is strongly interdependent (b ¼ 0 for total top-down control). But lower level
Analysis of Extremes in Management Studies
129
managers and employees rarely follow orders exactly. If they all ignore the CEO’s order, i.e., all parts of the firm operate independently, then b ¼ 0.5. Usually the employees follow orders with some probability. Thus, for a b ¼ 0.15 or so (given the findings by Stanley et al.), we expect a power law effect to be obtained. Interfirm Networks Two important families of network structures have emerged in recent studies of interfirm networks (Baum, Rowley, & Shipilov, 2004). The first is small-world network structures characterized by the combination of a high degree of clustering, meaning that there is a heightened probability of two actors being acquainted if they have one or more other acquaintances in common, and short characteristic path length, meaning that there exist short paths through a network between most pairs of actors (Watts & Strogatz, 1999). The second is scale-free network structures in which the degree distribution of the network – the distribution of ties among actors – is highly skewed, with a small number of actors having a disproportionately large number of ties (Baraba´si, 2002; Baraba´si & Albert, 1999). Research suggests the widespread presence of the small world pattern in interfirm networks, (e.g., Baum, Shipilov, & Rowley, 2003; Davis, Yoo, & Baker, 2003; Kogut & Walker, 2001; Uzzi, Spiros, & Delis, 2002), but there is limited evidence on whether degree distributions are scale-free in interfirm networks (Uzzi et al., 2002). Baum et al. (2004) examine the structure of investment bank syndicate networks in Canada between 1952 and 1990, and found the connectivity of the network to be highly skewed with most banks tied to a small set of prominent banks. They fit the degree distribution of syndicate ties among banks to a power-law distribution where the probability P(k) of finding a bank with degree (number of ties) k is P(k) k–g, where the exponent g is typically between one and three (Albert & Baraba´si, 2002), and the larger the exponent, the more skewed the distribution. Empirical estimation confirms that the degree distributions were very well fit by a power-law distribution and that the estimated exponents are consistent with prior studies of power law- distributed phenomena. Their estimates for –g are 2.36 (S.E. ¼ 0.04) for indegree (i.e., received ties) and 1.79 (S.E. ¼ 0.01) for outdegree (i.e., sent ties). Powell, White, Koput, and Owen-Smith (2005) conduct a similar analysis of degree distributions of U.S. biotechnology firms’ ties with six different types of partners (e.g., universities, pharmaceutical firms, and hospitals). Their estimated exponents ranged from 1.1 to 2.7 – within the range expected for power laws. Uzzi et al. (2002) reported power law exponents for
130
JOEL A.C. BAUM AND BILL MCKELVEY
the degree distribution of ties among participants (directors, producers, and lead actors) in Broadway musical productions. They give exponents for three years: 1900 (g ¼ 1.25), 1945 (g ¼ 1.43), and 1990 (g ¼ 1.60). Again, the values are consistent with a power-law distribution. Large Firm Profits, Biotech Clusters, and Train Wrecks To illustrate the range of phenomena that follow a power-law distribution, we give three additional and diverse empirical examples of interest to researchers in the field of management who study firm strategy, innovation, and learning: large firm profitability, biotechnology cluster size, and train accident fatalities.2 These examples also serve to graphically illustrate power-law distributions. The top panels in Figs. 1–3 show the ‘‘fat-tailed’’ probability distributions that characterize each of these phenomena. The lower panels of these figures represent these distributions as log–log histograms of k (the profit of a firm, the number of firms in a cluster, and the number of accident fatalities) vs. the probabilities, P(k), of these values occurring. Note that the axes of these plots – not the values of k and P(k) – are logged. Since power laws are distributed P(k) k–g, if the degree distribution is scale-free, then the slope of the curve on any section of the log–log plot should be the same, and a power law with exponent g will appear as a straight line with slope –g on the log–log plot. The linear trends apparent in these figures suggest that these three distributions are consistent with a power law. Empirical estimation confirms that the observed distributions are well fit by a power-law distribution. The estimated exponents are 1.280 (S.E. ¼ 0.03) for profitability, 1.863 (S.E. ¼ 0.04) for cluster size, and 0.938 (S.E. ¼ 0.04) for accident fatalities, with adjusted R2 statistics of 0.98, 0.99, and 0.97, respectively. Fitted lines with slope –g for these equations are given in Figs. 1–3.
How Data Get from Gaussian to Lognormal to Pareto Distributions As we have talked about power laws with colleagues, their reactions range from having no idea of what we are talking about, to thinking power laws are truncated lognormal distributions, to suggesting that power laws always indicate underlying interactive phenomena. One discipline that ran into the problem of distinguishing between exponentials, lognormals, and power laws is the ecologists’ studies of species abundance – how large are the various species in a particular ecology? Preston (1981) points out that they often mistake the interaction of two
Analysis of Extremes in Management Studies
131
Distribution of Canadian Top 500 Profits
180 160 140
f(x)
120 100 80 60 40 20 0 0
50
100
150
200
250
300
350
x Log-Log Plot -- CanadianTop 500 Profits 1.000
p(Profit)
1
10
100
1000
0.100
0.010
0.001 Profit
Profit
Fig. 1.
0.338*k^-1.280
Canadian Top 500 Profitability, 2003.
exponentials for a lognormal. Going from left to right in a plot, for example, an exponential, e0.1x shows a slowly declining curve with a long tail out to the right, whereas e0.4x shows a more rapidly declining curve with a much shorter tail out to the right. In population ecology, a typical survival function shows an exponential distribution. Preston shows that the multiplication of two exponentials produces a distribution looking very much like a lognormal. The latter results from the multiplication of a number of independent Gaussian-distributed variables. For example, multiplying some of Kolmogorov’s (1941) eight attributes
JOEL A.C. BAUM AND BILL MCKELVEY
132
Distribution of Canadian Biotech Cluster Sizes 140 120 100 f(x)
80 60 40 20 0 0
5
10
15
20
25
x Log-Log Plot -- CanadianBiotech Cluster Sizes 1
p(N Firms)
1
10
100
0.1
0.01
0.001 N Firms N Firms
Fig. 2.
0.619*k^-1.863
Canadian Biotechnology Firm Cluster Sizes, 2001.
increasing the probability of achieving wealth (social background, educational level, type of personality, technical ability, communication skills, motivation, being in the right place at the right time, and willingness to take risks) produces a lognormal distribution. Halloy (1998) combines lognormals and power laws indiscriminately into what he calls a ‘‘POLO’’ distribution. This is somewhat in error – the two are not really the same. A lognormal is an exponential distribution taking on the appearance of a normal distribution if the x-axis of the plot is a log scale – hence the name ‘‘lognormal.’’ As illustrated in Figs. 1–3, the power-law signature requires that a distribution plotted with x- and y-axis as log scales appear as a negatively sloped straight line.
Analysis of Extremes in Management Studies
133
Distribution of U.S. Railroad Fatalities 50 45 40 35 f(x)
30 25 20 15 10 5 0 0
10
20
30
40
x
Log-Log Plot -- U.S. Railroad Fatalities 1
p(Fatalities)
1
10
100
0.1
0.01
0.001 Fatalities Fatalities Fig. 3.
0.243*k^-0.938
U.S. Railroad Accident Fatalities, 1975–2001.
Pareto’s Law (Pareto, 1897) shows a distribution of wealth in which the vast majority of citizens forming the upper left ‘‘fat’’ tail have little wealth as opposed to a very few very wealthy individuals comprising the fat tail to the lower right. In this distribution the median puts 80% of the wealth to the right with 20% to the left; the mean of wealth is, thus, quite far to the right of the median whereas the mode is quite far to the left.
134
JOEL A.C. BAUM AND BILL MCKELVEY
Here is where it gets interesting. Generally, the assumptions of neoclassical economists and contemporary econometrics (e.g., Greene, 2002), fit the upper left tail fairly well – here we find poor people who mostly shop as independent individuals (though their purchasing may be subject to exogenous shocks in the form of promotions (Moss, 2002)). At the right we always find the purest form of the power law; and we also find the highest level of interaction among the rich – they are executives, sit on interlocking boards, have common investment advisors, meet at private clubs, have connections to Wall Street investment banks, and may even have collusive relations with executives at other firms. In the middle we may find the ‘‘clean’’ lognormal consisting of independent multiplicative components or we may find a truncated lognormal in which the middle portion appears as a straight line and may be construed to be a power law. For example, it is possible that some of Kolmogorov’s (1941) components of wealth are not independent of each other. Given this, the power-law signature seeps back toward the upper left, creating the straight-line truncated lognormal. In short, we have independence-based exponential distributions at the upper left tail; in the middle we find independent multiplicative lognormals morphing into truncated lognormals based on some interaction; and interaction-based power-law signatures at the lower right tail. The lesson from this is that the more there is a shift from independent data points to interactive ones, the more predominant power laws become.
WHERE DO POWER LAWS COME FROM? There is much debate as to what behavior power laws signify and what causes them. While fractal structures and interactive data points do give rise to power laws, it is not unheard of for independent data points to do likewise. And, it is no longer correct to conclude that power laws indicate only interactive causes (Andriani & McKelvey, 2006) – mostly they do, but not always. Below, we briefly review some of the main accounts for power-law distributions. Table 2 summarizes nine basic rules that underlie power law-distributed phenomena like those listed in Table 1, and the illustrative examples given earlier. Newman (2005) concludes that ‘‘preferential attachment’’ and ‘‘selforganized criticality’’ are the most important among these, but given the occurrence of power laws across all the sciences, we are not convinced of this. We believe that each of these applies to organizations and their management.
Analysis of Extremes in Management Studies
Table 2. Rules Square/cube
Connection costs
Diversity
Preferential attachment
Multiple traits
Phase transitions
Interaction decays Self-organized criticality
Least effort
a
135
Some Causes of Power Lawsa. Explanation Cauliflower; villages: In organisms, surfaces absorbing energy grow by the square, but the organism grows by the cube, resulting in an imbalance; fractals emerge to bring surface/ volume back into balance. (Carneiro, 1987) Growth unit connectivity; modularity: As cell fission occurs by the square, connectivity increases by n(n–1)/2, producing an imbalance between the gains from fission vs. the cost of maintaining connectivity; consequently, organisms form modules of cells so as to reduce the cost of connectivity (Bykoski, 2003) Heterogeneous agents: Agents seeking out other agents to copy/learn from so as to improve fitness generate network; there is some probability of positive feedback such that some networks become groups, some groups for larger groups, etc. (Kauffman, 1993) Nodes; networks: Given newly arriving agents into a system, larger nodes with an enhanced propensity to attract agents will become disproportionately even larger, giving rise to the commonly seen rank/frequency Pareto distribution; applies to both nodes and networks (Yule, 1924; Baraba´si, 2002) Wealth; mass extinctions: p(y) eay, where y ¼ ln m, and m ¼ no. of multiplied elements, transforms exponentials and lognormal distributions into a power-law distribution (Pareto, 1897; Raup, 1999; West & Deering, 1995; Newman, 2005) Turbulent flows: Exogenous energy impositions cause interaction effects and percolation transitions at a specific energy level – the 1st critical value – such that new interaction groupings form (Stauffer, 1985; Newman, 2005) Sales declines: Exogenous shocks cause interaction networks that then decay by a power law (Sornette, 2004) Sandpiles; forests; and heartbeats: Some systems reach a critical state where they maintain stasis by preservative behaviors – such as sand avalanches or forest fires – which vary in size of effect according to a power law (Bak, Tang, & Wiesenfeld, 1987; Drossel & Schwabl, 1992) Language: Word frequency in language is a function of ease of usage by both speaker/writer and listener/reader (Cancho & Sole´, 2003)
Additional power law causes are mentioned in West and Deering (1995), Sornette (2004), and Newman (2005).
136
JOEL A.C. BAUM AND BILL MCKELVEY
First, consider McKelvey’s seven ‘‘1st Principles’’ (Benbya and McKelvey, forthcoming).3 Collectively, these principles act as deterministic causes under the ‘‘multiple traits’’ heading in Table 2: 1. Prigogine’s (1955) Adaptive tension: Environmentally imposed tensions (energy differentials) stimulate adaptive order creation. 2. Ashby’s (1956) Requisite variety (complexity): Adaptive order creation occurs only if internal complexity (degrees of freedom) exceeds external complexity. 3. Fisher’s (1930) Change rate: Higher internal change rate offers adaptive advantage in changing environments. 4. Simon’s (1962) Modular design: Nearly autonomous subunits increase the rate of adaptive response. 5. Maruyama’s (1963) Positive feedback: Insignificant instigating events may result in significant order creation. 6. Lindblom’s (1959) Causal intricacy: Complexity requires advantageously coping with multiple causes: bottom-up, top-down, horizontal, diagonal, intermittent, and Aristotelian. 7. Dumont’s (1966) Coordination rhythms: Rhythmic alternation of causal dominance offers more functional adaptive response than balance. Like Kolmogorov’s (1941) theory of wealth creation, which is based on the multiplicative joint probability of eight causes, the presence of one or more of these principles vastly improves the probability of an organization’s adaptive success. They fit the multiple-traits category because, whereas only one or two traits would produce a lognormal distribution, as traits accumulate a Pareto distribution results. Several of these principles may also generate power-law outcomes individually. First, whereas cauliflower florets are obvious in their fractal structure, as Table 1b shows, there are many fractal phenomena within mammalian bodies that are not so obvious, but nevertheless, critically important to adaptive success. Carneiro’s (1987) key point is that absent a variety of what he calls ‘‘complexity traits,’’ villages remain small because, if larger, they become unmanageable because of the social equivalent of the square/cube rule (see Table 2). In his analysis, the complexity traits allow small social units such as villages to overcome the limits of surface-like limitations of simple face-to-face communication and town-meeting-like organization. The complexity traits compensate for the limitations of surface communication as the social volume increases. The increase in complexity traits adds degrees of freedom, which meet the demands of Ashby’s ‘‘law of requisite
Analysis of Extremes in Management Studies
137
variety,’’ which McKelvey and Boisot (2006) update to a ‘‘law of requisite complexity.’’ This also means that response to Ashby’s law is power law driven. The substitution of organizing methods as increased size makes faceto-face communication ineffective is well known in the management literature (Jones, 2000). Second, Simon’s (1962) principle of ‘‘near decomposability’’ and consequent modularity is a direct function of Bykoski’s (2003) ‘‘connection costs’’ rule. As growth units (e.g., employees in an organization) increase, communication costs increase disproportionately. Adaptive efficiency results when modules form (Sanchez & Mahoney, 1996; Schilling, 2000). This also sets up a power-law outcome. Third, note that the ‘‘square/cube’’ and ‘‘connection cost’’ rules oppose each other. If degrees of freedom grow too much, connectivity costs cause modules to form. But if modules become too large, the square/cube rule holds and fission results so as to bring the system back into balance by increasing the degrees of freedom. Acting together, these two rules set up the condition of Bak, Tang, and Weisenfeld’s (1987) ‘‘self-organized criticality’’ rule – balance is achieved more frequently by small adjustments, but less frequently by more dramatic reorganizations. Taken together, these rules imply that managers live in a world of self-organized criticality. Fourth, Prigogine’s (1955) negentropy4 stimulates creation of new organizational structures (what he called ‘‘dissipative structures’’) from adaptive tensions between supply and demand, high and low cost, old and new technology, etc. y (McKelvey, 2004). The ‘‘phase-transition’’ rule results in new structures appearing as a Pareto distribution. Sornette’s (2004) study of Amazon book sales shows that both exogenous shocks (the phase-transition rule) and endogenous interaction both dissipate through the ‘‘interaction decay’’ rule. His study shows that social interaction growth and decline both show power-law signatures. Fifth, the ‘‘diversity’’ rule explains Kauffman’s (1993) so-called ‘‘spontaneous’’ order creation dynamics in a system composed of heterogeneous agents.5 Simply, the presence of diverse agents sets up some probability that interaction networks will form. Sornette, Deschaˆtres, Gilbert, and Ageon’s (2004) study shows this clearly in a social interaction setting: an increase in the sales of books occurs simply through interactions among people without the exogenous phase-transition effect present. Sixth, the ‘‘self-organized criticality’’ (SOC) rule has many applications in organizations. We have already mentioned one of them above. Table 1d lists a variety of power-law findings bearing on organizations and management. Many of these result from SOC. Managers, for example, take a variety of
138
JOEL A.C. BAUM AND BILL MCKELVEY
large or small actions to keep a firm on track toward a particular objective, with many more small ones than large ones (e.g., Miller & Friesen, 1984). These will form power laws. Thomas, Kaminska-Labbe´, and McKelvey (2005) draw on the recent fractal heartbeat literature (Bigger et al., 1996; Ribeiro et al., 2002) to argue against the prevailing advocacy of aiming for static balance between dualities such as control vs. autonomy, exploitation vs. exploration, and global efficiency vs. country sensitivity. Their view is that dualities should be managed according to Dumont’s (1966) theory of societies and Romme’s (1999) circular organizing through ‘‘irregular oscillation’’ – like the irregular motions used to balance a bicycle. This is a managerial application of the SOC rule. Seventh, the ‘‘least-effort’’ rule is demonstrated in managers’ attempts to simplify their responsibilities when dealing with too much causal intricacy. Top-down influence and control channels that are the easiest to set up and use will occur the most frequently. More complicated and multidirectional communications will be less frequent, even though more effective. Diatlov (2005) shows that hierarchies and decision events both show power-law signatures. Least effort also contributes to the SOC effect, as managers balance between doing what is easy vs. doing difficult things that are more likely to work. Eighth, scanning across Table 1, it becomes clear that the ‘‘preferential attachment’’ rule is the most frequent. There are many network-based power-law signatures and this rule, otherwise known as the Mathew effect or rich-get-richer rule, underlies all of them. Managers operate within intraand inter-organizational networks. The U.S. and Europe both have antitrust laws to prevent what Baraba´si (2002) terms the ‘‘winner take all’’ rule. As Table 1 shows, parts of systems, the systems themselves, their predators (competitors) and prey (customers), and niche resources are all comprised of fractal structures with preferential attachment dominating. Much M&A activity is fractal balancing – buying firms so as to match other large firms (Target and Sears merging to match Wal-Mart); firms buying or selling lower-level units so as to fractal-balance against competitors and customers from the bottom to the top. Our analysis suggests that all of the power-law rules outlined in Table 2 apply to managers in organizations. There can be no escaping the conclusion that managers are constantly subject to power-law phenomena. And yet, everything they are supposed to learn from quantitative academic research rests on assumptions denying the existence of Pareto distributions and/or using robustness methods to fit Pareto phenomena into Gaussian assumptions.
Analysis of Extremes in Management Studies
139
WHY GAUSSIAN STATISTICS CAN MISLEAD Power-law phenomena stem from Paretian rather than Gaussian distributions. The fundamental difference lies in assumptions about the correlation among events. In a Gaussian distribution events are assumed to be independent. Independent events generate normal distributions, which sit at the heart of modern statistics. When events are interdependent, normality in distributions is not the norm. Instead, Paretian distributions dominate because extreme events occur more frequently than normal, bell-shaped Gaussian-based statistics would lead us to expect. As far back as Roethlisberger and Dixon’s (1939) Management and the Worker, and perhaps further, management research has built its credibility on a statistics of individuals. Distributions of individuals’ attributes are invariably Gaussian with obvious bell-shaped normal distributions. There is just enough variance in a typical ‘‘human’’ distribution to allow correlations of one attribute with another, but the mean is always stable and meaningful and the variance is very finite: too little variance (everyone at the mean) and correlations are impossible; too much variance and the outliers appear to cause invalid conclusions. All well and good for comparing people; but the management world is not just about people. It is full of large, valid, and meaningful size effects – sizes of firms, economies, market crashes, technological and market changes, government effects, wars, droughts, plagues, global effects, and on and on. As soon as large size effects appear, distributions shift away from normal to exponential, lognormal, Paretian, and power-law signatures. Surprisingly, statisticians still want us to believe we live in a world of Gaussian-distributed people instead of Pareto-distributed effects like wealth and firm sizes. Nowhere is this more obvious than in modern econometrics. The Centrality of Gaussian Assumptions in Current Statistical Analysis Greene’s Econometric Analysis (2002) is the standard reference for many econometricians and other social science researchers. He begins his analysis of more than 900 pages with linear multiple regression and its five endemic assumptions: (1) independence among data points; (2) linear relationships among variables; (3) exogenous independent variables; (4) homoscedasticity and nonautocorrelation; and (5) normal distribution. Mostly, the book focuses on how to make econometric methods work when one or more of these assumptions are untrue of the data. Given nonlinearity, for example,
140
JOEL A.C. BAUM AND BILL MCKELVEY
Greene says: ‘‘by using logarithms, exponentials, reciprocals, transcendental functions, polynomials, products, ratios, and so on, this ‘linear’ model can be tailored to any number of situations’’ (p. 122). As for the normal distribution assumption: y large sample results suggest that although the usual t and F statistics are still usable y they are viewed as approximations whose quality improves as the sample size increases y . As n increases, the distribution y converges exactly to a normal distribution y . This result is based on the central limit theorem and does not require normally distributed disturbances (p. 105).
He observes that: ‘‘heteroscedasticity poses potentially severe problems for inferences based on least squares’’ [regression analysis] y . It is useful to be able to test for homoscedasticity and if necessary, modify our estimation procedures accordingly (p. 222). Greene does not discuss the probability that interdependent, interacting, connectionist, interconnecting, coevolutionary, or mutual causal effects might occur among social events or agents. Nor does he discuss when independence might shift to interdependence, or the reverse. These possibilities just do not exist in econometricians’ assumptions about data. Most explanations for why Pareto distributions occur include a reference to interconnection of some form, however. Pareto distributions occur because there is some probability of a positive feedback or mutual causal progression resulting in an extreme event. None of the robustness adjustments to failing linear multiple regression assumptions that Greene discusses deals with the realworld’s probable – not just possible – losses of independence. The bottom line is that the various robustness tests Greene discusses, even including the best and the most widely used one, the generalized autoregressive, conditionally heteroscedastic (GARCH) model (Bollerslev, 1986), give no assurance whatsoever that modern-day researchers account for the effects of extreme events in their statistical analyses.6 With respect to earthquakes, Greene, along with virtually all modern regression modelers, want Californians building and living in high-rise buildings to think that using a moving average (GARCH) of quake variance over the hundreds of harmless (average) quakes will lead to building codes that protect against ‘‘the big one’’ – a level 8 or 9 quake that moves the ground 30 feet north. What robustness improvements do accomplish is the narrowing of statistical confidence intervals by avoiding the stretching effects of the Pareto distribution’s infinite tails. This considerably improves the likelihood of researchers getting their analyses to the point of meeting the minimum ‘‘po0.05’’ standards most journals have for publishing quantitative
Analysis of Extremes in Management Studies
141
results – basically they shift confidence intervals in toward the mean making them easier to reach. of course, this creates an incentive to not want to consider the idea that Gaussian distributions may totally misrepresent social and organizational phenomena. But it also brings into question many management research findings and derivative advice to practitioners. There is an alternative – extreme value theory and statistics – a robust statistical discipline dating back nearly 80 years. It is a discipline aimed at studying Paretian tails. In a managerial world where averages may often be less important than extremes, we now offer an introduction to this field of study, presenting three basic models and empirical examples, drawing illustrations from sports, terrorism, and managerial corruption.
FOUNDATIONS OF EXTREME VALUE THEORY AND STATISTICS Historically, work on extreme value problems can be traced back as far as Nicholas Bernouilli’s analysis of mean largest distances in 1709. In the 1920s, a number of individuals simultaneously began systematic development of a general statistical theory of extreme values. Early theoretical breakthroughs were the analysis of Fre´chet (1927) of asymptotic distributions of largest values, and the independent analysis of Fischer and Tippet (1928) of the same problem. While Fre´chet identified one possible limit distribution for extreme values, Fisher and Tippet showed that extreme limit distributions could only be one of the three types. Gnedenko (1943) presented the foundations for extreme value theory by providing necessary and sufficient conditions for Fisher and Tippett’s ‘‘three types theorem.’’ An early application was the paper of Weibull (1939) on metallurgical failures. This theorem led Gumbel (1941, 1958) to propose a statistical methodology for studying extremes based on fitting the extreme value distributions to data consisting of maxima or minima of some random process over a fixed block or interval of time (e.g., a month or a year). Early applications were in the field of civil engineering, where structures must be designed to withstand forces that might reasonably be experienced. Gumbel’s statistical methodology provided a way to anticipate the forces that might be encountered based on historical data. Contemporary methods derived from this early work involve fitting block maxima and minima with the generalized extreme value (GEV) distribution, which combines Fisher–Tippett and Gnedenko’s three types of distributions into a single, three-parameter distribution.
142
JOEL A.C. BAUM AND BILL MCKELVEY
More recent developments emphasize methods based on threshold exceedences (shortfalls) rather than block maxima (minima) (Davison & Smith, 1990). This ‘‘peaks over thresholds’’ (POTs) approach originates with Pickands (1975), who showed that excesses over a high threshold, asymptotically, follow a generalized Pareto distribution. By taking all exceedences over (shortfalls below) an appropriately high (low) threshold into account, these methods make more efficient use of data by incorporating information on all extreme events in any given block rather than only the maximum among them. They can also be extended to situations where we are interested in how extreme levels of some variable X depend on some other variable Y. There is also an intermediate approach, the r-largest order statistics method, which examines the largest r-order statistics in a block, rather than only the largest, again permitting more efficient estimation. Although the threshold method is the most general and flexible, for completeness, we will also consider block maxima and r-largest methods. We illustrate both univariate and multivariate statistical models. Having said that, by necessity, we provide only a relatively short overview of the essential features of each of these models (for more in-depth treatments, see Castillo et al., 2005; Coles, 2001; Reiss & Thomas, 2001). Table 3 summarizes the key features, requirements, and uses for each empirical approach. Implementation of these methods requires methods for estimating their parameters and associated standard errors so that their accuracy can be assessed and hypotheses tested. Unfortunately, standard statistical packages do not typically permit estimation of extreme value models. Several programs specifically tailored to extreme values are available, however. The analyses presented here were performed using The Extremes Toolkit (extRemes) (Gilleland & Katz, 2005), which is an interactive program for analyzing extreme value data using the R statistical programming language. The program may be downloaded at: http://www.isse.ucar.edu/extremevalues/evtk.html.
Generalized Extreme Value Distribution (GEVD) Model The extreme value distributions formally arise as limiting distributions for maxima or minima of a sequence of random variables. A formal definition is as follows: Suppose X1, X2, y are independent random variables with a common distribution function F: F ðxÞ ¼ PrfX j xg;
for each j and x
(1)
Method
Distribution
Summary of Extreme Value Theory Methods. Implementation
Considerations
Interpretation
Data are segmented into Number of order statistics: too small r ‘‘blocks’’ of equal may generate few length (i.e., week, observations, resulting month, year), and the in large estimation largest (smallest) r variance; too large r observations (e.g., the may include values 2, 3, 4, or 5 most that violate asymptotic extreme values) are support for the model, recorded, generating a biasing estimation and series that includes the extrapolation r-largest (smallest) order statistics for each block to which
143
Non-homogeneity: instability of process producing extremes over time, resulting in temporal variation in the location, scale, and shape parameters of the distribution e.g., seasonality, business cycles, innovation Addressed by freeing model parameters to vary with time Model likelihood statistics aid in assessing model improvement Extremal Basic interpretation interdependence: the identical to Block values of maxima method (minima) for two series Selection of r aided by are correlated comparison of e.g., wind strength and diagnostic plots and wave height parameter estimates to Addressed by freeing assess the validity of model parameters for the model across a each series to vary with range of values of r values of the other Increasing values of r series improve precision of estimates by inclusion Model likelihood statistics aid in of additional
Data are segmented into Block size: too short may Model fit is assessed Block maxima (Minima) Generalized extreme using graphical result in a poor ‘‘blocks’’ of equal value distribution analysis (probability approximation by the length (i.e., week, (GEVD) plots, quantile plots, limit model, biasing month, year) and a Three parameter return level plots, and estimation and series of block maxima (location, scale, shape) probability density extrapolation; too long or minima are limiting distribution plots) may generate few generated to which the that approximates When model accurately blocks, resulting in GEVD can be fitted maxima and minima fits data, model large estimation Encompasses Gumbel, parameters can be used variance Fre´chet, and Weibull to extrapolate from Pragmatic considerations distributions data maxima or may necessitate yearly depending on the value minima expected over blocks if data are of the shape parameter 10, 25, or 100 year recorded annually intervals
r-Largest (smallest) order GEVD statistics
Complications
Analysis of Extremes in Management Studies
Table 3.
Method
Distribution
Implementation
Considerations
the GEVD can be fitted Threshold exceedence (shortfall)
When complete timeseries data are available, avoids blocking, and makes use of all available information on extremes e.g., when some blocks contain more extreme events than another
Threshold level: too low an exceedence threshold may include values that violate asymptotic support for the model, biasing estimation and extrapolation; too high may generate few excesses for precise estimation (Note: opposite holds for shortfall thresholds)
Interpretation
Complications
information until r assessing model violates asymptotic improvement assumptions Basic interpretation Dependent exceedences identical to Block and (shortfalls): threshold Threshold methods excesses (shortfalls) are Threshold model clustered rather than selection aided by distributed uniformly diagnostic method over time prior to estimation and Extent of clustering of comparison of model threshold excesses fit across a range of (shortfalls) estimated thresholds by the ‘‘extremal Increasing (lowering) the index’’ threshold improves Addressed by precision of estimates ‘‘declustering’’ methods that filter by inclusion of dependent additional information observations to obtain until threshold violates a set of approximately asymptotic independent excesses assumptions (shortfalls) Extremal index used to assess effectiveness of declustering
JOEL A.C. BAUM AND BILL MCKELVEY
Generalized Pareto distribution (GPD) Two parameters (scale, shape) limiting distribution that approximates threshold exceedences (shortfalls) for sufficiently large (small) threshold values
144
Table 3. (Continued )
Analysis of Extremes in Management Studies
145
The distribution function of the maximum Mn ¼ max{X1, y , Xn} is given by the nth power of F: PrfM n xg ¼ F n ðxÞ
(2)
Although this precisely states the distribution of the maximum, it does not give any insight into what happens in large samples, that is, as n - N. To do that, we can renormalize, defining scaling constants an40 and bn such that Pr
M n bn x an
¼ PrfM n an x þ bn g ¼ F n ðan x þ bn Þ ! HðxÞ
as n ! 1
ð3Þ
where H is a probability distribution that is not always either 0 or 1 (Castillo et al., 2005; Coles, 2001). As noted earlier, there are only three possible forms for the limiting distribution H. These are given by Type I2Gumbel :
HðxÞ ¼ expð expðxÞ; (
Type II Frechet :
HðxÞ ¼
Type III2Weibull :
HðxÞ ¼
1oxo1
0
if xo0
expðxa Þ
if 0oxo1
expððxÞa Þ if 1oxo0 if x40
1
(4)
(5)
(6)
In both Fre´chet and Weibull types, a40 is some fixed constant. Any F for which the tail takes the power-law form 1 F ðxÞ cxa ;
x!1
(7)
for constants c40 and a40 is in the domain of attraction of the Fre´chet type (i.e., F converges to H of the Fre´chet type through Eq. (3) for appropriate an and bn). Any F with a finite endpoint, oF, such that F(oF) ¼ 1, but F(x)o1 for any xooF and power-law behavior as x m oF, so that 1 F ðoF yÞ cya ;
y#0
(8)
146
JOEL A.C. BAUM AND BILL MCKELVEY
for constants c40 and a40, is in the domain of attraction of the Weibull type. This type is usually applied to minima and in situations where there is a clear finite lower bound. The most common distribution in the domain of the Gumbel type is one for which the endpoint, oF, is infinite, but the tail of the distribution 1F(x) decays faster than polynomial (i.e., as in Eq. (7)). These three types of extreme value distributions can be combined into a single family known as the GEVD given by ! x m 1=x HðxÞ ¼ exp 1 þ x c
with
xm 1þx 40; xa0 c (9)
where m is a location parameter, c a scale parameter, and x the shape parameter.7 The location parameter shifts the distribution to the left or right on the horizontal axis, the scale parameter stretches or compresses the distribution, and the shape parameter determines the nature of the tail of the distribution. When x40 the distribution is of the Fre´chet type with a ¼ 1/x, when xo0 it is of the Weibull type with a ¼ 1/x, and when x ¼ 0 it is the Gumbel type. x is thus a key parameter in determining the qualitative behavior of the GEVD. Fig. 4 shows the effects of c and x on the form of the distribution. The GEVD provides a model for the distribution of block maxima of a series of independent and identically distributed sequence of observations X1, X2, y . Since F is not typically known, we can thus use the GEVD as an approximation for the distribution of maxima of the Xi sequence. To implement the model, data are blocked into sequences of equal length (i.e., month, year), generating a series of block maxima, Mn,1, y , Mn,m, to which the GEVD can be fitted. If the Xi are independent, then the block maxima are also independent; even if the Xi constitute a dependent series, however, independence of the block maxima is likely to be a reasonable approximation, and the GEVD useful. Estimates of extreme quantiles for the distribution of block maxima are then obtained by inverting Eq. (9): h 8 i < m c 1 log ð1 pÞ x x Mp ¼ : m c log log ð1 pÞ
for xa0 (10) for x ¼ 0
Analysis of Extremes in Management Studies
147
F(x)
ψ = 0.25
ψ = 0.50 ψ=1 ψ=3
x
F(x)
ξ= 0 ξ>0
ξ<0
x
Fig. 4.
Effects of the Scale (c) and Shape (x) Parameters on the Form of the GEVD.
where H(Mp) ¼ 1p. Mp is termed the return level associated with the return period 1/p; that is, assuming annual block maxima, the level Mp is expected to be exceeded on average once every 1/p years. Choosing the size of the block is important, and amounts to a trade-off between bias and variance: too short a size may result in poor approximation
148
JOEL A.C. BAUM AND BILL MCKELVEY
by the limit model, leading to bias in estimation and extrapolation, while a too long one may generate too few blocks, resulting in large estimation variance (Castillo et al., 2005; Coles, 2001). Pragmatic considerations often lead to use of yearly blocks because annual data are more commonly recorded. Even when this is not the case, estimates based on shorter blocks may be less robust if the conditions of Eq. (9) are violated. For example, the daily data may suffer from seasonality, violating the assumption that the Xi have a common distribution. Now we apply GEVD methods to two novel datasets to demonstrate their implementation. World’s Fastest Women and Men This example is based on the International Association of Athletics Federation (IAAF) list of the 255 fastest men’s and 86 women’s 100 m sprint times, which includes all men’s times of 10.06 s or less, and all women’s times of 11.03 s or better.8 There are two equivalent approaches to the modeling of these minima data. Either the GEVD for minima can be fitted to these data, or the data can be negated and the GEVD for maxima fitted (Coles, 2001). We adopt the latter approach since The Extremes Toolkit does not include a routine to estimate the GEVD for minima directly. Fig. 5 shows the yearly world’s fastest times for men and women in the 100 m sprint over the period from 1975 to 2005 for men and from 1981 to 2005 for women. From these data, we will attempt to estimate the fastest times likely to occur over the next 10, 25, or even 100 years by fitting these data to the GEVD. Maximization of the GEVD log-likelihood (16.597) for the women’s data yields estimates for (m, c, x) of (10.885, 0.115, 0.143), with standard errors (0.026, 0.018, 0.138). Approximate 95% confidence intervals for the parameters are thus [10.937, 10.833] for m, [0.079, 0.151] for c, and [0.413, 0.127] for x. For the men’s data the log-likelihood is 33.518 and estimates for (m, c, x) are (9.953, 0.084, 0.334), with standard errors (0.017, 0.013, 0.163). Approximate 95% confidence intervals for the parameters are thus [9.985, 9.920] for m, 0.058, 0.110 for c, and 0.660, 0.008 for x. Although it is impossible to check the validity of an extrapolation based on a GEVD model, assessment of its reasonableness can be made based on the observed data. Four graphical analyses assist with model checking (Castillo et al., 2005; Coles, 2001). Diagnostic plots assessing the accuracy of the GEVD model fitted to these data are shown in Fig. 6.
Analysis of Extremes in Management Studies
Fig. 5.
149
One Hundred Meter Sprint – Fastest Recorded Annual Time Worldwide among Women, 1981–2005 and Men, 1975–2005.
150
Fig. 6.
JOEL A.C. BAUM AND BILL MCKELVEY
One Hundred Meter Sprint Times– Probability, Quantile, Return-Level, and Density Plots.
Analysis of Extremes in Management Studies
151
Probability plots compare the empirical and fitted distribution functions. If the GEVD model is a good fit, the fitted distribution function should approximate the empirical one and the probability plot should lie close to the diagonal. Departures from linearity signal some weakness in the fit of the GEVD model. A limitation of the probability plot for assessing the fit of extreme value models is that both the empirical and fitted probability distributions approach one as M increases, while it is the accuracy of the model for large values of M that is of the greatest interest. Thus, probability plots are the least helpful in the area of the greatest concern. This limitation is avoided by quantile plots comparing the empirical and estimated values of X. These two plots contain the same information, but are presented on different scales. The third type is a return level plot, in which plots estimated return levels (Mp) against their associated return periods (1/p) on a logarithmic scale so that the tail of the distribution is compressed and the return level estimates for long return periods are displayed. The linearity of the plot in this case provides a baseline against which to judge the effect of the estimated shape parameter (x). Confidence intervals for the estimated return level can be included on the plot to increase its informativeness. Empirical values can also be included, for added diagnostic value – if the model estimates and empirical values are not roughly similar, the adequacy of the GEVD model is open to question. Finally, probability density functions of the fitted model can be compared to a histogram of the empirical data. Since the shape of a histogram depends importantly on the choice of groupings, this plot is generally less informative than the probability, quantile, and return level plots. For both women and men the near-linearity of the probability and quantile plots gives little cause to doubt the validity of the fitted model. The quantile plot for women does, however, diverge from linearity with the model estimate lower than the highest empirical value (i.e., current world record). The return level curves for both men and women asymptote to a finite level reflecting the negative estimates for x. The 95% confidence bounds on the return level plots remain relatively compact for quite high return periods, particularly for men. The corresponding density estimates also seem quite consistent with the histograms of these data. Overall then, the diagnostic plots provide reasonably strong support for the fitted GEVD model. Given the fit of the model, and assuming the relative stability of the process producing the fastest times, for men the model predicts the estimated 10-year return level to be 9.820 s with 95% confidence interval [9.767,
152
JOEL A.C. BAUM AND BILL MCKELVEY
9.852], for 25 years it is 9.788 s with 95% confidence interval [9.723, 9.819], and for 100 years it is 9.755 s with 95% confidence interval [9.652, 9.791]. For women, the 10-year return level is estimated to be 10.663 s with 95% confidence interval [10.543, 10.731], for 25 years it is 10.681 s with 95% confidence interval [10.620, 10.717], and for 100 years the estimate is 10.496 s [10.221, 10.598]. The asymmetry in the confidence intervals for the return levels reflects the greater uncertainty about the extreme values of the process. Thus, the model predicts little change in the fastest sprint times over the next 10 or even 100 years for either men or women from current world records of 9.77 s set by Asafa Powell in 2005 and 10.49 s set by Florence Griffith-Joyner in 1988. The latter of these records is one of the most controversial in athletics. As Fig. 5 shows, no female sprinter has come close to matching her record; and as the return level estimates indicate, her record is indeed the ‘‘sprint of the century.’’ Her dramatic physical change and improvement in 1988, rapid subsequent retirement, and shockingly early death, have led many to conclude that Griffith-Joyner was using steroids and other banned performance-enhancing drugs. These observations suggest an intriguing use of the GEVD – detection of cheating. Not included in the men’s data is the discredited world-record time of 9.73 s set by Ben Johnson at the 1988 Seoul Olympic Games. When included in Fig. 5 alongside the other men’s times, this time appears much like Griffith-Joyner’s – an outlier among outliers. Would Ben Johnson’s record time be expected based on the annual fastest times’ data for the 1975– 1987 period preceding his Seoul run? It appears not. The estimated 10-year return level based on the 1975–1987 data is 9.942 s with 95% confidence interval [9.979, 9.865]. For 25 years the estimate is 9.911 s with 95% confidence interval [9.949, 9.745]. Thus, we should not expect to see a run of Ben Johnson’s record time even once-in-a-quarter century; Carl Lewis’s second place time of 9.92 s, which ultimately earned him the gold medal, was well within the expected range. The current world record set in 2005 by Asafa Powell is 9.77 s, which is within the once in a quarter-century prediction range based on the pre-1988 data. Home Run Kings The potential of the GEVD to detect performance enhancement led us directly to the historical statistics archive of Major League Baseball – a league over which use of steroids had cast a pall during the 2004–2005 season. The archive records the yearly home run leader from 1874 to 2004. Given the changes in the number of games played in 1920, we use data from the year
Analysis of Extremes in Management Studies
153
1920 on, in which Babe Ruth hit 54 home runs for the New York Yankees and the ‘‘Curse of the Bambino’’ began, to 2004 – the year the curse ended when the Boston Red Sox won the World Series.9 Fig. 7 shows the major league leading number of home runs hit in each year from 1920 to 2004. From these data, we will attempt to estimate the maximum number of home runs we should expect to see hit once every 10, 25, or 100 years. The pattern of home runs appears to have changed little over the observation period, with the exception of a spike (1998, 1999, and 2001) late in the observation period. These ‘‘outliers’’ are central to the controversy alleging steroid use by some players to enhance their performance – including those who were home run leaders in these three years. Should we expect to see such extreme performances? That is, are these values predicted by the pattern of yearly maxima prior to their occurrence? Or, do they fall outside the range of normal extreme events? Maximization of the GEVD log-likelihood (294.244) for the home run data for the 1920–1997 period leading up to the recent spike yields estimates for (m, c, x) of (42.872, 6.586, 0.271) with standard errors (0.767, 0.533, 0.065). Approximate 95% confidence intervals for the parameters are thus [41.369, 44.375] for m, [5.541, 7.631] for c, and [0.398, 0.144] for x. The diagnostic plots for assessing the accuracy of the GEVD model fitted to these data are shown in Fig. 8. The near-linearity of both probability
Fig. 7.
Major League Baseball – Yearly Leading Number of Home Runs.
154
Fig. 8.
JOEL A.C. BAUM AND BILL MCKELVEY
Baseball Home Runs – Probability, Quantile, Return-Level, and Density Plots.
and quantile plots gives little cause to doubt the validity of the fitted model. The corresponding density estimates also seem consistent with the histogram of the data. The return level curve asymptotes to a finite level reflecting the negative estimates for x, and the 95% confidence bounds on the return level plots is compact for quite high return periods. Overall then, the diagnostic plots provide reasonably strong support for the fitted GEVD model. Given the fit of the model, and assuming the relative stability of the process producing home runs, the model estimates that the 10-year return level is 53.967 home runs with 95% confidence interval [52.308, 56.057], for 25 years it is 56.959 home runs with 95% confidence interval [55.151, 59.967], and for 100 years it is 60.187 home runs with 95% confidence interval [57.993, 64.841]. Thus, based on the data from 1920 to 1997, once in 100 years we should expect to see a home run leader hit between 58 and 65 home runs. Mark McGwire’s 1999 mark of 65 home runs is at the upper bound of the model prediction, but his 70 home runs in 1998, followed by Barry Bond’s 73 home runs in 2001 exceed it. Of course, the occurrence of these values within
Analysis of Extremes in Management Studies
155
such a short time frame is clearly not expected based on the model’s predictions. These are surely extremes beyond normal extreme events!
A Complication: Non-homogeneity In contrast to league-leading numbers of home runs which are relatively stable over time, there is evidence in Fig. 5 that the pattern of the fastest 100 m times has changed over time, particularly for men (and for women as well, excluding Griffith-Joyner’s controversial 1988 result). This trend may reflect, for example, improved training methods, and it seems likely that such improvements will continue to occur (although perhaps with diminishing returns).10 These improvements in athletic performance over time suggest that the distribution is non-homogeneous across years. Because GEVD estimates are only meaningful under the assumption of stability in the process this casts doubt on the foregoing analysis. Fortunately, such temporal patterns can be accounted for by permitting any or all of the GEVD parameters (m, c, x) to vary over time, t (Castillo et al., 2005; Coles, 2001; Reiss & Thomas, 2001). The plots in Fig. 5 suggest that the changes over time affect the level of the distribution, but not the other aspects. Therefore, we can estimate a model in which c(t) and x(t) are assumed constant and m(t) as time varying. Notably, even if c and x are time-invariant, model misspecification in any one parameter can lead to inaccurate estimates in another. A summary of results for models with constant, linear, quadratic, and log-linear trends in m(t) is presented in Table 4 (for women, Griffith-Joyner’s result was replaced with the fastest time recorded by another athlete in the same year). Confirming the apparent improvement in minima over time, all time-varying specifications improve over the time-constant specification. The simple linear trend implies an improvement of 0.008 s per year (0.25 s over the 1975–2005 period) for both men and women. For men, the model loglikelihood statistics indicate that while all three time-varying models improve on the constant model, the linear model provides the best fit (2 the difference in log-likelihoods is distributed as w2-distribution with degrees of freedom equal to the number of additional parameters estimated). For women, the quadratic and log-linear models are roughly equivalent in fit (given the difference in the number of parameters estimated), although a formal comparison is not possible since the models are not nested. The loglinear model has the benefit, however, of implying a slowing of improvements in race times, but not a worsening of the fastest time.
156
JOEL A.C. BAUM AND BILL MCKELVEY
Table 4. Model
GEVD Estimates with Time Variation in m. LL
m, m(t)
c
x
Constant
33.518
Linear
55.042
Quadratic
55.757
Log-linear
47.771
9.953 (0.017) 10.059, 0.008 (0.009, 0.000) 10.026, 0.012, 0.0002 (0.010, 0.000, 0.000) 10.116, 0.076 (0.033, 0.012)
0.084 (0.013) 0.044 (0.006) 0.049 (0.008) 0.051 (0.007)
0.334 (0.163) 0.397 (0.112) 0.632 (0.188) 0.266 (0.143)
10.876 (0.024) 10.968, 0.008 (0.021, 0.000) 11.089, 0.036, 0.001 (0.028, 0.002, 0.000) 11.064, 0.085 (0.049, 0.021)
0.111 (0.018) 0.102 (0.013) 0.081 (0.013) 0.092 (0.016)
0.425 (0.133) 0.537 (0.151) 0.387 (0.158) 0.494 (0.167)
Men
Women Constant
21.424
Linear
25.192
Quadratic
28.835
Log-linear
27.180
Model coefficient estimate standard errors in parentheses.
Fig. 9 shows the estimated trend in m using constant, linear, quadratic, and log-linear models. As suggested by the likelihood analysis, the linear and log-linear models give the most accurate representations of the apparent time variation in the men’s and women’s data, respectively. Thus, prediction from the time-constant model that there would be little improvement in either men’s or women’s world record times over the next 100 years is improved by taking into account the pattern of time dependence in the location parameter.
r-Largest Order Statistics Model (GEVD) A difficulty in any extreme value analysis is the scarcity of data for model estimation, which can result in model estimates and return levels having large variance. This challenge has led to the development of depictions of extreme behavior that permits modeling of data other than block maxima. There are two such approaches (Castillo et al., 2005; Coles, 2001; Reiss & Thomas, 2001). One is based on the behavior of the r-largest (smallest) order
Analysis of Extremes in Management Studies
157
Women -10.6
-10.7
-10.8
-10.9
-11
-11.1
1985
1990
1995
2000
2005
Men -9.7
-9.8
-9.9
-10
-10.1
-10.2 1975
1980
1985
1990
1995
2000
2005
Fig. 9. One Hundred Meter Sprint Times – Estimated Linear, Quadratic, and LogLinear Time Trends.
statistics within a block, for small values of r. The other is based on exceedences of high (low) thresholds. Here we focus on the model for the r-largest order statistics; the threshold exceedence model is examined below. As before, starting with a series of independent and identically distributed variables, data are grouped into m blocks. In block i the largest ri
158
JOEL A.C. BAUM AND BILL MCKELVEY
observations are recorded, generating the series Mi(ri) ¼ (Mi(1), y , Mi(ri)), for i ¼ 1, y , m. The r-largest order statistics model gives a likelihood whose parameters correspond to those of the GEVD for block maxima. So, the interpretation of parameters is unaltered from the block maxima approach, but the precision of estimates should be improved by including the additional information. Akin to the block length tradeoff, the number of order statistics used in each block also constitutes a bias-variance tradeoff: too small r generates few observations and high variance; too large r may violate the asymptotic support for the model, leading to bias. In practice, r is typically set as large as possible, subject to model diagnostics. Below, we apply this method to demonstrate its application. Intifada Fatalities September 29, 2000 saw an outburst of Palestinian anger and has resulted in an ongoing ‘‘Intifada,’’ the Arabic term for ‘‘a shaking off.’’ On almost a daily basis, innocent Palestinian and Israeli people are killed. This analysis of Intifada fatalities is based on records of Israeli and Palestinian deaths compiled by the Middle East Policy Council (MEPC).11 From these data, we will attempt to estimate the maximum monthly fatalities likely to result from this conflict, as well as to gain an understanding of the conflict’s underlying dynamic, by fitting these data to the r-largest order statistics model. Fig. 10 shows the number of Palestinian fatalities’ four largest weekly order statistics for each month of the Intifada. Thus, r ¼ 1 gives the most deadly week, r ¼ 2 the second most deadly, and so on. Table 5 gives the loglikelihoods, parameter estimates (standard errors), and return levels (standard errors) for r ¼ 1, 2, 3, and 4. The estimates for r ¼ 1 give the GEVD fitted to monthly block maxima. As expected, with increasing values of r, the precision of estimates tends to increase (i.e., smaller standard errors) as more data are included in the analysis. If the asymptotic approximation remains valid for the value of r, then the parameter estimates should be stable when the model is estimated with more order statistics. Parameter estimates are relatively stable for m and x for r ¼ 2 through r ¼ 4, and across all values of r for c, supporting the validity of the model. The estimated 10- and 100-month return levels given in Table 5 also appear in line with the observed fatality rates: the lower bound of the 95% confidence intervals for the estimated 100-month return level has been exceeded four times, and the lower bounds of the 95% confidence interval for the 10-month return level exceeded five times in 56 months.
Analysis of Extremes in Management Studies
Fig. 10.
159
Palestinian Intifada Fatalities, September 2000–May 2005: r-Largest Order Statistics.
The validity of the model is reinforced by the diagnostic plots for each value of r, which are presented in Fig. 11. As the four panels in the figure show, the agreement between the model and the data improves as r increases for all the four diagnostic plots, and, in addition, the confidence intervals on the return plot become narrower reflecting the increasing precision. Notably, the return plots indicate that the process is not asymptotic, reflecting the positive values of x and suggesting the explosiveness of the Intifada interaction. Now we repeat the analysis for Israeli fatalities. Fig. 12 gives the number of Israelis killed for the four largest weekly order statistics for each month of the Intifada. Table 5 gives the log-likelihoods and parameter estimates (standard errors) for r ¼ 1, 2, 3, and 4. In contrast to expectations, the precision of estimates increases from r ¼ 1 (i.e., the GEVD for monthly maxima) to 2, but not for 3, and the model would not converge for r ¼ 4. Additionally, the instability of the parameter
160
JOEL A.C. BAUM AND BILL MCKELVEY
Table 5.
r
LL
r-Largest Order Statistics Model Estimates for r ¼ 1, 2, 3, and 4. m
c
x
10-Month Return Level 100-Month Return Level
Palestinian fatalities 1
227.366
2
383.602
3
502.909
4
607.196
16.749 11.015 0.147 (1.649) (1.259) (0.098) 18.158 10.838 0.223 (1.356) (1.083) (0.083) 19.091 10.845 0.233 (1.273) (1.051) (0.073) 19.296 11.074 0.256 (1.262) (1.002) (0.073)
46.130 (38.096, 49.889 (42.439, 52.524 (42.944, 57.690 (49.268,
5.391 (0.844) 4.575 (0.609) 3.883 (0.618)
31.205 (26.418, 40.908)
60.689) 64.213) 70.556) 74.227)
89.171 (65.015, 101.471 (77.646, 107.644 (75.042, 108.815 (76.637,
146.101) 161.440) 185.732) 188.323)
Israeli fatalities 1
188.914
2
299.581
3
345.180
5.212 0.254 (0.702) (0.152) 4.726 0.605 (0.662) (0.147) 5.313 1.188 (1.186) (0.241)
50.833 (37.478, 77.237)
4 Failed to converge Model coefficient estimate standard errors in parentheses; return level 95% confidence intervals in parentheses.
estimates suggests that the asymptotic approximation is violated for higher order statistics. The frequency with which the monthly maxima is zero for r ¼ 3 and 4 in Fig. 12 is consistent with this conclusion. Thus, the validity of the r-largest order statistics model is in doubt for the Israeli fatality data. Diagnostic plots for each value of r, which are presented in Fig. 13, reinforce doubts about the validity of the model. While the fit of the GEVD to the monthly block maxima (i.e., r ¼ 1) is good, as the panels for r ¼ 2 and 3 show, the agreement between the model and data declines as r increases for all the four diagnostic plots, and, in addition, the confidence intervals on the return plot become wider, rather than narrower. Thus, we conclude that the GEVD for monthly block maxima fit the data best. The estimated 10- and 100-month return levels given for this model in Table 5 appear consistent with the Israeli fatality data for September 2000 and May 2005: the lower bound of the 95% confidence intervals for the
Analysis of Extremes in Management Studies
Palestinian Intifada Fatalities – Probability, Quantile, Return-Level, and Density Plots.
161
Fig. 11.
162
Fig. 12.
JOEL A.C. BAUM AND BILL MCKELVEY
Israeli Intifada Fatalities, September 2000–May 2005: r-Largest Order Statistics.
estimated 100-month return level has been exceeded only once, the lower bounds of the 95% confidence interval for the 10-month return level four times in 56 months. Conclusions based on extrapolated extremes in this conflict are speculative at best, however. The death of the Palestinian leader Yasser Arafat in November 2004 and the subsequent Presidential elections are seen widely as turning points in the Middle East peace process that have altered the dynamics of the Intifada. This view is borne out by the decline in both Palestinian and Israeli deaths since March 2005. A Complication: Interdependent Extremes Earlier we observed a pattern of time dependence in the fastest 100 m sprint times. There is evidence in Figs. 10 and 12 of a different kind of dependence. In particular, the values of the maxima for Palestinian and Israeli fatalities appear correlated, with the highest values for both sides of the conflict
Analysis of Extremes in Management Studies
Fig. 13.
163
Israeli Intifada Fatalities – Probability, Quantile, Return-Level, and Density Plots.
tending to co-occur. Such a dependence likely stems from factors including the tendency of each side to seek retribution for recent fatalities as well as the changing intensity of conflict over time (the greater the intensity, the greater the fatalities on both sides), which is itself influenced by the retribution-seeking dynamic. Because GEVD estimates are only meaningful under the assumption of stability in the process, this casts doubt on the foregoing analysis. Although we examine multivariate extremes in more detail below, we can also attempt to account for this temporal variation, by permitting the GEVD parameters (m, c, x) to vary over time with factors, x, influencing the intensity of the conflict. Figs. 10 and 12 suggest that the changes over time in the intensity of the conflict affect the level of the distribution, but not the other aspects.
164
Table 6.
JOEL A.C. BAUM AND BILL MCKELVEY
GEVD Model Estimates with Conflict Dynamic Variation in m.
Model
LL
m, m(b)
c
x
Constant
227.366
Linear retribution
223.915
Squared retribution
225.337
16.749 (1.649) 13.407, .391 (1.912, 0.141) 15.300, .012 (1.732, 0.005)
11.015 (1.259) 10.316 (1.222) 10.758 (1.264)
0.147 (0.098) 0.151 (0.112) 0.123 (0.113)
5.391 (0.844) 2.152, 0.165 (1.304, 0.057) 3.845, 0.0029 (0.816, 0.0005)
5.212 (0.702) 4.998 (0.647) 5.092 (0.613)
0.254 (0.152) 0.128 (0.148) 0.055 (0.118)
Palestinian fatalities
Israeli fatalities Constant
188.914
Linear retribution
182.759
Squared retribution
181.449
Model coefficient estimate standard errors in parentheses.
Therefore, as before, we can estimate a model in which c and x are assumed constant and m as varying with x. Given the likely role of retribution in escalating the intensity of the conflict, here, we define x to be the maximum weekly fatalities incurred in the prior month by the adversary in the conflict. Table 6 gives estimates of the GEVD parameters based on monthly block maxima. The constant models are the r ¼ 1 models from Table 5. Confirming the apparent influence of retribution on the intensity of the conflict, specifications in which the location parameter is permitted to vary with linear and squared terms for the adversary’s prior month’s weekly maximum fatalities improve on the constant specification. For the Israeli data, the squared specification provides the best fit; for Palestinians, the linear specification is the best. The cross-effects imply roughly two additional Palestinian fatalities per five Israeli fatalities in the prior month, and one additional Israeli fatality per six Palestinian fatalities in the prior month. Although permitting the location parameter to vary in this way does not appear to affect the scale parameter estimates, it does result in a marked reduction in the shape parameter for the Israeli data. Combined with the positive estimate for the ‘‘retribution effect’’, the drop in the shape parameter estimate suggests that the apparently non-asymptotic (i.e., explosive) character of Israeli maxima implied by the initial analysis is the result of
Analysis of Extremes in Management Studies
165
escalating responses to recent fatalities shifting the intensity of the process rather than the process itself being non-asymptotic. Thus, model misspecification in m appears to have resulted in inaccurate estimates for x.
Threshold Exceedences Model Modeling block maxima can be a wasteful approach to extreme value analysis if additional information on the extremes is available. Although the r-largest order statistics model can provide a better alternative, even this method can be wasteful of information if one block happens to contain more extreme events than another. If an entire time series of data are available, as they frequently are in the field of management, then better use of the data can be made by adopting a threshold exceedence approach that avoids the blocking approach altogether. Below, we describe this empirical approach, and demonstrate its application in two management-related empirical settings. Let X1, X2, y be a sequence of independent, identically distributed random variables, with a marginal distribution function F. Define as extreme those values y among Xi that exceed some high threshold, say u. The stochastic behavior of extreme values is then given by the conditional probability 1 F ðu þ yÞ ; y40 (11) Pr X 4u þ yjX 4u ¼ 1 F ðuÞ where X is a value in the Xi sequence. If the distribution, F, were known, then the distribution of threshold exceedences given in Eq. (11) would also be known. Since F is not typically known, analogous to the use of the GEVD as an approximation for the distribution of maxima of a long sequence when F is not known, we need a distribution that approximates well for high values of the threshold. The limit distribution that approximates threshold excesses as u increases is the generalized Pareto distribution (GPD) (Castillo et al., 2005; Coles, 2001). For large enough u, the distribution of (Xu) conditional on X4u is approximately 1=x xy HðyÞ ¼ 1 1 þ with c þ xðu mÞ xy y40; 1 þ 40 ð12Þ c þ xðu mÞ
166
JOEL A.C. BAUM AND BILL MCKELVEY
Notably, the parameters of the GPD for threshold excesses are uniquely determined by the parameters of the associated GEVD for block maxima. Specifically, x is equivalent to that of the corresponding GEVD. Thus, the shape parameter x is crucial in determining the qualitative behavior of the GPD, as it was for the GEVD. If xo0 the GPD of excesses has an upper bound of [m(c+x(um)]/x; if x Z0 the distribution has no upper limit. Thus, in contrast to the block maxima approach, in the threshold approach observations are defined as extremes if they exceed some high threshold, u. U.S. Stock Market Indices Extreme value theory is becoming increasingly popular in financial applications. Because financial solvency is determined more by extreme changes in market conditions than by typical ones, this is not surprising. Fig. 14 shows the daily closing prices (top panels) of the Dow Jones and NASDAQ stock market indices for the period from January 1, 1990 to June 15, 2005.12 As the closing prices show, the level of the process has changed markedly over the observation period – the process is non-stationary. Empirical studies in finance suggest that an approximation of stationarity can be achieved by taking the logarithms of ratios of successive observations (rescaled by 100 for convenience). These ‘‘log-daily returns,’’ displayed in the bottom panels of the figure, suggest that the transformation is reasonably successful. For the analysis we negate the log-daily return values to focus on extreme declines. In financial modeling, extreme quantiles of the daily returns are generally referred to as the ‘‘value-at-risk’’ (VaR) (Coles, 2001; Reiss & Thomas, 2001). VaR is the standard risk measurement used to protect portfolio holders against adverse market conditions and prevent them from taking extraordinary risks. Standard methods for calculating the VaR assume normality of the data; unfortunately, this assumption is often strongly violated as the unconditional distribution of financial time series is known to be fattailed. As illustrated below, the GPD threshold model provides a method for estimating the VaR directly and more reliably, and the return level plot graphs VaR against risk (Coles, 2001). We can now analyze the transformed data using extreme value methods, and in particular threshold exceedence models, to better understand the properties of its extremes. Before proceeding, however, we must first select a threshold level, u. The choice of threshold has considerations analogous to those for selection of blocks. Too low a threshold is likely to violate the asymptotic foundation of the model; too high a threshold will generate too
Analysis of Extremes in Management Studies
Fig. 14.
Down Jones and NASDAQ Stock Indices, 1990–2005 – Closing Price and Log-Daily Returns.
167
168
JOEL A.C. BAUM AND BILL MCKELVEY
few excesses for precise estimation. Ideally, the threshold selected is the lowest possible, while still permitting the limit model to provide a reasonable approximation. Two methods are available for setting the threshold level. One is an exploratory method carried out prior to model estimation, the other assesses the stability of parameter estimates for models fit across a range of thresholds. The exploratory approach is based on mean residual life plots, which graph the means of the threshold exceedences across a range of values of u (threshold levels). At levels of u for which the GPD model is appropriate, the means should change linearly with u. Above a threshold u0 at which the GPD provides a valid approximation of the excess distribution, the mean residual life plot should be approximately linear with u. This suggests that the threshold u0 be selected as the lowest value of u above which threshold exceedence means are roughly linear (Castillo et al., 2005; Coles, 2001). Fig. 15 gives mean residual life plots for the transformed Dow Jones and NASDAQ data. The plot is initially linear for the Dow Jones data, but shows substantial curvature in the range – 1ru r2. For u42 the plot is reasonably linear when judged relative to the 95% confidence intervals also shown in the plot. This suggests a threshold of 2, which yields 108 exceedences in the series of 3,898 daily observations. The NASDAQ plot is similar, with the plot reasonably linear for u43, for which there are 131 exceedences. The second, and complementary, technique is to fit the GPD to a range of thresholds and to look for stability in parameter estimates (as we did with the r-largest order statistics model). If the GPD is a good fit for excesses above a threshold u0, then excesses of a higher threshold, u, should also follow a GPD, and the shape parameters, x, of the two distributions should be identical. The scale parameters, c, should also be identical if reparameterized as cuxu, since the scale parameter changes with u unless x ¼ 0 (Castillo et al., 2005; Coles, 2001). Using this method, an appropriate threshold can thus be selected by plotting c (suitably reparameterized) and x against u, and selecting u0 as the lowest value of u for which the estimates remain roughly constant. Fig. 16 plots the values of parameter estimates for c and x across a range of possible threshold values, u, for the Dow Jones and the NASDAQ data. The change in the pattern observed in the mean residual life plot for high thresholds is also seen here, but the changes are now seen to be small relative to the 95% confidence intervals. Thresholds of 2 and 3 thus appear reasonable. Maximization of the GPD with u ¼ 2 for the Dow Jones data yields parameter estimates for c and x of (0.602, 0.234) with standard errors (0.091, 0.119), and log-likelihood 78.535. Approximate 95% confidence
Analysis of Extremes in Management Studies
Fig. 15.
169
Negative Log Daily Returns – Mean Residual Life Plots.
intervals are thus [0.780, 0.424] for c and [0.467, 0.001] for x. The estimates therefore correspond to an unbounded excess distribution (i.e., x40). For the NASDAQ data with u ¼ 3 the estimated GPD parameters c and x are (1.006, 0.152), with standard errors (0.147, 0.117), and log-likelihood 151.679. Approximate 95% confidence intervals for the parameters are thus [1.294, 0.718] and [0.381, 0.077]. Again the shape parameter suggests
170
Fig. 16.
JOEL A.C. BAUM AND BILL MCKELVEY
Negative Log Daily Returns – GPD Parameter Estimates against Thresholds.
an unbounded distribution, but the evidence is weaker with the confidence interval for x including 0. The diagnostic plots for assessing the accuracy of the fitted GPD models are shown in Fig. 17. For both Dow Jones and NASDAQ, the near-linearity of the probability and quantile plots gives little cause to doubt the validity of the fitted model. The quantile plot for the Dow Jones does, however, diverge from linearity with the model estimate lower than several of the highest empirical values. The return level curves do not asymptote for either of the market indices, reflecting the positive estimates for x. The 95% confidence
Analysis of Extremes in Management Studies
Fig. 17.
171
Negative Log Daily Returns – Probability, Quantile, Return-Level, and Density Plots.
172
JOEL A.C. BAUM AND BILL MCKELVEY
bounds on the return level plots remain relatively compact for quite high return periods. The corresponding density estimates also seem quite consistent with the histograms of the data. Overall then, the diagnostic plots provide reasonably strong support for the fitted GPD model. Given the fit of the model, the parameter estimates yield the 10-year return levels (measured as a decline in log-daily return) for the Dow Jones of 7.010 with 95% confidence interval [5.429, 9.237] and for the NASDAQ of 10.131 with 95% confidence interval [8.073, 13.186]. The 100-year estimate return levels are 12.427 [7.792, 17.062] for the Dow Jones and 15.893 (10.350, 21.417) for the NASDAQ. Thus, once in 10 years we should expect to see a negative log-daily return of between 5.4 and 9.2 for the Dow Jones and between 8.1 and 13.2 for the NASDAQ. Casting some doubt on these predictions, between 1990 and 2005, the Dow Jones experienced four declines within this range (although none above 7.5), and the NASDAQ experienced two declines within the predicted range (although neither above 10.2). Martha Stewart and ImClone Recent investigations have brought to light ethical and legal wrongdoings across the fields of accounting, consulting, insurance, securities analysis, and investment banking. Perhaps the highest profile of these recent events, though clearly not the most consequential, was the case of biopharmaceutical company ImClone. In this case, Martha Stewart and her stockbroker, Peter Bacanovic committed illegal insider trading when Stewart sold stock in ImClone on December 27, 2001, after receiving a tip from Bacanovic, a broker with Merrill Lynch, who also created an alibi for and made false statements concerning Stewart’s ImClone trades. Bacanovic’s tip was that two of his other clients – ImClone’s CEO, Samuel Waksal and his daughter – placed orders to sell all the ImClone stock they held at Merrill Lynch. Waksal placed these orders knowing that the U.S. Food and Drug Administration (FDA) was about to reject one of ImClone’s key products, a cancer treatment called ‘‘Erbitux’’. The next day, ImClone announced that the FDA had decided not to accept ImClone’s Erbitux application for filing. By the close of the next trading day, December 31, 2001, the price of ImClone stock dropped 16 percent to $46 per share. By selling when she did, Stewart avoided losses of $45,673. Insider information enhances the performance of stock market transactions in much the same way that steroids enhance the performance of athletes, which as we illustrated earlier, could aid in their detection. Can extreme value statistics also aid in the detection of insider trading?
Analysis of Extremes in Management Studies
Fig. 18.
173
ImClone ‘‘Normal’’ Daily Trading Volume, 2000–2005.
Fig. 18 shows the daily trading volumes (divided by 1,000 for rescaling) in ImClone stock on the five trading days prior to company announcements, unless those days also followed within five trading days of a company announcement, for the period from January 1, 2000 to September 5, 2005.13 These days should thus reflect ‘‘normal’’ trading activity. The trading volume on December 27, 2001, identified by the black marker, represents the fourth-largest daily trading volume. The two highest days occurred on February 26, 2002 the trading day prior to ImClone announcing discussions with the FDA regarding resubmission of the Erbitux application, and March 28, 2003, the trading day prior to ImClone announcing it would postpone statement of its 2002 fourth quarter and year-end financial results. The third occurred two trading days prior to ImClone finally reporting those financial results. We were unable to determine if these pending announcements were reported earlier in the media, triggering the higher trading volumes. To determine whether we should expect to see such extreme daily trading volumes, we can analyze the data using the threshold exceedence model. As with the stock market indices, however, we must again first select a threshold level. Fig. 19 gives mean residual life plots for the ImClone daily trading volume data. The plot is initially linear, but shows substantial curvature in the range 0rur4,000. For u44,000 the plot is reasonably linear when judged relative to the 95% confidence intervals also shown in the plot.
JOEL A.C. BAUM AND BILL MCKELVEY
174
Fig. 19.
ImClone Daily Trading Volume – Mean Residual Life Plot.
A plot of the values of GPD parameter estimates for c and x across a range of possible threshold values in Fig. 20 reinforces this conclusion. The change in the pattern observed in the mean residual life plot for high thresholds is again seen, but the changes are now more clearly small relative to the 95% confidence intervals. Taken together, these analyses suggest a threshold of 4,000 (i.e., a trading volume of 4,000,000 shares), which yields 32 exceedences in the series of 389 daily observations. Maximization of the GPD with u ¼ 4,000 for the ImClone daily trading data yields parameter estimates for c and x of (1558.691, 0.017) with standard errors (468.382, 0.243), and log-likelihood 266.728. Approximate 95% confidence intervals are thus [621.927, 2,495.455] for c and [0.260, 0.226] for x. The estimates therefore correspond to a linear excess distribution, which is neither unbounded, nor does it asymptote (i.e., x ¼ 0). The diagnostic plots for assessing the accuracy of the fitted GPD models are shown in Fig. 21. The near-linearity of the probability plot gives little cause to doubt the validity of the fitted model. The quantile plot does, however, diverge from linearity with the model estimate lower than the two highest empirical values, and as a result, the 95% confidence bounds on the return level plots become quite large for high return periods. The linearity of the return level plot reflects the insignificant estimate for the shape parameter, x. The corresponding density estimates seem quite consistent with the
Analysis of Extremes in Management Studies
Fig. 20.
175
ImClone Daily Trading Volume – GPD Parameter Estimates against Threshold.
histograms of the data. Overall, the diagnostic plots provide reasonable support for the fit of the GPD model. Given the fit of the model, the parameter estimates yield 10-year return levels for the daily volume (rescaled by 1,000) of 8,596.587, with 95% confidence interval [7,331.583, 12,657.424]. The estimated 25-year return levels are 9,944.622 [8,190.889, 15,196.376]. Thus, once in 10 years we should expect to see a daily volume of between 7.3 million and 12.7 million shares. Notably, between 2000 and 2005, ImClone’s daily trading volume exceeded 7.3 million shares four times, including December 27, 2001. One interpretation of these occurrences is that they cast doubt on the model predictions. The other is that ImClone experienced extreme daily trading volumes that are not predicted by the pattern of otherwise normal daily trading. Unfortunately, as noted above, we were unable to determine whether or not one or more of the three highest daily volumes resulted from media announcements that preceded the company announcements, which would have helped us to decide between these two possibilities. A Complication: Dependent Exceedences Earlier we observed time dependence in the fastest 100 m sprint times and interdependence in extremes of Palestinian and Israeli fatalities during the Intifada. The Dow Jones and NASDAQ stock market indices likely exhibit a third kind of dependence. Although conversions such as the transforma-
176
Fig. 21.
JOEL A.C. BAUM AND BILL MCKELVEY
ImClone Daily Trading Volume – Probability, Quantile, Return-Level, and Density Plots.
tion of stock market prices into log-daily returns can effectively reduce nonstationarity, so that the pattern of variation is roughly constant over time, the time series is still not independent. These time series are the products of a history of complex, dynamic interactions producing the stock market prices. This can be seen quite clearly in Fig. 22, which shows the pattern of threshold exceedences over time for the Dow Jones (u42) and NASDAQ (u43). If these two series were independent, then the threshold exceedences should be uniformly distributed over time. Clearly, they are not; instead they are clustered together. This violation of model assumptions brings into question the validity of the threshold excesses model for the stock market data. Fortunately, it is possible to deal with the problem of dependent exceedences in the threshold model. The method, declustering, filters the dependent observations to obtain a set of threshold excesses that are approximately independent. This is accomplished by defining clusters of exceedences,
Analysis of Extremes in Management Studies
177
Dow Jones (u > 2)
0.000 -1.000 -2.000 -3.000 -4.000 -5.000 -6.000
12/01/2003
12/01/2002
12/01/2001
12/01/2000
12/01/1999
12/01/1998
12/01/1997
12/01/1996
12/01/1995
12/01/1994
12/01/1992
06/08/1992
12/01/1993
12/01/1991
06/08/1991
-8.000
12/01/1990
-7.000
NASDAQ (u > 3)
0.000 -2.000 -4.000 -6.000 -8.000
Fig. 22.
06/08/2003
06/08/2002
06/08/2001
06/08/2000
06/08/1999
06/08/1998
06/08/1997
06/08/1996
06/08/1995
06/08/1994
06/08/1993
-12.000
06/08/1990
-10.000
Negative Log Daily Returns Threshold Exceedences over Time.
identifying the maximum excess within each cluster, and then fitting the GPD to the cluster maxima (Castillo et al., 2005; Coles, 2001; Reiss & Thomas, 2001). We define clusters by specifying a threshold, u, and defining consecutive exceedences of u to belong to the same cluster. The cluster ends when r consecutive observations fall below u. The next exceedence of u then initiates the next cluster. The choice of r is important. If r is too small, the
178
JOEL A.C. BAUM AND BILL MCKELVEY
independence of nearby clusters is dubious. If r is too large, observations that might reasonably have been considered separate clusters will be combined. Also at issue are the questions of bias and variance that earlier affected the selection of block length and threshold level. How much clustering of exceedances of a threshold occurs in the limit of the distribution can be estimated by the extremal index, y. If y ¼ 1, then the data are independent. If yo1, then there is some dependency (clustering) in the limit. There are many possible estimators of the extremal index; the one used in the Extreme Values Toolkit is the interval estimator (Ferro & Segers, 2003), which can also be used to estimate the optimal run length for declustering. For the Dow Jones data, y ¼ 0.461 with the estimated optimal run length of 11 for declustering. For the NASDAQ data, y ¼ 0.207 with the estimated optimal declustering run length of 22. Reinforcing the conclusion of Fig. 22, these estimates for y suggest a strong history dependence in the log-daily return exceedences, particularly for the NASDAQ. The result of declustering the log-daily returns data based on the optimal estimated runs are shown in Fig. 23. In the figures, the dashed vertical lines demarcate the clusters (49 for the Dow Jones data; 25 for the NASDAQ data). Confirming the effectiveness of the declustering, the values of y for the declustered data are 0.999 and 1.000 for the Down Jones and NASDAQ, respectively. Table 7 compares estimates for the GPD threshold excesses model with and without declustering. As before, u ¼ 2 for the Dow Jones data and for the NASDAQ data u ¼ 3. Comparisons of parameter estimates indicate that they are relatively stable, although the shape parameter x appears larger after declustering for the NASDAQ data. Such comparisons are made difficult, however, by the larger standard errors for the declustered data (the result of fewer exceedences to estimate the model). A clearer sense can be obtained from the return levels. Notably, in each case the confidence interval for the return level shifts to the right, raising the lower confidence bound. This reinforces the earlier doubt cast on the original predictions by the fact that, during the 15-year observation period, the Dow Jones experienced four declines and the NASDAQ two declines in the original 100-year return level ranges.
Multivariate Extremes and Extremal Dependence When studying the extremes of two or more processes, each can be modeled individually using univariate techniques as we have done thus far, but
Analysis of Extremes in Management Studies
Fig. 23.
179
Negative Log Daily Return Cluster Groupings.
there is also value in considering extreme value interrelationships as we saw in the analysis of Intifada fatalities. It may be, however, that some combination of processes is of greater interest or importance than any one process alone, or that data for one process can inform our understanding of
180
JOEL A.C. BAUM AND BILL MCKELVEY
Table 7.
Threshold Exceedence Model Estimates with and without Clustering.
Run Length, r Exceedences
LL
c
x
10-Year Return Level
100-Year Return Level
7.010 (5.429, 9.237) 7.494 (6.309, 11.556)
12.427 (7.791, 17.062) 14.667 (8.464, 23.667)
Dow Jones na
108
11
49
0.602 0.234 (0.091) (0.119) 49.811 0.655 0.284 (0.185) (0.199)
78.535
NASDAQ na
131
22
25
1.006 0.152 10.131 15.893 (0.147) (0.117) (8.073, 13.186) (10.350, 21.417) 33.038 1.102 0.424 10.982 22.418 (0.315) (0.298) (8.702, 15.120) (10.567, 42.345)
151.68
Model coefficient estimate standard errors in parentheses; return level 95% confidence intervals in parentheses.
another. To aid in such analyses, there are multivariate analogs to the univariate block maxima and threshold models we have employed to this point. As with all multivariate models, dimensionality creates difficulties for model validation and computation; so we focus on the bivariate case for simplicity. Bivariate Extreme Value Distribution (BEVD) for Block Maxima Let (X1, Y1), y , (Xn, Yn), y be a sequence of vectors that are independent versions of a random vector having distribution function F(x,y). As in the univariate case, we can examine the behavior of multivariate extremes based on the limiting behavior of block maxima. This requires us to define the vector of component-wise maxima. M n ¼ ðM x;n ; M y;n Þ
(13)
where Mx,n ¼ maxi ¼ 1, y , n {Xi} and My,n ¼ maxi ¼ 1, y , n {Yi}. The asymptotic theory of multivariate extremes analyzes Mn as n - N. Since standard univariate extreme value analysis applies to both {Xi} and {Yi}, we can characterize the limiting joint distribution of Mn as n - N using the family of BEVDs, with distribution functions of the form F ðx; yÞ Gðx; yÞ ¼ expfV ðx; yÞg;
x40;
y40
(14)
Analysis of Extremes in Management Studies
181
where Z
1
V ðx; yÞ ¼ 2 0
w 1w ; max dHðwÞ x y
(15)
and H is a distribution function [0, 1] satisfying the mean constraint Z
1
w dHðwÞ ¼ 1=2
(16)
0
The complete class of bivariate limits can be obtained by generalizing the marginal distributions according to the GEVD. Specifically, letting Gðx; yÞ ¼ expfV ðx; ¯ y¯ Þg
(17)
where 1=xx x mx x¯ ¼ 1 þ xx cx
and
y¯ ¼
1 þ xy
y my cy
!!1=xy (18)
provided that [1+xx(xmx)/cx]40 and [1+xy(ymy)/cy]40 and V satisfies Eq. (13) for some choice of H. The marginal distributions of x and y are GEVD with parameters (mx, cx, xx) and (my, cy, xy), respectively. One standard class of parametric families whose mean is parameter-free, satisfying Eq. (14), and for which the integral in Eq. (13) is tractable, is the logistic family: n a o Gðx; yÞ ¼ exp x1=a þ y1=a ; x40; y40 (19) for a (0,1). The advantage of the logistic family is its flexibility. As a - 1 in Eq. (17), x and y correspond to independent variables. As a - 0, x and y correspond to perfectly dependent variables. The logistic family thus generates BEVDs covering all levels of interdependence from independence to perfect dependence (Castillo et al., 2005; Coles, 2001). Consider the Intifada fatality data. As we noted earlier, there is evidence that extreme values for the Israeli and Palestinian fatality data are correlated. Fig. 24, which plots the monthly block maxima data together, reveals further the tendency for large numbers of Israeli and Palestinian fatalities to co-occur. Earlier, we accounted for this correspondence by permitting the GEVD location parameter to vary with the intensity of the conflict. Now, instead, we will account for this dependence by fitting the data to the BEVD.
182
Fig. 24.
JOEL A.C. BAUM AND BILL MCKELVEY
Palestinian and Israeli Intifada Fatalities, Monthly Maxima – September 2000–May 2005.
Table 8.
Israeli fatalities Palestinian fatalities
BEVD Estimates for Block Maxima. LL
m
c
x
a
407.535
5.365 (0.831) 16.720 (1.673)
5.097 (0.681) 11.146 (1.280)
0.246 (0.155) 0.193 (0.102)
0.670 (0.084)
Model coefficient estimate standard errors in parentheses.
Assuming the logistic dependence structure specified in Eq. (17), the BEVD yields the estimates in Table 8. The logistic dependence parameter estimate a is 0.670 with standard error 0.084, corresponding to moderate dependence between the sequences. The marginal parameters are similar to those obtained in the univariate analyses reported in Tables 5 and 6; however, the shape parameter for the Israeli data is much closer to the constant GEVD model estimate than to the ‘‘retribution’’ model in which the location parameter was permitted to vary with the intensity of the conflict.
Analysis of Extremes in Management Studies
183
The diagnostic plots for assessing the accuracy of the fitted BEVD model are shown in Fig. 25. For both Israeli and Palestinian fatalities, the nearlinearity of the probability and quantile plots supports the validity of the fitted model. The quantile plot is now a much better fit with the Palestinian data, which was the case for the GEVD fit presented in Fig. 11 (see panel r ¼ 1). As before, neither of the return level plots asymptotes, reflecting the positive estimates for x. Overall then, the diagnostic plots provide reasonably strong support for the fitted BEVD model, and suggest some improvement in the fit for the Palestinian data after taking into account the extremal dependence of the two processes. Bivariate Extreme Value Distribution (BEVD) for Threshold Exceedences A bivariate threshold exceedence model is also possible. In the univariate case, we used the GPD as a limit distribution that approximates threshold excesses as u increases. Now we need a distribution G(x,y) with which to approximate a joint distribution F(x,y) for exceedences x4ux, y4uy, for large values of ux and uy. We can characterize the limiting joint distribution by specifying the threshold exceedence according to the GPD (Castillo et al., 2005; Coles, 2001). Specifically, letting F ðx; yÞ Gðx; yÞ ¼ expfV ðx; ¯ y¯ Þg ;
x4ux ;
y4uy
(20)
where !1 xx ðx mx Þ 1=xx and x¯ ¼ log 1 þ cx 0 !1=xy 11 xy ðx my Þ A y¯ ¼ @log 1 þ cy
ð21Þ
provided that the thresholds ux and uy are sufficiently large to substantiate the limit distribution as an approximation (see Coles, 2001, pp. 154–156 for a detailed derivation of this model). A complication of this model is that a bivariate pair may exceed the threshold on one component, but not the other so that x4ux but youy. F(x,y) is not applicable in such cases. But there is information in such cases concerning the x component of the pair (although not the y component). Consequently, it is necessary to estimate the censored likelihood in which
184
Fig. 25.
JOEL A.C. BAUM AND BILL MCKELVEY
Intifada Fatalities – Probability, Quantile, Return-Level, and Density Plots.
Analysis of Extremes in Management Studies
185
the contribution of such pairs is @F ðx; ux Þ (22) @x Now let us return to the stock market index data. Fig. 22, which plots the (negated) log-daily returns for the Dow Jones and NASDAQ together, reveals a tendency for large negative (and positive) returns to co-occur. For the estimation, we adopt three threshold levels set at the 90%, 95%, and 97% quantiles of the data, corresponding to u ¼ 1.11, 1.59, and 2.01 for the Dow Jones data, and u ¼ 1.71, 2.56, and 2.99 for the NASDAQ data. Thus, the 97% quantile threshold corresponds to the original values of u (Dow Jones42, NASDAQ43) for the univariate analysis. The 97% quantile threshold is superimposed on the bottom panel of Fig. 26. Again assuming a logistic dependence structure, the BEVD threshold exceedence model yields the estimates in Table 9. Estimates for the logistic dependence parameter range from 0.594 at the 95% quantile threshold to 0.751 at the 97% quantile threshold. These values of a correspond to a moderate level of dependence. The scale parameters (c) are similar to those obtained in the univariate analyses (reproduced in the first two rows of the table). However, the shape parameters (x), which, while remaining positive in direction, are generally smaller in magnitude. This suggests that the apparently non-asymptotic character of these market indices’ maxima implied by the initial analysis is the result of extremal dependence between the two indices rather than either excess process itself being unbounded (i.e., x40). PrfX ¼ x; Y uy g ¼
CONCLUSION Extreme value theory is about, well, extremes. Natural disasters over the past 12 months appear to have been the worst in decades – as many as 270,000 dead from the December 2004 Indian Ocean tsunami, more than 75,000 dead from the October 2004 Pakistan earthquake, Katrina, Rita, and Wilma, all three Category 5 hurricanes hitting the Gulf States in autumn 2005, with Katrina easily being the most expensive natural disaster in history. We now worry most about a possible repeat of the 1918 avian flu pandemic that killed between 50–100 million people. Extreme value theory gained momentum when projections were needed about 20-, 100-, and 500-year storms when the Hoover Dam was being designed in the 1930s (Gumbel, 1958): How big a flood should it be designed to withstand? This use is aimed at finding out possible effects, damages, and costs of floods, earthquakes, hurricanes, and fires as much in advance as possible.
186
Fig. 26.
JOEL A.C. BAUM AND BILL MCKELVEY
Log Daily Return Threshold Exceedences, Dow Jones (u42), NASDAQ (u43).
Analysis of Extremes in Management Studies
Table 9.
BEVD Estimates for Threshold Exceedences.
Univariate GPD Dow Jones (u42) NASDAQ (u43) u ¼ 90% quantile (LL ¼ 1745.814) Dow Jones NASDAQ u ¼ 95% quantile (LL ¼ 1189.110) Dow Jones NASDAQ u ¼ 97% quantile (LL ¼ 946.434) Dow Jones NASDAQ
187
c
x
a
0.602 (0.091) 1.006 (0.147)
0.234 (0.119) 0.152 (0.117)
na
0.667 (0.048) 1.141 (0.082)
0.181 (0.056) 0.103 (0.054)
0.594 (0.019)
0.660 (0.068) 1.068 (0.115)
0.173 (0.075) 0.107 (0.078)
0.694 (0.026)
0.637 (0.087) 1.129 (0.161)
0.193 (0.097) 0.070 (0.100)
0.751 (0.032)
Model coefficient estimate standard errors in parentheses.
We began by showing just how widespread evidence of power-law phenomena has become – power laws appear in each of the physical, biological, social, and organizational worlds. Behind every power law there lies a cause that is usually explained by use of a scale-free theory. We briefly considered nine of these, noting that each applies to management and organizational phenomena. Our basic argument is summarized in the following logical sequence: 1. 2. 3. 4. 5.
Power laws stem from non-independent behavior. Non-independent behavior is ever-present in social contexts. Pareto distributions are as much prevalent as Gaussian distributions. Therefore, extremes are easily as important as averages. Managers worry most about extremes – from Enron and WorldCom at one end to GE and Microsoft at the other, with all of the human competence pluses and minuses in between. 6. Extreme statistics should be a much larger component of the management researcher’s toolkit than is currently so.
188
JOEL A.C. BAUM AND BILL MCKELVEY
The statistical treatment of extreme values dates back to Fre´chet (1927), Fisher and Tippett (1928), Weibull (1939), Gumbel (1941, 1958), and Gnedenko (1943). Though their origin stems from trying to anticipate and cope with natural and physical events, we turned to human behaviors to illustrate the use of three basic models in their univariate and multivariate forms: 1. Generalized extreme value distribution (GEVD), which combines Fisher– Tippett and Gnedenko’s three types of distributions into a single, threeparameter distribution, and involves fitting block maxima and minima. We illustrated this model with data on the World’s fastest men and women, and Major League Baseball home run champions. 2. r-Largest order statistics model, which extends the GEVD model by examining the largest r-order statistics in a block, rather than only the largest and permitting more efficient estimation. We modeled Intifada fatalities to illustrate this model. 3. Threshold exceedences model, which, based on the generalized Pareto distribution (GPD), incorporates information on all extreme events above or below an appropriately high (low) threshold, making it the most general, flexible, and efficient model. We illustrated this model with management-relevant examples such as stock market behavior and insider trading. By necessity, we have provided only an overview of the essential features of each of these models. Readers interested in learning more are encouraged to turn to excellent recent texts by Coles (2001); Reiss and Thomas (2001); and Castillo et al. (2005). As is obvious from the examples we use, extreme value theory and statistics offer us another kind of window – one into human corruption. Ben Johnson, who was stripped of his 100 m sprint gold medal for use of performance-enhancing drugs, is also the only male sprinter whose time is outside the range of extreme values that might be expected once in a quarter century. We also show that Florence Griffith Joiner’s world record of 10.49 s in the 100 m sprint in the 1988 Olympic trials was the run-of-a-century. This indication that she was outside normal extreme behavior, coupled with her physical changes, her sudden jump from a silver medal in the 200 to world record in the 100 m sprint, her retirement shortly after the 1988 Olympics, and her unexpected death 10 years later, combine to suggest that her performance was drug-enhanced, as many have concluded. Our analysis also shows that baseball sluggers, Barry Bonds and Mark McGwire, are far
Analysis of Extremes in Management Studies
189
enough outside the statistical window of normal extreme behavior to suggest that their achievements were enhanced as well. And then there is Martha Stewart. Since Mandelbrot’s classic paper of 1963, recently reconfirmed (Sornette, 2003, 2004; Mandelbrot & Hudson, 2004), we have known that power laws reflect non-independent behavior among data points (Andriani & McKelvey 2005). Starting from a lognormal distribution, we know that the stronger the interdependence effect, the stronger, straighter, and more pronounced is the power-law signature. This is especially characteristic of the ‘‘wealth’’ end of Pareto’s distribution of wealth. Our analysis shows that Stewart and other inside traders on December 27, 2001 were even more interaction-based than the interdependence that lies behind normal stock market trading. If we apply Gaussian statistics to earthquakes, magnitude 8 and 9 quakes are incredible outliers. But if we accept Pareto distributions and extreme statistics, they fall on the power-law signature’s straight line – right on track as it were. Our use of extreme statistics allows us to define what is really normal extreme behavior in a Paretian world. This translation of extreme events into paths of normal behavior allows us to start predicting extreme events rather than worrying about whether events are average or not. Since much that is commanding the attention of managers is about extreme positives and negatives, our demonstration of extreme statistics illustrates how scholars may conduct predictive analyses about the unusual rather than the average – and offer significant practitioner relevance besides. What could be even more important to managers is the fact that we can also analyze what is beyond the usual extremes to be expected in Paretodistributed phenomena. The problem for statistical analysis in this circumstance is to find ways for identifying what is outside normal expectations based on Pareto distributions – as opposed to Gaussian-based averages. Our foregoing sports, Intifada, and management illustrations of the three models, based as they are on ‘‘extremes beyond extremes’’, are good examples of this possibility. Might extreme value theory, for example, have saved the space shuttles Challenger and Columbia? Using extreme value theory and statistics, one could have concluded from available data on damage to the o-rings sealing the joints between sections of the shuttle’s solid rocket boosters, that one should not launch Challenger at such a cold temperature, despite having no measurements at such a low temperature. An analysis of data on damage resulting from insulating foam (and perhaps ice) flaking off the external tanks during liftoff would likely have revealed that Columbia, which was known to have experienced more extensive damage to insulating tiles during liftoff since 1997 when a new Freon-free insulating foam
190
JOEL A.C. BAUM AND BILL MCKELVEY
was first used, was doomed. And, what about hurricane Katrina? A major hurricane had long been predicted to hit New Orleans. Experts knew the city was unprepared and painted an accurate picture of the devastation and cost; Local, State, and Federal officials are rightly blamed for ignoring them. Also of critical importance to managers is the idea that power-law slopes and accompanying statistical windows may be compared. For example, consider storms ranging from small thunderstorms to Category 5 hurricanes. They form a rank/frequency power law. The cost of these storms also forms a power law. Assuming that the Category 5 storm is at the lower right of the plot, then, is the cost slope steeper or flatter? If government is doing its job and enforcing appropriate designs and building codes, the slope should be steeper – costs of extreme storms are moderated by design and enforcement. We already see this with earthquakes – since the loss of life and buildings in California and Japan is much less than in Pakistan, Indonesia, or China, the power-law slope of damage is much steeper than the quake slope. Has the Securities and Exchange Commission (SEC) done enough to reduce the cost of stock market crashes? We can now answer this question. We can also develop statistical windows within which to put the cost of corruption in relation to the size of the crime and ask whether the SEC is matching earthquake preparedness in California or Pakistan. Some scholars suggest that power laws are taking on the stature of natural laws such as gravity (Preston, 1950; MacArthur, 1960; Bak 1996; Halloy, 1998). If this is the case, then extreme statistics may be used to explore the relation between power-law descriptions of entire industries, economies, and the wealth of continents, and whether individual industries, economies, or regions are progressing at a steeper or flatter economic power-law slope. Such use of extreme statistics moves beyond describing the possible damage of the next big storm to measuring the wealth and well being of economies and regions and the effectiveness of their governments. Or, one could examine collusion in an industry by studying, for example, by making power-law slope comparisons between petroleum prices, gas prices, and industry profits. In the latter case, we wonder if the widely discussed recent record profits of Exxon–Mobile fall above or below what the petroleum price power-law slope would lead us to expect. Our essential argument is that manager-relevant statistical methods need to change from Gaussian to Pareto so as to keep company with equivalently changing managerial worlds. Unfortunately, econometrics and virtually all managerial/organizational statistical methods have not. We have argued that whenever tails of probability distributions are of interest, methods
Analysis of Extremes in Management Studies
191
based around assumptions of normal distributions are likely to underestimate extremes – large or small – and that it is natural to consider applying extreme value statistical methods. Extreme value theory provides a statistical approach to an inherently difficult problem – predicting the size of a rare event. We have introduced several very general and easily employed statistical methods, and indicated how these methods may be adapted to management-related problems. We hope that our demonstrations will persuade readers that extreme value theory has an important role to play in the development of management studies, and that extreme value statistics will soon have a place in every management researcher’s toolkit.
NOTES 1. References in support of each kind of power law are given in Andriani and McKelvey (2005). 2. These data are compiled from the following sources: profitability: 2003 Canadian Business list of Canada’s 500 largest publicly traded companies; biotechnology: 2001 Canadian Biotechnology Directory listings; and train accident fatalities: U.S. Federal Railroad Administration’s Rail Equipment Accident/Incident Reports, 1975– 2001. 3. First principles are one logic step up in truth claims support from foundational axioms such as F ¼ ma that are self-evident truths. 4. Schro¨dinger (1944) coined ‘‘negentropy’’ to refer to energy importation. Prigogine’s basic theory is that the tension imposed by the 2nd Law of Thermodynamics fosters the emergence of structures that, then, conform to the 1st Law. 5. Agents may be defined as biomolecules, cells, organisms, people, groups, firms, societies, etc. 6. GARCH ‘‘ y allows the variance to evolve over time’’ (p. 242). ARCH/ GARCH assumes that model errors appear in clusters and that the ‘‘ y forecast error depends on the size of the previous disturbance’’ (p. 238) – it treats variance as a ‘‘ y moving average of squared returns’’ (Engle, 1982). 7. Eq. (9) gives the GEV for maxima; for minima it is ! x m 1=x HðxÞ ¼ 1 exp 1 x c
with
xm 40; 1x c
xa0
8. These data are available from the IAAF at: http://www.iaaf.org/statistics/toplists/inout=O/ageGroup=N/season=0/index.html. 9. These data are available at: http://mlb.mlb.com/NASApp/mlb/stats/historical/entry.jsp.
192
JOEL A.C. BAUM AND BILL MCKELVEY
10. Note, however, that women’s times have taken a virtual right-angle turn downward since the peak in 1999. Does this suggest a fairly systematic retreat from steroid use by women sprinters since then? 11. These data are available from the MEPC at: http://www.mepc.org/public_asp/ resources/mrates.asp. The MEPC statistics do not include Palestinian suicide bombers (or other attackers) nor do they include Palestinians targeted for assassination (although bystanders killed during these assassinations are counted). Israeli Defense Forces soldiers killed during incursions into Palestinian lands are counted. 12. These data are available from Yahoo at: http://www.finance.yahoo.com/indices?u. 13. These data are available from Yahoo at: http://www.finance.yahoo.com/q/ hp?s=IMCL and http://www.phx.corporate-ir.net/phoenix.zhtml?c=97689&p=irolnews.
REFERENCES Albert, R., & Baraba´si, A. L. (2002). Statistical mechanics of complex networks. Reviews of Modern Physics, 74, 47–97. Anderson, N., Herriot, P., & Hodgkinson, G. P. (2001). The practitioner–researcher divide in industrial work and organizational (IWO) psychology: Where are we now, and where do we go from here? Journal of Occupational and Organizational Psychology, 74, 391–411. Andriani, P., & McKelvey, B. (2005). Beyond averages: Extending organization science to extreme events and power laws. 21st EGOS colloquium, 1 July, Berlin. Andriani, P., & McKelvey, B. (2006). Power laws and scale-free theory: Implications for organization science. Working paper, Durham Business School, University of Durham, Durham, UK. Ashby, W. R. (1956). Introduction to cybernetics. New York: Wiley. Bak, P. (1996). How nature works: The science of self-organized criticality. New York: Copernicus. Bak, P., Tang, C., & Wiesenfeld, K. (1987). Self-organized criticality: An explanation of 1/f noise. Physical Review Letters, 59, 381–384. Baraba´si, A. L. (2002). Linked: The new science of networks. Cambridge, MA: Perseus. Baraba´si, A. L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286, 509–512. Baum, J. A. C., Rowley, T. J., & Shipilov, A. V. (2004). The small world of Canadian capital markets: Statistical mechanics of investment bank syndicate networks, 1952–1989. Canadian Journal of Administrative Sciences, 21, 307–325. Baum, J. A. C., Shipilov, A. V., & Rowley, T. J. (2003). Where do small worlds come from? Industrial and Corporate Change, 12, 697–725. Beer, M. (2001). Why management research findings are unimplementable: An action science perspective. Reflections, 2, 58–65. Benbya, H., & McKelvey, B. (forthcoming). Using coevolutionary principles to improve IS alignment. Information Technology & People. Bennis, W. G., & O’Toole, J. (2005). How business schools lost their way. Harvard Business Review, 83(May), 96–104.
Analysis of Extremes in Management Studies
193
Beyer, J. M., & Trice, H. M. (1982). The utilization process: A conceptual framework and synthesis of empirical findings. Administrative Science Quarterly, 27, 591–622. Bigger, J. T., Jr., Steinman, R. C., Rolnitzky, L. M., Fleiss, J. L., Albrecht, P., & Cohen, R. J. (1996). Power law behavior of RR-interval variability in healthy middle-aged persons, patients with recent acute myocardial infarction, and patients with heart transplants. Circulation, 15, 2142–2151. Bollerslev, T. (1986). Generalized autoregressive conditional heteroscedasticity. Journal of Econometrics, 31, 307–327. Brief, A. P., & Dukerich, J. M. (1991). Theory in organizational behavior. Research in Organizational Behavior, 13, 327–352. Brock, W. A. (2000). Some Santa Fe scenery. In: D. Colander (Ed.), The complexity vision and the teaching of economics (pp. 29–49). Cheltenham, UK: Edward Elgar. Bykoski, V. (2003). Was: Zipf law – now: Life is about hierarchies. E-mail message sent October 12. (http://www.necsi.net:8100/Lists/). Cancho, R. F., & Sole´, R. V. (2003). Least effort and the origins of scaling in human language. Proceedings of the National Academy of Sciences, 100, 788–791. Carneiro, R. L. (1987). The evolution of complexity in human societies and its mathematical expression. International Journal of Comparative Sociology, 28, 111–128. Casti, J. L. (1994). Complexification: explaining a paradoxical world through the science of surprise. London: Abacus. Castillo, E., Hadi, A. S., Balakrishnan, N., & Sarabia, J. M. (2005). Extreme value and related models with applications in engineering and science. Hoboken, NJ: Wiley. Coles, S. (2001). An introduction to statistical modeling of extreme values. New York: Springer. Collins, J. (2001). Good to great: Why some companies make the leap y and others don’t. New York: Harper Business. Collins, J. C., & Porras, J. I. (1994). Built to last: Successful habits of visionary companies. New York: Harper Business. Davis, G. F., Yoo, M., & Baker, W. E. (2003). The small world of the American corporate elite. Strategic Organization, 1, 301–326. Davison, A. C., & Smith, R. L. (1990). Models for exceedances over high thresholds. Journal of the Royal Statistical Society, B52, 393–442. Diatlov, V. (2005). Information technology in the practice of organising: Meeting fragmentation and interdependence by the incremental political delivery of information systems in financial services. Unpublished PhD dissertation. School of Management, University of Southampton, Southampton, UK. Drossel, B., & Schwabl, F. (1992). Self-organized critical forest-fire model. Physical Review Letters, 69, 1629–1632. Dumont, L. (1966). Homo hie´rarchicus essai sur le syste`me des castes. Paris: Bibliothe`que des Sciences Humaines, Editions Gallimard. Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of UK inflation. Econometrica, 50, 987–1008. Ferro, C. A. T., & Segers, J. J. (2003). Inference for clusters of extreme values. Journal of the Royal Statistical Society B, 65, 546–556. Fisher, R. A. (1930). The genetical theory of natural selection. Oxford, UK: Clarendon. Fisher, R. A., & Tippett, L. H. C. (1928). Limiting forms of the frequency distribution of the largest and smallest members of a sample. Proceedings of the Cambridge Philosophical Society, 24, 180–190.
194
JOEL A.C. BAUM AND BILL MCKELVEY
Fre´chet, M. (1927). Sur la loi de probabilite´ de l’e´cart maximum. Annales de la Socie´te´ Polonaise de Mathe´matique, 6, 93–116. Ghoshal, S. (2005). Bad management theories are destroying good management practices. Academy of Management Learning and Education, 4, 75–91. Gilleland, E., & Katz, R. W. (2005). Tutorial for the extremes toolkit (extRemes): Weather and climate applications of extreme value statistics. National Center for Atmospheric Research, Research Applications Laboratory (http://www.assessment.ucar.edu/toolkit). Gnedenko, R. (1943). Sur la distribution limite du terme maximum d’une se´rie Ale´atoire. Annals of Mathematics, 44, 423–453. (English translation in Kotz S. & Johnson N.L. (Eds). (1992).Breakthroughs in statistics I: Foundations and basic theory (pp. 185–194). NY: Springer.) Greene, W. H. (2002). Econometric Analysis (5th ed.). Englewood Cliffs, NJ: Prentice-Hall. Gumbel, E. J. (1941). Probability interpretation of the observed return periods of floods. Transactions of the American Geophysics Union, 21, 836–850. Gumbel, E. J. (1958). Statistics of extremes. New York: Columbia University Press [Paperback reprint in 2004 by Dover, New York.]. Halloy, S. R. P. (1998). A theoretical framework for abundance distributions in complex systems. Complexity International, 6, 1–12. Hamilton, B. H., & Nickerson, J. A. (2003). Correcting for endogeneity in strategic management research. Strategic Organization, 1, 51–78. Jones, G. R. (2000). Organizational theory: Text and cases. Upper Saddle River, NJ: Prentice-Hall. Kauffman, S. A. (1993). The origins of order: Self-organization and selection in evolution. Oxford, UK: Oxford University Press. Kaye, B. (1993). Chaos and complexity. New York: VCH. Kogut, B., & Walker, G. (2001). The small world of Germany and the durability of national networks. American Sociological Review, 66, 317–335. Kolmogorov, A´. N. (1941). Uher das Logarithmisch Normale Verteilungsgesetz der Dimensionen der Teilchen Hei Zerstuckelung. Commits Rendus (DoLlady) de L’Academie des Sciences de l’URSS, 31, 99–101. Lawler, E. E., Monty Mohrman, A., Jr., Mohrman, S. A., Ledford, G. E., Jr., & Cummings, T. G. (1985). Doing research that is useful for theory and practice. New York: Lexington Books. Lindblom, C. E. (1959). The science of ‘muddling through’. Public Administration Review, 19, 79–88. MacArthur, R. H. (1960). On the relative abundance of species. American Naturalist, 94, 25–36. Mandelbrot, B. B. (1982). The fractal geometry of nature. New York: Freeman. Mandelbrot, B. B., & Hudson, R. L. (2004). The (mis)behavior of markets: A fractal view of risk, ruin and reward. London: Profile. Maruyama, M. (1963). The second cybernetics: Deviation-amplifying mutual causal processes. American Scientist, 51, 164–79. (Reprinted in: W. Buckley (Ed.), Modern systems research for the behavioral scientist (pp. 304–313). Chicago, IL: Aldine, 1968.) McKelvey, B. (2003). From fields to science. In: R. Westwood & S. Clegg (Eds), Debating organization: Point-counterpoint in organization studies (pp. 47–73). Oxford, UK: Blackwell. McKelvey, B. (2004). Toward a 0th Law of Thermodynamics: Order-creation complexity dynamics from physics and biology to bioeconomics. Journal of Bioeconomics, 6, 65–96.
Analysis of Extremes in Management Studies
195
McKelvey, B., & Andriani, P. (2005). Why Gaussian statistics are mostly wrong for strategic organization. Strategic Organization, 3, 219–228. McKelvey, B., & Boisot, M. (2006). Speeding up strategic foresight in a dangerous, complex world: A complexity approach. In: G. Suder (Ed.), Corporate strategies under international terrorism and adversity. Cheltenham, UK: Edward Elgar. Miller, D., & Friesen, P. (1984). Organizations: A quantum view. Englewood Cliffs, NJ: PrenticeHall. Moss, S. (2002). Policy analysis from first principles. Proceedings of the National Academy of Sciences, 99(Suppl. 3), 7267–7274. Newman, M. E. J. (2005). Power laws, Pareto distributions and Zipf’s Law. (http:// www.arxiv.org/abs/cond-mat/0412004). Pareto, V. (1897). Cours d’economie politique. Paris: Rouge. Peters, T. J., & Waterman, R. H. (1982). In search of excellence. New York: Harper & Row. Pfeffer, J. (1993). Barriers to the advancement of organizational science: Paradigm development as a dependent variable. Academy of Management Review, 18, 599–620. Pickands, J. I. (1975). Statistical inference using extreme order statistics. Annals of Statistics, 3, 119–131. Powell, W. W., White, D. R., Koput, K. W., & Owen-Smith, J. (2005). Network dynamics and field evolution: The growth of interorganizational collaboration in the life sciences. American Journal of Sociology, 110, 1132–1205. Preston, F. W. (1950). Gas laws and wealth laws. Scientific Monthly, 71, 309–311. Preston, F. W. (1981). Pseudo-lognormal distributions. Ecology, 62, 355–364. Prigogine, I. (1955). An introduction to thermodynamics of irreversible processes. Springfield, IL: Thomas. Raup, D. M. (1999). The nemesis affair: A story of the death of dinosaurs and the ways of science. New York: Norton. Reiss, R. D., & Thomas, M. (2001). Statistical analysis of extreme values: From insurance, finance, hydrology and other fields (2nd ed.). Boston, MA: Birkha¨user. Ribeiro, A. L. P., Lombardi, F., Sousa, M. R., Barros, M. V. L., Porta, A., Barros, V. C. V., Gomes, M. E. D., Machado, F. S., & Rocha, M. O. C. (2002). Power-law behavior of heart rate variability in Chagas’ disease. American Journal of Cardiology, 89, 414–418. Roethlisberger, F. J., & Dixon, W. J. (1939). Management and the worker. Cambridge, MA: Harvard University Press. Romme, G. L. (1999). Domination, self-determination and circular organizing. Organization Studies, 20, 801–832. Rynes, S. L., Bartunek, J. M., & Daft, R. L. (2001). Across the great divide: Knowledge creation and transfer between practitioners and academics. Academy of Management Journal, 44, 340–355. Sanchez, R., & Mahoney, J. T. (1996). Modularity, flexibility, and knowledge management in product and organization design. Strategic Management Journal, 17, 63–76. Schilling, M. A. (2000). Toward a general modular systems theory and its application to interfirm product modularity. Academy of Management Review, 25, 312–334. Schro¨dinger, E. (1944). What is life? The physical aspect of the living cell. Cambridge, UK: Cambridge University Press. Simon, H. A. (1962). The architecture of complexity. Proceedings of the American Philosophical Society, 106, 467–482.
196
JOEL A.C. BAUM AND BILL MCKELVEY
Sornette, D. (2003). Why stock markets crash: Critical events in complex financial systems. Princeton, NJ: Princeton University Press. Sornette, D. (2004). Critical phenomena in natural sciences: Chaos, fractals, self-organization and disorder: Concepts and tools (2nd ed.). Heidelberg, GDR: Springer. Sornette, D., Deschaˆtres, F., Gilbert, T., & Ageon, Y. (2004). Endogenous versus exogenous shocks in complex networks: An empirical test using book sale rankings. Physical Review Letters, 93, 22870-1–22871-4. Stanley, M. H. R., Amaral, L. A. N., Buldyrev, S. V., Havlin, S., Leschhorn, H., Maass, P., Salinger, M. A., & Stanley, H. E. (1996). Scaling behavior in the growth of companies. Nature, 379, 804–806. Stauffer, D. (1985). Introduction to percolation theory. London: Taylor & Francis. Thomas, C., Kaminska-Labbe´, R., & McKelvey, B. (2005). Managing the control/autonomy dilemma: From impossible balance to irregular oscillation dynamics. Working paper, University of Nice, Sophia Antipolis, France. Uzzi, B., Spiros, J., & Delis, D. (2002). The origin and emergence of career networks in the Broadway musical industry, 1877–1995. Working paper, Kellogg Graduate School of Management, Northwestern University, Chicago, IL. Van de Ven, A. H., & Johnson, P. E. (forthcoming). Knowledge for theory and practice. Academy of Management Review. Watts, D. J., & Strogatz, S. H. (1999). Collective dynamics of ‘small-world’ networks. Nature, 393, 440–442. Weibull, W. (1939). A statistical theory of the strength of materials. Ingenioersvetenskapsakademien Stockholm Handlingar. In: Proceedings of the Royal Swedish institute for engineering research, Stockholm, Sweden (Vol. 151, pp. 1–45). Weick, K. E. (2001). Gapping the relevance bridge: Fashions meet fundamentals in management research. British Journal of Management Research, 12, S71–S75. West, B. J., & Deering, B. (1995). The lure of modern science: Fractal thinking. Singapore: World Scientific. West, G. B., Brown, J. H., & Enquist, B. J. (1997). A general model for the origin of allometric scaling laws in biology. Science, 276, 122–126. Yule, G. U. (1924). A mathematical theory of evolution based on the conclusions of Dr. J. C. Willis. Philosophical Transactions of the Royal Society of London B, 213, 21–87. Zipf, G. K. (1949). Human behavior and the principle of least effort. New York: Hafner.
THE ROLE OF FORMATIVE MEASUREMENT MODELS IN STRATEGIC MANAGEMENT RESEARCH: REVIEW, CRITIQUE, AND IMPLICATIONS FOR FUTURE RESEARCH Nathan P. Podsakoff, Wei Shen and Philip M. Podsakoff ABSTRACT Since the publication of Venkatraman and Grant’s (1986) article two decades ago, considerably more attention has been directed at establishing the validity of constructs in the strategy literature. However, recent developments in measurement theory indicate that strategy researchers need to pay additional attention to whether their constructs should be modeled as having formative or reflective indicators. Therefore, the purpose of this chapter is to highlight the differences between formative and reflective indicator measurement models, and discuss the potential role of formative measurement models in strategy research. First, we systematically review the literature on construct measurement model specification. Second, we
Research Methodology in Strategy and Management, Volume 3, 197–252 Copyright r 2006 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1479-8387/doi:10.1016/S1479-8387(06)03008-6
197
198
NATHAN P. PODSAKOFF ET AL.
assess the extent of measurement model misspecification in the recent strategy literature. Our assessment of 257 constructs in the contemporary strategy literature suggests that many important strategy constructs are more appropriately modeled as having formative indicators than as having reflective indicators. Based on this review, we identify some common errors leading to measurement model misspecification in the strategy domain. Finally, we discuss some implications of our analyses for scholars in the strategic management field.
INTRODUCTION Establishing the validity of the measures of hypothetical constructs is critical to the development of the social sciences (Schwab, 1980). Two decades ago, Venkatraman and Grant (1986) addressed this issue specifically within the domain of strategic management. These authors noted that much of the attention in strategy research up to the time of their article had focused on the substantive relationships between different theoretical constructs, rather than on validating the relationships between the measures and the constructs these measures were purported to assess. They pointed out that construct validation was critical to strategy research for several reasons. First, evaluation of one’s measures provides a strong basis for interpreting substantive relationships. Second, since the upper bound of the variance shared between any two variables is a product of their reliabilities, inattention to the measurement properties of constructs may lead to the rejection of relationships that actually exist. Finally, improved understanding of the relationships between constructs can only be attained through better measurement. Venkatraman and Grant (1986) concluded that the quality of strategy research depends on the quality of the measures used and recommended that strategy researchers develop valid, multiple-item scales to measure their constructs. Although current strategic management research is still criticized for relying on the use of single-indicator measures and a lack of emphasis on construct measurement (Boyd, Gove, & Hitt, 2005), the issues raised by Venkatraman and Grant (1986) appear to have attracted a great deal of attention from strategy researchers. For example, a search of the ISI Web of Knowledge database indicates that this paper has been cited over 140 times since its publication. Furthermore, an examination of the studies published in the Academy of Management Journal (AMJ), Administrative Science Quarterly (ASQ), and Strategic Management Journal (SMJ) suggests
Formative Measures in Strategy Research
199
that strategy researchers have increasingly recognized the importance of construct measurement. They have strived to develop multiple-item scales to measure key constructs and have increasingly used structural equation modeling (SEM) techniques in their analyses (e.g., Birkinshaw, Hood, & Jonsson, 1998; Finkelstein & Boyd, 1998; Koka & Prescott, 2002; Shook, Ketchen, Hult, & Kacmar, 2004; Williams, Gavin, & Hartman, 2004). Although the attention given to measurement issues such as reliability and validity have been characterized as largely inadequate (Boyd et al., 2005) or less than ideal (Shook et al., 2004), the increasing use of multiple-item scales in construct measurement is nevertheless a welcome development in strategic management research. However, to further improve the measurement of constructs in strategic management research, scholars must attend to two important developments in this domain. The first development focuses on the nature of the relationship between a construct and its measures (e.g., Bollen, 1984, 1989; Bollen & Lennox, 1991; Diamantopoulos & Winklhofer, 2001; Edwards & Bagozzi, 2000; Jarvis, MacKenzie, & Podsakoff, 2003; MacCallum & Browne, 1993; MacKenzie, Podsakoff, & Jarvis, 2005). In the social sciences, researchers traditionally treat the relationship between a construct and its measures on the basis of classical test theory, which assumes that each measure is a reflection or manifestation of the underlying construct (Schwab, 1980). As a result, these measures are referred to as ‘‘reflective’’ measures of the construct (Fornell & Bookstein, 1982), and the construct is referred to as a ‘‘latent’’ variable (MacCallum & Browne, 1993). However, there is growing recognition that some measures may actually be determinants or causes of a construct, rather than manifestations of it (Bollen & Lennox, 1991; Edwards & Bagozzi, 2000; MacCallum & Browne, 1993). These measures are referred to as ‘‘formative’’ measures of the construct (Fornell & Bookstein, 1982), and the construct is referred to as a ‘‘composite’’ variable (MacCallum & Browne, 1993).1 One frequently cited example of a composite construct is social economic status (SES), which is typically measured by an individual’s income, education level, and occupation. In this case, none of the measures are manifestations of an individual’s SES; instead, they jointly determine or form the SES construct. The second development focuses on the nature of the relationship between a higher-order multidimensional construct and its dimensions (Edwards, 2001; Law & Wong, 1999; Law, Wong, & Mobley, 1998). For example, Law and Wong (Law & Wong, 1999; Law et al., 1998) distinguished between latent and aggregate models. In their latent model, these authors indicate that the higher-order construct exists at a ‘‘deeper’’ level
200
NATHAN P. PODSAKOFF ET AL.
than its dimensions and each dimension represents a manifestation of the construct. Conceptually and mathematically, this relationship is very similar to that occurring between a construct and its reflective measures. In contrast, an aggregate construct is formed as an algebraic composite of its dimensions, with each dimension representing a distinct facet of the construct. Mathematically, this relationship is very similar to that between a construct and its formative measures. The specification of the relationship between a construct and its measures or dimensions is not only important at a conceptual level, but also has significant consequences on the substantive relationships under study (Bollen & Lennox, 1991; Jarvis et al., 2003; Law & Wong, 1999; MacCallum & Browne, 1993; MacKenzie et al., 2005). However, with only a few exceptions (e.g., Fornell, Lorange, & Roos, 1990; Johansson & Yip, 1994; Olk & Young, 1997), these developments have not been incorporated into empirical strategy research. Indeed, our review of the articles published in AMJ, ASQ, and SMJ during the last two decades indicates that only a few strategy articles have explicitly discussed the relationships between constructs and their measures or dimensions. Therefore, most studies employing multiple measures to assess a construct implicitly treated them as reflective measures of the construct. There are two possible explanations for the very limited use of formative measures in strategy research. The first explanation is that constructs having formative indicators represent only a small proportion of the constructs used in the strategy field (Slater & Atuahene-Gima, 2004, p. 238). This would suggest that the majority of constructs in this domain are conceptualized as unidimensional, and are measured using indicators that are reflections of the underlying construct. In contrast, the second explanation, provided by Hitt, Gimeno, and Hoskisson (1998), is that many strategy researchers are not aware of or do not understand the differences between formative and reflective measurement models. In this case, strategy researchers may be modeling constructs having formative indicators as having reflective indicators out of a lack of knowledge. On the basis of their review of 18 strategic management studies published from 2000 to 2003, Williams et al. (2004) concur with this second explanation, arguing that researchers may default to the use of reflective measurement models without thinking through the nature of the relationship between their construct and its measures. Given the above observations, the purpose of this chapter is to highlight the differences between formative and reflective indicator measurement models, and discuss the potential role of formative measurement models in
Formative Measures in Strategy Research
201
strategy research. First, we systematically review the recent developments in the literature on construct measurement model specification. As part of this review, we discuss the criteria that can be used to differentiate between constructs that should be modeled as having formative indicators from those that should be modeled as having reflective indicators. Next, in order to assess the extent of measurement model misspecification in the current strategic management literature, we conduct a comprehensive review of strategy studies published in AMJ, ASQ, and SMJ from 1994 to 2003. Based on this review, we identify four common errors leading to construct measurement model misspecification. Finally, we discuss some implications of our analysis for scholars in the strategy field. Before proceeding, we want to point out that our review is not intended as a criticism of the researchers whose studies we assess or cite as examples. On the contrary, we have great respect for these scholars and their efforts toward improving the quality of strategy research. Neither do we wish to disparage the quality of the research published in AMJ, ASQ, or SMJ. Studies published in these leading journals have greatly influenced the field of management (cf. Podsakoff, MacKenzie, Bachrach, & Podsakoff, 2005; Tahai & Meyer, 1999). In our opinion, the lack of attention to measurement model specification is largely due to two factors. The first is the dominance of econometric analysis in strategic management research, and the second is the relatively recent application of formative indicator measurement models to many different domains in the social sciences in general. Indeed, the issues surrounding formative measurement models are also new to researchers in other disciplines in the field of management as well (Law & Wong, 1999; Law et al., 1998; MacKenzie et al., 2005). In this regard, we believe that the issues we address in this article will have implications for researchers in strategy and other disciplines in the social sciences.
FORMATIVE AND REFLECTIVE MEASUREMENT MODELS For the purpose of clarity, it is important for us to define the key terms used in this chapter. We define a construct as a conceptual term used to describe a phenomenon of theoretical interest (Schwab, 1980). We use the terms measure or indicator interchangeably to describe an observed variable gathered from a survey, interview, archival record, or any other empirical source designed to capture a construct (Edwards & Bagozzi, 2000). We define a
202
NATHAN P. PODSAKOFF ET AL.
dimension as a conceptual term used to describe a distinct facet of a construct that is conceptualized as having heterogeneous facets (Bollen & Lennox, 1991). Therefore, by definition, each dimension captures a unique aspect of a multidimensional construct. In addition, unlike measures, dimensions are constructs themselves and also have their own measures or indicators. In SEM language, a multidimensional construct is a secondorder construct, its dimensions are first-order constructs, and measures of the dimensions are observable variables. We use the term formative measurement model for situations in which a construct is modeled as a composite of its dimensions or measures, and reflective measurement model for situations in which a construct is modeled as a latent variable of its dimensions or measures.
The Conceptual and Mathematical Distinctions between Formative and Reflective Measurement Models As indicated above, recent developments in construct measurement have addressed two related issues: (a) the relationship between a construct and its measures (Bollen & Lennox, 1991; Diamantopoulos & Winklhofer, 2001; Edwards & Bagozzi, 2000; MacCallum & Browne, 1993), and (b) the relationship between a multidimensional construct and its dimensions (Edwards, 2001; Law et al., 1998; Jarvis et al., 2003; MacKenzie et al., 2005). The focus of much of this discussion has been on whether a construct should be modeled as a composite of its measures/dimensions (i.e., using a formative measurement model), or as a latent variable of its measures/ dimensions (i.e., using a reflective measurement model). In this section, we first review and integrate these two streams of literature, with particular attention to the conceptual and mathematical differences between formative and reflective measurement models. We then discuss the consequences of measurement model misspecification and the criteria that can be used to help researchers choose between formative and reflective measurement models. To minimize the potential for confusion, we begin our review with the relationships between first-order constructs and their measures, and then we discuss the relationships between higher-order multidimensional constructs, their dimensions, and their measures. Relationships between First-Order Constructs and Measures As noted earlier, reflective measurement models are built on classical test theory and treat each reflective measure as a manifestation of the construct
Formative Measures in Strategy Research
203
being assessed. According to classical test theory, the variance in each reflective measure of a first-order unidimensional construct has two components: variance reflecting the construct and variance due to error (Schwab, 1980). To ensure high reliability and validity, it is important to assess such a construct with multiple reflective measures. Mathematically, each reflective measure is defined as a linear function of the construct plus measurement error: yi ¼ li Z þ i ; i ¼ 1; . . . ; n
(1)
where yi is the ith reflective measure, Z is the construct, li is the effect (i.e., factor loading) of the construct Z on yi, ei is measurement error specific to yi, and n is the number of reflective measures used to assess the construct (Bollen & Lennox, 1991). An example of a construct modeled as having reflective indicators is provided in Panel A of Fig. 1. This figure illustrates several of the distinguishing characteristics of constructs with reflective indicators. First, the direction of the arrows in this model goes from the latent construct (Z) to its indicators. This signifies that the indicators are reflections or ‘‘effects’’ of the underlying construct that they represent, and that it is the underlying construct that accounts for the covariation among the indicators. Second, the direction of the arrows reflects the fact that the indicators are the dependent variables in this measurement model and that, conceptually, the factor loadings represent a simple regression of each of the indicators on the underlying construct of interest. Finally, the error terms attached to the indicators are included in order to acknowledge that each of the indicators is measured with error.
Fig. 1.
Illustrations of (A) Reflective Indicator Model and (B) Formative Indicator Model.
204
NATHAN P. PODSAKOFF ET AL.
When estimating a construct with reflective measures, the purpose is to ‘‘account for the observed variances and co-variances’’ between the measures (Fornell & Larcker, 1981, p. 441). More importantly, only the covariance that each measure shares with all the other measures of the construct (i.e., common variance) is used for analysis. Variance that is unique to each measure is regarded as measurement error and is not taken into account. As a result, reflective measurement models emphasize internal consistency reliability and convergent validity when assessing the ‘‘goodness’’ of the indicators. To derive an estimate of the construct Z, factor analytic techniques are employed on the reflective indicators y1, y, yn (Edwards & Bagozzi, 2000). An example of a construct for which this type of measurement model is appropriate in the local competition construct developed by Birkinshaw et al. (1998) in their study of subsidiary initiatives in multinational corporations. Birkinshaw and his colleagues defined local competition as the level of competitiveness in the local market as perceived by subsidiary management. To measure this construct, they asked subsidiary CEOs to indicate the extent to which the following two items describe their business environment: (1) competition in this country is extremely intense; and (2) domestic competition is intense. These two items assess the overall level of perceived local (domestic) competition in very similar ways, and subsidiary CEOs’ responses to each of these items reflect their perceptions of the level of local competition faced by their firm, plus measurement error. By analyzing the variance and covariance of these two reflective measures, the authors were able to assess the reliability and convergent validity of the measures. In formative measurement models of first-order constructs, the measures are not viewed as reflections or manifestations of the construct to be assessed. Rather, they are viewed as determinants of the construct. In other words, formative indicators of a first-order construct are exogenous variables that actually determine the variance of the construct, rather than vice versa as in a reflective measurement model. Mathematically, a first-order construct assessed with formative measures is generally operationalized as a weighted linear combination of the formative measures, plus a disturbance term: Z¼
n X
gi X i þ z
(2)
i¼1
where Z is the construct, n is the number of formative measures used to assess the construct, gi is the parameter estimate or weight reflecting the contribution of Xi to the construct Z, Xi is the ith formative measure, and z is the
Formative Measures in Strategy Research
205
disturbance term (Bollen & Lennox, 1991; Edwards & Bagozzi, 2000; MacCallum & Browne, 1993). This measurement model is illustrated in Panel B of Fig. 1. In contrast to measurement models for constructs having reflective indicators, the direction of the arrows in measurement models for constructs with formative indicators goes from the indicators to the construct to illustrate the fact that the indicators are the determinants of the construct. Conceptually, this means that the construct is the dependent variable in this measurement model and that the indicators are the independent variables. Unlike the reflective indicator measurement model, measurement errors in individual formative indicators X1, y, Xn are not a concern. However, a disturbance (error) term (z) is included at the construct level to capture any invalidity in the set of indicators that are used to measure the construct (MacKenzie et al., 2005). Sometimes, researchers may ignore the disturbance term such that the construct is treated as a perfect weighted linear combination of the selected formative measures (Edwards & Bagozzi, 2000). In this case, measurement error in the construct is effectively ignored. When estimating a formative measurement model, the purpose is not to account for the observed covariance shared by the measures. Instead, the purpose is to replicate the variance observed in the indicators in order to minimize the residuals in the structural relationships of interest between the construct and other variables and to increase explanatory power (Fornell & Bookstein, 1982). Further, all the variance observed in each indicator including both the variance shared with other measures and the variance unique to each measure, is included in the analysis. Formative measurement models underlie principal component analysis, canonical correlation analysis, and partial least squares analysis, each of which uses observed measures to assess constructs through weighted linear combinations (Edwards & Bagozzi, 2000). Sometimes, researchers may simply derive estimates of the construct by using either the sum (e.g., Nachum, 2003) or the mean of the indicators (e.g., Mitchell, Holtom, Lee, Sablynski, & Erez, 2001). In such cases, the researchers have assumed that the measures contribute equally to form the construct: gi ¼ 1 when the sum is used and gi ¼ 1=n when the mean is used. An example of a construct for which this type of measurement model might be appropriate in the intensity of the rivalry among existing competitors construct developed by Michael Porter (1980). According to Porter, this construct may be measured using indicators such as industry concentration, sales growth, and excess capacity (Hitt et al., 1998). Because the intensity of rivalry is determined by these measures, they should be modeled as formative indicators, not reflective indicators (manifestations) of rivalry.
206
NATHAN P. PODSAKOFF ET AL.
Relationships between Higher-Order Multidimensional Constructs, Dimensions, and Measures Higher-order multidimensional constructs are conceptualized as having multiple dimensions, with each dimension representing an important aspect of the construct (Bollen & Lennox, 1991). Law and Wong (1999; Law et al., 1998) proposed three models to specify the nature of the relationships between a higher-order multidimensional construct and its dimensions: a latent model, an aggregate model, and a profile model. According to Law and his colleagues, a latent construct is defined by the commonality of the dimensions, and the dimensions are defined as manifestations of the construct with different degrees of accuracy. This relationship is very similar to that between a first-order unidimensional construct and its reflective indicators, except that dimensions are unobservable constructs themselves, whereas reflective indicators are observable variables (Edwards, 2001; Law et al., 1998). In Law and Wong’s aggregate model, the construct is defined as an algebraic composite of its dimensions, which are distinct components of the construct itself (Law et al., 1998). This relationship is very similar to that between a first-order construct and its formative indicators, except that dimensions are conceptual components of a higher-order multidimensional construct, whereas formative indicators are observable indicators of a firstorder construct (Edwards, 2001). Finally, in their profile model, Law and his colleagues indicate that the construct cannot be defined as an algebraic function of the dimensions; instead, it represents various configurations of different levels of the dimensions (Law et al., 1998). Because we are interested in higher-order multidimensional constructs whose relationships with their dimensions can be specified mathematically, we limit our discussion here to the latent and the aggregate models. The research conducted by Law and his colleagues (Law et al., 1998) makes an important contribution to our understanding of the relationships between higher-order multidimensional constructs and their dimensions. However, we have reservations about the appropriateness of the latent variable model discussed by Law et al. in the measurement of higher-order multidimensional constructs in strategy research. As noted by Law et al., the dimensions in a latent model do not represent distinct facets of the construct. Rather, they represent the same construct with different levels of accuracy. More importantly, Law et al. argue that when a higher-order multidimensional construct is defined as the commonality of its dimensions (i.e., the covariance shared by all of its dimensions), the variances specific to each dimension and those shared only by some of the dimensions are not considered to be part of the true variance of the construct. Rather, they are
Formative Measures in Strategy Research
207
considered error variance. In other words, the unique variances of the dimensions are excluded from being part of the construct, even though these dimensions are conceptualized as part of the construct’s domain. Therefore, we believe that it is more appropriate for strategy researchers to use the aggregate model because it treats each dimension as an important component of the construct. This treatment of a higher-order multidimensional construct is consistent with our conception, following Bollen and Lennox (1991), that dimensions represent distinct facets of the construct. However, it should be noted that our recommendation is based on the assumption that an algebraic relationship can be specified between the constructs and their dimensions. In other words, the constructs are not just assessed as configurations of their dimensions through a profile model. An example of a strategy construct that we believe is consistent with the aggregate model is the interest similarity construct developed by Doz, Olk, and Ring (2000). These authors surveyed R&D consortium partners regarding the similarity of their interests in nine different areas: industry, organizational characteristics, existing ties, new development in products, new development in technology process, motive to improve performance, motive to develop technical standards, motive to share resources, and motive to develop resources. According to Doz et al. (2000), interest similarities in these nine areas complement one another and collectively represent the interest similarity between R&D consortium partners. Therefore, they argued that interest similarity should be modeled as a construct having formative indicators. We agree with Doz et al.’s (2000) treatment of the interest similarity construct since this construct is clearly a result of a firm’s perception of the similarity in interest it has with another firm in these nine different areas, rather than the cause of them.
CONSEQUENCES OF MEASUREMENT MODEL MISSPECIFICATION Measurement model misspecification refers to situations in which constructs having formative measures are incorrectly specified as having reflective measures, or vice versa. Because a reflective measurement model only uses the covariance shared between all of the measures, whereas a formative measurement model uses all the variance observed in the measures to derive an estimate of the construct, measurement model misspecification can have severe consequences for conclusions regarding the substantive relationships
208
NATHAN P. PODSAKOFF ET AL.
between the constructs under study (Jarvis et al., 2003; Law & Wong, 1999; MacKenzie et al., 2005). Moreover, because the proportion of the unique variance increases as the covariance among the measures decreases, the impact of measurement model misspecification is expected to be more severe when the measures of a construct do not covary at a high level. Law and Wong (1999) were perhaps the first researchers to examine the empirical consequences of measurement model misspecification in the management domain. Using a sample of 204 business school graduates, these authors modeled relationships between job perceptions (task significance, task identity, task variety, autonomy, and feedback), leader member exchange (LMX), liking toward the supervisor, job satisfaction, and turnover intentions with two different measurement models. In the first model, Law and Wong modeled task significance, task identity, task variety, autonomy, and feedback as reflective indicators of job perceptions; whereas, in the second model these facets were modeled as formative indicators of the job perceptions construct. Their analysis indicated that the parameter estimates for the relationships between job perceptions and job satisfaction was inflated when the job perceptions construct was mismodeled, as having reflective indicators as compared to when it was correctly modeled as having formative indicators. Therefore, the findings reported by these authors provided initial evidence indicating that measurement model misspecification may influence the substantive relationships between constructs in the management literature. However, the generalizability of these findings were somewhat limited by several methodological factors (cf. MacKenzie et al., 2005), including the fact that this research was conducted in a single, relatively small sample. In an attempt to address these limitations, Jarvis et al. (2003) employed Monte Carlo simulation techniques to examine the consequences of measurement model misspecification in different contexts. Specifically, these authors manipulated the strength of the correlations between the construct measures (i.e., weak, r ¼ 0.10; moderate, r ¼ 0.40; or strong, r ¼ 0.70) and the position of the focal construct within the structural model (i.e., exogenous or endogenous). Their results suggest that measurement model misspecification has substantial consequences for structural parameter estimates. For example, when the focal construct was positioned as an exogenous variable in a structural equation model, and its formative indicators were misspecified as reflective, the unstandardized structural parameter estimates emanating from the focal construct to the endogenous constructs were inflated by an average of 490% when the correlations for its formative measures were weak (r ¼ 0.10), 385% when the correlations were moderate
Formative Measures in Strategy Research
209
(r ¼ 0.40), and 336% when the correlations were strong (r ¼ 0.70). Furthermore, the unstandardized parameter estimates emanating from the focal construct were significantly different from the same paths in the correctly specified formative measurement model (po0.001). In addition, Jarvis et al. also reported significant differences in the parameter estimates (although of lesser magnitude and of opposite sign) when the focal construct that was misspecified as having reflective measures was positioned as an endogenous variable. Another important finding of the study by Jarvis et al. (2003) relates to the goodness-of-fit indices. These authors reported that many of the most commonly recommended goodness-of-fit indices (CFI, SRMR, and RMSEA) generally indicated appropriate model fit for structural equation models in which formative measures were incorrectly specified as reflective measures. Thus, they concluded that, ‘‘when the measurement model is misspecified, researchers may have difficulty detecting it based on the overall goodness-of-fit indices’’ (Jarvis et al., 2003, p. 212). Taken together, findings of the study by Jarvis et al. indicate that measurement model misspecification not only has serious consequences on parameter estimates, but also that these consequences are generally not detectable through the goodnessof-fit indices recommended in the structural equation modeling literature. A more recent simulation conducted by MacKenzie et al. (2005) provides additional support for these findings.
CHOOSING BETWEEN FORMATIVE AND REFLECTIVE MEASUREMENT MODELS Because of the potentially severe consequences of measurement model misspecification and the difficulty in detecting it through the most commonly used goodness-of-fit indices, it is important for researchers to correctly specify the measurement models in their analysis. Although there has been some discussion about the differences between formative and reflective measures (Bollen & Lennox, 1991; Edwards & Bagozzi, 2000; Jarvis et al., 2003), choosing the most appropriate measurement model can be difficult. Based on the existing literature (Bollen & Lennox, 1991; Jarvis et al., 2003; MacKenzie et al., 2005), we review four criteria that can help strategy researchers make this choice. The first criterion focuses on the conceptual relationship between the constructs and the measures or dimensions, whereas the other three focus on the interrelationships among the measures
210
NATHAN P. PODSAKOFF ET AL.
or the dimensions. Again, to minimize the potential for confusion, we first discuss these criteria in the measurement of first-order constructs, and then generalize them into the measurement of higher-order multidimensional constructs.
Measurement of First-Order Constructs The first criterion focuses on the conceptual relationship between the construct and its measures, and has been referred to as the direction of causality (e.g., Bollen & Lennox, 1991). As noted by Bollen and Lennox (1991), the term ‘‘causality’’ entails no special meaning here other than whether the construct is better conceptualized as being determined by its measures or the measures are viewed as reflections of the underlying construct. The defining characteristic of formative measurement models is that the construct is determined by its measures, and changes in the measures are expected to produce changes in the construct. This suggests that the construct is the endogenous variable, and the measures are the exogenous variables that are theoretical determinants of the construct. In contrast, the defining characteristic of reflective measurement models is that the construct underlies the measures, and changes in the construct are expected to cause changes in the measures. This suggests that the measures are simply observable manifestations or reflections of the construct. The second criterion relates to the interchangeability of the measures at the conceptual level. Because formative measures are exogenous determinants of the construct and may capture unique aspects of its domain, they are not necessarily expected to be interchangeable. As a result, removing any of the formative measures of a construct may alter the conceptual domain of the construct and subsequently result in construct measurement deficiency (Bollen & Lennox, 1991). In contrast, because reflective measures all share the same common theme and are viewed as equivalent manifestations of the same construct, they are expected to be interchangeable and assuming that they are equally reliable, removing any of them should not have a significant impact on the conceptual domain of the construct. The third criterion relates to the covariance of the measures (Bollen, 1984; Bollen & Lennox, 1991; Jarvis et al., 2003). Since each formative measure is an exogenous variable that may capture a unique aspect of the construct’s domain, these measures need not be expected to covary at a high level with one another. Indeed, as noted by Bollen (1984) measures of a composite construct may be positively related, negatively related or not related to each
Formative Measures in Strategy Research
211
other at all. In contrast, since all of the measures of a reflective construct are expected to be interchangeable manifestations of the same construct, reflective measures are expected to covary at a high level. This third criterion can be viewed as the empirical evidence of the second criteria, the conceptual interchangeability of the measures. The fourth criterion relates to the similarity of the nomological networks of the measures (Jarvis et al., 2003; MacKenzie et al., 2005). As manifestations of the same construct, reflective measures are expected to have similar antecedents and consequences. As exogenous determinants of a construct, formative measures are not necessarily expected to have similar antecedents or consequences. Empirically, similarity of nomological networks can be assessed by the ‘‘structural equivalence’’ of the measures. For example, assuming a construct has two measures i and j, and the correlation matrix also includes n measures (km, m ¼ 1, y, n) of other constructs, the similarity of nomological networks implies a similarity between correlations rikm and rjkm (m ¼ 1, y, n), and can be assessed by Corrkm ðrikm ; rjkm Þ; m ¼ 1; . . . ; n: However, because this item-level information is typically not made available by authors, and because this test may be biased by factors such as single source ratings, we recommend that the similarity of the nomological networks criterion be assessed from a conceptual level, and then supported with empirical evidence, if it is available. In summary, a first-order construct should be assessed with a formative measurement model if the indicators are conceptualized as exogenous determinants of the construct, are not necessarily interchangeable, do not necessarily covary with each other at a high level, and are not expected to have similar nomological networks. In contrast, if the indicators are conceptualized as manifestations of the construct, are expected to be interchangeable and covary at a high level, and have similar nomological networks, then a reflective measurement model is appropriate.
Measurement of Higher-Order Multidimensional Constructs When a construct is conceptualized as having multiple facets (i.e., dimensions) the first question is whether the relationship between the construct and the dimensions can be expressed in an algebraic function. In a reflective measurement model, the dimensions are expressed as an algebraic function of the construct. In a formative measurement model, the construct is expressed as an algebraic function of the dimensions. When no algebraic function can be specified between the construct and the dimensions, the
212
NATHAN P. PODSAKOFF ET AL.
construct may more appropriately be assessed as a configuration of the dimensions through a profile model (Law et al., 1998). To choose between a formative and a reflective measurement model, the key question to ask is whether each dimension makes a unique contribution to the understanding of the construct. The answer depends on both the relationship between the dimensions and the relationship between the construct and the dimensions. For each dimension to make a unique contribution to the understanding of the construct, a necessary condition is that they must be distinguishable from each other. Specifically, they must not be interchangeable conceptually, must not covary at a high level empirically, and must not have similar nomological networks. When these conditions are not met, it suggests that the dimensions are not distinct from each other, but rather are just equivalent manifestations of the construct. In this situation, a reflective measurement model should be used. When the above conditions are met, it suggests that the dimensions are distinct, and the unique variances captured by the dimensions make important contributions to the understanding of the construct (i.e., each dimension is a distinct component of the construct), and a formative measurement model should be used. As noted earlier, because we believe that many higher-order multidimensional constructs in strategic management meet these conditions (i.e., have dimensions that capture unique aspects of the construct that should be included in the conceptual domain of the construct) they probably should be modeled as having formative indicators, rather than as having reflective indicators.
MEASUREMENT MODEL MISSPECIFICATION IN STRATEGY RESEARCH In this section, we systematically review construct measurement model misspecification in the contemporary strategy research. Our purpose is twofold. The first goal is to assess the extent to which measurement model misspecification exists in the strategy domain. The second goal of this review is to examine the relevance of formative measurement models to research and theory development in strategy. To accomplish these goals, we examined construct measurement models in a sample of recently published strategy studies. In our assessment, we focus on: (1) how the original authors modeled their constructs, and (2) what the conceptual and empirical evidence provided by the original authors suggests about the relationship between the constructs and their measures or dimensions.
Formative Measures in Strategy Research
213
Sample of Studies Included For the purposes of our review, we selected a sample of strategic management studies published in AMJ, ASQ, and SMJ over a recent ten-year period (1994–2003). These three journals were selected because they are the most prominent journals publishing empirical strategy research, and have been shown to be among the most influential journals in the field (Podsakoff et al., 2005; Tahai & Meyer, 1999). The period from 1994 to 2003 was chosen because we believe that articles published during this time frame would be representative of contemporary strategy research and recent developments in the field. Each issue of the focal journals was manually searched to identify all of the studies that measured constructs related to the strategic management domain. A study needed to meet three criteria to be included in the sample. First, the study had to focus on a strategy-related topic. Second, the study had to report the measures used to assess the construct(s). Finally, the article had to report the results of a confirmatory factor analysis for at least one of these constructs. The reason why we limited our analyses to studies that reported confirmatory factor analyses is that this technique or procedure explicitly requires the researchers to specify the nature of the relationship between constructs and their indicators. Our search led to the identification of 45 articles, which included a total of 257 constructs assessed by multiple measures. A list of the articles in the final sample is reported in the reference section at the end of this chapter.
Assessment of Measurement Model Misspecification To examine the extent of measurement model misspecification in the strategic management domain, we took a multistep approach. First, we recorded each construct, its measures, and the authors’ conceptual specification of the nature of the relationships between each construct and its measures (i.e., as formative or reflective). Following this, we assessed the relationships between the constructs and their measures using the four criteria described earlier. Generally speaking, constructs were categorized as reflective when the measures conceptually represented reflections or manifestations of the construct, the measures were expected to covary with each other, the measures were viewed as interchangeable, the omission of one of the measures was not expected to alter the conceptual domain of the construct, and the measures were expected to have the same antecedents and
214
NATHAN P. PODSAKOFF ET AL.
consequences. In contrast, we categorized constructs as formative when the measures were viewed as determinants of the construct, the measures were not necessarily expected to covary with each other, the measures were not necessarily viewed as interchangeable, the omission of one of the measures could alter the conceptual domain of the construct, and the measures were not necessarily expected to have the same antecedents and consequences. Although the conceptual criteria discussed above served as the primary basis for our decision as to whether a construct should have been modeled as having formative or reflective indicators, once the conceptual nature of the construct was examined we also examined the data reported by the authors to determine whether there was any empirical evidence to suggest that the measures should have been modeled as formative or reflective indicators. We recorded measures as having low covariation if they exhibited any of the following conditions: (1) one or more of the inter-measure correlations were below 0.30, indicating that the measures shared less than 10% of their variance with each other; (2) one or more of the standardized factor loadings was below 0.70, indicating the amount of variance accounted for in one or more measures by its intended factor was less than 0.50; or (3) the coefficient a was below 0.70. Researchers in the construct validation literature have argued that measures sharing less than 10% of their variance probably do not capture the same construct (Kline, 1998); that constructs should account for at least 50% of the variance in their reflective measures (Fornell & Larcker, 1981); and that reflective scales should exhibit a minimum internal consistency reliability of 0.70 to indicate that the measures covary at an adequate level (Nunnally & Bernstein, 1996). However, as indicated in our earlier discussion, our assessment was primarily based on the conceptual nature of the relationships between the measures and the constructs, and among the measures themselves. Empirical evidence of low covariation between the measures was largely used to support our conceptual assessment. Further, to be conservative in our assessment, we gave the measurement model specification presented by the original authors the ‘‘benefit of the doubt’’ when we were not sure about the conceptual relationship between the measures and the construct they were purported to measure. The first two authors independently applied the above criteria to all the constructs in the sample, reaching agreement on 83% of the constructs after the initial assessment. A close examination of the constructs on which we disagreed indicated that almost all of the differences occurred as a result of how rigorously the raters applied the ‘‘benefit of the doubt’’ criterion in specific studies. For example, in some cases where only one of the four
Formative Measures in Strategy Research
215
indicators of a construct was clearly different from the rest, one rater may have given the original authors the ‘‘benefit of the doubt’’ in his rating, whereas the other did not. After the raters exchanged opinions and debated the merits of both perspectives, they reached agreement on all the constructs. To check for the potential of unintended bias in our assessment, we asked two colleagues, one in strategy and one in OB, to perform independent assessments using the criteria. Their assessments agreed with our final assessment on 84% of the constructs, with the strategy colleague identifying 13 additional constructs that he felt should have been modeled as having formative measures than did our initial assessment. This evidence suggests that our assessment was not unintentionally biased toward coding more constructs as having formative measures. We summarize our final assessment of measurement model (mis)specification in Table 1. Among the 257 constructs examined, a total of 233 (90.6%) of them were modeled as having reflective measures by their original authors, and only 24 constructs (9.4%) were modeled as having formative measures. According to our assessment, only 37.7% (28.4% plus 9.3%) of the constructs were modeled correctly; whereas the remaining 62.3% were modeled incorrectly. Of those constructs modeled correctly, 75% (73 of 97) were reflective constructs modeled correctly as having reflective measures, whereas the remaining 25% (24 of 97) were formative constructs modeled as having formative measures. However, all of the 160 mismodeled constructs (62.3% of the total sample of constructs) were inappropriately modeled as having reflective indicators when they should have been modeled as having formative indicators. A Pearson chi-squared test indicated that our assessment of the relationships between the constructs and their measures was significantly different from the specifications provided by the original authors (w2(1) ¼ 10.50, po0.01).
Table 1. Summary of Measurement Model (Mis)Specification in Strategy Research Published in AMJ, ASQ, and SMJ (1994–2003). Measurement Model by Original Authors Reflective Formative Total
Measurement Model by Our Assessment Reflective
Formative
73 (28.4%) 0 (0.0%) 73 (28.4%)
160 (62.3%) 24 (9.3%) 184 (71.6%)
Pearson w2(1) ¼ 10.50, po0.01.
Total
233 (90.7%) 24 (9.3%) 257 (100%)
216
NATHAN P. PODSAKOFF ET AL.
Examples of Mismodeled Constructs in Strategy Research Our assessment of measurement models in the recent strategy literature suggests that almost three-quarters (71.6%) of the strategy constructs in our sample should have been modeled as having formative indicators. These findings indicate that the proportion of constructs with formative measures in the strategy literature is much higher than other fields in which similar analyses have been conducted. For example, Jarvis et al. (2003) found that 31% of the marketing constructs they examined should have been modeled as having formative measures, and in their review of contemporary leadership research, Podsakoff, MacKenzie, Podsakoff, and Lee (2003) reported that 47% of the constructs in their sample should have been modeled as having formative measures. Therefore, contrary to the assertion of Slater and Atuahene-Gima (2004), these results suggest that constructs with formative measures are not only pervasive in strategic management research, but even more so than in some other fields of social science. Although it is not completely clear why constructs with formative measures are so prevalent in strategic management research, we believe that there are at least three possible reasons for this phenomenon. First, strategy constructs often capture abstract, complex, multidimensional concepts occurring at the organization-, industry-, or cultural-level (e.g., firm performance, environmental uncertainty, cultural competencies, etc.). This is somewhat different from micro-organizational research that often focuses on fairly narrowly defined constructs at the individual level (e.g., employee intentions to stay/leave and/or specific attitudes about one’s job or organization). Second, strategy research often relies on measures obtained from secondary sources, such as the COMPUSTAT, CRSP tapes, and/or company annual reports, to assess their focal constructs. Because many of these indicators are initially developed to describe distinct organizational characteristics (e.g., firm size, company history, business profile, management and governance structure, firm performance, etc.), they are often quite different and not easily interchangeable with each other. Third, because econometrics serves as one of the basic foundations of strategic management research, and because multicollinearity is one problem that econometric analysis tries to avoid, many strategy researchers may have been trained to create maximally different measures of their hypothetical constructs even when they model these constructs as having reflective indicators. To draw further attention of strategy researchers to constructs having formative measures, Table 2 presents a sample of constructs from our review that we believe are representative of those that should have been modeled as
Formative Measures in Strategy Research
217
having formative indicators. The first two columns in the table contain the name of the construct and the name(s) of the author(s) of the study in which the construct was examined, and the third column reports the indicators used to measure the construct. The next four columns contain our assessment of whether: (a) the likely direction of causality is from the measures to the construct (formative) or from the construct to the measures (reflective), (b) the items appear to be conceptually interchangeable or not, (c) the indicators are expected to have the same nomological networks or not, and (d) the indicators are conceptually expected to covary with each other at a high level or not. The final column in the table (e) reports whether the empirical data provided by the researchers suggested that the indicators possessed low intercorrelations or not. Although a comprehensive discussion of all of the constructs included in Table 2 goes beyond the scope of this review, an examination of this table suggests that constructs having formative indicators are prevalent in some of the key areas of strategic management research. For example, this distinction may be particularly important for those interested in the resource-based view (RBV) of the firm, which has recently gained prominence in the field (Hoopes, Madsen, & Walker, 2003). According to the RBV, organizations can earn a sustainable competitive advantage if and only if they have superior resources and these resources are protected by some form of mechanism that prevents their diffusion throughout the industry. As a result, constructs such as firm assets (Spanos & Lioukas, 2001), specialized resources (Birkinshaw et al., 1998), organizational slack (Palmer & Wiseman, 1999), and social capital (Koka & Prescott, 2002) are of central importance in this approach. However, given the complexity involved in measuring these constructs, researchers rarely conceptualize them as having a single dimension. Instead, they conceptualize and operationalize these constructs, either explicitly or implicitly, as having multiple facets or dimensions. For example, Spanos and Lioukas (2001) assessed firm assets using three measures: organizational/managerial assets, marketing assets, and technical assets. In our view, these different types of assets are not interchangeable with each other, and omitting one or more of them would alter the conceptual domain of the construct. In addition, they would not necessarily covary with each other, since some firms that possess good technical assets will not possess good organizational/managerial assets. Finally, we believe that these different asset categories are likely to have different antecedents and consequences. Therefore, we believe that the firm asset construct should be modeled as a composite of the measures designed to assess the individual asset dimensions using a formative measurement model.
Construct
Examples of Constructs with Formative Indicators Mismodeled as Constructs with Reflective Indicators in Strategy Research. Author(s)
Indicators Used to Measure Construct
Conceptual Distinctions
Acquirer asset divestiture
Capron, Mitchell and Swaminathan (2001)
Acquisition performance
Capron (1999)
Similarity of nomological network?
Items expected to covary?
Low item covariationb
F
No
No
No
No
F
No
No
No
No
Holm, Eriksson, and Johanson (1999)
F
No
No
No
Yes
NATHAN P. PODSAKOFF ET AL.
Business network connection
Manufacturing asset divestiture Logistic asset divestiture Sales network asset divestiture Administration asset divestiture Since the acquisition, how have the following changed: Impact on market share Impact on sales Impact on intrinsic profitability Impact on relative profitability to industry To what extent is your business with this customer affected by any of your other customers? To what extent is your business with this customer affected by any of your own suppliers?
Empirical Distinctions
Item interchangeability?
Direction of causalitya
218
Table 2.
Collaborative knowhow
Geletkanycz, Boyd, and Finkelstein (2001) Simonin (1999)
No
No
No
Yes
F
No
No
No
No
219
Please rate your company’s know how in the following areas: Partner identification Partner selections Negotiations Legal aspects Understanding strategic implications of collaborating Technological assessment Estimating asset values and future cash flows Tax aspects Losing the deal Staffing (recruiting, training, rewarding, rotating) Managing alliance– parent company relations
F
Formative Measures in Strategy Research
CEO compensation
To what extent is your business with this customer affected by any of his own customers? To what extent is your business with this customer affected by any of his own other suppliers of products supplementary to yours? Long-term compensation Cash Compensation
Construct
Author(s)
Indicators Used to Measure Construct
Conceptual Distinctions
Similarity of nomological network?
Items expected to covary?
Low item covariationb
F
No
No
No
No
Cross-functional integration
F
No
No
No
No
NATHAN P. PODSAKOFF ET AL.
Competitive rivalry
Empirical Distinctions
Item interchangeability?
Direction of causalitya Building trust with the partner Conflict resolutions Renegotiating initial agreements with partner Logistics and resource transfer Cross-cultural training Knowledge/skills acquisition Knowledge/skills safeguarding Profit or capital repatriation Existing from the alliance Product characteristics Spanos and Lioukas (2001) Promotional strategies among rivals Access to distribution channels Service strategies to customers Integration between Song and MontoyaR&D and Weiss (2001) manufacturing Integration between marketing and R&D
220
Table 2. (Continued )
Differentiation
Differentiation strategy
Hult, Ketchen, and Nichols (2002) Kotha and Vadlamani (1995)
Homburg, Krohmer, and Workman (1999)
F
No
No
No
Yes
F
No
No
No
No
F
No
No
Yes
No
Formative Measures in Strategy Research
Cultural competitiveness
221
Integration between marketing and manufacturing Entrepreneurship Innovativeness Learning New product development Quality control procedure Quality of product Building brand Identification Influence distribution channel Customer service capability Highly trained personnel Refine existing products Marketing innovations Above average Promotion/ advertising High priced products Enhance advertising quality Build reputation Creating superior customer value through services accompanying the products Building up a premium product or brand image Obtaining high prices from the market Advertising
Construct
Author(s)
Indicators Used to Measure Construct
Conceptual Distinctions
Similarity of nomological network?
Items expected to covary?
Low item covariationb
F
No
No
No
Yes
F
No
No
No
No
NATHAN P. PODSAKOFF ET AL.
External learning
Li and Atuahene- The competitive Gima (2002) intensity has been very high and uncertain. Severe price competition has been a characteristic of my industry Our firm must change its marketing practices frequently to keep up with the market and competitors The rate at which products or services become obsolete has dramatically increased Schroeder, Bates, We strive to establish and Junttila long-term (2002) relationships with suppliers We maintain close communication with suppliers about quality considerations and design changes
Empirical Distinctions
Item interchangeability?
Direction of causalitya Environmental uncertainty
222
Table 2. (Continued )
Firm assets
Firm performance
Formal control
F
No
No
No
Yes
F
No
No
No
No
F
No
No
No
No
F
No
No
No
No
Formative Measures in Strategy Research
External ties
Our customers give us feedback on quality and delivery performance Our customers are actively involved in the product design process Geletkanycz et al. Fortune ties Total ties (2001) Size ties Profit ties Betweenness ties Closeness ties Degree ties Organizational/ Spanos and Lioukas (2001) managerial Marketing Technical Tippins and Sohi Customer retention Sales growth (2003) Profitability ROI Lin and Germain A comprehensive (2003) management control and information system Use of cost centers for cost control Quality control of operation using sampling and other methods Formal appraisal of personnel
223
Construct
Author(s)
Indicators Used to Measure Construct
Conceptual Distinctions
Similarity of nomological network?
Items expected to covary?
Low item covariationb
F
No
No
No
No
F
No
No
No
No
F
No
No
No
No
F
No
No
No
Yes
NATHAN P. PODSAKOFF ET AL.
New technological expertise New marketing expertise Product development Managerial development Manufacturing/ production processes Number of competitors Industry rivalry Palmer and Concentration (CR4) Wiseman Industry sales growth (1999) R&D expenditures for Innovative Spanos and differentiation Lioukas (2001) product development R&D expenditures for process innovations Emphasis on being ahead of competition Rate of product innovations Institutional support Li and Atuahene- Please indicate the extent Gima (2002) to which in the last three years government and its agencies have: Implemented policies and programs that have beneficial to your venture’s operation Steensma and Lyles (2000)
Empirical Distinctions
Item interchangeability?
Direction of causalitya IJV learning
224
Table 2. (Continued )
Learning intent
Subramani and Venkatraman (2003)
Tsang (2002)
F
No
No
No
No
F
No
No
No
No
Formative Measures in Strategy Research
Joint decision making
225
Provided needed technology information and other technical support for your venture Played a significant role in providing financial support for your venture Helped your venture to obtain licenses for import of technology, manufacturing and raw material, and other equipment seldom interfered in the operations of your venture Please indicate the extent to which decisions in these issues are jointly made by your firm and (your partner): Competitive analysis, strategy formulation Plans for sales promotion, advertising Analyzing market trends, response to promotion, etc. One of your company’s objectives of forming this venture is to Learn specific skills and competencies (e.g., technology) held by your Chinese partner(s)
226
Table 2. (Continued ) Construct
Author(s)
Indicators Used to Measure Construct
Conceptual Distinctions
Similarity of nomological network?
Items expected to covary?
Low item covariationb
Low-cost strategy
Homburg et al. (1999)
F
No
No
No
No
Managerial discretion
Finkelstein and Boyd (1998)
F
No
No
No
Yes
NATHAN P. PODSAKOFF ET AL.
Item interchangeability?
Direction of causalitya Learn more about how to do business in China Learn or improve the skills of inter-firm cooperation in a joint venture setting Learn or improve the skills of managing overseas operations To what extent does your business unit emphasize the following activities? Pursuing operating efficiencies Pursuing cost advantages in raw material procurement Pursuing economies of scale Market growth R&D intensity Advertising intensity Capital intensity Concentration Regulation Demand instability
Empirical Distinctions
Tippins and Sohi (2003)
Partner fit: Complementarity and compatibility
Kale, Singh, and Perlmutter (2000)
Perceived rate of change
May, Stewart, and Sweo (2000)
F
No
No
No
No
F
No
No
No
No
F
No
No
No
Yes
227
Information acquisition Information dissemination Shard interpretation Declarative memory Prodcedural memory There is high complementarily between the resources/capabilities of the two partners There is high similarity/ overlap between the core capabilities of each partner The organizational cultures of the two partners are compatible with each other The management and operating styles of the partners are compatible with each other The frequency and speed of change that you see in the trends, issues, and conditions in each environmental sector: Political/legal Competition Economic Sociocultural Technology Customer/market Resources
Formative Measures in Strategy Research
Organizational learning
Construct
Author(s)
Indicators Used to Measure Construct
Conceptual Distinctions
Empirical Distinctions
Similarity of nomological network?
Items expected to covary?
Low item covariationb
F
No
No
No
No
F
No
No
No
No
F
No
No
No
No
F
No
No
No
Yes
F
No
No
No
No
NATHAN P. PODSAKOFF ET AL.
Item interchangeability?
Direction of causalitya Positional advantage Hult and Ketchen Market orientation Entrepreneurship (2001) Innovativeness Organizational learning Businesses share Product market Stimpert and relatedness Duhaime customers Businesses require same (1997) raw materials Businesses share manufacturing processes Businesses share distribution network Asymmetry in Resource asymmetry Capron et al. innovation of acquirer to (2001) Asymmetry in target manufacturing knowhow Asymmetry in commercial dynamism Asymmetry in product quality Recoverable slack Slack Palmer and Available slack Wiseman Potential slack (1999) Information volume Social capital Koka and Prescott (2002) Information diversity Information richness
228
Table 2. (Continued )
Specialized resources
Strategic importance
F
No
No
No
Yes
F
No
No
No
No
F
No
No
No
No
Formative Measures in Strategy Research
Specificity
229
Indicate you capability or Birkinshaw, distinctive expertise in Hood, and the following areas Jonsson (1998) relative to other subsidiaries in the corporation: Product or process R&D Manufacturing capability Marketing capability Managing international activities Innovation entrepreneurship Simonin (1999) To develop its technology/process know-how, your partner had to invest significantly in specialized equipment and facilities To develop its technology/process know-how, your partner had to invest significantly in skilled human resources Sales revenues of this Tsang (2002) joint venture’s products constitute a significant portion of the total income of your company This joint venture occupies an important strategic position in China visa`-vis your company’s competitors
230
Table 2. (Continued ) Construct
Author(s)
Indicators Used to Measure Construct
Conceptual Distinctions
Direction of causalitya
Boyd and ReuningElliott (1998)
F
Item interchangeability?
Similarity of nomological network?
Items expected to covary?
Low item covariationb
No
No
No
Yes
NATHAN P. PODSAKOFF ET AL.
Strategic planning
Market potential for this joint venture’s products in China is huge This joint venture is regarded as an important profit contributor Overall, your company views this joint venture as very important This section examines several common planning activities. Please indicate the emphasis placed on each activity within you organization Mission statement Trend analysis Competitor analysis Long-term plans Annual goals Short-term action plans Ongoing evaluation
Empirical Distinctions
Strategic similarity
Subsidiary Leadership
Target asset divestiture
F
No
No
No
Yes
F
No
No
No
No
F
No
No
No
Yes
F
No
No
No
No
Formative Measures in Strategy Research
Strategy
231
Similarity of customers of target and acquirers Similarity of geographic markets of targets and acquirers Similarity of competitors of targets and acquirers Innovative Spanos and Lioukas (2001) differentiation Marketing differentiation Low cost Birkinshaw et al. Indicate how characteristic (1998) each of the following statements is in describing your subsidiary: The subsidiary has a history of strong, internationally respected leaders The credibility of subsidiary top management is high The subsidiary CEO or president works with managers to focus their efforts toward the subsidiary’s objectives Manufacturing asset Capron et al. (2001) divestiture Logistic asset divestiture Sales network asset divestiture Administration asset divestiture Capron et al. (2001)
Construct
Author(s)
Indicators Used to Measure Construct
Conceptual Distinctions
Upper echelons
Subramani and Venkatraman (2003)
Similarity of nomological network?
Items expected to covary?
Low item covariationb
F
No
No
No
No
F
No
No
No
Yes
a Direction of causality was coded ‘‘F’’ to indicate that the construct is more accurately conceptualized as being caused or determined by the indicators. b Low item covariation was coded ‘‘Yes’’ if the indicators exhibited any one or more of the following conditions: (1) one or more the inter-item correlations were below 0.30, (2) that the amount of variance accounted for in some of the items by their intended factor was less than 0.50, or (3) a for the construct was below 0.70; and ‘‘No’’ otherwise.
NATHAN P. PODSAKOFF ET AL.
Palmer and Wiseman (1999)
What is the likelihood of major changes occurring in this product category over the next 12 months? Extensive styles changes Major product innovations Key manufacturing/ quality innovations Functional heterogeneity Education heterogeneity
Empirical Distinctions
Item interchangeability?
Direction of causalitya Uncertainty
232
Table 2. (Continued )
Formative Measures in Strategy Research
233
Table 2 suggests that formative measurement models may also be important in the measurement of strategic planning processes (Boyd & Reuning-Elliott, 1998), as well as the specific types of strategies that firms use in their attempts to gain a competitive advantage (Homburg, Krohmer, & Workman, 1999; Kotha & Vadlamani, 1995). The low-cost strategy construct measured by Homburg et al. (1999) is a good example of this type of construct. According to these authors, low-cost strategy can be measured by the extent to which a business unit pursues: (a) operating efficiencies, (b) cost advantages in raw materials procurement, and (c) economies of scale. These measures were modeled as reflective indicators of the low-cost strategy construct. However, we do not believe that these measures are interchangeable, and removing one of them would probably alter the conceptual domain of the construct. Moreover, although Homburg et al. (1999) did not report low levels of covariation between these indicators, these indicators need not covary with each other because some firms may try to reduce costs through their raw materials procurement strategies, rather than through operating efficiency. Finally, the antecedents and consequences of these strategies would not necessarily be expected to be the same. Therefore, it would appear that these strategies are more appropriately viewed as formative indicators of a composite ‘‘low-cost strategy’’ construct. Other constructs in the strategy domain that we believe are more appropriately modeled as having formative indicators include several constructs in the strategic alliance, and the mergers and acquisition areas. Steensma and Lyles’ (2000) measure of international joint venture learning is a good example of this. These authors assessed this construct with five indicators: (a) new technology expertise, (b) new marketing expertise, (c) product development, (d) managerial development, and (e) manufacturing/ production processes. Clearly, international joint venture learning is the result of these things, not the cause of them. In addition, these measures are not interchangeable, and eliminating one or more of them would alter the domain of the construct. Finally, it should be obvious that some of things that firms would do to improve their technological expertise are dramatically different from the things that they would do to improve their marketing expertise. Other constructs in the strategic alliance and mergers and acquisition domains that would appear to be formative in nature are Capron, Mitchell, and Swaminathan’s (2001) acquirer asset divesture construct, Capron’s (1999) acquisition performance construct, Spanos and Lioukas’s (2001) competitive rivalry construct, Subramani and Venkatraman’s (2003) joint decision-making construct, Kale, Singh, and Perlmutter’s (2000) partner fit construct, Tsang’s (2002) strategic importance
234
NATHAN P. PODSAKOFF ET AL.
construct, and Capron et al.’s (2001) strategic similarity and target asset divestiture constructs. Finally, many of the constructs in the areas of strategic leadership and corporate governance would also appear to be more appropriately modeled as having formative indicators, including CEO compensation (Geletkanycz, Boyd, & Finkelstein, 2001), external ties (Geletkanycz et al., 2001), managerial discretion (Finkelstein & Boyd, 1998), and subsidiary leadership (Birkinshaw et al., 1998). As an example of this, Finkelstein and Boyd (1998) defined managerial discretion as ‘‘the latitude of options top managers have in making strategic choices’’ (Finkelstein & Boyd, 1998, p. 179), and operationalized this construct using indicators of: (1) market growth, (2) R&D intensity, (3) advertising intensity, (4) demand instability, (5) capital intensity, (6) concentration (industry structure), and (7) regulation. Although these authors reported that, with the exception of demand instability, all of these measures loaded significantly on a single underlying factor, we think that there are several reasons why it makes more sense to view these measures as causes of discretion rather than results of it. First, we do not believe that many of these measures are interchangeable and subsequently omitting one or more of them would alter the conceptual domain of the construct. Second, we could imagine situations where managers feel that they have a considerable amount of discretion with respect to their decisions regarding R&D intensity, but have considerably less discretion in terms of industry structure (concentration). Indeed, Finkelstein and Boyd (p. 186) indicated that they were unsure whether industry structure would be positively or negatively related to discretion in their study. Third, none of the absolute values of the inter-item correlations between the six measures that Finkelstein and Boyd retained in their study to measure managerial discretion was higher than 0.30, and seven of the fifteen were 0.10 or below (Finkelstein & Boyd, 1998, p. 188). This suggests that many of the measures of managerial discretion shared less than 1% of their variance with each other. Finally, our belief that the managerial discretion construct is better viewed as a being a result of the measures included in this study rather than a cause of them seems consistent with Finkelstein and Boyd’s own view of this construct. In a footnote on page 179 of their article, these authors noted that: It is important to establish that the model of managerial discretion we developed and tested is carefully defined on the basis of its determinants. That is, it is the determinants of managerial discretion emanating from the task environment that are the focus of this study. Although it is conceivable to study discretion in terms of specific actions of managers, such an approach is problematic because it requires clear knowledge of
Formative Measures in Strategy Research
235
whether a specific strategic move is managerially driven or externally motivated. Without in-depth, clinical examination of managerial actions and the context surrounding those actions, such an approach to the study of discretion is quite problematic. By focusing on the determinants of discretion, it is possible to assess in a more objective manner the potential latitude of action facing a manager, an approach that also enables largersample studies of discretion and its consequences and that has been the norm in research on discretion to date y
Given that the ‘‘determinants’’ that Finkelstein and Boyd (1998) appear to be referring to here are the seven indicators they used to assess the managerial discretion construct, we believe that modeling this construct as having formative indicators is more consistent with their conceptual definition than modeling it as having reflective indicators.
COMMON ERRORS IN MEASUREMENT MODEL MISSPECIFICATION Despite the apparent prevalence of formative measurement models in strategy research, of the 45 articles we examined in this review, only five (Doz et al., 2000; Johansson & Yip, 1994; Olk & Young, 1997; Robins, Tallman, & Fladmoe-Lindquist, 2002; Tsang, 2002) clearly discussed the difference between formative and reflective measurement models, and modeled one or more of their constructs as having formative indicators. This finding led us to believe that relatively few strategy researchers are aware of the recent developments regarding formative indicator measurement models or understand the implications of these types of measurement models for their research. During our examination of the strategy literature, we identified four types of common errors leading to measurement model misspecification. We believe that identifying these errors can help strategy researchers better understand formative measurement models and employ them correctly. Therefore, a summary of the errors and their potential consequences is presented in Fig. 2. As indicated in Fig. 2A in this figure, the first type of error occurs when a researcher does not explicitly specify the dimensionality of his or her focal construct, but inappropriately models the indicators as reflections of an underlying latent variable. This error is represented in Fig. 2A by the nebulous ‘‘structure’’ between the theoretical construct and its indicators. The ambiguity in this situation results from the fact that the dimensionality of the construct is never clearly articulated by the researcher, and therefore the
B
1
236
A 1
2 2
Dimension 1 Theoretical Construct
Construct Dimensionality?
3
3
Theoretical Construct 4
4
Dimension 2 5
5
6
6
C
D
2
1 Dimension 1
r1,2 = low value
3
Theoretical Construct
Theoretical Construct Dimension 2
2
r1,3 = low value r2,3 = low value
4
3 5
6
Fig. 2.
Illustrations of Four Common Errors Leading to Measurement Model Misspecification.
NATHAN P. PODSAKOFF ET AL.
1
Formative Measures in Strategy Research
237
relationship between the construct and its measures is unclear. On the one hand, the construct appears to be comprised of different facets or dimensions; although the researcher never clearly specifies the number and/or the distinguishing characteristics of these dimensions. However, on the other hand, the researcher treats the measures as capturing the same construct, and therefore inappropriately models them as reflective indicators of the focal construct. An example of this error occurred in the measurement of specialized resources by Birkinshaw et al. (1998). These authors were interested in how subsidiary companies are able to contribute to the firm-specific advantage of a multinational corporation (MNC) and defined specialized resources as ‘‘resources that offer the potential for the MNC’s firm-specific advantage y [that are] y superior to those elsewhere in the organization, y [and receive] y recognition by corporate management’’ (Birkinshaw et al., 1998, pp. 224–225). This construct was measured by asking managers to rate their subsidiary’s capabilities relative to other subsidiaries in the corporation in five different areas: innovation and entrepreneurship, managing international activities, manufacturing, marketing, and R&D (Birkinshaw et al., 1998). Although Birkinshaw and his colleagues recognized that these activities were different, they still modeled them as reflective measures of a subsidiary’s specialized resources. Our reading of this study suggests that the five categories identified by Birkinshaw et al. (innovation and entrepreneurship, managing international activities, manufacturing, marketing, and R&D) are distinguishable from each other and collectively determine a firm’s specialized resources. It is also clear that these researchers did not believe that the five areas were interchangeable. In support of this notion, results presented by these researchers suggested that the level of covariation between some of these measures was very low. For example, the correlation between subsidiary’s capability in manufacturing and capability in marketing was only 0.06, and the correlation between capability in manufacturing and capability in managing innovation and entrepreneurship was only 0.09. Generally, this indicates that these pairs of ‘‘reflective’’ measures share less than 1% of their variance with each other. Therefore, both conceptual considerations and empirical evidence suggest that a formative measurement model would be more appropriate than the reflective measurement model employed by the authors to measure the specialized resources construct. The second type of error occurs when a researcher explicitly conceptualizes a construct as having multiple distinguishable dimensions that are not interchangeable with each other, develops specific measures to assess each of these
238
NATHAN P. PODSAKOFF ET AL.
dimensions, but then models these dimensions as reflective measures of a second-order construct. This type of error is illustrated in Fig. 2B by the fact that the distinguishable facets identified by a researcher are then modeled as first-order reflective dimensions of the higher-order construct. Because the construct is conceptualized as a composite of its multiple dimensions, the dimensions should be modeled as formative indicators of the higher-order construct. In cases in which they are not modeled in this manner, the higherorder construct will not capture the total variance in its dimensions, but rather only the variance that is common to all of the dimensions. This will essentially lead to a deficiency in the measurement of the higher-order construct. An example of this type of error occurred in the measurement of social capital by Koka and Prescott (2002). Recognizing that social capital is central to the study of inter-firm relationships, Koka and Prescott carefully specified the construct’s domain. Defining it in terms of the information benefits available to a firm due to its strategic alliances, these authors convincingly argued that social capital (as a second-order construct) is comprised of three distinct first-order dimensions: information volume, information diversity, and information richness. This conceptualization of social capital as a multidimensional construct was supported by their results, which indicated that the three dimensions had different effects on firm performance. Although these authors pointed out the need to treat these dimensions as ‘‘separate but integral components of the social capital construct’’ (Koka & Prescott, 2002, p. 798), their structural equation model specified information volume, information diversity, and information richness as three reflective measures of social capital. However, given the theoretical and empirical support the authors provided for three separate and distinct dimensions of social capital, we believe that this construct should have been modeled as a composite construct having three first-order formative dimensions. The third type of error is slightly more complicated. In this case, a researcher conceptualizes the focal construct as multidimensional in nature, develops items to assess each of these dimensions, but then models all the items as reflective measures of the overall construct rather than of its dimensions. This type of error is illustrated in Fig. 2C by the fact that the distinguishable facets of the higher-order construct that is identified by the researcher are essentially ignored when the construct is inappropriately modeled as having first-order reflective indicators. An example of this error occurred in the measurement of partner fit by Kale, Singh, and Perlmutter (2000). Kale and his colleagues conceptualized partner fit as determined by both complementarity in capabilities and compatibility in organizational
Formative Measures in Strategy Research
239
culture between two partner firms. They measured this construct with four items: (1) high complementarity between the resources/capabilities of the partners; (2) high similarity/overlap between the core capabilities of each partner; (3) the compatibility of the organizational cultures of the partners; and (4) the compatibility of the management and operating styles. These four measures were modeled as reflective indicators of partner fit. In our assessment, the first two items are reflective measures of complementarity, the last two items are reflective measures of compatibility, and complementarity and compatibility are distinct components of partner fit. In this situation, one solution would be to retain a single measure for complementarity and compatibility respectively, and then model partner fit as a composite of these two remaining measures. However, we believe that a more appropriate solution is to model partner fit as a second-order construct with first-order formative measures represented by its two dimensions (complementarity and compatibility), which would each be modeled as having reflective first-order indicators. In essence, this would result in a second-order construct that would be formed by first-order dimensions modeled as having reflective indicators. The first three types of error discussed above generally occur during the conceptualization stage of the development of a theoretical construct. However, the last type of error we identified (Fig. 2D) occurs when a researcher uses traditional construct validation techniques based on classical test theory to assess measures that should be modeled as formative in nature, but instead are modeled as reflective in nature. In this case, the researcher is probably not aware of the existence of formative measurement models and the effect that these traditional construct validation procedures may have on the measurement of his or her construct. As a result, (s)he follows standard guidelines for validating reflective indicator measurement models and assumes that the indicators of a construct should be highly correlated with each other, load on the same factor, and (when taken together as a scale) should possess high internal consistency reliability. Therefore, when some of his or her measures possess low factor loadings and/or low intercorrelations with other measures of the ‘‘same’’ construct, he or she concludes that these measures are not ‘‘valid’’ and should be eliminated from the scale. This is illustrated in Fig. 2D by the dotted lines that represent the removal of Items 1 and 3 because of their low covariation with Item 2. In this case, the researcher only retains Item 2 because he or she believes that it is the best representation of the focal construct. However, as we mentioned earlier, when the relationship between the measures and the construct is formative in nature, factor loadings and internal consistency reliabilities are
240
NATHAN P. PODSAKOFF ET AL.
not appropriate criteria for validation. Therefore, removing measures based on these metrics will likely lead to a deficiency in the measurement of the focal construct. An example of this type of error occurred in the measurement of subsidiary leadership by Birkinshaw et al. (1998). In their study, Birkinshaw and his colleagues developed a measure of subsidiary leadership using three items: ‘‘(the) subsidiary’s history of strong, internationally respected leaders; the credibility of the leadership with head office managers; and the leadership’s effort at developing middle management’’ (1998, p. 231). They report the correlations between these items to be ‘‘moderate,’’ with one as low as 0.19. In response, Birkinshaw and his colleagues decided to drop the last two items, leaving the construct of subsidiary leadership to be measured by only one item. In our assessment, the three items developed by Birkinshaw et al. capture different aspects of subsidiary leadership effectiveness, are not interchangeable, and are likely to have different antecedents and consequences. As a result, the strength of a subsidiary’s leadership probably is better measured as a composite of these three items, and dropping the last two of them will likely create construct deficiency. In addition, the identification of this error may provide an explanation of the repeated use of single-item measures in the strategy field (Boyd et al., 2005). In some cases, researchers or reviewers may eliminate measures that do not meet traditional construct validation standards when, in fact, these items are representative of distinct dimensions of a construct that should be modeled as having formative indicators.
FIRM PERFORMANCE AS A CONSTRUCT HAVING FORMATIVE INDICATORS In our discussion up to this point, we have identified several constructs in the field of strategy that we believe should be modeled as having formative indicators. As noted earlier, many of these constructs (e.g., firm assets, specialized resources, social capital, strategic planning, organizational learning, managerial discretion, etc.) play an important role in the current strategy literature. However, we believe that perhaps one of the best examples of a strategy construct that should be modeled as having formative indicators is firm performance. Firm performance is arguably the most important construct in the field of strategic management (cf. Barney, 2002; Schendel & Hofer, 1979;
Formative Measures in Strategy Research
241
Venkatraman & Ramanujam, 1986). Indeed, Venkatraman and Ramanujam (1986) state that firm performance is at the very heart of strategic management research, because ‘‘most strategic management theories either implicitly or explicitly underscore performance implications, since performance is the time test of any strategy’’ (p. 802). However, there is little consensus regarding how best to measure this construct. As a result, many measures have been developed and used in the previous literature (Barney, 2002). A comprehensive review of these measures, as well as their relationships with each other, is beyond the scope of our paper. Nor do we believe that we could develop a ‘‘perfect’’ measure of firm performance that would be applicable to all strategy researchers interested in examining this construct. Rather, our objective in this section is to use firm performance as an example to illustrate how strategy researchers should choose between formative and reflective measurement models. As in the case of the development of any construct valid measure, the first question a researcher must ask himself or herself is: How do we define the construct? Unfortunately, there is little consensus regarding the definition of firm performance. For strategy researchers who have a strong background in economics and/or finance, firm performance often refers to how efficiently or effectively a firm utilizes its assets in generating profits. However, for strategy researchers who take a stakeholder perspective, or those who are trained in sociology, psychology, and/or political science, this is a narrow definition because it only focuses on financial performance and ignores the social performance of the firm, such as its impact on the well-being of employees, customers, suppliers, and the environment (Harrison & Freeman, 1999). In our definition, we take a stakeholder perspective and treat firm performance as including dimensions of both financial and social performance. Fig. 3 provides a very general conceptual depiction of this perspective. Once the conceptual definition of firm performance has been established, the next step is to identify the relationship between the focal construct (firm performance) and its dimensions (financial and social performance). Essentially, the question to be addressed at this stage in the process is whether financial and social performance are interchangeable manifestations or distinct dimensions of firm performance. Although some researchers have reported a positive relationship between social and financial performance (e.g., Waddock & Graves, 1997), we do not believe that these dimensions are conceptually interchangeable with each other or that they necessarily should covary at a high level. Indeed, this conceptual perspective is consistent with the recent meta-analysis conducted by Orlitzky, Schmidt, and Rynes (2003), who reported that the corrected correlations between social and financial
242
Firm Performance
Social Performance
Financial Performance
Stock-Market Performance
Treynor’s Measure
Sharpe’s Measure
ROA
ROI
Customer Performance
Environmental Performance
ROS
Customer Satisfaction
Fig. 3.
Employee Performance
Customer Retention
General Conceptual Model of Firm Performance Dimensions and Measures.
Philanthropic Performance
NATHAN P. PODSAKOFF ET AL.
Jensen’s Alpha
Accounting Performance
Formative Measures in Strategy Research
243
performance ranged from 0.15 to 0.36, depending on how these dimensions were measured. In addition, we believe that antecedents and consequences of financial and social performance dimensions are likely to be different. For example, it is not hard to imagine that some of the factors that may increase the financial performance of a firm (e.g., decreasing costs through by product standardization or outsourcing customer service to a foreign country) may actually decrease customer satisfaction (e.g., because it reduces product attributes and/or customers perceptions of service quality), which is a component of social performance. Similarly, it is also not hard to image that factors that increase employee satisfaction and job security (e.g., compensation, benefits, lifetime employment, etc.) and subsequently enhance the social performance of the firm, may in some cases actually decrease the financial performance of the firm. Therefore, we conceptualize financial performance and social performance as representing different facets of firm performance that are not interchangeable with each other. To capture the entire variance of a firm’s performance, a formative measurement model needs to be applied. Of course, the financial and social performance dimensions are themselves constructs and therefore it is necessary to determine whether they should be modeled as having formative or reflective indicators. In strategy research, most measures of financial performance fall into one of the following two groups: accounting measures of profitability such as return-on-assets (ROA), return-on-investments (ROI), and return-on-sales (ROS);2 and measures of stock market performance such as shareholder return and Jensen’s alpha (Hoskisson, Hitt, Wan, & Yiu, 1999). Although some researchers have reported high correlations between different accounting measures of profitability (e.g., Delios & Beamish, 1999) or between different stock market-based measures (e.g., Hoskisson, Johnson, & Moesel, 1994), there is no convincing evidence that these two groups of measures are generally highly correlated. For example, in a recent study, Gentry and Shen (2005) examined the convergence between measures of accounting performance and stock market performance using the entire COMPUSTAT database as their sample. These authors found that none of the correlation coefficients between shareholder return and accounting measures of profitability exceeded 0.25. They further examined the convergence of these measures at the 2-digit and the 4-digit SIC levels, and found that the average industry-wide correlation coefficients at these two industry levels were all below 0.30. Therefore, given the evidence in the literature, we believe that accounting profitability and stock market performance are not interchangeable manifestations of financial performance. Instead, they represent distinct
244
NATHAN P. PODSAKOFF ET AL.
aspects of financial performance, and should be modeled as such. We provide a general model depicting this notion in Fig. 3, which we will discuss in greater depth below. To assess accounting performance, we elected to use three widely employed measures, ROA, ROI, and ROS. Because these measures are often viewed as fairly equivalent manifestations of profitability and are highly correlated, we treat them as reflective measures of accounting performance. Similarly, because measures such as Jensen’s alpha and the Sharpe and Treynor measures are commonly used to assess firm performance in the stock market and are also highly correlated (Hoskisson et al., 1994), we modeled these measures as reflective indicators of stock market performance. Therefore, in our assessment, accounting and stock market performance each have reflective measures, and are themselves formative dimensions of a firm’s financial performance. When accounting profitability and stock market performance are each assessed by a single measure (e.g., ROA and shareholder return, respectively), the measures are assumed to be perfect proxies of the underlying constructs. In this situation, financial performance is simply assessed through a formative measurement model using the measures of ROA and shareholder return in place of accounting profitability and stock market performance. However, as we have noted before, we do not endorse the use of single-item measures when multiple items of a construct are available. Because social performance is related to the well-being of multiple stakeholders, we conceptualize it as a composite multidimensional construct, with the well-being of each stakeholder group serving as a formative dimension. For illustration, Fig. 3 identifies four dimensions of social performance: customer, employee, environmental, and philanthropy. According to this conceptualization, we need to first measure each of the above four dimensions and then create a composite measure of social performance using a formative measurement model. Certainly, each of the four dimensions should also be assessed through appropriate reflective or formative measures, as in the measurement of the accounting profitability and stock market performance dimensions. For example, customer performance might be assessed by multiple formative dimensions including customer satisfaction and customer retention, which would each have their own reflective measures. After both financial performance and social performance are assessed through appropriate measurement models, a formative measurement model should be applied to create the composite, higher-order construct, called firm performance.
Formative Measures in Strategy Research
245
As pointed out earlier, the measurement of firm performance described above is just an example of how to develop constructs having formative measures.3 We are well aware that other researchers may have different definitions of firm performance or different conceptions of its dimensionality. Regardless of these differences, the key point is to have a clear definition of the construct and to carefully conceptualize its dimensionality, and then develop measures and choose appropriate measurement models to assess the construct. In our review of the literature, we found that several authors used multiple items to measure firm performance (e.g., acquisition in Capron, 1999; firm performance in Delios & Beamish, 1999 and Tippins & Sohi, 2003; accounting and stock market performance in Hoskisson et al., 1994). When the measures are conceptualized as interchangeable, are expected to covary at a high level, are expected to have similar antecedents and consequences, and omitting one of them would not be expected to alter the conceptual domain of the construct, a reflective measurement model is appropriate. However, when the indicators appear to be measuring different aspects of firm performance, are not expected to be highly correlated with each other, and removing one or more of them would be expected to change the conceptual domain of the construct, a formative measurement model should be applied.
CONCLUSIONS AND IMPLICATIONS FOR THE FIELD OF STRATEGY Over two decades have past since Venkatraman and Grant (1986), building on the work of Schwab (1980), noted that it is important for strategy researchers to establish the construct validity of their measures before they try to establish the substantive validity of their constructs. Although much progress has been made in this regard during the past twenty years, recent developments in measurement theory (Bollen & Lennox, 1991; Diamantopoulos & Winklhofer, 2001; Edwards & Bagozzi, 2000; Jarvis et al., 2003; Law et al., 1998; Law & Wong, 1999; MacCallum & Browne, 1993; MacKenzie et al., 2005) indicate that strategy researchers need to pay particular attention to whether their constructs should be conceptualized as having formative or reflective indicators. Our review of the strategy domain suggests that constructs having formative indicators are prevalent in the contemporary literature, and represent some of the most important constructs in the field. However, we also found that many strategy researchers
246
NATHAN P. PODSAKOFF ET AL.
are not aware of formative measurement models, and consequently misspecify their constructs. We believe that our findings have three important implications for the field. First, strategy researchers need to pay more attention to the manner in which they conceptualize their constructs. Generally speaking, if they view their construct as having distinguishable facets that are not interchangeable and do not covary at high level with each other, and removing one of these facets will change the conceptual domain of the construct, the construct should be conceptualized as formative in nature. In contrast, if researchers view the measures of their construct to be interchangeable and covary with each other at a high level, and believe that removing one or more of these measures would not change the conceptual domain of the construct, then they should conceptualize their construct as being reflective in nature. This may require additional work on the part of strategy researchers to make sure that they clarify the conceptual domain of their constructs. Second, it is important for strategy researchers to recognize that establishing the validity of constructs having formative indicators will require techniques that are different from those required for establishing the validity of constructs having reflective indicators. It is apparent from our review that many strategy researchers are familiar with techniques for establishing the validity of constructs assumed to have reflective indicators. Indeed, we found that many researchers regularly use internal consistency reliability estimates, confirmatory factor models, and tests of convergent and discriminant validity in order to examine the construct validity of their measures. However, it is also apparent from our review that fewer strategy researchers are familiar with the techniques and procedures required to validate constructs having formative indicators. Although a comprehensive discussion of these techniques goes beyond the scope of our paper, we recommend that those who are interested in developing and validating measures for constructs having formative indicators refer to the articles by Diamantopoulos and Winklhofer (2001) and MacKenzie et al. (2005). Our final recommendation is that reviewers familiarize themselves with the conceptual and technical issues surrounding formative measurement models. As gatekeepers of the field, reviewers greatly influence the quality of the research published in the literature. By learning about the distinctions between formative and reflective measurement models, they can more effectively critique the (improper) modeling of constructs in empirical research. Further, they can more effectively explain the problems inherent to measurement model misspecification and offer constructive recommendations to authors regarding how to improve the measurement and modeling of their constructs.
Formative Measures in Strategy Research
247
These efforts by reviewers, as well as by authors, will greatly enhance research quality and facilitate theory development in the strategy domain. In conclusion, we hope that the issues discussed in this article can help researchers better choose between formative and reflective measurement models, increase the validity of the constructs they employ, and finally improve the quality of theory development and empirical research in the field of strategy.
NOTES 1. Formative measures have also been referred to as causal measures, and reflective measures as effect measures (e.g., Bollen & Lennox, 1991; MacCallum & Browne, 1993). Following Edwards and Bagozzi (2000), we prefer the terms formative and reflective because these terms can also be used to describe the relationship between a multidimensional construct and its dimensions, as we point out later. 2. Some people may argue that ROA, ROI, and ROS represent different facets of accounting profitability, and therefore, should be modeled as formative indicators of this construct. In the strictest sense, this may be accurate. However, for our purposes, we believe that each one of these indicators represents a measure of the profitability of the firm in accounting terms, and therefore we have treated them as reflective measures of accounting performance in our example. 3. It is important to note that the model shown in Fig. 3 is not identified, as it is illustrated. However, our point in this section is to provide an illustrative example of how firm performance might be modeled as a construct having formative indicators, not to discuss the technical requirements necessary to identify the measurement model. Those interested in such a discussion are advised to refer to MacCallum and Browne (1993), Jarvis et al. (2003), or MacKenzie et al. (2005).
ACKNOWLEDGMENTS We are grateful to Bert Cannella, Jason Colquitt, Richard Gentry, Javier Gimeno, and Tim Judge for their comments on earlier drafts of this manuscript. Certainly, we are responsible for all of the arguments and any errors that remain.
REFERENCES Barney, J. B. (2002). What is performance? Gaining and sustaining competitive advantage (2nd ed.). Upper Saddle River: Prentice-Hall. Birkinshaw, J., Hood, N., & Jonsson, S. (1998). Building firm-specific advantages in multinational corporations: The role of subsidiary initiative. Strategic Management Journal, 19, 221–241.
248
NATHAN P. PODSAKOFF ET AL.
Bollen, K. A. (1984). Multiple indicators: Internal consistency or no necessary relationship? Quality and Quantity, 18, 377–385. Bollen, K. A. (1989). Structural equations with latent variables. New York, NY: Wiley. Bollen, K. A., & Lennox, R. (1991). Conventional wisdom on measurement: A structural equation perspective. Psychological Bulletin, 110, 305–314. Boyd, B. K., Gove, S., & Hitt, M. A. (2005). Construct measurement in strategic management research: Illusion or reality? Strategic Management Journal, 26, 239–257. Boyd, B. K., & Reuning-Elliot, E. (1998). A measurement model of strategic planning. Strategic Management Journal, 19, 181–192. Capron, L. (1999). The long-term performance of horizontal acquisitions. Strategic Management Journal, 20, 987–1018. Capron, L., Mitchell, W., & Swaminathan, A. (2001). Asset divestiture following horizontal acquisitions: A dynamic view. Strategic Management Journal, 22, 817–844. Delios, A., & Beamish, P. W. (1999). Geographic scope, product diversification, and the corporate performance of Japanese firms. Strategic Management Journal, 20, 711–727. Diamantopoulos, A., & Winklhofer, H. M. (2001). Index construction with formative indicators: An alternative to scale development. Journal of Marketing Research, 38, 269–277. Doz, Y. L., Olk, P. M., & Ring, P. S. (2000). Formation processes of R&D consortia: Which path to take? Where does it lead? Strategic Management Journal, 21, 239–266. Edwards, J. R. (2001). Multidimensional constructs in organizational behavior research: An integrative analytical framework. Organizational Research Methods, 4, 144–192. Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological Methods, 5, 155–174. Finkelstein, S., & Boyd, B. K. (1998). How much does the CEO matter? The role of managerial discretion in the setting of CEO compensation. Academy of Management Journal, 41, 179–199. Fornell, C., & Bookstein, F. L. (1982). Two structural equation models: LISREL and PLS applied to consumer exit – voice theory. Journal of Marketing Research, 19, 440–452. Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research, 18, 39–50. Fornell, C., Lorange, P., & Roos, J. (1990). The cooperative venture formation process: A latent variable structural modeling approach. Management Science, 36, 1246–1255. Geletkanycz, M. A., Boyd, B. K., & Finkelstein, S. (2001). The strategic value of CEO external directorate networks: Implications for CEO compensation. Strategic Management Journal, 22, 889–898. Gentry, R. J., & Shen, W. (2005). The relationship between accounting and market performance measures: In search of convergence. Working paper. University of Florida. Harrison, J. S., & Freeman, R. E. (1999). Stakeholders, social responsibility, and performance: Empirical evidence and theoretical perspectives. Academy of Management Journal, 42, 479–485. Hitt, M. A., Gimeno, J., & Hoskisson, R. E. (1998). Current and future research methods in strategic management. Organizational Research Methods, 1, 6–44. Holm, D. B., Eriksson, K., & Johanson, J. (1999). Creating value through mutual commitment to business network relationships. Strategic Management Journal, 20, 467–486. Homburg, C., Krohmer, H., & Workman, J. P. (1999). Strategic consensus and performance: The role of strategy type and market-related dynamism. Strategic Management Journal, 20, 339–357.
Formative Measures in Strategy Research
249
Hoopes, D. G., Madsen, T. L., & Walker, G. (2003). Why is there a resource-based view? Toward a theory of competitive heterogeneity. Strategic Management Journal, 24, 903–927. Hoskisson, R. E., Hitt, M. A., Wan, W. P., & Yiu, D. (1999). Theory and research in strategic management: Swings of a pendulum. Journal of Management, 25(3), 417–456. Hoskisson, R. E., Johnson, R. A., & Moesel, D. D. (1994). Corporate divestiture intensity in restructuring firms: Effects of governance, strategy, and performance. Academy of Management Journal, 37, 1207–1251. Hult, G. T. M., & Ketchen, D. J., Jr. (2001). Does market orientation matter? A test of the relationship between positional advantage and performance. Strategic Management Journal, 22, 899–906. Hult, G. T. M., Ketchen, D. J., Jr., & Nichols, E. L., Jr. (2002). An examination of cultural competitiveness and order fulfillment cycle time within supply chains. Academy of Management Journal, 45, 577–586. Jarvis, C. B., MacKenzie, S. B., & Podsakoff, P. M. (2003). A critical review of construct indicators and measurement model misspecification in marketing and consumer research. Journal of Consumer Research, 30, 199–218. Johansson, J. K., & Yip, G. S. (1994). Exploiting globalization potential: U.S. and Japanese strategies. Strategic Management Journal, 15, 579–600. Kale, P., Singh, H., & Perlmutter, H. (2000). Learning and protection of proprietary assets in strategic alliances: Building relational capital. Strategic Management Journal, 21, 217–237. Kline, R. B. (1998). Principles and practice of structural equation modeling. New York: The Guilford Press. Koka, B. R., & Prescott, J. E. (2002). Strategic alliances as social capital: A multidimensional view. Strategic Management Journal, 23, 795–816. Kotha, S., & Vadlamani, B. L. (1995). Assessing generic strategies: An empirical investigation of two competing typologies in discrete manufacturing industries. Strategic Management Journal, 16, 75–83. Law, K., & Wong, C. S. (1999). Multidimensional constructs in structural equation analysis: An illustration using the job perception and job satisfaction constructs. Journal of Management, 25, 143–160. Law, K., Wong, C. S., & Mobley, W. H. (1998). Toward a taxonomy of multidimensional constructs. Academy of Management Review, 23, 741–755. Li, H., & Atuahene-Gima, K. (2002). The adoption of agency business activity, product innovation, and performance in Chinese technology ventures. Strategic Management Journal, 23, 469–490. Lin, X., & Germain, R. (2003). Organizational structure, context, customer orientation, and performance: Lessons from Chinese state-owned enterprises. Strategic Management Journal, 24, 1131–1151. MacCallum, R. C., & Browne, M. W. (1993). The use of causal indicators in covariance structure models: Some practical issues. Psychological Bulletin, 114, 533–541. MacKenzie, S. B., Podsakoff, P. M., & Jarvis, C. B. (2005). The problem of measurement model misspecification in behavioral and organizational research and some recommended solutions. Journal of Applied Psychology, 90, 710–730. May, R. C., Stewart, W. H., & Sweo, R. (2000). Environmental scanning behavior in a transitional economy: Evidence from Russia. Academy of Management Journal, 43, 403–427.
250
NATHAN P. PODSAKOFF ET AL.
Mitchell, T. R., Holtom, B. C., Lee, T. W., Sablynski, C. J., & Erez, M. (2001). Why people stay: Using job embeddedness to predict voluntary turnover. Academy of Management Journal, 44, 1102–1121. Nachum, L. (2003). Liability of foreignness in global competition? Financial service affiliates in the city of London. Strategic Management Journal, 24, 1187–1208. Nunnally, J. C., & Bernstein, I. M. (1996). Psychometric theory (3rd ed.). New York, NY: McGraw-Hill. Olk, P., & Young, C. (1997). Why members stay in or leave an R&D consortium: Performance and conditions of membership as determinants of continuity. Strategic Management Journal, 18, 855–877. Orlitzky, M., Schmidt, F. L., & Rynes, S. L. (2003). Corporate social and financial performance: A meta-analysis. Organization Studies, 24, 403–442. Palmer, T. B., & Wiseman, R. M. (1999). Decoupling risk taking from income stream uncertainty: A holistic model of risk. Strategic Management Journal, 20, 1037–1062. Podsakoff, P. M., MacKenzie, S. M., Bachrach, D. G., & Podsakoff, N. P. (2005). The influence of management journals in the 1980s and 1990s. Strategic Management Journal, 26, 473–488. Podsakoff, P. M., MacKenzie, S. M., Podsakoff, N. P., & Lee, J. Y. (2003). The mismeasure of man(agement) and its implications for leadership research. Leadership Quarterly, 14, 615–656. Porter, M. E. (1980). Competitive strategy. New York: Free Press. Robins, J. A., Tallman, S., & Fladmoe-Lindquist, K. (2002). Autonomy and dependence of international cooperative ventures: An exploration of the strategic performance of US ventures in Mexico. Strategic Management Journal, 23, 881–901. Schendel, D. E., & Hofer, C. W. (1979). Strategic management: A new view of business policy and planning. Boston, MA: Little, Brown. Schroeder, R. G., Bates, K. A., & Junttila, M. A. (2002). A resource-based view of manufacturing strategy and the relationship to manufacturing performance. Strategic Management Journal, 23, 105–117. Schwab, D. P. (1980). Construct validity in organizational behavior. In: L. L. Cummings & B. Staw (Eds), Research in organizational behavior (Vol. 2, pp. 3–42). Greenwich, CT: JAI Press. Shook, C. L., Ketchen, D. J., Hult, T. M., & Kacmar, K. M. (2004). An assessment of the use of structural equation modeling in strategic management research. Strategic Management Research, 25, 397–404. Simonin, B. L. (1999). Ambiguity and the process of knowledge transfer in strategic alliances. Strategic Management Journal, 20, 595–623. Slater, S. F., & Atuahene-Gima, K. (2004). Conducting survey research in strategic management. In: D. J. Ketchen & D. D. Bergh (Eds), Research methodology in strategy and management (Vol. 1, pp. 227–249). Boston, MA: Elsevier. Song, M., & Montoya-Weiss, M. M. (2001). The effect of perceived technological uncertainty on Japanese new product development. Academy of Management Journal, 44, 61–80. Spanos, Y. E., & Lioukas, S. (2001). An examination into the causal logic of rent generation: Contrasting Porter’s competitive strategy framework and the resource-based perspective. Strategic Management Journal, 22, 907–934. Steensma, H. K., & Lyles, M. A. (2000). Explaining IJV survival in a transitional economy through social exchange and knowledge-based perspectives. Strategic Management Journal, 21, 831–851.
Formative Measures in Strategy Research
251
Stimpert, J. L., & Duhaime, I. M. (1997). In the eyes of the beholder: Conceptualizations of relatedness held by managers of large diversified firms. Strategic Management Journal, 18, 111–125. Subramani, M. R., & Venkatraman, N. (2003). Safegaurding investments in asymmetric interorganizational relationships: Theory and evidence. Academy of Management Journal, 46, 46–62. Tahai, A., & Meyer, M. J. (1999). A revealed preference study of management journals’ direct influences. Strategic Management Journal, 20, 279–296. Tippins, M. J., & Sohi, R. S. (2003). IT competency and firm performance: Is organizational learning a missing link? Strategic Management Journal, 46, 745–761. Tsang, E. W. K. (2002). Acquiring knowledge by foreign partners from international joint ventures in a transition economy: Learning-by-doing and learning myopia. Strategic Management Journal, 23, 835–854. Venkatraman, N., & Grant, J. H. (1986). Construct measurement in organizational strategy research: A critique and proposal. Academy of Management Review, 11, 71–87. Venkatraman, N., & Ramanujam, V. (1986). Measurement of business performance in strategy research: A comparison of performance. Academy of Management Review, 11, 801–814. Waddock, S. A., & Graves, S. B. (1997). The corporate social performance-financial performance link. Strategic Management Journal, 18, 303–320. Williams, L. J., Gavin, M. B., & Hartman, N. S. (2004). Structural equation modeling methods in strategy research: Applications and issues. In: D. J. Ketchen & D. D. Bergh (Eds), Research methodology in strategy and management (Vol. 1, pp. 303–346). Boston, MA: Elsevier.
FURTHER READING Andersson, U., Forsgren, M., & Holm, U. (2002). The strategic impact of external networks: Subsidiary performance and competence development in the multinational corporation. Strategic Management Journal, 23, 979–996. Atuahene-Gima, K. (2003). The effects of centrifugal and centripetal forces on product development speed and quality: How does problem solving matter? Academy of Management Journal, 46, 359–373. Baum, J. R., & Wally, S. (2003). Strategic performance decision speed and firm performance. Strategic Management Journal, 24, 1107–1129. Busenitz, L. W., Gomez, C., & Spencer, J. W. (2000). Country institutional profiles: Unlocking entrepreneurial phenomena. Academy of Management Journal, 43, 994–1003. Hambrick, D. C., & Finkelstein, S. (1987). Managerial discretion: A bridge between polar views of organizational outcomes. Research in Organizational Behavior, 9, 369–406. Hopkins, W. E., & Hopkins, S. A. (1997). Strategic planning-financial performance relationships in banks: A casual examination. Strategic Management Journal, 18, 635–652. Kostova, T., & Roth, K. (2002). Adoption of an organizational practice by subsidiaries of multinational corporations: Institutional and relational effects. Academy of Management Journal, 45, 215–233. Locke, E.A. (2003). Good definitions: The epistemological foundation of scientific progress. In: J. Greenberg (Ed.), Organizational behavior: The state of the science (2nd ed., pp. 415–444). Mahwah, NJ: Lawrence Erlbaum.
252
NATHAN P. PODSAKOFF ET AL.
McEvily, B., & Zaheer, A. (1999). Bridging ties: A source of firm heterogeneity in competitive capabilities. Strategic Management Journal, 20, 1133–1156. Murtha, T. P., Lenway, S. A., & Bagozzi, R. P. (1998). Global mind-sets and cognitive shift in a complex multinational corporation. Strategic Management Journal, 19, 97–114. Schilling, M. (2002). Technology success and failure in winner-take-all markets: The impact of learning orientation, timing, and network externalities. Academy of Management Journal, 45, 387–398. Steensma, H. K., & Corley, K. G. (2001). Organizational context as a moderator of theories on firm boundaries for technological sourcing. Academy of Management Journal, 44, 271–291. Tsai, W., & Ghoshal, S. (1998). Social capital and value creation: The role of interfirm networks. Academy of Management Journal, 41, 464–476. Wally, S., & Baum, R. J. (1994). Personal and structural determinants of the pace of strategic decision making. Academy of Management Journal, 37, 932–956. Yli-Renko, H., Autio, E., & Sapienza, H. J. (2001). Social capital, knowledge acquisition, and knowledge exploitation in young technology-based firms. Strategic Management Journal, 22, 587–613.
INDIVIDUALS AND ORGANIZATIONS: THOUGHTS ON A MICRO-FOUNDATIONS PROJECT FOR STRATEGIC MANAGEMENT AND ORGANIZATIONAL ANALYSIS Teppo Felin and Nicolai Foss ABSTRACT Making links between micro and macro levels has been problematic in the social sciences, and the literature in strategic management and organization theory is no exception. The purpose of this chapter is to raise theoretical issues in developing micro-foundations for strategic management and organizational analysis. We discuss more general problems with collectivism in the social sciences by focusing on specific problems in extant organizational analysis. We introduce micro-foundations to literature by explicating the underlying theoretical foundations of the origins of individual action and interaction. We highlight opportunities for future research, specifically emphasizing the need for a rational choice programme in management research.
Research Methodology in Strategy and Management, Volume 3, 253–288 Copyright r 2006 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1479-8387/doi:10.1016/S1479-8387(06)03009-8
253
254
TEPPO FELIN AND NICOLAI FOSS
INTRODUCTION Traditionally, some of the most troublesome issues in the social sciences have been those that relate to analytical levels and units of analysis (Machlup, 1967; Klein, Dansereau, & Hall, 1994). Notably, for more than a 100 years, economics (e.g., Menger, 1883; Hayek, 1952; Arrow, 1951; Dosi, 1995), sociology (e.g., Durkheim, 1962; Lazarsfeld & Menzel, 1970; Coleman, 1990), and the philosophy of science (Popper, 1957; Satz & Ferejohn, 1994) have witnessed a debate as to whether individuals (‘‘micro’’) or social collectives (‘‘macro’’) have explanatory primacy. This debate has raged under the label of ‘‘methodological individualism’’ versus ‘‘methodological collectivism.’’ The issue and debate carry very substantial theoretical and explanatory implications; for example, what are the relations between micro and macro levels? Do we always need to invoke micro-level explanatory mechanisms when trying to explain some macro-level phenomenon? Is it legitimate to rely on aggregate constructs as part of the explanans – or, are these only present in the explanandum of an explanatory structure? Although it is surely possible to conceptually separate the methodological domain (‘‘How should theories be constructed and evaluated?’’) from the ontological (‘‘What exists in the world?’’), the issue and the debate also carries substantial philosophical implications, and furthermore, very different philosophical positions may be, and have been, invoked to defend the respective positions. Further related questions include – what is the ontological status of aggregate social entities (e.g., organizations)? In what sense can organizations be said to exist independently of individuals? Is it meaningful to ascribe intention and actions to organizations? Clearly, these issues are also of immense relevance for theory-building in organizational theory and strategic management. However, while considerable, recent attention is being paid to ‘‘levels issues,’’ ‘‘multiple level analysis’’ and the like in management research (e.g., Klein et al., 1994; Dansereau, Yammarino, & Kohles, 1999), strategic management has seen no, or few, efforts to reconcile micro and macro-levels, or more generally, efforts to build micro-foundations (see Coff, 1999; Lippman & Rumelt, 2003a, 2003b; Foss & Foss, 2005).1 There are three possible reasons for this. First, one may speculate that the lack of micro-foundations is perhaps based on an implicit agreement that such discussions are best left at the level of the base disciplines (e.g., psychology at the individual level). That is, it can be argued that strategic management and organization theory is by definition a collective or firm-level discipline, and thus the key questions of interest
Individuals and Organizations
255
should be pursued at this level (without consideration for other levels). Or, second and conversely, the inherently pluralistic nature of management studies may preclude building specific micro-foundations, given the potential equivalency assigned to all levels. That is, no particular level deserves emphasis, and each level (for example – individual, organization, network, or industry) offers a different (though, as we will discuss, contradictory) and complementary approach to thinking about organizations and their performance. Or, third and finally, the empirically driven character of strategic management perhaps crowds out this sort of methodological, theoretical, and philosophical inquiry. Whatever the reason for a lack of ‘‘micro-foundations project’’ in strategic management, and organizational analysis more generally, this paper is written with the belief that the time has come for opening a discussion that cuts across strategy, organization theory and management, and which centres on many of the issues that have been at stake in the classical discussions in social science and philosophy concerning levels and units of analysis (cf. Felin & Hesterly, 2007). Specifically, to broadly foreshadow both our underlying premise and conclusion, this essay begins with the following increasingly forgotten assumption (Felin & Foss, 2005). That is, organizations are made up of individuals, and there is no organization without individuals. There certainly seems to be nothing quite as elementary; yet, this elementary truth has been lost in the increasing focus on structure, routines, capabilities, culture, institutions, and various other collective conceptualizations – with rather negative consequences for theory-building, empirical work, and managerial practice. Our contribution then is to highlight the need to explain the individual-level origins, or micro-foundations of collective structures as they arise from individual action and interaction, while extant work seems to take organization, and structure more generally, for granted. Furthermore, we delineate extant work in a succinct framework, and highlight promising future directions, which take individual-level considerations or micro-foundations, seriously. More generally, we outline three pillars of a rational choice programme for management, by highlighting theoretical work in sociology and economics, in hopes of providing promising methodological and theoretical foundations for the next generation of management theory. We should note that our efforts here generally are complementary with quite recent calls for ‘‘micro-foundations’’ and we specifically further build on this research project (e.g., Felin & Foss, 2005; Foss & Foss, 2005; Felin & Hesterly, 2007).2 However, at the risk of repeating what has
256
TEPPO FELIN AND NICOLAI FOSS
been published and said elsewhere, we focus on new issues and extensions that need further consideration and clarification. The unique contribution of this chapter is to offer a systematic and thorough exposition, building on Coleman’s (1990) classic meta-theoretical discussion and his overall framework, specifically in applying the framework to the notion of organizational capabilities and highlighting extensions by way of calling for a rational choice programme for strategic management and organization theory. This chapter is structured as follows. We begin by first briefly highlighting problems in the present work on organizational capabilities by providing a general conceptual figure illustrating critical questions, which remain unanswered, specifically the lack of micro-foundations and causal mechanisms (Organizational Capabilities: A Lack of Micro-foundations). As the general problems highlighted in this first section have recently, quite extensively, been discussed elsewhere (specifically see Felin & Hesterly, 2007; Felin & Foss, 2005), the section will remain purposefully succinct to avoid further repetition. Next, we position the above discussion and associated problems (lack of micro-foundations and causal mechanisms) in the context of a meta-theoretical discussion utilizing the general model of social explanation developed by the sociologist James Coleman (1990). We consistently apply this framework throughout this paper (Metatheory: Mechanisms, Explanation and Analytical Levels). Our first use of the Coleman model is to show in which sense the recent emphasis on structure, capabilities, and the more general taken-for-grantedness of ‘‘organization,’’ in management studies is a manifestation of methodological collectivism and why this is problematic (Organizational Capabilities and Methodological Collectivism). The second use of the Coleman model is to develop implications for research into the building of explicit micro-foundations for routines and capabilities. We explicate important research issues in the building of micro-foundations for organization-level phenomena and outline desiderata for such theory-building. We furthermore argue that theory-building should be founded on rational choice theory rather than on theories of individual behaviour, which is reactive, routinized, etc. We specifically highlight three key pillars of a rational choice research programme, namely (1) rationality, (2) choice, and (3) causal mechanisms; thus providing the rough and preliminary foundations for a much-needed rational choice research programme for management research. Some concluding observations are made on building micro-foundations (Building Foundations: Implications for Research) and on revisiting strategy’s disappearing mandate (Conclusion).
Individuals and Organizations
257
ORGANIZATIONAL CAPABILITIES: A LACK OF MICRO-FOUNDATIONS Strategy scholars have converged on organizational capabilities as the key construct for the understanding of firm-level heterogeneity and performance (Eisenhardt & Martin, 2000; Winter, 2003).3 As suggested by Fig. 1 – extant work has focused on making the link between organizational or firm-level capabilities and collective-level outcomes (see dotted box in Fig. 1). A central argument of capabilities-based work is that organizational routines or capabilities are the fundamental units of analysis, and that the organization should be conceptualized as the central repository of routines and capabilities (e.g., Nelson & Winter, 1982). The extant organizational capabilities approach is explicit about the need to focus on the collective as the key level of analysis, which for us highlights the need for micro-foundations, providing us an important point of departure from extant work and the unique contribution of this chapter. As a brief example of the present emphasis on the collective, Zollo and Winter (2002) explicitly define capabilities as ‘‘learned and stable pattern of collective activity’’ (Zollo & Winter, 2002, p. 340, emphasis added). The emphasis on collective constructs has lead to a corresponding neglect of the levels of individual action and interaction
Individual actions
Firm-level capabilities
Mechanisms aggregating individual actions to capabilities?
Firm-level outcomes
Mechanisms linking capabilities to firm-level outcomes?
Extant work on organizational capabilities
Fig. 1. Extant Work on Organizational Capabilities.
258
TEPPO FELIN AND NICOLAI FOSS
(Felin & Hesterly, 2007; Felin & Foss, 2005). As Argote and Ingram (2000, p. 156; emphasis added) lamented, to the extent that there has been progress in studying knowledge as the foundation of competitive advantage, ‘‘y it has been at the level of identifying consistencies in organizations’ knowledge development paths and almost never at the level of human interactions that are the primary source of knowledge and knowledge transfer.’’ However, neglecting this ‘‘primary source’’ means that we (strategic management and organizational scholars as well as managers that act on the prescriptions of the capabilities view) miss potentially important individual-level information. Put differently, while links between organizational capabilities and outcomes have been made in extant work (see Fig. 1), critical questions as to the underlying individual-level foundations and origins of these capabilities seem to be lacking. Two specific problems are worth highlighting in light of Fig. 1 (matters presently remaining outside of dotted box). First, while extant work has tended to, in effect, round out individuals from the analysis (Felin & Hesterly, 2007), can we theorize individual-level origins for collective constructs, in this case capabilities? Second, an absolutely fundamental problem has arisen in the form of various theoretical ‘‘black boxes’’ in our theories. Namely, how exactly do individual actions, abilities, and choices aggregate up to the collective level? What are the underlying causal (and social) mechanisms? These and other questions provide the general premise for our subsequent discussion. While our efforts here require a meta-theoretical (and perhaps metaphysical) detour of sorts into some key philosophical and ontological matters, we hope the reader will tolerate this, as these underlying questions are quite fundamental to strategic management and organization theory, and in large part, have been neglected.
METATHEORY: MECHANISMS, EXPLANATION AND ANALYTICAL LEVELS Mechanisms and the Individual Level Any theoretical (and consequently empirical) effort to explain organizational phenomenon (the explanandum) has to make a choice (though the choice is often made only implicitly) that concerns the level at which explanation takes place, that is, the analytical level at which the important components of the explanans are located. A classic distinction in social science research is made between the collective and the individual level. This question, of course, has divided disciplines, and even led to some of social
Individuals and Organizations
259
science’s most infamous in-fighting (Hayek, 1943; Popper, 1957) in the form of the opposition between methodological individualism and methodological collectivism. Nonetheless the questions and concerns remain equally valid and applicable to not only social science in general, but strategic management and organizational studies, specifically. We here take the position that the explanation of collective phenomena must ultimately be grounded in explanatory mechanisms that involve individual action and interaction, that is, methodological individualism (cf. Hayek, 1952; Ullman-Margalitt, 1978; Elster, 1989; Elster, 1998; Boudon, 1998a).4 Methodological individualism has been defended in numerous ways. For example, it is sometimes defended by invoking a deeper argument of ontological individualism, according to which only individuals, and not collectives, are acting entities. While we have sympathy with this argument, our argument is rather epistemological. We take it to be the ultimate aim of scientific endeavours in the social science domain to identify and theorize the causal social mechanisms – the ‘‘cogs and wheels’’ (Elster, 1989, p. 3) – that generate and explain observed associations between events (Bhaskar, 1978; Hedstrom & Swedberg, 1998). We associate this view with scientific and methodological realism (Harre´, 1970; Bhaskar, 1978; Foss, 1994). It differs from the traditional covering-law model of explanation of Carl Hempel and others, because the covering-law model does not imply an insistence on identifying genuine causality. In contrast, causality is central in a mechanisms approach. Thus, to the mechanism-oriented social scientist, the discovery of how human action and interaction causally produce collective level phenomena is what science is all about (e.g., Cowan & Rizzo, 1996). We also take it to be implicit in a mechanism-oriented scientific inquiry that explanatory black boxes be eschewed in principle. The modifier is necessary, because one reason for allowing some black boxes to enter explanation is explanatory parsimony (Hedstrom & Swedberg, 1998, p. 12; also see Coleman, 1990, p. 16).5 Economists and strategic management scholars perform somewhat related explanatory operations when they construct firmlevel arguments. This type of shorthand can be permitted, as long as it is understood that it is nothing more than pragmatic shorthand. The problem in contemporary strategic management is that it is much too often forgotten that, for example, talk of firm-level capabilities is precisely that – explanatory shorthand for underlying individual-level action and interaction. Or, even more worrying, capabilities and similar constructs imply a break with the methodological individualism that we see as fundamentally aligned with a mechanism-based approach to explanation and an endorsement of
260
TEPPO FELIN AND NICOLAI FOSS
methodological collectivism that is at variance with a focus on causal and social mechanisms in explanation.
A GENERAL MODEL OF SOCIAL SCIENCE EXPLANATION To clarify and put notions such as methodological individualism and collectivism or ‘‘individual-level’’ and ‘‘collective level explanation,’’ into a broader and consistent perspective, consider Fig. 2 which builds on the insightful conceptual model of the sociologist James Coleman (1990); also see Coleman (1986). Coleman’s conceptual model to our knowledge is the best and most concise effort to capture meta-theoretical matters and problems relating to micro-macro relations, and levels more generally, and thus provides an apt unifying tool for our purposes (cf. Abell, 2003). The framework provides the general overall organizing focal point for much of our discussion, and will be briefly explicated. The figure begins from a distinction between the macro-level and the micro-level. For example, the macro-level may be the organizational level, while the micro-level is the level of individuals. (Higher levels, such as networks, could perhaps also be discussed in conjunction with this framework, though for our purposes we focus on the individual as the micro-level and the organization as the macro-level). As shown, there are links between macro-macro (arrow 4) and macro-micro (arrow 1 and 5), micro-micro (arrow 2), and micro-macro (arrow 3). The figure also makes a distinction, between what is to be explained (i.e., the explanandum, or, dependent variable) and that with which explanation takes place (the explanans, or, independent variables), as indicated by the direction of the arrows. Ultimately, the aim is to be able to explain some macro-level phenomenon Social fact Macro
4
1
5
Micro 2
Fig. 2.
Coleman Diagram.
Social fact
3
Individuals and Organizations
261
(placed at the top-right corner). In order to do so, the analyst must make use of theoretical mechanisms implied by the arrows. For example, he can use explanans that involve arrows 1, 2, and 3, which would be the choice of the rational choice social scientist. Or, he can make use of arrow 4 or 5, which would be the choice of the structural-functionalist sociologist.6 Coleman’s model implicitly builds on the mechanism-oriented approach to social science explanation that we have very briefly discussed (and will further explicate below). The arrows in Fig. 2 are boxes that, as it were, may be filled out with theoretical mechanisms that the analyst chooses. To put some brief explanatory meat on the bare bones of this framework, consider using the Coleman framework to analyze the effects on the use of a certain resource to an increase in the price of that resource. This type of analysis could begin with arrow 1, that is, from the impact that the economy-wide price of the relevant resource has on the decision situations of individual agents. The rise in price affects the economic conditions that decision makers face, and this gives rise to various adaptations, notably, less will be used of the now more expensive resource, by the individual (arrow 2), as well as on the (collective) level of the economy (arrow 3). However, whether the decision-makers react in a maximizing or a bounded rational manner is not something the framework is explicit about. And whether the adjustments that result in less of the input takes place through smooth rational processes of adaptation or through selection processes (i.e., those firms that do not use less of the now more expensive resource becomes less profitable and eventually are weeded out) is not something the framework says anything about either. Thus, the framework in itself produces few restrictions on the particular mechanisms that the analyst can choose. However, we argue that one crucial restriction can immediately be placed on the framework. Specifically, while the framework does indeed feature an arrow 4, that is, a mechanism or set of mechanisms that are solely located at the collective macro level, and makes no contact whatsoever with the level of individual action; this is included purely for illustrative purposes (i.e., for illustrating the explanatory approach of collectivist streams in social science). While we have no problems with placing the explanandum on the collective level, we reject the attempt to place all of the explanans on that level. The reason for this is that there are no conceivable mechanisms on the social domain that operate solely on the collective level. There simply are no mysterious macro-level entities directly producing macro-level outcomes. While, this is how we may reason for purposes of shorthand (e.g., ‘‘aggregate investments drove much of last quarter’s growth in GNP,’’ ‘‘industry conditions determined the strategy of this strategic group,’’ etc.), macro
262
TEPPO FELIN AND NICOLAI FOSS
conditions (i.e., the top-left corner) only impact on macro outcomes (the top-right corner) indirectly, namely through mechanisms 1, 2, and 3 (we suppress arrow 5 for the moment). The implication is that if an explanandum in the social science context is placed on the macro-level (top-right corner), explaining it means choosing explanans that make use of one of these three sets of arrows (and their underlying mechanisms): [3], [2,3], or [1,2,3]. This is, however, an implication that is not reflected in much sociological work, and, unfortunately, also much work in management.
Examples: Collectivism in Management Research Much social science research, including research in strategic management and organizational analysis has, however, rather unfortunately, operated on this level (that is, at the macro–macro, or arrow 4 level), with previous ‘‘social facts’’ determining future social facts. These social facts have taken on various labels – culture, structure, environment, institutions, and so forth – and are generally highlighted as the key independent variables strongly determining overall collective (and individual) outcomes (e.g., behaviour and learning). Much of sociology indeed explicitly builds on models which operate only at the social fact and collective level (cf. Boudon, 2004).7 For example, institutional theory heavily emphasizes the role of exogenous, taken-for-granted institutions in determining organizational structure (Meyer & Rowan, 1977). The causal modelling explicitly focuses only on arrow 4-type explanations; thus, collective myths (in the form of, for example, pre-existing structures) are adopted given the need to establish legitimacy, in short, social facts determine social facts. Or, to put it differently, overwhelming environmental uncertainly leads organizations to imitate structures and strategies of others and to take on the form of their immediate environment (DiMaggio & Powell, 1983). Another example of macro-macro type explanations is much of the capabilities approach in strategic management (Kogut & Zander, 1992; Teece, Pisano, & Shuen, 1997), and more specifically the work originating in organizational routines (Nelson & Winter, 1982; cf. Felin & Foss, 2006). This literature asserts that performance differences between firms are driven by efficiency differences that may somehow be ascribed to collective constructs, such as routines, capabilities, competencies, and the like (also see Henderson & Cockburn, 1994). In all of this, virtually no reference is made to individual action and interaction (as noted by Felin & Hesterly, 2007, in fact, individual essentially are assumed to be homogeneous and randomly distributed). In
Individuals and Organizations
263
fact, while Nelson and Winter (1982, p. 72) make a metaphorical link between individual skills and organizational routines, this analysis explicitly remains at the level of a metaphor, and more generally dismisses with individual-level explanation. Nelson and Winter of course recognize that they are making strong abstractions in emphasizing the collective level (over the individual), and even highlight that they are not completely happy with the choice (1982, p. 83), but nonetheless in its present form capabilities remain defined as collective-level constructs, independent of individuals (Zollo & Winter, 2002). More generally, there are two sets of theoretical mechanisms linking individuals, capabilities and firm-level outcomes that are unaccounted for in the capabilities approach as presently conceptualized (see Fig. 3). As indicated by the figure, neither the mechanisms linking individual actions to capabilities, nor the mechanisms linking capabilities to organizational-level outcomes have been satisfactorily identified. The former mechanism concerns how capabilities originate from individual actions and interaction (cf. Felin & Foss, 2005). We are not aware of a single contribution to strategic management or organizational theory that convincingly shows how organization-level capabilities may emerge from individual actions and interaction. A similar problem arises in connection with the link from capabilities to organization-level outcomes, such as competitive advantage. Superior routines or capabilities or dynamic capabilities are somehow postulated to be the direct cause of such outcomes. Such notions are labels for sequences of individual actions that are postulated to fit into patterns that are organization-specific and repetitive (Dosi, Marengo, & Bassanini, 1999). However, all the emphasis is on the postulated pattern rather than on why individual actions taken together should produce patterns (and continue to produce and reproduce such patterns). To understand this would require that analysis began from considering the individual, for
4
Organizational routine 1
5
Organizational capability 3 Present lack of micro-foundations
Individual 2
Fig. 3.
Extant Capabilities Work Fitted into the Coleman Diagram.
264
TEPPO FELIN AND NICOLAI FOSS
example, looking at what it is that incentivizes or motivates individuals to take those actions that are consistent with routinized behaviour. However, this issue is deliberately suppressed by Nelson and Winter (they simply assume that routines represent truces – 1982, p. 107), and by subsequent writers in the capabilities view. Effectively, individual action is black boxed in these approaches. If pressed on the issue of the missing individual, these kinds of approaches will often formally recognize that individuals of course are present. However, the response effectively takes place by introducing a fifth explanatory mechanism, pictured as arrow 5 in Fig. 2. This goes directly from collective structures, institutions, etc. to individual behaviour.8 Thus, individuals are formally recognized but they are portrayed as mere ‘‘actors’’ that embody the role that is dictated by the collective-level environment (as in structuralfunctionalism). From a methodological individualist perspective, this does not amount to building satisfactory micro-foundations, as any individuallevel considerations are superseded by such factors as culture and environment, routines, capabilities, etc. That is, while individuals and managers certainly do get mentioned in the theoretical development, nevertheless the assumption is that heterogeneity in collective context, environment, and situation drives organizational-level, as well as, individual-level outcomes (if indeed the latter mechanism is considered at all). In short, in collectivist approaches, micro-foundations are missing, as questions such as, where structure originates from in the first place (arrow 3, Figs. 2–3), why certain collective structures are adopted, and what are the underlying individual-level considerations (arrows 2 and 3), remain unanswered in theorizing that solely remains at the collective-level. Institutional theories are an apt example of arrow 1, 4 and perhaps 5-type explanations, and they have also increasingly been utilized in strategic management research (e.g., Lounsbury & Glynn, 2001; Oliver, 1997), they are nonetheless only one example of this type of explanation. While we are admittedly painting with a quite broad brush, more generally behavioural models also place similar, strong causal, emphasis on various higher-level facts (generally, ‘‘the environment’’) determining lowerlevel individual behaviour (for quite explicit links between sociological and behavioural theories see e.g., Homans, 1974 or Hummell & Opp, 1968). The key independent variables and causal factors of behavioural models include ‘‘the environment,’’ various external stimuli, or more generally cultural arguments. For example, in Cyert and March’s classic Behavioral Theory of the Firm, significant emphasis is given to the environment as the determining factor of organizational (and implicitly individual) behaviour (also see
Individuals and Organizations
265
Bromiley, 2005), and further behavioural antecedents are found in Simon’s work, which argues that most individual and organizational behaviour can be traced to some kind of ‘‘environmental stimulus’’ (March & Simon, 1958, p. 139). March and Simon (1958, pp. 141–142) further place emphasis not only on environmental stimuli but also on situations and associated programmatic and routine behaviour in organizations. However, the older behaviouralist school (i.e., March and Simon, Cyert and March) emphasizes the starting point of the individual (thus giving considerations to arrows 1, 2 though perhaps not 3), even if this was an individual strongly influenced by environmental stimuli. We are concerned that this emphasis has vanished almost entirely in more recent work, as more recent behavioural approaches seem to have moved much more towards a strong environmental (and related cultural) argument (Greve, 2003). More generally our arguments in terms of the Coleman diagram and its applications both to strategic management specifically, and organizational analysis more generally are captured in Table 1. Table 1 is simply a summary of our arguments, and we will use it as a further organizing tool for this section, though will not refer to it subsequently. To further digest and summarize some of the above discussion – theories of organization that give strong emphasis to arrows 1, 4, and 5 generally do not give causal consideration to lower levels, in fact even completely disregarding them, as lower levels inherently are determined by higher level causes. An apt way to summarize the theoretical emphasis in arrow 1-type explanation is the notion of ‘‘organizations as strong situations’’ (Davis-Blake & Pfeffer, 1989; also see Pfeffer, 1997). Thus, contextual and environmental factors receive significant emphasis as the predominent theoretical engines. The general questions of interest focus on determining what these contextual factors are and how they cause individual behaviour. Cultural arguments most obviously are emphasized in this type of analysis. Arrow 1, 4, and 5 type explanations are so pervasive in both strategic management and organizational analysis that any questioning of their validity may seem suspect to most scholars, but we will nonetheless argue that emphasis in these areas is misplaced as micro-level explanations inherently are more theoretically sound. Or, to make the preceding sentence less controversial, it seems that arrow 3-type, micro-foundational explanation has been neglected relative to macro-foundational work. While we have emphasized research in organizational analysis, which gives emphasis to arrow 1, 4, and 5 type explanations, strategic management similarly makes strong assumptions about the need to place emphasis on explanation at these, higher levels. For example, organizational culture is one explanatory variable often cited
266
Table 1.
Key Dimensions and Considerations for Individual-Organization Links. Arrow 1
Key terms
Arrow 4
Arrow 2
Arrow 3
Evolution and environmental adaptation
Interaction and exchange
Choice, latent characteristics, ability/ skills, self-selection
Macro-macro Environmental selection
Micro-micro Behavior
Key question(s)
What are key contextual and environmental factors determining behavior? What is the role of culture in determining outcomes?
How does the organization respond to environmental pressures? What are the routines and programs of the organization?
Representative article
Henderson and Cockburn, 1994 Behavioural
Cyert and March, 1963
What are the underlying origins of interaction – why, with whom etc., is there an underlying, behavioural interaction-factor (‘‘socialization’’ and resulting ‘‘emergence’’), or is it simply an artefact of underlying characteristics of individuals March, 1991
Micro-macro Rational design, invisible hand mechanisms, exchange How do aggregate structures, institutions, etc., emerge from individual action and interaction? What is the process of sorting or self- selection into organizations?
Behavioural
Behavioural
Causality Theoretical mechanisms
Behavioural vs. Rational emphasis
Future research Rational choice
TEPPO FELIN AND NICOLAI FOSS
Culture, environment, organization/ collective, situation, institutions Macro-micro Socialization
Individuals and Organizations
267
in the resource-based view as a key source of competitive advantage (e.g., Barney, 1986). More generally, Henderson and Cockburn (1994), in an article which has been highlighted as one of the key (both theoretical and empirical) works in the resource-based view (Barney, 2001; for further discussion see Felin & Hesterly, 2007), explicit emphasis is placed on various organizational factors as the determinants of individual-level outcomes (that is, arrow 1-type explanation). Much of the literature on organizational capabilities also places heavy emphasis higher levels such as networks and alliances as the key determinants capabilities and outcomes (Dyer & Singh, 1998). Thus, in strategic management there is similarly a strong emphasis on arrow 4 and 5 type explanations. Finally, knowledge, capabilities, and associated resource-based arguments have also placed some emphasis on arrow 2 type explanations. Arrow 2’s explanations focus on individual interaction and associated outcomes. For example, March (1991) in his classic simulation of organizational learning focuses on interaction and socialization as the key determinant to individual and consequently organizational learning. While interaction and socialization certainly seem like plausible theoretical mechanisms, how and why exactly interaction or socialization causes learning remains unaddressed.
SOME REASONS WHY MICRO-FOUNDATIONS ARE CRITICAL The above discussion and present arguments in strategic management and organization theory raise more general methodological questions about where analysis should start. Our discussion thus far certainly has tipped our hand in terms of the type of explanation that we advocate, though a more formal discussion we believe is instructive. Coleman (1990, pp. 1–23), in similar fashion, in a meta-theoretical discussion asks whether social science explanation should start with collective and external, and individual-level and internal explanations. Insights may perhaps be drawn from both perspectives, but given the need for theoretical parsimony, and as a first approximation, which approach is more persuasive and theoretically (and methodologically) sound. Coleman gives three reasons why individual-level, internal explanation is theoretically more robust, and we further extend these insights into strategic management and organization theory. That is, using the Coleman framework as a tool, we highlight why micro-level explanation improves on collective-level explanation. We discuss three reasons and explicate them further below. First, observations, and quite
268
TEPPO FELIN AND NICOLAI FOSS
importantly, interventions, are more naturally made at lower, individual levels (rather than higher levels, if at all possible). This argument implicitly appeals to the kind of mechanism-based oriented explanation that we alluded to earlier (and will discuss further below) and more generally provides for a more practical and humanistic approach to theorizing given the more ready prescriptions (cf. Hedstrom & Swedberg, 1998). Second, collective level explanations leave theories open for alternative explanations. That is, individual-level factors may precede collective facts which may subsequently be presupposed by theorists, which, however, inherently mis-specifies collective models. Third and finally, individual-level explanations are more stable as organizations inherently are resultant of the actions of component parts, that is, individuals. We will briefly discuss each of the above reasons in turn below. Observation and Interventions Made at Lower Level The question of intervention provides an interesting challenge to collective level theories, how specifically would an organization, for example, develop a capability (or more generally change)? Scholars have recently in fact posed this question, and again the theory inevitably relies on individual-level factors, but in so doing creates a data-theory disconnect as the theory remains at the collective level, but underlying observations and potential interventions at the individual level. For example, Song, Almeida, and Wu (2003) argue and show that capabilities can be developed by hiring particular individuals from other organizational settings, thus questioning the ‘‘collective status’’ of capabilities in the first place. More loosely, if we were to observe any organization, we inherently would be observing specific individuals, rather than the whole of the organization in any meaningful way. This of course gets lost in the broad general use of aggregate and organizationallevel language and empirical data showing various financial and strategic actions of firms over time, but explicating the underlying abilities and motivations of the individuals seem critical to truly explaining collectivelevel outcomes. To use Coleman’s (1990, p. 3) language: An explanation based on internal analysis [individual and micro-foundations] of system [organization] behaviour in terms of action and orientations of lower-level units is likely to be more stable and general than explanation which remains at the system level. Since the system’s behaviour is in fact resultant of the actions of its component parts, knowledge of how the actions of these parts combine to produce systematic behaviour can be expected to give greater predictability than will statistical relations of surface characteristics of the system.
Individuals and Organizations
269
Obviously, this is not to say that aggregate data cannot be used, but when making strong theoretical statements based on ‘‘reduced form’’ correlations at these high levels of aggregations, strategic management scholars should be aware of the nested individual action and interaction that generally gets lost in the process. The available data in effect begins to drive the theoretical development, rather than the other way around, and more importantly, as argued by Coleman, more robust theories of collectivise are developed by starting at the lower level. More generally, the matter of intervention provides a rather stringent, further litmus test for matters such as organizational change. Hypothetically, if we wanted to radically (to use an extreme case to highlight) intervene in an organizational setting to change it (whether strategy or structure, and associated performance outcomes), it would seem logical that the most radical forms of change would involve changes in the underlying individuals who make up the organization. Management practice of course seems to verify the logical intuition behind the link between the need to focus on individuals in organizational intervention, for example, in cases of top management team turnover and more generally in matters such as hiring and retention. Interventions thus, while perhaps generally applicable to the organization as a whole, in effect are as good as, and effective as, the underlying response at the lower, individual level. This ‘‘other side’’ of capabilities, perhaps even the ‘soft underbelly’ of capabilities (given the lack of work in this domain), seems largely unexplored as emphasis remains at organizationlevel practices (cf. Henderson & Cockburn, 1994), rather than their individual-level origins and the underlying characteristics and choices of the individuals responding to potential higher level organizational practices.
Alternative Explanations In offering only a collective level approach of organizational capabilities, many alternative explanations from lower levels are quite readily apparent. The capabilities literature in fact has recently begun to note this problem, as scholars point to the capabilities themselves being rooted in specific individuals, rather than the organization (e.g., Lacetera, Cockburn, & Henderson, 2004; Song et al., 2003). That is, another apt litmus test in fact is whether the mobility of particular individuals (Felin & Hesterly, 2007) leads to the respective knowledge decay or building. Early contributions to the organizational capabilities literature in passing even note that individuallevel alternative explanations are quite readily apparent (Henderson
270
TEPPO FELIN AND NICOLAI FOSS
& Cockburn, 1994, pp. 79–80), though subsequent theorizing and empirical work has taken a quite strong collective stance. The one specific mechanism that these ideas suggest is that ‘‘more capable’’ individuals may create and self-select into particular organizational settings (cf. Stern, 2004), and thus as collective work generally does not (or is not able to) control for individual-level a priori factors. Specifically, the underlying assumption is that individuals are randomly distributed (and thus can essentially be treated as homogeneous) into organizational settings, thus warranting the emphasis on heterogeneous collective practices. Thus, in terms of our Figures, emphasis remains on the mechanisms in arrow 1, 4 and 5 type, though underlying micro-foundations as to the origins of practices remain unexplored. Put in light of Fig. 2, focusing on arrow 2 and 3 type explanations may provide underlying micro-foundations for explaining the origin and development of organizational capabilities, for example, by addressing such important questions as: who is attracted to a particular organizational setting and why, what characteristics and abilities do they bring to the setting?9
Organizations and Strategies Resultant of Individuals A final point, closely related to the above, supporting the need for individual-level theories and micro-foundations is the fact that any collective structure is fundamentally resultant of individual-level actions and composition. That is, we logically cannot conceive of ‘‘organization’’ without the individuals that make it up. This upward relationship thus suggests that some emphasis should be placed on factors related to processes of ‘‘organizing,’’ that is, who and why certain individual organize (cf. Olson, 1971). Despite well-known warnings of problems associated with anthropomorphizing, there is still quite a bit of action and behaviour attributed to the organization itself, rather than the individuals, which comprise the collective. Many behavioural models of organization theory and strategic management ascribe overwhelming importance to the environment, particularly in more recent formulations (e.g., Bromiley, 2005; Greve, 2003). While we do not want to rule out the kind of mechanisms indicated by arrow 1 in Fig. 2, we stress that strategic management inherently is interested in purposeful action on the part of managers, and we cannot therefore from this perspective let managers be mere puppets at the mercy of aggregate ‘‘environmental’’ or collective forces. Similarly, in capabilities work on the theory of the firm, that has taken its inspiration from Nelson and Winter (1982), there
Individuals and Organizations
271
is a strong underlying argument that routines and capabilities very strongly influence individual actions (arrow 5 in Fig. 2). Unfortunately, this leaves it unclear how routines and capabilities originates and change as a result of individual action and interaction (the left hand arrow in Fig. 1) (cf. Felin & Foss, 2005). Besides, while routine behaviour may perhaps dominate the organizational landscape (Nelson & Winter, 1982), it nevertheless does not necessarily follow that this routine behaviour is what deserves the most emphasis in theoretical work, nor does it follow that routines automatically are the key source of what differentiates firms.
More General Arguments It is quite arguable that the capabilities approach and other collectivist currents in contemporary management thought are founded on an implicit assumption of strong individual homogeneity and malleability by heterogeneous context, situation, and surroundings (for this argument, see Felin & Hesterly, 2007). If such malleability indeed is the case (linked with the associated assumption of the random distribution of individuals), then individuals can in effect be rounded out, as they simply passively ‘‘mediate’’ the causal mechanisms going from collective constructs such as capabilities to collective outcomes. Individuals are not – strictly speaking – left out of the explanation; they are just not very important, and primary emphasis can be placed on the collective level. Little would be lost of the explanation if it was simply to take place in terms of the mechanisms represented by arrows 4 or 5 in Fig. 2 (rather than in terms of arrows, 1, 2, and 3). However, arguing that individuals a priori are homogeneous and malleable directly conflicts with established theoretical and empirical arguments from the cognitive sciences emphasizing the role of a priori knowledge (Spelke, Breinlinger, Macomber, & Jacobson, 1992). In short, individuals clearly bring abilities and propensities with them into organizational settings (arrow 3 in Fig. 2) that strongly influence which outcomes can be realized. Such underlying factors are marginalized by the preoccupation with the overwhelming determinance of pre-existing structure. On the other hand, arguing that individuals are heterogeneous does not imply that the collective level is non-existent or unimportant. Rather, it suggests the importance of explicitly linking the individual and the collective levels, that is, in unveiling the mechanisms represented by at least arrow 3 in Fig. 1, and possibly also arrows 1 and 2. As indicated already, overall surprisingly little serious analytical attention has been dedicated to this task,
272
TEPPO FELIN AND NICOLAI FOSS
although a few scholars have strived valiantly to establish the relevant links (e.g., Argote, 1999). In the following, we outline research implications of a micro-foundations programme in strategic management and organizational analysis.
BUILDING MICRO-FOUNDATIONS: IMPLICATIONS FOR RESEARCH Building Bridges Between Levels Unfortunately, the general methodological individualist admonition that aggregate phenomena ultimately be accounted for in terms of individual action and interaction leaves open exactly what ‘‘action’’ and ‘‘interaction’’ means. The purpose of the following is to identify and discuss in some detail the set of mechanisms at various levels (including behavioural assumptions) that is relevant for building foundations for routines and capabilities. Furthermore, the discussion will show how the choice of such foundations is influenced by, how capabilities are conceptualized, whether the starting point is in a homo oeconomicus model or a homo sociologicus model of behaviour, etc. Thus, we shall seek to identify possible research strategies, and identify the available conceptual toolbox. Underlying psychological assumptions and models of human nature are absolutely fundamental to theorizing for strategic management and organizational scholars, as these assumptions provide the foundation for much of what can subsequently be said and theoretically developed (March and Simon, 1958, 9–11; also see Simon, 1985, p. 303). Unfortunately, these underlying assumptions often receive little attention, perhaps given the immediate need to get to the business of organization-level theorizing. However, equally important is how to build from the individual (micro) level to the organizational (macro) level, that is, to theorize the mechanisms implied by arrow 3 in Figs. 2 and 3. This has traditionally been a troublesome part of the social sciences (cf. Hayek, 1937; Arrow, 1951; Coleman, 1990). As already mentioned, the ‘‘bridging’’ becomes simple if all individuals are assumed to be essentially homogenous. Strikingly, this explanatory trick is characteristic of both (formal) economics and (functional-structuralist) sociology, though otherwise the two disciplines or theories are quite obviously completely different. To the sociologist, the assumption of homogeneity is a license to let all behaviour be influenced and formed by structures. To the economist, the assumption of homogeneity is useful
Individuals and Organizations
273
because it allows him to take one agent as representative of the whole economy and treat the allocational decisions of this single agent as a representation of the aggregate allocation problem. Both procedures strongly reduce the complexity of the analytical problem of showing how individual action and interaction produces aggregate phenomena. However, both approaches, while often utilized in organizational analysis, fundamentally trivialize the problem of understanding how organizationlevel phenomena, such as capabilities, originate from the actions and interactions of heterogeneous individuals (or how these capabilities are changed or maintained). While the assumption of individual homogeneity may assist the formal modelling of the economist, it is a too strong an affront to realism for most purposes of management theory to take the behaviour of organization X to be fully represented by the actions of a ‘‘representative employee’’ NN. Abstracting from assumptions about individuals and how they interact gets us dangerously close to the untenable kind of collectivist explanation represented by arrow 4 in Fig. 2. This reasoning results in our first implication for research: Research Implication 1: Building micro-foundations for organizational phenomena such as organizational capabilities (explanandum), implies that explicit assumptions be made about both how individuals and their interaction (explanans) produces the explanandum .At a minimum the microfoundations therefore involve identifying mechanisms that are represented by arrow 3 in Fig. 2. Thus, the micro-foundations project would usually seem to involve two demanding undertakings first, making assumptions about individuals (e.g., nature see Simon, 1985, p. 303), and second, given these assumptions, exploring the action and interaction of individuals to see if the explanandum can conceivably be produced as a result of this interaction. For example, the ‘‘no fat’’ modelling methodology of game theory (Rasmussen, 1989) prescribes making the minimal (but precise) assumptions about rationality, information, and interaction that makes it possible for the analyst to derive the explanandum as an endogenous outcome of rational actions in equilibrium. Theorizing arrow 3 may be highly complicated. Or it may be conceptually simpler particularly when the explanandum is an outcome of willed, rational design. For example, the existence of organizational incentive system may be explainable in terms of a CEO or top management team designing and implementing the system. However, even the best-designed system is almost bound to produce unforeseen effects at the organizational level. Whether these are effective to the organization also depends on the information, rationality, and motivation of the CEO, and of course, the
274
TEPPO FELIN AND NICOLAI FOSS
underlying abilities and motivation (perhaps intrinsic) of the employees themselves. At any rate, individual and organizational levels remain connected. A critical matter in building bridges between levels (above discussion) is the mode of theoretical explanation utilized, which we discuss next. Mode of Explanation As a general matter, theory formation, and the formulation of strategies for theory formation, is strongly influenced by the mode of explanation that is dominant in a given scientific field (Elster, 1983, pp. 16–17). A classic distinction is between intentional (i.e., explaining a phenomenon solely in terms of the intentional actions of individuals, i.e., design); functional (explaining a phenomenon in terms of the benefits that the phenomenon (e.g., a social institution) brings to the group where the phenomenon is present), and causal explanation (i.e., explaining in terms of cause and effect) (Elster, 1983). Some types of explanation in social science mix these three; for example, an invisible hand explanation (Ullman-Margalitt, 1978) mixes intentional and causal explanation, and an evolutionary explanation may include these two as well, in addition to elements of functional explanation.10 Which mode of explanation does the organizational capabilities view apply (or imply)? To the extent that explanation by means of this approach proceeds solely on the macro level (arrow 4), some kind of causal explanation would seem to be the mode that is applied (i.e., capabilities directly cause, e.g., superior, firm-level performance). However, it is possible to give a somewhat different account of explanation within the capabilities approach, one that harmonizes with the heavy sociology content in this approach. In this account, the capabilities approach is heavily functionalist. Thus, capabilities and routines exist within specific firms because of their beneficial effects for these firms; however, the specific linkages between those capabilities/routines and their effects are not understood or at least not well understood by agents. As a huge literature on scientific explanation has clarified, functional explanation is highly questionable, because the feedback links between the effects of an aggregate entity (e.g., a routine) and the entity itself are seldom if ever specified. Such links, to the extent that they can be identified, must exist on the micro-level, that is, involve arrows 1, 2, and 3 in Fig. 3 – which leads back to the issue of individual action and interaction. This reasoning suggests the point – inherent in our emphasis on mechanisms – that there is a strong causal dimension to the micro-foundations project. Building from individual actions to organizational outcomes involves causal links that should be specified and theorized.
Individuals and Organizations
275
Research Implication 2: Building micro-foundations for organizational capabilities must generally avoid functionalist explanation, but (in line with the emphasis on mechanisms) must make use of causal explanation.
Fundamental Choices in Theory-Building: Making Assumptions About Human Agency As a general matter, micro-foundations in management can take many different forms. We agree that it is possible to build a kind of micro-foundations from a sociological and/or behavioural starting point (e.g., Gavetti, 2005) just as a rational choice approach, more akin to economics, is also possible as micro-foundation (and we shall advocate this later). However, there are wide-ranging implications of theoretical choices made on this level (cf. Simon, 1985). They influence how organization-level phenomena can be conceptualized and explained. On a very basic level, there is a crucial choice to be made between a homo oeconomicus or a homo sociologicus model of behaviour.11 The choice that is made at this level, condition to a large extent what is further assumed about the rationality, information, and foresight of the agent or actors. Thus, if the choice is the homo oeconomicus model (and standard economic modelling methodology), assumptions are typically made that agents possess very considerable foresight (sometimes even perfect foresight) and can make perfectly rational decisions, no matter the informational circumstances. All versions of the paradigmatic expected utility model make such assumptions. Homo sociologicus is, in contrast, stuck with roles and norms that directly shape behaviour (i.e., arrow 5 in Fig. 1). As many have observed, the demands placed on the actor are much smaller than those placed on its related, though extremely distant, theoretical cousin, homo sociologicus. The chosen model of behaviour is also closely connected to the applied mode of explanation. That is, if agents are equipped with a high degree of foresight and information asymmetries are insignificant, it is hard to make room for unintended consequences; this effectively eliminates functional explanation, invisible hand explanation, and also evolutionary explanations. This connects to how phenomena on the organizational level are conceptualized. Strongly ‘‘rational’’ behavioural models (e.g., the maximization model of the economist) often suggest that organization-level phenomena are consciously and rationally designed (e.g., Milgrom & Roberts, 1992). Thus, the mechanism is pure intentional design. Behaviourial work, in contrast, will typically emphasize the ‘‘emergent’’ or partly ‘‘hidden’’ nature of
276
TEPPO FELIN AND NICOLAI FOSS
organization-level phenomena, such as routines or capabilities (Nelson & Winter, 1982; Dosi et al., 1999). The mode of explanation employed by such work will typically be functionalist; that is, the existence of routines, capabilities, culture, etc. in an organization. is ‘‘explained’’ in terms of its beneficial consequences for the organization, but there is no attempt to identify the mechanisms that show how these routines, etc. are produced and reproduced by individuals. However, it is possible to assert that individuals act in a rational manner, yet organization-level phenomena are not necessarily designed, but ‘‘spontaneously emerge’’ (Hayek, 1973; Schotter, 1981; Sugden, 1986). They are ‘‘the results of human action, but not of human design’’ (Hayek, 1973) and their explanation takes place in terms of ‘‘invisible hand explanation’’ (Ullman-Margalitt, 1978; for a list of examples of invisible hand explanations, see Nozick, 1974, pp. 20–21). Invisible hand explanations may take a number of forms, but common to them is that they belong to the category of what Hempel (1965, p. 447) calls ‘‘genetic explanation,’’ which ‘‘y present[s] the phenomenon under study as a final stage of a developmental sequence, and accordingly account for the phenomenon by describing the successive stages of that sequence.’’ More specifically, such explanation explains the explanandum as the unintended (and perhaps even unforeseen, cf. UllmanMargalitt, 1978) outcome of the interaction of many intentionally acting agents. This is a mode of explanation that has seldom been explicitly applied in management. Yet, we conjecture that this mode of explanation is one that should apply fairly broadly: complex organizations manifest both highly intentional and rational behaviour and unanticipated consequences thereof. Research Implication 3: From the perspective of building micro-foundations for organizational phenomena invisible hand explanations are attractive because they combine the intentional (level of individual) with the causal (interaction among individuals) mechanisms, and seem to capture essential characteristics of complex organizations. However, invisible hand explanations are consistent with a fairly broad set of assumptions with respect to behaviour. The paradigmatic use of these explanations is, of course, in economics: overall patterns of allocation and distribution in markets emerge ‘‘as if by an invisible hand.’’ The behavioural assumption that usually drives economics is, of course, maximizing rationality, but invisible hand explanations are fully consistent with various behavioural approaches or with approaches that make only minimal assumptions on intention and rationality (e.g., Alchian, 1950). In the following section, however, we argue that the most satisfactory and fruitful behavioural assumptions can be found in rational choice theory.
Individuals and Organizations
277
A RATIONAL CHOICE PROGRAMME FOR MANAGEMENT? THREE PILLARS By way of linking our efforts (and early prescriptions for the future) to existing streams of research, there are quite promising strains of sociological (and economics) research, which, while not pervasive in management, explicitly argue for the need for micro-foundations and the need to make the micro-macro link (e.g., Boudon, 1998a; Hedstrom & Swedberg, 1998; Olson, 1971; Schelling, 1978). This rational choice programme, while in part marginalized in sociology, nonetheless of late has received some significant interest among a growing group of scholars. As the very name of the ‘‘rational choice’’ research programme suggests, the two key pillars of this approach emphasize first, rationality on the part of individuals, and second, choice. A third fundamental pillar is that of ‘‘causal mechanisms,’’ which we also discuss (building on earlier discussions). We highlight the implications of each of these three pillars, specifically emphasizing some very promising theoretical work which we believe can provide the foundations for further theory development by scholars in strategic management and organization theory. We explicate these three key pillars of the rational choice programme as, to our knowledge, this research programme has not (at least in any meaningful or direct way) entered our conversations in management, despite its promise to solve many of the problems we have highlighted in this article. Rationality The emphasis in extant behavioural approaches again is on the determinacy of various external factors, and the past, on learning and decision-making (or, arrow 1 and 4 type explanations – e.g., Levitt & March, 1988), while the rationality assumption rather focuses on individuals as actors capable of reason, predictions, and common sense. While, there has been a remarkably programmatic effort in behavioural and sociological approaches to denigrate any forethought or reason on the part of actors (see Rosenberg, 1995 for an excellent overview), even going as far back as Comte, nonetheless more recent research suggests that individuals in fact are quite ‘‘smart’’ and rational in how they approach and learn in unpredictable environments (Gigerenzer et al., 1999). That is, the present theoretical emphasis on poor judgements on the part of individuals (e.g., biases), which in fact has coloured much of our research even in strategic management, has been challenged by scholars who find that individuals are rational given the
278
TEPPO FELIN AND NICOLAI FOSS
information that is available in various decision-making situations. In fact, a stunning conclusion might be that biases and errors are simply made by scholars (in their theorizing), and not the subjects they study (see Stanovich & West, 2000, for an excellent discussion). This new-found rationality has important implications for organizational research, as our models still strongly emphasize ‘‘boundedness,’’ rather than rationality. Extant organizational work emphasizes legitimacy and imitation, while more rational approaches suggest that individuals are quite deliberate and purposeful in their actions. While notions of deliberateness have recently been alluded to in literature on organizational capabilities (Zollo & Winter, 2002), significantly more work is needed in truly anchoring these arguments into robust theoretical work in disciplines such as psychology and cognitive science, and more generally, we believe that much of the collectivist (and perhaps behavioural) baggage is best discarded as the present efforts to continuously amend the overall theoretical approaches are doomed to fail and result in an incoherent patchwork of ad-hoc explanations.
Choice The second key pillar of a rational choice research programme is choice. There is very little choosing on the part of managers in extant theories, given the programmatic and routine behaviour emphasized presently. Choice suggests an overall logic of intentionality, which is lacking in behavioural arguments. Furthermore, choice-based models and associated invisiblehand explanations have provided quite powerful micro-macro links in some quarters of social science (e.g., Schelling, 1978). The question of choice in fact suggests that we in effect begin our analysis with the ‘‘lone’’ individual, and from there build a theoretical model by first specifying our assumptions about individuals by highlighting their interests, capabilities, and goals (rather than ascribing them to the organization), and then further explicate the underlying mechanisms of the organizing process (Olson, 1971).12 In fact, the process of self-selection, rather than the presently assumed random distribution of individuals (Henderson & Cockburn, 1994), we think provides a very powerful micro-macro link or social mechanism, which deserves significantly more emphasis. Scott Stern in fact recently, while not theoretically anchoring his arguments in the language of ‘‘micro-foundations,’’ nonetheless points to the powerful role of individuals deciding, choosing, and creating environments (also see Zenger, 1994). By way of further examples of promising theoretical work for management scholars to build on,
Individuals and Organizations
279
the work of Olson (1971) and Brennan and Buchanan (1985) begins with the very approach suggested in this chapter, though at different levels of analysis, but nonetheless providing broad guidance for the type of approach advocated in this chapter. The choice aspect of the research programme we are advocating here furthermore is also grounded in a ‘‘humanistically congenial image of man’’ (Coleman, 1990, p. 4). That is, rather than impute causation to external and collective factors such as the environment, situation, or organization; the emphasis is placed on individual-level factors and choices. This emphasis on choice comes in stark contrast to many who are calling for further institutional (e.g., Oliver, 1997) and behavioural (Bromiley, 2005) foundations for strategic management. That is, we call for a radically different rational choice approach, which emphasizes internal, individual-level factors, or, micro-foundations. This more internal approach we believe also is more congenial with the general mandate of strategic management to focus on managerial action and choices, even prescriptions, rather than highlighting the inevitable forces which may impose on a manager or firm. While various external ‘‘inevitabilities’’ may, and certainly do exist, it would seem that the key differentiating factors lie in internal, individual-level factors. The focus from a choice-type perspective then highlights rational reason, rather than, the devil made me do it/external-type explanations. Causal Mechanisms Various definitions of causal or social mechanisms have been offered in the social sciences (see Hedstrom & Swedberg, 1998; Mahoney, 2003 for an overview), though the central gist of this work has been to call for explicit links between micro and macro levels. While there are some fine-grained differences between definitions, they all call for explanatory and causal understanding (rather than citing correlations) regarding how specifically individual actions, choices, and interactions aggregate to the collective level (Elster, 1989), and as a secondary exercise, how the macro-levels subsequently interrelate back to micro-levels (though without deteriorating into ‘Giddensian’ notions of mutual instantiation). While social science in general, management included, has done an admirable job with the latter, relating macro-levels to the micro, the work calling for causal mechanisms emphasizes the critical need to first explain the origin of macro-structures as the result of individual actions, rather than take them for granted. In short, the hard theoretical work is left undone, as management theories simply assume the existence of the very structures they seek to explain.
280
TEPPO FELIN AND NICOLAI FOSS
Overall, based on our brief outline of key three pillars of a rational choice programme, we suggest the following research implication. Research Implication 4: A rational choice research programme focusing on rationality, choice, and causal mechanisms provides the foundations for superior theoretical explanation in management theory. Overall, a rational choice programme is quite attractive as ‘‘rational action is its own explanation’’ (Hollis, 1977), while collective and behavioural explanations remain unsatisfactory as they point to black boxes, which themselves require immediate further explanation. That is, behavioural and collective approaches leave one quite unsatisfied as to the underlying causes of actions (Boudon, 1998a, 1998b). The broad strokes and methodological and theoretical outlines, we have highlighted above, we believe will provide management scholars the early basics for building similar models for management theory.
PRACTICAL ADVICE FOR SCHOLARS – FOUR IMPORTANT QUESTIONS IN LINKING MICRO AND MACRO Before concluding we delineate practical advice and prescriptions for scholars by way of highlighting four important questions that we hope will help scholars think about how they link micro and macro in their own research. While the questions in part summarize some of the arguments delineated above, we hope their brief explication here will provide a practical and succinct tool for scholars to use in their own work. These questions are meant as a practical litmus test for scholars to think about their underlying assumptions, and more generally, in hopes of helping scholars in their theory-building, and also to push scholars to build towards a rational choice programme for management.
What Are Your Assumptions About Human Nature? An important question that rarely gets attention is the underlying assumptions made by scholars regarding human nature. As noted by Simon, ‘‘nothing is more fundamental in setting our research agenda and informing our research methods than our view of the nature of the human beings whose behaviour we are studying’’ (1985, p. 303). Answering this question
Individuals and Organizations
281
gets at the core of thinking about links between micro and macro, as theory which begins at the macro level simply assumes that individuals in effect are passive receptacles of the broader cultural environment, without underlying consideration for matters of choice and rationality on the part of individuals. Furthermore, questions of the disposition of individuals, for example, salient characteristics such as personality, also should receive consideration. For organization theorists, and strategists for that matter, this requires a familiarity (even expertise) with individual-level theories. Given the implied division of labour among management scholars, rarely are scholars proficient at both individual and collective-level theories, while the question of human nature cuts at the very relationship between micro and macro, and should therefore be carefully thought about in any research project. How Does Something become a Collective Property? A bulk of collective-level theorizing happens at the level of metaphor or analogy. For example, Nelson and Winter’s notion of organizational routine is developed via metaphor from individual skill. However, the question of truly collective-level properties rarely (if ever) gets answered beyond discussing analogies between individual-level characteristics (e.g., identity, memory, learning) and the collective level, and over time a rather subtle process creeps in where collective constructs originally established via metaphor become taken for granted organizational-level facts (e.g., routines). While making links with metaphor, or citing emergence, undoubtedly serves a temporary purpose, we believe that truly digging into the constructs, beyond analogy, is required. This should take the form of explaining at what point, and why, we can truly talk about a collective capabilities sans individuals. The question of something becoming a collective-level property, even in the best levels-literature still remains a mystery (Dansereau et al., 1999, p. 349), as ‘emergence’ or ‘synergy’ are still frequently cited without more careful delineation of how and why something might emerge from lower levels. Furthermore, it is rarely recognized that it is equally likely that something less, rather than something more (the latter is generally always assumed), might emerge from individuals interacting. What are the Origins of Collective-Level Variables? A closely, related, though still separate question is the origin of collectivelevel variables. To use the example of capabilities, presently there is a strong
282
TEPPO FELIN AND NICOLAI FOSS
focus on collective-level origins, specifically such variables as the past, interaction, and experience serving as key independent variables (cf. Levitt & March, 1988). But, as we have argued throughout this paper, all collective level constructs must have individual-level origins. Digging into the past here may provide one way to understand origins, for example, Arrow (1974) highlights how the ‘‘organizational code’’ may be a function of the founders of the organization. The matter of origins then requires more careful consideration of what individuals a priori, that before joining, the organization bring to the table. Matters such as individual self-selection would seem important as providing individual-level origins for (perhaps invisible hand) collective outcomes. How Does Change Occur? The question of change in organizations provides a final question, which we believe will help scholars to link micro and macro. While ‘‘change’’ clearly may not be the explicit focus of a research project, nevertheless most papers implicitly weigh in on how individuals and organizations may or may not be able to change in a given environment. While there is a rather heavy emphasis on the external or the environment in inducing change (cf. Greve, 2003), it would seem that the key differentiating factors lie at lower levels. Put differently, while an external shock or discontinuity certainly may lead to organizations changing, how and what their responses are cannot meaningfully be ascribed to the change-inducing stimulus itself, but rather, the origins of change must lie in the choice and efforts that individuals and organizations make. Thus, asking the question of how change occurs crystallizing the underlying assumptions and we believe will help scholars to more rigorously think about their research. The purpose of the above questions is to provide broad guidelines for scholars to think about the underlying assumptions that their theories make, specifically with an eye towards making links between micro and macro levels. The above questions certainly do not exhaust the full spectrum of the types of considerations that are made in any research project, but at least begin the process of pushing scholars towards directions articulated in this paper.
CONCLUSION With the broad outlines we have provided we hope to counteract what we suggest is partly a disappearing mandate for strategy and organizational
Individuals and Organizations
283
analysis. What we mean by this is that the possibility of strategic action (and thus management) is either suppressed by a heavy emphasis on environmental determinism – in other words, that various external factors come to play the role as the key independent variables determining individual and organizational behaviour – or obscured by an emphasis on firm-level constructs, notably capabilities, that effectively obscure individual action. We argue that a rational choice programme may mitigate against this trend. Specifically, it would seem critical to management scholars to understand and be able to impute actions to individuals, rather than collective variables. We have more generally outlined (using Coleman’s model) the present emphasis on macro-level concepts in organizational analysis, but suggest the individual-level micro-foundations are needed in order to explain their emergence, existence, persistence, and change.
NOTES 1. In contrast, economics has for more than three decades witnessed much fundamental theoretical inquiry has been devoted to establishing micro-foundations for macro-economics. The ‘‘micro-foundations project’’ implies the building of rigorous mathematical models that demonstrate how individual action and interaction may produce economy-wide consequences (Leijonhufvud, 1968), that is, mimic the real aggregate movements of the economy as captured in time series of central variables (Lucas, 1977). 2. Gavetti (2005) has also called for ‘‘micro-foundations’’ for management theory in a recent article, though highlighting somewhat different, though arguably complementary, theoretical elements. 3. Our arguments also apply to work that makes use of notions of routines, dynamic capabilities, competencies, core competences, and other of the numerous aliases of, or concepts related to, the concept of capabilities. 4. In its most extreme form, methodological individualism asserts that in explanations of social phenomena, reference is allowed only to individuals, their properties, and their (inter)actions. Thus, at no point in the explanation can reference be made to supra-individual entities as in any way causal agents. No ‘‘shortcuts’’ by making reference to aggregates are allowed anywhere in the explanation. On this programme, explaining, for example, the current strategy of Shell must always ultimately involve making reference to the mental states of all relevant organizational stakeholders and how these mental states produced particular actions that over time combined to produce current Shell strategy. If practicable, all explanation in such a scheme essentially becomes painstaking historical explanation. However, obviously, this is a highly problematic research strategy, if taken to its logical conclusion. First, it runs into problem of historical regression, because it is not clear where the explanatory chain should stop. Second, it runs into a complexity problem associated with accounting for the mental states and actions of all relevant organizational
284
TEPPO FELIN AND NICOLAI FOSS
stakeholders. As Hayek (1945) reminded us, these kind of ‘‘particular circumstances of time and place’’ usually cannot be ‘‘concentrated in a single mind.’’ At any rate, we do not subscribe to an extreme methodological individualism, but admit a role for collective concepts in explanation as in Agassi (1960). 5. For example, under competitive conditions, decision-makers in firms only have a very limited feasible behavioural repertoire. If they do not choose an element of this set, they will not survive. Thus, a structure (i.e., competitive conditions) can substitute in an explanatory sense for a much more complicated explanation involving individual action and interaction (for a related approach, see Satz & Ferejohn, 1994). 6. We should note that arrow 5 is not in the Coleman (1990) formulation of the diagram. It was suggested to us by Peter Abell. Arrow 5 allows for the direct influence of social structures on actions postulated in more extreme streams of sociology, such as structural functionalism. In these streams choice is not seen as an outcome of deliberation relative to constraints (i.e., arrow 2). Thus, arrow 1 and 2 cannot fully capture such approaches to social theory. 7. Even much of mainstream economics is not always immune from methodological collectivism. As Austrian school economists have argued for years (e.g., Lachmann, 1973), much of macroeconomics was, and to some extent still is, methodologically collectivist. 8. While the methodological individualist does not deny that institutions (e.g., the law, incentive systems, etc.) influence human actions, he will balk at the way in which the mechanism implied by arrow 5 portrays actors as puppets that are somehow being directly manipulated by overall societal forces, and will insist on the higher degree of voluntarism implied by the use of mechanisms contained in arrows 1 and 2 (i.e., institutions matter to individual action because they influence various individual conditions, but they do not directly cause action). 9. This brings self-selection into the picture (cf. Felin & Hesterly, 2007). An explanation of organizational heterogeneity in terms of self-selection may abstract completely from interaction among individuals and the possibly ‘‘emergent’’ effects thereof. In this explanation organizational capability is a genuine epiphenomenon of individual actions. 10. Taken as a whole, management studies have traditionally made use of all three modes of explanation. However, typically those parts of management studies that are heavily influenced by economics (e.g., strategic management, corporate governance, organizational design) have made heavy use of intentional explanation while those parts (e.g., work on organizational culture, population ecology, the new institutionalism in organizational analysis) that are more heavily influenced by sociology and anthropology make more use of functional explanation. 11. Admittedly, this is a stark contrast, which neglects that there are intermediate positions (e.g., Lindenberg, 1988 and Boudon, 1998b). The discussion is also somewhat hampered by the fact that there is no homo sociologicus model on the same level of elaboration and sophistication as the homo oeconomicus model. 12. Coleman relatedly makes the following point (noting the ‘‘errors’’ and tradeoffs that are made in theorizing): ‘‘In this paper, I will proceed in precisely the opposite fashion to that taken by the advocates of homo sociologicus. I will make an opposite error, but one which may prove more fruitful. I want to begin the
Individuals and Organizations
285
development of a theory of collective decisions, and in so doing I will start with an image of man as wholly free: unsocialized, entirely self-interested, not constrained by norms of a system, but only rationally calculating to further his own interest. This is much the image of man held by economists, and with it the economists have answered one part of Hobbes’s question: how is it that although the men who make it up are wholly self-interested, the economic system can operate with one man’s actions benefiting others. It was the genius of Adam Smith to pose an answer to this part of Hobbes’s question.’’ (Coleman, 1964, p. 167)
REFERENCES Abell, P. (2003). On the prospects for a unified social science: Economics and sociology. SocioEconomic Review, 1, 1–26. Agassi, J. (1960). Methodological individualism. British Journal of Sociology, 11, 244–270. Alchian, A. A. (1950). Uncertainty, evolution, and economic theory, In: Idem. 1977. Economic forces at work. Indianapolis: Liberty Press. Arrow, K. J. (1951). Social choice and individual values. New York: Wiley. Arrow, K. J. (1974). The limits of organization. New York: W.W. Norton. Barney, J. B. (1986). Organizational culture: Can it be a source of sustained competitive advantage? Academy of Management Review, 11, 656–665. Barney, J. B. (2001). Is the resource-based ‘view’ a useful perspective for strategic management research? Yes. Academy of Management Review, 26, 41–57. Bhaskar, R. (1978). A realistic theory of science. Sussex: Harvester Press. Boudon, R. (1998a). Social mechanisms without black boxes. In: P. Hedstrom & R. Swedberg (Eds), Social mechanisms: An analytical approach to social theory (pp. 172–203). Cambridge: Cambridge University Press. Boudon, R. (1998b). Limitations of rational choice theory. American Journal of Sociology, 104, 817–828. Boudon, R. (2004). The poverty of relativism. Oxford: Bardwell Press. Brennan, G., & Buchanan, J. M. (1985). The reason of rules: Constitutional political economy. Cambridge: Cambridge University Press. Bromiley, P. (2005). The behavioural foundations of strategic management. Malden, MA: Blackwell Publishing. Coff, R. W. (1999). When competitive advantage doesn’t lead to performance: Resource-based theory and stakeholder bargaining power. Organization Science, 10, 119–133. Coleman, J. S. (1964). Collective decisions. Sociological Inquiry, 34, 166–181. Coleman, J. S. (1986). Social theory, social research, and a theory of action. American Journal of Sociology, 91, 1309–1335. Coleman, J. S. (1990). Foundations of social theory. Cambridge (Mass.)/London: The Belknap Press of Harvard University Press. Cowan, R., & Rizzo, M. (1996). The genetic-causal tradition and modern economic theory. Kyklos, 49, 273–317. Cyert, R., & March, J. G. (1963). A behavioral theory of the firm. Englewood Cliffs, NJ: Prentice-Hall.
286
TEPPO FELIN AND NICOLAI FOSS
Dansereau, F., Yammarino, F. J., & Kohles, J. C. (1999). Multiple levels of analysis from a longitudinal perspective: Some implications for theory building. Academy of Management Review, 24, 346–357. Davis-Blake, A., & Pfeffer, J. (1989). Just a mirage: The search for dispositional effects in organizational research. Academy of Management Review, 14, 385–400. Dosi, G. (1995). Hierarchies, markets and power: Some foundational issues on the nature of contemporary economic organizations. Industrial and Corporate Change, 4, 1–19. Dosi, G., Marengo, L., Bassanini, A., & Valent, M. (1999). Norms as emergent properties of adaptive learning: The case of economic routines. Journal of Evolutionary Economics, 9, 5–26. DiMaggio, P. J., & Powell, W. (1983). The iron cage revisited: Institutional isomorphism and collective rationality in organizational fields. American Sociological Review, 48, 147–160. Durkheim, E. (1962). The rules of the sociological method. Glencoe, IL: Free Press. Dyer, J., & Singh, H. (1998). The relational view: Cooperative strategy and sources of interorganizational competitive advantage. Academy of Management Review, 23, 660–679. Eisenhardt, K., & Martin, J. (2000). Dynamic capabilities: What are they? Strategic Management Journal, 21, 1105–1121. Elster, J. (1983). Explaining technical change. Cambridge: Cambridge University Press. Elster, J. (1989). Nuts and bolts for the social sciences. Cambridge: Cambridge University Press. Elster, J. (1998). A plea for mechanisms. In: P. Hedstrøm & R. Swedberg (Eds), Social mechanisms: An analytical approach to social theory. Cambridge: Cambridge University Press. Felin, T., & Foss, N. J. (2005). Strategic organization: A field in search of microfoundations. Strategic Organization, 3, 441–455. Felin, T., & Foss, N. J. (2006). Organizational routines: A sceptical look. In: M. Becker (Ed.), The handbook of organizational routines. Edward Elgar. Felin, T., & Hesterly, W.S. (2007). The knowledge-based view, heterogeneity, and new value creation: Philosophical considerations on the locus of knowledge. Academy of Management Review, forthcoming. Foss, K., & Foss, N. J. (2005). Value and transaction costs: How property rights economics furthers the resource-based view. Strategic Management Journal, 26, 541–553. Foss, N. J. (1994). Realism and evolutionary economics. Journal of Social and Biological Systems, 17, 21–40. Gavetti, G. (2005). Cognition and hierarchy: Rethinking the microfoundations of capabilities’ development. Organization Science, 16, 599–617. Gigerenzer, G., Todd, P. M., & The ABC Research Group. (1999). Simple heuristics that make us smart. Oxford University Press. Greve, H. R. (2003). Organizational learning from performance feedback: A behavioral perspective on innovation and change. Cambridge: Cambridge University Press. Harre´, R. (1970). The principles of scientific thinking. London: Macmillan. Hayek, F. A. (1937). Economics and knowledge. In: F. A. Hayek (1948) (Ed.), Individualism and economic order. Chicago: University of Chicago Press. Hayek, F. A. (1943). Scientism and the study of society. Economica, 10, 34–63. Hayek, F. A. (1945). The use of knowledge in society. In: F. A. Hayek (1948) (Ed.), Individualism and Economic Order. Chicago: University of Chicago Press. Hayek, F. A. (1952). The counter revolution of science. Chicago: University of Chicago Press. Hayek, F. A. (1973). Law, legislation, and liberty, vol. 1: Rules and order. Chicago: University of Chicago Press.
Individuals and Organizations
287
Hedstrom, P., & Swedberg, R. (1998). Social mechanisms: An introductory essay. In: P. Hedstrom & R. Swedberg (Eds), Social mechanisms: An analytical approach to social theory (pp. 1–31). Cambridge: Cambridge University Press. Hempel, C. G. (1965). Aspects of scientific explanation. New York: The Free Press. Henderson, R., & Cockburn, I. M. (1994). Measuring competence? Exploring firm effects in pharmaceutical research. Strategic Management Journal, 15, 63–84. Hollis, M. (1977). Models of man: Philosophical thoughts on social action. Cambridge: Cambridge University Press. Homans, G. C. (1974). Elementary forms of social behavior (2nd ed.). New York: Harcourt Brace Jovanovich. Hummell, H. J., & Opp, K.-D. (1968). Sociology without sociology – the reduction of sociology to psychology: A program, a test and the theoretical relevance. Inquiry, 11, 205–226. Klein, K. J., Dansereau, F., & Hall, R. J. (1994). Levels issues in theory development, data collection, & analysis. Academy of Management Review, 19, 195–229. Kogut, B., & Zander, U. (1992). Knowledge of the firm, combinative capabilities, and the replication of technology. Organization Science, 3, 383–397. Lacetera, N., Cockburn, I., & Henderson, R. (2004). Do firms change capabilities by hiring new people? A study of the adoption of science-based drug discovery. In: J. A. Baum & A. M. McGahan (Eds), Business strategy over the industry life cycle – Advances in strategic management 21. Oxford: Elsevier/JAI Press. Lachmann, L. M. (1973). Methodological individualism and the market economy. London: Institute for Economic Affairs. Lazarsfeld, P. F., & Menzel, H. (1970). On the relation between individual and collective properties. In: A. Etzioni (Ed.), A sociological reader on complex organizations. London: Holt, Rinehart and Winston. Leijonhufvud, A. (1968). On Keynesian economics and the economics of Keynes. Oxford: Oxford University Press. Levitt, B., & March, J. (1988). Organizational learning. Annual Review of Sociology, 14, 319–340. Lindenberg, S. W. (1988). Contractual relations and weak solidarity: The behavioural basis of restraints on gain-maximization. Journal of Institutional and Theoretical Economics, 144, 39–58. Lippman, S. A., & Rumelt, R. P. (2003a). A bargaining perspective on resource advantage. Strategic Management Journal, 24, 1069–1086. Lippman, S. A., & Rumelt, R. P. (2003b). The payments perspective: Micro-foundations of resource analysis. Strategic Management Journal, 24, 903–927. Lounsbury, M., & Glynn, M. A. (2001). Cultural entrepreneurship: Stories, legitimacy and the acquisition of resources. Strategic Management Journal, 22, 545–564. Lucas, R. E. (1977). Understanding business cycles, in idem. 1980. Studies in business cycle theory. Cambridge, MA: MIT Press. Machlup, F. (1967). Theories of the firm: Marginalist, behavioral and managerial. American Economic Review, 57, 1–33. Mahoney, J. (2003). Tentative answers to questions about causal mechanisms. Paper and presentation from 2003 American political science association meetings. March, J. G. (1991). Exploration and exploitation in organizational learning. Organization Science, 2, 71–87. March, J. G., & Simon, H. (1958). Organizations. New York: Wiley.
288
TEPPO FELIN AND NICOLAI FOSS
Menger, C. (1883). Untersuchungen uber die methode der sozialwissenschaften, und der politischen oekonomie insbesondere. Leibzig: Duncker & Humblot. Meyer, J. W., & Rowan, B. (1977). Institutional organizations: Formal structure as myth and ceremony. American Journal of Sociology, 83, 340–363. Milgrom, P., & Roberts, J. (1992). Economics, organization and management. Englewood Cliffs, NJ: Prentice-Hall. Nelson, R. R., & Winter, S. (1982). An evolutionary theory of economic change. Cambridge, MA: Harvard University Press. Nozick, R. (1974). Anarchy, state, and utopia. New York: Basic Books. Oliver, C. (1997). Sustainable competitive advantage: Combining institutional and resource based views. Strategic Management Journal, 18, 697–713. Olson, M. (1971). The logic of collective action. Cambridge, MA: Harvard University Press. Pfeffer, J. (1997). New directions for organization theory: Problems and prospects. Oxford University Press. Popper, K. R. (1957). The poverty of historicism. London: Routledge and Kegan Paul. Rasmussen, E. (1989). Games and information: An introduction to game theory. Oxford: Basil Blackwell. Rosenberg, A. (1995). The philosophy of social science. Boulder, CO: Westview Press. Satz, D., & Ferejohn, J. (1994). Rational choice and social theory. The Journal of Philosophy, 91, 71–87. Schelling, T. (1978). Micromotives and macrobehavior. New York: W.W. Norton & Co. Schotter, A. (1981). The economic theory of social institutions. Cambridge: Cambridge University Press. Simon, H. A. (1985). Human nature in politics. American Political Science Review, 79, 293–304. Song, J., Almeida, P., & Wu, G. (2003). Learning-by-hiring: When is mobility more likely to facilitate interfirm knowledge transfer? Management Science, 49, 351–365. Spelke, E. S., Breinlinger, K., Macomber, J., & Jacobson, K. (1992). Origins of knowledge. Psychological Review, 99, 605–632. Stanovich, K. E., & West, R. F. (2000). Individual differences in reasoning: Implications for the rationality debate. Behavioral and Brain Sciences, 23, 645–726. Stern, S. (2004). Do scientists pay to be scientists? Management Science, 50, 835–853. Sugden, R. (1986). The economics of rights, cooperation and welfare. Oxford: Blackwell. Teece, D. J., Pisano, G. P., & Shuen, A. (1997). Dynamic capabilities and strategic management. Strategic Management Journal, 18, 509–534. Winter, S. G. (2003). Understanding dynamic capabilities. Strategic Management Journal, 24, 991–995. Zenger, T. R. (1994). Explaining organizational diseconomies of scale in R&D: The allocation of engineering talent, ideas and effort by firm size. Management Science, 40, 708–729. Zollo, M., & Winter, S. G. (2002). Deliberate learning and the evolution of dynamic capabilities. Organization Science, 13, 339–352.
RIGOR AND RELEVANCE USING REPERTORY GRID TECHNIQUE IN STRATEGY RESEARCH Robert P. Wright ABSTRACT The psychological analysis of strategic management issues has gained a great deal of momentum in recent years. Much can be learned by entering the black box of strategic thinking of senior executives and bring new insights on how they see, make sense of, and interpret their everyday strategic experiences. This chapter will focus on a powerful cognitive mapping tool called the Repertory Grid Technique and demonstrate how it has been used in the strategy literature along with how a new and more refined application of the technique can enhance the elicitation of complex strategic cognitions for strategy and Board of Directors research. ‘‘And those who were seen dancing were thought to be insane by those who could not hear the music’’ Friedrich Nietzsche
The purpose of this chapter is to inform strategy researchers about a powerful, rigorous and systematic cognitive mapping method called, Repertory Grid Technique (RGT), and how it can be used in the strategy field to advance our understanding about strategists’ interpretations of their lived experiences. Reactions by past users of the technique both in the management and strategy fields have been mixed and generally not many have used Research Methodology in Strategy and Management, Volume 3, 289–341 Copyright r 2006 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1479-8387/doi:10.1016/S1479-8387(06)03010-4
289
290
ROBERT P. WRIGHT
this tool because of the (mostly unwarranted) criticisms about the approach as being too clumsy, cumbersome and time consuming. We believe these past views are the result of poor grid preparation (both in the research grid design stage and in the mental preparation of the grid participant), the lack of pilot testing of the intended grid, and evidence of an unsystematic following of grid protocol when the grid is administered, inter alia. In this respect, we hope to put all these views to rest, by advocating the simplicity and do-ability of using the grid technique in strategy research. We hope through our detailed yet succinct explanations of the theory, practices and procedures, along with hints and tips, demonstration results (based on actual application of the method) will inspire other strategy researchers to enrich their current methodological and epistemological orientations and to contemplate using this interviewing technique to compliment their current tools of data collection and analysis in developing better theories (Barr, 2004; Hitt, Boyd, & Li, 2004; Smith & Hitt, 2005). In this respect, we begin with a brief overview of the importance of cognitions research in the strategy field along with the importance of mapping strategic cognitions (Huff, 1990; Huff & Jenkins, 2002). We then provide an overview of all the past major works that have used the RGT in the strategy field and what sort of contributions have been made using this technique. Then we explain the theoretical foundations from which this methodology is grounded based on the Theory of Personal Constructs (originating from Clinical Psychology). Having established these roots, we then explain exactly what is a ‘‘Repertory Grid’’ by identifying and describing the key characteristics, including issues of reliability and validity. Then a brief and succinct description of how to administer the technique is explained followed by sample results from actual application of the technique in the author’s current program of research in using the grid technique at the board level. The reader will be shown how to generate the results along with how to interpret the findings. The final part of the chapter will address useful hints and tips in administering the grid technique and some concluding remarks for repertory grids in strategy research.
IMPORTANCE OF COGNITIONS RESEARCH IN THE STRATEGY FIELD With the growing importance of more cognitive research in the strategy field (Gnyawali & Tyler, 2005; Hodgkinson, 1997, 2001a, 2001b, 2005; Hodgkinson & Sparrow, 2002; Tenbrunsel, Galvin, Neale, & Bazerman, 1996) and the
Rigor and Relevance Using Repertory Grids in Strategy Research
291
recent build-up of momentum for more ‘‘micro’’ emphasis on the understanding and the practice of strategy (Jarzabkowski, 2003, 2004, 2005; Johnson, Melin, & Whittington, 2003; Whittington, 1996, 2003), the time is ripe to advocate research that is more fine-grained and goes deep into the mental psyche of how senior executives see, interpret and make sense of their strategizing experiences. Axelrod’s (1976) seminal work on belief systems of political elites suggested that explicit descriptions of knowledge structures in everyday organizational settings was not only possible, but very informative with regard to how organizational members construed their worlds (Huff, 1997; Meindl, Stubbart, & Porac, 1996; Prasad & Prasad, 2002; Smircich & Stubbart, 1985; Stubbart, 1989; Walsh, 1995). The emphasis on psychological analysis of strategic management issues has also generated a great deal of interest over the past two decades with a growing interest in strategic processes, ‘‘mapping’’ strategic thought (Hodgkinson, 1997, 2001a, 2001b, 2005; Huff, 1990; Lyles, 1990) and on linking managerial cognitions to strategic choice, behavioral outcomes and even firm performance (Priem, 1994). In this line of research, we advocate that investigating the knowledge structures of strategists will open up new insights in better explaining and hence predicting strategic behavior. If we can show how executives ‘‘think’’ about their strategizing experience (albeit well conceived/ill conceived) we can learn a great deal about what they look for in crafting winning strategies. We believe such a program of research can move the field, as a strategist thinks. Even in the board and governance literatures, there are calls for new directions for future research along the lines of a more psychological analysis to board issues. Dalton, Daily, Johnson, and Ellstrand (1999) and later Daily, Dalton, and Cannella (2003) advocated the need to ‘‘disconnect’’ from the over-reliance of mainstream theories of Agency and Resource Dependence and into more alternative theories to provide better explanations and insights into how boards function and dysfunction (see also Roberts, McNulty, & Stiles, 2005). Issues of the need for more fine-grained research and venturesome methods (Barr, 2004; Huse, 2000, 2005; Ketchen & Bergh, 2004, 2005) to open up the black-box of board work (Conger, 1988; Dalton, Daily, Ellstrand, & Johnson, 1998; Dalton et al., 1999; Pettigrew, 1992; Walsh, 1995) in better understanding how boards of directors perceive their directorial roles based on their social interactions in a socially constructed world would really advance our understanding of boards, boards of directors and board work as it is actually happening (see Johnson, Daily, & Ellstrand, 1996; Lorsch & MacIver, 1989; Mace, 1971, 1979; Stiles & Taylor,
292
ROBERT P. WRIGHT
2001; Wright, 2004b; Wright, Butler, & Priem, 2003; Zahra & Pearce, 1989). Pettigrew’s (1992) call for more focus on studying actual directors’ behaviors is also in line with this direction for future research on boards (McNulty & Pettigrew, 1999; Pettigrew & McNulty, 1995; Stiles, 2001; Stiles & Taylor, 2001; Zahra & Pearce, 1989). Forbes and Milliken (1999) and Rindova (1999) further advocated the need to investigate the social-psychological processes of these top managers to get new insights to old and vexing problems of how these executives make sense of their strategic worlds. In the midst of this cognitive revolution, and the psychological analysis of strategic issues (Hodgkinson & Sparrow, 2002), cognitive mapping is receiving a great deal of attention in advancing cognition research (EasterbySmith, Thorpe, & Holman, 1996; Eden, 1992; Eden & Ackermann, 1998; Huff & Jenkins, 2002; Pye & Pettigrew, 2005). Fiol and Huff (1992) talked about the shifting nature of cognitive maps and the increasing role it is taking as a decision-making tool for organizational members. Cognitive maps are analogous to a geographer’s map that can tell us where we are and where we can go. One such mapping approach originating out of clinical psychology and grounded in Personal Construct Theory is the RGT (Kelly, 1995). Past grid applications (Easterby-Smith et al., 1996; Jankowicz, 1990; Stewart & Stewart, 1981) and in particular, in strategic management (Calori, Johnson, & Sarnin, 1994; Dutton, Walton, & Abrahamson, 1989; Ginsberg, 1988, 1989; Huff, 1990; Huff & Jenkins, 2002; Reger & Huff, 1993; Reger & Palmer, 1996) have been noteworthy in bringing new insights deep in the strategists’ mindset, and it is their work that we now turn.
Overview of Past Repertory Grid Applications in the Strategy Literature and their Key Contributions to the Field Dunn, Cahill, Dukes, and Ginsberg (1986) applied the repertory grid method to look at how 17 criminal justice practitioners and non-practitioners interpreted criminal justice information in a large urban municipal. They supplied six descriptions of policy functions as elements (e.g. crime prevention, crime detection and apprehension, social order, traffic control, emergency services and community relations) for construct elicitation. They found considerable potential in the grid technique as a method of eliciting the diverse systems of interpretations surrounding public policy issues. Dunn and Ginsberg (1986) further advocated the grid technique by calling it a socio-cognitive network methodology to investigate organizational
Rigor and Relevance Using Repertory Grids in Strategy Research
293
dynamics. They used different organizational interventions as elements such as, management inventory system, strategic planning system, office automation system, decision support system, quality circles and collateral organization. The elicited constructs were then ranked (rather than rated). They concluded that when compared to other methods the gird technique was flexible, efficient, systematic and easily reproducible in understanding frames of reference. Ginsberg (1989, 1990, 1994) took the grid work further by applying it to issues of construing business portfolio (through strategic business units) in better understanding how managers made sense of diversification. In doing so he made reference to Prahalad and Bettis’s (1986) seminal paper in the Strategic Management Journal calling for the use of more creative methods to elicit top managers’ cognitive maps. In looking at a firm’s six strategic business units and applying Personal Construct Theory (as a basis of socio-cognitive approach), Ginsberg was able to successfully demonstrate the powers of the grid technique in studying strategic diversification issues. He also raised several important questions about building a model of diversification based on top managers’ belief systems and that past approaches were biased by researcher intrusion because they employed their own constructs on those researched and hence were not able to capture the mental models of how real corporate managers lived their experiences. In his Academy of Management Review paper, he advocated that through a sociocognitive approach such as the grid technique, researcher could begin to answer questions about belief systems of top managers in organizational life (Ginsberg, 1990; see also Ginsberg’s (1994) SMJ article; and Calori et al., 1994 made reference to the appropriateness of using the repertory grid technique in looking at CEO’s complex cognitive maps facing environmental uncertainty). Walton (1986) looked at how top managers in the financial industry categorize successful and unsuccessful firms. As part of a phenomenological and multi-method design, he employed the grid technique to elicit how top managers described and categorized 10 firms (respondents each listed 5 successful firms and 5 less successful firms, including their own firm) as elements. His findings inform us that by uncovering managers’ implicit frames may provide them with some control over them and that managers appear to structure descriptions of firms around central features forming coherent impressions. Dutton et al. (1989) later elicited constructs/dimensions used to sort strategic issues in a Port Authority. Their approach compared the 26 dimensions identified in three literatures to their empirical work using the grid technique to see the degree of overlap in the dimensions strategic decision makers used when understanding strategic issues. They
294
ROBERT P. WRIGHT
advocated that these dimensions influenced decision makers’ attention. Their findings suggest that describing the meaning decision makers apply to issues maybe equally important for understanding the links between cognition and individual and organizational actions. Reger’s (1987, 1990) work on strategic groups has also been exemplary in applying the (minimum context form of the) grid technique. She advocated that cognitive frames were worth studying because strategists can only reformulate strategies based on their perceptions of competitive positioning filtered through existing frameworks. Her longitudinal exploratory study in the Chicago banking market (1982–1985) investigated 18 of the largest bank holding companies as elements for construct elicitation. She cleverly focused her grid questions so that her 24 respondents described to her how different banks were strategically similar in the way they operated and carried out their business in the market. Her findings provided differences in how competitive position was defined by key players. Further developments from these findings by Reger and Huff (1993) asked an extended research question about whether perceptions of competitive groupings were widely shared or whether they were more idiosyncratic. Using the same data set, they found that industry participants actually developed shared perceptions about strategic commonalities among firms. Using Personal Construct Theory in explaining their findings (see also Reger, Gustafson, Demarie, & Mullane, 1994), they found that executive perceptions were similar among subgroups of firms in an industry. In an extended study comparing the findings from Reger’s (1987) Chicago study (classified as an industry undergoing medium change) and their grid study of 25 upper echelon strategic managers in the Arizona Financial Intermediary industry in 1989 (characterized as a highly changing environment due to the deregulating industry at the time), Reger and Palmer (1996) also compared these field works with Walton’s (1986) study of the New York finance industry (classified as low environmental change) in studying executive cognitions over time when environments change. Their findings showed that managers tended to rely on old/obsolete maps to navigate new environments. Hence deducing that to function effectively, managers needed to update their cognitive maps of the world to keep up with the changing market forces (Hodgkinson’s (1997, 2005), longitudinal study of the UK Residential Estate Agents industry using a modified grid approach also found the existence of cognitive inertia). Reger and Palmer’s study also found that managers in competing firms did in fact viewed competition differently in turbulent environments. They also advocated that the qualitative analysis of the content of dimensions generated from the grid technique was especially suited for comparing
Rigor and Relevance Using Repertory Grids in Strategy Research
295
mental models across individuals (see also Porac & Thomas, 1990; and Thomas & Venkatraman, 1988). Daniels, Johnson, and de Chernatony (1994) investigated differences in managerial cognitions of competition in the UK Off-Shore Pump industry using a visual card sorting mapping technique and a variation of the RGT to increase the robustness of their results. Their findings showed diversity of cognitions of competition (see also Hodgkinson & Johnson (1994), though not using the grid technique in this instance found considerable variations in cognitive competitive structures overall and advised that it was questionable to aggregate perceptual data, though a high degree of intra-organizational agreement was found) (Simpson & Wilson’s (1999) study of two organizations operating in objectively similar environments in New Zealand found that based on the 24 grid interviews they conducted in these two companies that commonality of cognitive structures and individuality of cognitive maps existed simultaneously) (Spencer, Peyrefitte, & Churchman (2003), in looking at the entire local healthcare hospital administrator market and using a modified version of the full context form of the grid technique also found consensus and divergence in the belief structures of their 20 respondents). In a later study, Daniels, de Chernatony, and Johnson (1995) compared both these mapping methods (the RGT they administered being a modified version) in the study of competitive industry structures. They concluded that though the visual card sorting method was not superior to the grid technique and that both techniques elicited non-trivial constructs, the card sorting method was more practical and easy to administer, relegating the grid technique as potentially too cumbersome. (Brown (1992), similarly compared different mapping methods and found respondents were quite negative about the grid technique – perhaps this was unjustly so because respondents were exhausted after undergoing two different mapping methods prior to commencing the grid technique). Daniels, Johnson, and de Chernatony (2002) continued to use both the visual card sorting methods and a modification of the grid technique to investigate, this time, the influences of task and institutional forces on managers’ mental models (see Daniels & Johnson (2002) and Hodgkinson (2002) for further discussions on the issues raised on Daniel et al. (2002) approach used). O’Higgins’ (2002) work on what constitutes effective non-executive directors on Irish Boards using six role titles of effective and ineffective directors as elements, allows us to understand the mental models of non-executive directors. The purpose of her study was purely for construct elicitation to see what dimensions are used to select such directors on Irish boards. Her findings showed through (partial use of the grid technique) that the
296
ROBERT P. WRIGHT
cognitive process structures of the 26 prominent non-executive directors and chairmen interviewed, showed remarkable homogeneity in what they looked for to sit on their boards (indirectly revealing the lack of diversity in Irish boards). De Leon and Guild (2003) applied the full RGT (which is quite rare to see as most applications are either modifications of the method and/or partial application) using six different business plans (two that were successful, two that were marginal and two that were rejected) to elicit what venture capitalists and entrepreneurs look for in ‘‘intangibles’’. The findings further advocated the significant contributions made by the use of the grid technique in identifying intangibles assessed by investors and communicated by entrepreneurs. They praised the technique because it avoided contaminating the elicited constructs experts use to evaluate business proposals. Finally the work of Wright et al. (2003) and Wright (2004a, 2004b, 2004c, 2005a, 2005b) provides a slightly more advance application of the original full grid technique in eliciting more complex managerial cognitions in the strategy field. In the first case, they used a unique set of verb elements to represent the entire strategy-making process to elicit how top managers make sense of their strategy-making experience. Their findings were able to clearly identify, for the first time, the core perceptual dimensions (the labelling of the x- and y-axis on a two dimensional cognitive map) and the language executives actually used to describe their experience with the different strategy-making activities. Wright (2004b) later compared strategic cognitions of the strategy-making process between high- and low-performing firms defined in terms of return on equity (ROE) and found marked differences in the way executives in these two types of firms strategically think about their strategizing. In a later work, using the same improved technique of verb elements for construct elicitation, this time on board work, new insights on what board members looked for in effective boards was established. Given the richness of these new insights generated from the use of the RGT in advancing our knowledge about strategy and strategy-making, we hope this chapter will bring you closer to appreciating the powers of the grid technique. In particular, building on the noteworthy works of past grid users, we hope the more advanced application of the grid technique (as advocated by Wright et al., 2003) will help us take the utility of the grid technique to a new level both in terms of their use of unique grid elements for construct elicitations (which allows us to elicit more complex strategic cognitions) and in their extension of idiographic maps into collective cognitions to detect emerging knowledge structures when groups of similar
Rigor and Relevance Using Repertory Grids in Strategy Research
297
people exposed to similar experiences are brought together. It is to these more recent works (built on a solid foundation of past noteworthy grid users in the strategy field) that we focus our chapter discussion on the RGT. Before doing so, we feel it is important to have an understanding about the theoretical foundations of the technique, and it is to this underpinning we now address.
OVERVIEW OF THEORETICAL FOUNDATIONS: PERSONAL CONSTRUCT THEORY Kelly (1955) (a clinical psychologist) pioneered his theory of personality in the 1930s during the days of the Great Depression. In helping him understand how people see the world, he perceived three major concerns of the time. In order to establish any strong predictions of human behavior, one needed to study the masses in order to produce any meaningful results. This number’s game was not practical for Kelly as he was more concerned with making rigorous predictions at the individual and group level. Second, all diagnosticians suffer from an underlying concern that their own analysis contributes to what is being diagnosed, resulting in serious observer-bias in understanding another person’s point of view. Lastly, he found that the role of the expert was overly rated. People depended on the word of the expert to tell them what the problem was. ‘‘If you don’t know what’s wrong with a client, ask him;1 he may tell you!’’ (Kelly, 1955, p. 201). Hence, Kelly invented the RGT as a way of helping people become their own expert. The psychology of the day also worried Kelly because it advocated that people were influenced by the environment (stimulus–response). He strongly opposed this notion by theorizing that a person dictates his/her own actions and that it was not the environment per se that had a controlling effect. An individual is not a victim of the world (not a passive responder), rather an active construer of his/her own destiny (this is very similar to the term ‘‘sensemaking’’ coined by Weick, 2001). In this respect, Kelly formulates his Fundamental Postulate which forms the backbone of his entire Theory of Personal Constructs: ‘‘A person’s processes are psychologically channeled by the ways in which he/she anticipates events’’ (see also Easterby-Smith et al., 1996). He viewed the person as a scientist in a constant state of flux, always interpreting and reinterpreting, construing and re-construing the world that is heavily influenced by experience (Blowers & O’Connor, 1996; Burr & Butt, 1992; Ginsberg, 1989; Huff, 1997; Kelly, 1955). Kelly called
298
ROBERT P. WRIGHT
this network of interpretations a system of constructs, and introduced the notion of Constructive Alternativism to indicate the many varied possible constructions available to the individual in construing/making sense of the world. In a constructed world, a person’s view is merely one of many (Kelly, 1955; Meindl et al., 1996; Stewart & Stewart, 1981). Kelly invented the grid technique as a way to get people to exhibit their construct systems and serve as a psychological mirror for reflection. In essence, the fundamental focus of the Theory of Personal Constructs when applied in the strategy area is the analysis of the system of constructs strategists use to analyze, understand, structure, make sense of and change his/her environment. This can best be further elaborated through Kelly’s (1955) 11-Corollaries as described by Scheer and Catina (1996). Constructs are the way we construe, interpret and make sense of the world around us. They are always bi-polar in nature, simply because Kelly believed that we think in terms of contrasts (dichotomy corollary); the word ‘‘good’’ does not mean much by itself until it is stacked against the word ‘‘evil’’. It is through this dichotomous dimension that we make sense of the world. As such, an individual’s constructs have direct meaning because it forms the foundation of our construction of reality (construction corollary). Constructs are organized into hierarchical systems of structure according to their importance to an individual (organization corollary). These different systems of constructs may compete and contradict each other when viewed at the same time (fragmentation corollary) and are very much based on experience (experience corollary). The person will always choose that side of the bi-polar construct that best reflects his motivation or preference, which in effect mirrors his/her true inclination (choice corollary). Constructs are meaningful to the individual; this is why Kelly termed his theory a Theory of ‘Personal’ Constructs (individual corollary). In spite of this ‘personal’ nature of construing, it is possible for an individual to construe the world in similar ways to others (commonality corollary); and even if construing is different to others, the person needs to be able to relate to and understand, to a reasonable degree, other people’s construing in order to be a part of a socially constructed world (sociality corollary). It is also important to note that an individual’s constructs have a limited usage or range of convenience (range corollary); and a construct’s ability to welcome new events and life experiences will vary from construct to construct (modulation corollary). Now that we have an idea about the theory behind the grid technique, let us now turn to the actual method and see how it is used to help explain the psychology of personal constructs.
Rigor and Relevance Using Repertory Grids in Strategy Research
299
What is a Repertory Grid? A repertory grid is a cognitive mapping tool used to elicit and analyze the cognitive thought processes of individuals through a structured interview technique. The method engages the interviewee in an intense conversation (Centre for Person Computer Studies, 1993) to investigate the way people construe and make sense of their world. The actual grid itself is in fact, an empty matrix/grid form that is used to record the interview conversation. In its empty state, the grid is deceptively structured with its matrix-like appearance. But the powers of the grid are in its true flexibility and versatility once the purpose of the interview is established and the domain of investigation is incorporated into the design of the grid interview. In the strategy field, the domain of interest can range from, how top managers make sense of competition, defining strategic groups, what constitutes effective directorship on boards; to even how the entire strategic management process are perceived and interpreted. It has been said that the grid can be as creative as the imagination (Fransella & Bannister, 1977; Stewart & Stewart, 1981). In other words, what you put into the grid will determine what you get out of it. The repertory grid form is a two-way classification of data in which issues for investigation are integrated with the individual’s views of the world (based on lived experience). Fig. 1 shows a sample of an individual Deputy Chairman’s completed repertory grid about his experience with nine critical board activities generated from the author’s current program of research. Four essential features of a grid must be considered and pilot tested before any administration of the technique is carried out: elements, constructs, rating the grid and forced choice (readers can resort to Easterby-Smith, 1980; Easterby-Smith et al., 1996; Fransella, 2003; Jankowicz, 1990, 2003; and Stewart & Stewart, 1981, for excellent accounts on wider applications and deeper appreciation of alternative designs of the grid technique). This chapter will demonstrate how a particular approach is used in eliciting more complex managerial cognitions in the strategy field (as pioneered by Wright et al., 2003). Elements (Listed Vertically in the Middle of the Grid) Elements are a key feature of the grid that determines the type and quality of construal elicited from interviewees. They define the research area to be investigated (Bell, Vince, & Costigan, 2002). In Fig. 1, our elements are labelled as E1, E2, E3 y E9. All grid elements must demonstrate the following characteristics:
Similar 1
Reviewing & Formulating Mission, Long-term Objectives & Strategies
Evaluating Firm Performance Against Plans & Budgets
Building External Relationships with Key Stakeholders
Carrying out the Critical Board Roles the Way I Prefer
Developing & Evaluating CEO & Senior Management Talent
Overseeing Strategy Implementation Initiative
Monitoring the Ethical & LegalCompliance of the Company
Analyzing Organizational Resources & Environment (SWOT)
Acquiring / Securing Critical Resources for Company Success
Construct Pole Emergent Pole
Contrast Pole Implicit Pole
Different 5
E1 E2 E3 E4 E5 E6 E7 E8 E9 1
3
3
1
123 commitment in thought & action
2
5
1
1
456 consciously not window-dressing / strong executive inertia
3
3
2
1
789 management strong & self reflective (focused)
4
4
3
147 clear concept of where we want to be
5
2
1 1
5 2
258 too busy to reflect
6
5 3
369 well thought-out (systems in place)
7
3
1
Not carried out very well
3
3 2
3
3
not self reflective enough
1
5
1
3
1 2
competition withother priorities
5
5
deploying assets right
1
1 2
2
1
5
less apparent
1 1
1 1
2
2
3 3
1
5
1
1
2
3 3
5
1 4
1 2
1
1
3
3
in competition with profitability need to devote more time
Rating of 3 can also mean = does not apply Using Kelly's (1955) Repertory Grid Technique
Fig. 1.
Sample of completed Repertory Grid for a Deputy Chairman (Sample grid focus ¼ Critical Board Roles).
ROBERT P. WRIGHT
Carried out very well
300
Your Board's CURRENT Experience with 9 Critical Board Roles. NOT WHAT SHOULD BE!
1 2 3 4 5 6 7 8 9 Suggested Systematic RandomTriads of 9 Board elements
Rigor and Relevance Using Repertory Grids in Strategy Research
301
Homogeneous. Elements must either be all objects, all people, all events or all situations and not a combination of these groups because it can get problematic when administering the grid for constructs (though Wright and Lam (2002), successfully demonstrated that it was possible to use different categories of elements in one grid by converting them all into ‘‘doing words’’ to make them all appear the same). The present chapter demonstration adopted Wright and Lam’s approach in order to capture more complex cognitions as (sometimes) strategy-making is made up of people, objects, activities and events, and as such it would be more realistic to elicit the language senior executives use to make sense of this complex process of strategizing. Strategizing and strategic management is and can be seen as a complex array of activities. More conventional elements used in strategy research can be names of competitors, types of strategic decisions or even CEOs or directors as elements. In each of these examples, the list of elements are of the same kind. However, if you begin to venture into areas of strategy-making that involves a combination of people, events, situations, etc. y and want to see how top managers make sense of them, then you would need to convert them into ‘‘doing words’’ (or verbs) in order to make them appear the same – in this case – all verbs. Hence, an example could be of the entire ambit of what boards of directors do on boards. In such a case as in Fig. 1, the critical board roles/activities can be summed up as: E1: E2: E3: E4: E5: E6: E7: E8: E9:
Securing critical resources for company success Analyzing organizational resources and environment (SWOT) Monitoring ethical and legal compliance of company Overseeing strategy implementation initiative Developing and evaluating CEO and senior management team Carrying out the board roles the way I prefer Building external relationships with key stakeholders Evaluating firm performance against plans and budgets Reviewing mission, long-term objectives and strategies
Or if you are interested in looking at the entire strategy-making process, the element could be: E1: E2: E3: E4: E5: E6:
Developing mission/vision Implementing strategy Analyzing organizational resources/capabilities (strengths/weaknesses) Developing core values/corporate culture Generating management practices and systems Carrying out the strategic management process the way you prefer
302
ROBERT P. WRIGHT
E7: Formulating long-term objectives E8: Evaluating firm performance E9: Enforcing corporate governance E10: Liaising with the top management team E11: Analyzing industry and environment (opportunities/threats) E12: Formulating new strategies By looking at grid elements this way, you will begin to open up opportunities for capturing more complex strategy cognitions. Discrete and Representative. Elements should provide a reasonable coverage of most aspects of whatever is being investigated, covering the four-corners of the subject matter. Elements also need to be discrete so that a wider range of construction can be elicited from respondents. Elements that are sub-sets of another must not be included, as this will make differentiation between them problematic. Also note that if your chosen elements can have opposites, then it is more likely that you have ‘‘constructs’’ as elements. This will make the administration of the grid technique very clumsy and cumbersome (which is probably one of the key reasons why so often people give up on the grid technique because they have misapplied the method which in turn confuses the grid respondent). Hence for first time grid users, it is always wiser to dabble with simple and very discrete elements like making them all people elements, all objects, or all events or all situations (and not a combination of them). For more experienced grid users, using verbs as elements may sometimes appear that opposites can be generated, yet this can be easily avoided if you word the verbs in a manner that does not appear so. As Short as Possible. Elements must be specific and easily understood by the respondent. In this respect, while the number of elements can range from 8 to 24 (even more in clinical settings), about nine elements is an adequate number for most managerial applications when following the full grid protocol as described in this chapter (Easterby-Smith et al., 1996; Stewart & Stewart, 1981). This is because any larger number will significantly extend the time taken to administer the grid for construct elicitation. As a matter of fact, proper grid protocol suggests that nine representative elements of any domain of research interest should be sufficient enough to comb all four corners of one’s area of research – if one’s elements are indeed representative. One must also be mindful of the type of respondents the grid will be administered to. In the strategy field, it is most likely that these will be senior executives at the upper echelon of their organizations holding roles in the
Rigor and Relevance Using Repertory Grids in Strategy Research
303
top management teams and or the board of directors. Given this sample’s demand for time and strategic deadlines, we need to ensure a quick and easy entry and exit when interviewing these executives. When administered properly using nine elements, it can be comfortably accomplished within a solid one hour of engaging conversation and elicitation; so pilot testing with fellow colleagues and even on oneself before going out into the field would be wise advice to take here. Previously Experienced. When choosing strategy elements, they must be well known to the person to whom the grid is being administered. A general rule is for the respondent to have had actual experience (preferably current or recent experience) with each of the elements so that the personal constructs generated from the grid interview are relevant and meaningful. In spite of this, the grid technique can also be used for scenario situations to gauge how top managers would respond to the certain given situations/crisis. Hence for such grid applications, we would be more interested in how and what their knowledge structures look like when put in situations unfamiliar to them. Some Good Elements/Some not so Good. Kelly (1955) firmly believed that when eliciting personal constructs from people, it is always useful to have in among the list for comparison, a variety of good and not so good elements. This is because when the typical grid is actually administered, the respondent will be better able to distinguish which elements are more similar than others in making distinctions about the way they see the world; and this is important in the generation of the elicited bi-polar constructs (language) senior executives use. So for example, if you are interested to research what makes effective directorship on listed company boards, then you will have a list of some directors on listed boards that are very good at what they do and some who do not do a good job; sometime, if you are not able to decide this, a list of all directors (assuming the focus is on one board) will suffice in discriminating the good from the bad based on the interviewed directors’ own personal experience with each board member. Supply the Elements or Elicit the Elements. Before conducting a repertory grid interview, you need to decide on the focus of your research (often called your domain of interest, in grid terms). Once this is decided, you then need to articulate the purpose of doing the grid interview: What is it that you want to achieve by interviewing senior executives? Once these two core and critical issues are decided, then you will need to decide what your elements for construct elicitations are going to be. Remember that the elements you
304
ROBERT P. WRIGHT
decide on in your strategy research will tell you basically what it is you are trying to find out about top managers’ mental models. To do this, you need to decide whether you will supply them or elicit from them. If you want to elicit elements of your domain of interest from your respondents, then you would need to ask them. There are several ways to do this; two approaches in particular that are common to grid work are simply ask the interviewee to list them (say for example the names of their key competitors; or name four strategic decisions that were recently made that lead to company success/and name four other strategic decisions that were made recently that lead to problems for the company). From these types of lists, the interviewer gets a feel of what the strategist perceives to be competitors/good and poor decisions. Once these elements have been elicited, the interviewer can then randomly select three elicited elements at a time to see how they are perceived in the language of the executive. Another approach to elicit elements is to provide a series of ‘‘role-title’’ questions from which the respondent provides answers. These answers will form the elements to be used in the construct elicitation stage. A useful technique suggested by Stewart and Stewart (1981) is to ask the questions in pairs. So if we take the example of studying effective directorship on boards, we could ask the respondent to please name
E1: A highly effective Director on the Board E2: A less effective Director on the Board E3: Yourself now on the Board E4: An average Director on the Board E5: Another less effective Director on the Board E6: The ‘‘Ideal’’ Director E7: Another highly effective Director on the Board E8: Another average Director on the Board E9: Yourself as you would like to be when you have really mastered being a director
(example answer: William) (example answer: Philip)
(example answer: Wendy) (example answer: John)
(example answer: Andy) (example answer: Christina)
Rigor and Relevance Using Repertory Grids in Strategy Research
305
(Naturally, the idea is not to have the same name come up twice; and if it does, the respondent is asked to think of another person. Note that sometimes, executives do not want to provide the real names for fear of disclosure, in which case, it would be okay to ask them to use initials or only first names, or whatever – the researcher does not really need to know exactly who these people are – so long as the person being interviewed can recall who is who). Fig. 2 provides an example of a completed grid by an Executive Director about her dealings with different types of directors on her board. If the broader objective of your research is to compare the language/ knowledge structures between groups of strategists/top managers, then it becomes important to ensure all the elements (or supplied constructs) are the same for each grid interview administered to each and every respondent. Hence, by asking role title questions across your sample set will provide you with the consistency of element-types when you want to compare the elicited constructs at a later stage of your data analysis. Another approach is to supply the elements to each and every respondent – (also ensuring that every interviewee is exposed to the same elements to be systematic – and to act as a common denominator for cross-sample comparison later). By supplying the elements for construct elicitation, we do lose a bit of the ‘‘observer-free bias’’ that is the original characteristic of the grid. Nevertheless, this small degree of researcher intrusion into the research design is compensated by the wealth of rich data source that is elicited and compared between sample groups for emerging mental models of the way strategists see, interpret and make sense of their worlds. In the case of our own research on effective boards as an example (see Fig. 1), we wanted to see how boards of directors make sense of their board experiences. To do this, we had to find out what boards really do so that we could then begin to administer a grid interview eliciting how board members see, interpret and make sense of their board experiences. In narrowing down our nine critical board roles (as elements, as in Fig. 1) we first examined the extant literature on boards and what boards do. Once a preliminary list of things boards do are short listed (x-number), the next step is to go out into the field and ask actual board members (Chairmen, CEOs, Executive Directors, Independent/Non-Executive Directors) on whether our list from the literature was consistent with what they actually do on boards. Once nine key and core critical board activities are further short listed (merged and confirmed from the literature and the field) we would then conduct another pilot test; this time, using the short listed board activities as elements and actually administering the grid technique on 2–5 board members from
Similar 1
Myself when I have really mastered being a director
Christina
Andy
The Ideal Director
John
Wendy
Contrast Pole Implicit Pole
Different 5
E1 E2 E3 E4 E5 E6 E7 E8 E9 1
789 high EQ/can detach from emotional (based on facts) 258 more self-centered
2 3 4 5 6
369 have relationships with network
7
123 able to evaluate & look at whole picture 456 ability to communicate eloquently & logically
1
4
2
1
5
1
2
3
2
2
2
1
1
3
1
Ineffective
2
2
3
1
more focused on things that are small
5
1 1
1
1
1
3
1
doesn't communicate as well or as much more emotional (speaking before you think)
1
3
2
1
2
1
1
5
1
1
4
3
5
2
1
1
2
1
doesn't have personality to bring in business
5
1
5
5
5
5
5
1
5
1
1
5
3
2
1
1
2
1
certain priorities beyond needs (less selfish) does not have network
Rating of 3 can also mean = does not apply Using Kelly's (1955) Repertory Grid Technique
Fig. 2.
Sample of completed Repertory Grid for an Executive Director (Sample grid focus ¼ Effective Directorship).
ROBERT P. WRIGHT
Effective
147 ability to bring in business (make it profit)
Philip
William
Construct Pole Emergent Pole
Myself now on the Board
Your CURRENT Experience with 9 Directors on Boards. NOT WHAT SHOULD BE!
306
1 2 3 4 5 6 7 8 9 Suggested Systematic Random Triads of 9 Director elements
Rigor and Relevance Using Repertory Grids in Strategy Research
307
within our intending sample population. This is to ensure that our board elements/activities (though not outlining the entire things boards do) do in fact represent the main activities of board work. The pre-pilot test is also to ensure that our wording of the elements is understandable, that personal board constructs are smoothly elicited and to also test out respondent’s reaction to the technique, inter alia. One interesting observation from the sample of elements mentioned above and in Figs. 1 and 2 is that we have used a wild-card in the deck: Element E6. This we have either termed, the ‘‘ideal’’ or more appropriately, the ‘‘preferred’’ element (see Blowers & O’Connor, 1996). We especially find the inclusion of this type of element in most of our grid application in the strategy field meaningful because it allows us to see (at the data analysis stage) how actual strategy elements are being interpreted and experienced relative to how grid respondents preferred. Such an inclusion in grid work can provide more interesting insights both to the interviewee and for the researcher in how things really are compared to how the strategist prefer them to be. Constructs (Elicited from Respondent and Listed Along the Rows from Left to Right) These are the bi-polar dimensions along which the individual strategist makes sense of his/her world. The individual’s own constructs help him describe and differentiate the strategy elements as he sees it based on his lived experience with them. They are recorded on the repertory grid in his own words (unedited). Constructs are always elicited by presenting respondents with a triad of elements at a time and asked the Kellyian Question: ‘‘In what way are any two of these similar, but different from the third, in terms of y’’ Kelly (1955) deliberately designed the elicitation in this manner because he believed that differentiating between similarity and differences really brings out a stronger meaning to how people see, interpret and make sense of their life experiences. Again, using our example earlier, the word ‘‘good’’ does not mean much unless contrasted against the word ‘‘evil’’. When ‘‘good’’ is placed against ‘‘bad’’, the word conjures up a very different meaning. Hence, according to Kelly, it is the meaning people invest on their experiences that is the potent ingredient (Burr & Butt, 1992). Therefore, through this type of questioning to elicit how people make sense of the world brings new light to how people think; and this can prove to be a very powerful medium of understanding. The second part of the question: ‘‘y in terms of y’’ is equally powerful in that it places a more focused design to the interview. This can be a range of possibilities depending on the purpose of your grid interview. In the case of our current work on effective
308
ROBERT P. WRIGHT
boards and effective directorships, for example, we have used ‘‘in terms of how well or how not well they are carried out’’ (for Fig. 1); and ‘‘in terms of what the board member does that make them effective or ineffective directors on the board’’ (for Fig. 2) (you will notice that we have supplied the first construct for cross-checking purposes). In eliciting personal constructs by presenting three elements at a time to respondents, a good rule is to ensure that each element is triadically compared at least twice in any grid interview to ensure that they have been well represented in the discussion. A standard procedure to elicit constructs based on nine elements is to follow the systematically random pattern of elements 1, 2 and 3; then 4, 5, 6; then 7, 8, 9; and now 1, 4, 7; then 2, 5, 8; and finally 3, 6, 9 (quite simply triading the first three rows, then the three columns). It is interesting to note that prior to going out into the field to interview, the researcher should decided which strategy elements ‘‘in particular’’ do you want the comparisons to be done. Hence, strategically positioning/ordering the elements into the positions of which will be E1, E2, E3 y E9 can be an important issue to address before going out to the field (for our current research on boards and strategy-making, we decided to place our ‘‘preferred’’ element in the position of E6). Two final points to note when eliciting constructs from the interviewee: before asking the Kellying questions about which two are similar but different from the third, it is always helpful to ask the respondent to ‘‘image and visualize’’ actually experiencing each of these elements as we discuss them in the interview. Such a prompt will really make it easy on the interviewee in generating their personal constructs of how they see them based on their actual experience with them (and not what should be). Also, in eliciting constructs, the researcher must use some degree of probing with open and closed questions. Naturally, if you get a construct that says ‘‘male vs. female’’, one should not stop here and record it on the grid form. Rather, the interviewer should probe further or to use Hinkle’s (1965) words: to ladder up or ladder down. Laddering up (by asking ‘why’ questions) will help you elicit more super-ordinate constructs closer to the respondent’s core values. Laddering down (asking ‘what or how’ questions) will provide you with subordinate constructs. The idea is to elicit the highest superordinate construct you can get out of the person before you record it onto the grid form. Therefore, if you administer the grid correctly, you can elicit about 6–10 really core personal constructs of a person’s construct system in the space of one to one and one half full hour. Practice in ‘‘laddering’’ (probing) will lead to better construct elicitation and a richer data set of strategists’ mental models.
Rigor and Relevance Using Repertory Grids in Strategy Research
309
Rating the Grid (The Numbers in the Middle of the Grid) Once as many individual’s constructs are elicited, the grid respondent is asked to rate the strategy elements (on a 5-point rating scale) based on the dimensions (constructs) he uses to make judgments about his strategy experience. This process of linking constructs to elements signifies the way the constructs are used in relation to the elements which articulates the meaning of each side of the construct poles. Kelly (1955) called this process, ‘‘y putting numbers to words’’. Longer scales can be used (for example 7point or even 10-point scales) but not really advisable based on our experience, as it would extend the time taken to complete a full grid) (please see the section on administering the grid, in how the grid is actually rated). Forced Choice of Elicited Strategic Construct Poles Once the repertory of constructs are elicited from the individual and each element is rated on each of the elicited constructs, the grid subject is asked to choose the preferred side of each construct pole by indicating a tick (this checkmark is important because when the completed grid is inputted into a grid program, and cognitive maps produced, we can easily see which side of a bi-polar construct was preferred). The question that is normally asked at this final stage of the interview re-emphasizes the purpose of the interview. So for our effective boards or effective directorship examples; in the first case we would ask, ‘‘which side of each of these bi-polar constructs do you prefer in what makes a highly effective board’’? For our second example on directorship, we would ask, ‘‘which side of each of these bi-polar constructs would you choose in what makes a highly effective director’’? Note that sometimes you may get respondents who do not want to choose between sides and prefer both poles; in which case you need to encourage them (when push comes to shove) which side would they really choose. The choice the individual makes reflects an extension of his inner construct system (Choice Corollary) of the way he construes the world, and this can be very revealing to both interviewee and researcher. A Meaningful Approach to the Grid Technique’s Reliability and to its Validity A prominent theme in the future directions on methodology in the strategic management literature point towards more cognitive orientations employing innovative tools that can generate more fine-grained results, multi-method in approach, longitudinal and combines both qualitative and quantitative methods (Hitt, Gimeno, & Hoskisson, 1998; Snow & Thomas, 1994; Sparrow, 1994). We believe the RGT satisfies these requirements due to its flexibility and ability to rigorously and systematically elicit the cognitive
310
ROBERT P. WRIGHT
perceptions of the way real people construe their real life experiences (Beail & Fisher, 1988; Jankowicz, 1990; Kelly, 1955; Slater, 1977). The technique demonstrates a high degree, free from observer-bias ‘‘in the language of the natives’’, (in effect respondents are asked to tell it as it is, in their own words, without any influence from predetermined questions of the researcher) and presents a high degree of reliability in both qualitative and quantitative characteristics in the generation of strategic knowledge structures (AdamsWebber, 1989; Slater, 1977; Smith, 2000; Wright et al., 2003). These maps have been found to be both powerful reflective psychological mirrors and effective tools for improving managerial decision-making. Conventional test of reliability is for an instrument of measurement to produce identical results when repeated. However, Kelly (1955) argued against this conventional test as inappropriate when applied to the RGT, contending that ‘reliability is that characteristic of a test that makes it insensitive to change’. Yet the very essence of the grid method was to capture change as and when it occurs. This goes to the heart of Kelly’s Fundamental Postulate in that he conceptualized man like a scientist who is always testing and re-testing hypotheses of events based on his everyday experiences. Through this systematic self-diagnosis of experiences, man anticipates the replication of events. And in this very process of constructive alternativism, some constructs within his construct system are confirmed, some challenged, some rejected, and yet new ones are formulated. Man, according to Kelly, is in a constant state of flux and motion, never ever really in a phase of status quo. Given this essence of construal in man’s internal system, it becomes clear why Kelly argued against the conventional notions of replicability and reliability. Mair (1964), Bannister and Mair (1968), Fransella and Bannister (1977), Adams-Webber (1989) and Bell (1990) all advocated that we should not talk about the reliability of the grid as there are many forms of it (as demonstrated in Figs. 1 and 2), each with its specific purpose, domain of interest, involving different numbers of constructs, elements and an infinite array of individual grid scores. Hence it would be more appropriate to refer to the reliability of a specific grid and not to grids in general. In citing the analogy of a failed thermometer producing consistent readings each time, Fransella and Bannister (1977, p. 82) proposed that ‘‘y The overall aim is surely not to produce stable measures – stability or instability exists in what is measured, not in the measure (itself). Our concern is, as Mair (1964) put it, to assess predictable stability and predictable change’’ (italic emphasis added). Indeed, we should be studying the grid from the point of view of when it, in
Rigor and Relevance Using Repertory Grids in Strategy Research
311
fact, shows changes in constructions and what do these changes in construction signify in a person’s own perception of the world (over time) (p. 83). Bannister and Mair (1968, p. 171) also argued that change is an integral characteristic of construct systems and that some constructs will inevitably show degrees of stability and instability over time. ‘‘Super-ordinate constructs may be more stable than subordinate, core role more stable than peripheral, and tight more stable than loose constructs’’. And in this respect some grids may show a lower reliability if the individual being administered with the grid is undergoing a period of transition in his (/her) life, producing constructions different to those originally elicited (Sperlinger, 1976). So much of life and our experiences are about change, and grids are highly sensitive to such changes. Conventional tests of reliability are of little use in evaluating the psychometric properties of repertory grid data. Viney (1988, p. 200) also adds that such traditional reliability tests ignore the constructivist assumptions of personal construct psychology in that ‘‘y reliability may be better conceptualized as dealing with inconsistency than with consistency, that is as an index of sensitivity to change’’ (see also the comments by Blowers and O’Connor, 1996, p. 18). In spite of these views, Table 1 shows some of the more promising findings in the literature about the reliability of the grid. Though not all reliability studies have found high test–retest scores (see for example, Bavelas, Chan, & Guthrie, 1976), the findings documented in Table 1 show very high percentage of repeat constructs signifying, in more traditional terms, that the repertory grid has the potential of producing highly reliable data when the technique was administered again at time T2 (Slater 1977, p. 127) (see also Bell, 1990, p. 42). As for validity issues, grid data can be validated in several ways. Firstly, is the grid measuring/eliciting what it says it will measure? The technique is designed to elicit the cognitive thought-processes of how people see, interpret and make sense of their worlds. The elicited constructs can be further validated by feeding the results back to participants and asked to do the interpreting to ensure whether they are valid from the point of view of the person that was providing the constructs. Another method can be to show the constructs to a sample of the population within which one is undertaking his strategy research and ask them to see if they make sense. This can be further complimented in the researcher’s own analysis of the data set to see if the elicited constructs and their connections with the elements do in fact, make sense. Ultimately, of course, the real test lies with the person from which the constructs were originally elicited. In sum, the grid technique is a
Some Notable Grid Reliability Studies Showing High Test-retest Scores.
Reliability Studies Hunt (1951) Pedersen (1958) Fjeld and Landfield (1961) Fjeld and Landfield (1961) Fransella and Adams (1966) Bannister and Mair (1968) Sperlinger (1976)
Smith (2000)
Stability of elicited constructs. Gave 41 role titles/administered triads. Found 70% repeat constructs Stability of elicited elements. Found 77% reproduction of original elements Repeated Hunt’s (1951) study. Found 80% correlation between 1st and 2nd elicitation Repeated Pedersen’s (1958) study and found 72% reproduction of originally elicited elements Found high correlation of ‘‘like me’’ constructs with other constructs at retest Element variance: rank order photographs on supplied constructs. Found correlation was at 86% Distance between ‘Self’ and 11 other grid elements (rank-order correlation on 2 occasions ¼ 0.95); and found rank-order correlation of identical repeat constructs at 0.85. Found perceptions of one’s similarity to others remain substantially unchanged over long time. Showed significant test-retest reliability for most structural measures, with modal reliability coefficient of 0.85 Reliability of a grid in limited domain: test–retest correlation of 0.73 (for n ¼ 12), test–retest correlation of 0.82 (for n ¼ 32); split-half reliability of 0.92 (for n ¼ 13), split-half reliability of 0.80 (for n ¼ 32) Found impressive high test-retest stability for most structural measures of a grid. Intensity score of 0.87 and a Percentage of Variance Accounted for First Factor (PVAFF) of 0.73 over a 12 month period.
Test–Retest Period 1 week interval 1 week 2 week interval 1 week 1 month 6 week intervals 7.7 months
1hr/1 week/ 1month 3 weeks
12 months
Note: Majorities of reliability studies focuses on repeat construals with short test–retest periods. Test–retest studies with italic periods are the exceptions.
ROBERT P. WRIGHT
Feixas, Lopez Moliner, Navarro Montes, Tudela Mari, and Neimeyer (1992) Neimeyer (1994)
Major Findings
312
Table 1.
Rigor and Relevance Using Repertory Grids in Strategy Research
313
useful valid cognitive mapping tool because of the following reasons: (1) Observer-bias free characteristics (2) Reveals the mental psyche in a highly rigorous, systematic and analytical way (3) Constructs elicited in the language of natives (respondents), and not the researcher (4) Repertory grid elicits constructs of how people see their worlds based on experience (5) Flexibility of application – solid framework (deceptively structured) (6) Concerned with cognitions, perceptions, mental thought-processes, frames of reference about how people construe/construct/interpret/ makes sense of reality (7) Able to produce cognitive maps understandable by intended audience (8) Solidly grounded in the Theory of Personal Constructs How to Administer the Repertory Grid Technique in Strategy Research In this section, we will show you how a single grid is administered and how we aggregated several grids together to look at a different level of analysis to see what cognitions look like at the collective level of understanding. One should note that the original spirit and intent of Kelly’s (1955) work was in the clinical context where he was trying to establish a deeper understanding of his client patients. Hence, his theory of Personal constructs and the repertory grid was originally designed as an idiographic method. However, in venturing into uncharted waters with the technique for the strategy field, not only did we convert the different types of elements within the same elicitation interview to elicit more complex cognitions (as explained above), but we also wanted to see what happens when top managers doing similar work, exposed to similar issues, make sense of their strategizing worlds. In essence the approach we will share here looks to uncover emerging strategic thinking when individual cognitions are brought together to form collective cognitions (Prahalad & Bettis, 1986; Spender, 1989; Tenbrunsel et al., 1996; Walsh, 1995). Our current work in this direction has been able to bring new insights by applying this modified approach and it is to this collective level that we focus most of our administration and data analysis. (Apart from the aggregation approach employed in this section, idiographic cognitive maps, principal component analysis and cluster diagrams can still be generated from Figs. 1 and 2). To reiterate, remember that because we want to compare strategic cognitions, we need to ensure that our elements are kept the same across all one-on-one grid interviews. As discussed earlier, whether you supply or elicit
314
ROBERT P. WRIGHT
them, they need to be consistent and confirmed from both the literature and the field, and pilot tested before starting the actual fieldwork. Once the elements are decided (as in our Fig. 1 example on the nine critical board roles), they are each printed onto a 4.500 300 cards that are later laminated for repeated use. When you have solicited acceptance from board members for interview, a good practice in grid work is to mentally prepare the respondent before the interview and in the interview itself, before the elicitation stage. What has worked for us in the past is to email the executive who has agreed to be interviewed with a brief one-page outline of the intending interview at least one week prior to the meeting. This brief interview outline informs the executives of the framework of the one hour we will spend with them and makes it clear that Part I (about 5 min) is basic data collection (this can be tailored to the type of demographic and other information you might like to seek); Part II (about 10 min) is interesting because it talks about the elements that will be the focus of the discussion on the day and asks them to rate them for initial reflection before the interview. Part III (about 45 min) is really where the real grid work begins and (when administered correctly) can be comfortably (though intensively) achieved within a solid 45 min. When the researcher meets on the day, the same outline will be repeated with Parts I and II being discussed to ‘‘mentally prepare’’ the respondent for the actual grid elicitation. Our past experience has found that Part II is especially useful when we briefly get the respondent to talk about each of the board activities (now presented on laminated cards with the qualifying question – ‘‘y in terms of y’’ printed at the footer of every card to ensure the discussion is focused). For each board element they discuss (E1, E2, E3 y E9)(and remember this section should only be about 10 min tops, so the researcher needs to take control of the time), the interviewee is asked, ‘‘on a scale of 1– 10, where 1 is very poorly done and 10 is very well done, how would you rate this board activity as it is actually happening on your board (and why)?’’ Once Part II is over, explain to the interviewee that you are now going to show a quick demonstration on how you will ask questions for Part III, and that it is very important that they understand the technique because this will be the way all remaining questions will be asked. Present on a laminated A4 page (landscape) three words:
CAR HORSE TRAIN Explain that here are three things, and the way you will ask questions is ‘‘In what way are any two of these similar but different from the third one, what
Rigor and Relevance Using Repertory Grids in Strategy Research
315
would you say?’’ The researcher provides the first example, ‘‘for example, the CAR and HORSE are similar because they both have four legs, whereas the TRAIN is different because it has many legs. So ‘four legs vs. many legs’ – I am going to call this a construct, because this is the way I construe/see these three things; (pause) now can you give me another example in which any two are the same but different from the third’’? Once the interviewee gets the idea, then one can comfortably move onto Part III where one will use exactly the same questioning technique. One will soon find that both interviewer and the interviewee will be even more interested and engaged in the interview from this point on! (A word to note when eliciting constructs is that the two poles of a bi-polar construct do not necessarily have to be opposites. It depends on how one sees the differentiation. Some people with more complex construct systems (Adams-Webber, 1989) may exhibit constructs that are not direct opposite (for example, the opposite to ‘‘black’’ may not be ‘‘white’’ to some, but rather ‘‘happiness’’!). Present the first triad group of three elements cards E1, E2 and E3 to the interviewee and ask the question: ‘‘In terms of how well or how not well they are done on your board, in what way are any two of these critical board roles similar but different from the third?’’ The board member’s replies will be in the form of bi-polar constructs. For example when presented with the following triadic comparison: E1: Acquiring/securing critical resources for company success E2: Analyzing organizational resources and environment (SWOT) E3: Monitoring the ethical and legal compliance of the company The response from ‘‘Herbert’’ (a Deputy Chairman) in Fig. 1 was that ‘y critical board roles 2 and 3 are similar on my board and they are well done because there is commitment in thought and action; and that is why these board roles are well done. However, element 1 (acquiring/securing critical resources for company success) was different and not as well done because when we do it, it is in competition with profitability. Hence the elicited bi-polar construct, commitment in thought and action y in competition with profitability’. (For cross-reference, the researcher writes in 123 on the left of the grid and simultaneously fills in the grid with the elicited construct). Repeat this procedure with each of the nine critical board roles being shown to the Deputy Chairman at least two times so that we are able to tap all four-corners of the board member’s cognitions about the subject. An important point to note at this construct elicitation stage is to ensure that the respondent has ample time to ‘‘think about’’ and respond to the triads of strategy elements put before them. This is because there will
316
ROBERT P. WRIGHT
inevitably be moments of long pauses and contemplation by the respondent in thinking about how two items are similar but different from the third. At these moments, it is strongly advised for the interviewer not to provide any examples or lead the respondent into any frame of thinking (to do so would be a case of imposing the researcher’s constructs on the respondent’s own construing). Once as many constructs are elicited based on the supplied board roles, the respondent is then asked to rate each of the board roles using his own elicited bi-polar constructs using a 5-point scale (a rating of ‘‘1’’ described board activities closest to the left-hand pole of the bi-polar construct elicited; and a rating of ‘‘5’’ represented towards the right-hand side of the elicited bi-polar construct). In the first example (123) one will recall that Elements 2 and 3 were seen as similar based on Herbert’s experience with them; so before Herbert rates the whole row, we can simply either get him to rate each row of constructs based on the ends of the scale (so each cell can be anything from 1,2,3,4 or 5) or a better more systematic technique as advocated by Stewart and Stewart (1981) is to use the following approach: ‘‘Now, for the first construct I elicited from you, you said that elements 2 and 3 were similar, so let me give them both a ‘‘1’’ because you said it was similar; you said they were similar because there is ‘commitment in thought and action’ and that was why they were well done. For E1 – ‘Acquiring and securing critical resources for company success’, you said it was different, so let me give it a ‘‘5’’ (fill in these numbers in their respective cells as you speak them). Now on a scale of 1-all-the-way-to-5 (at this moment, point at both the left and right hand poles of this construct and the headers that says ‘‘similar 1’’ and ‘‘different 5’’ in the grid form); on a scale of ‘commitment in thought & action’ – to – ‘in competition with profitability’, how would you rate E4 – overseeing strategy implementation initiatives – as it is actually happening in your board now? E5? E6? y E9?’’
The Deputy Chairman then rates the remaining elements E4 to E9 using that one dimension. The same procedure is followed for the next row bipolar construct, until all elements are rating using the respondent’s own constructs, based on his own ratings of them, based on what is actually happening right now on his board (and not what should be). At the end of this rating exercise, we ask them to choose that side of each of the bi-polar constructs that in his view makes a highly effective corporate board. These choices are marked with a tick. After this paper-and-pencil recording of the interview, the completed grid is then inputted to the RepGrid program to generate cluster analysis, cognitive maps and other statistical information. Figs. 3 and 4 provide an example of the Deputy Chairman’s individual cognitive map generated from
Well thought-out (systems in place) 15.52% *well thought out (systems in place) need to devote more time E1 Securing Critical Resources for Coy. Success
*deploying assets right
E2Analyzing Org.Resources & Environ.(SWOT)
competition with other priorities E8 Evaluating Firm Perf. Against Plans & Budgets in competition with profitability
*consciously NOT window-dressing (strong executive inertia) E6 Carrying out the Board Roles the Way I Prefer E3 Monitoring Ethical & Legal Compliance of Coy. E7 Building Ext. Relations with Key Stakeholders
Commitment in thought & action
*carried out very well
E4 Overseeing Strategy Implementation Initiative
Not self-reflective enough 67.70% not carried out very well not self reflective enough E5 Developing & Evaluating CEO & Senior Mgt.Team
*commitment in thought & action
too busy to reflect *clear concept of where we want to be (no competition with other priorities)
*management strong & self-reflective (focused) less apparent
Rigor and Relevance Using Repertory Grids in Strategy Research
PrinCom: Herbert (Deputy Chairman) Elements: 9, Constructs: 7, Range:1 to 5, Context: Board Cognitions
Need to devote more time E9 Reviewing Mission L/T Obj. & Strategies
Fig. 3.
Individual Cognitive Map of Herbert, Deputy Chairman (based on his perceptions of 9 Critical Board Roles). 317
318
PrinCom: Herbert (Deputy Chairman) Elements: 9, Constructs: 7, Range:1 to5, Context: Board Cognitions
Well thought-out (systems in place) 15.52%
E1 Securing Critical Resources for Coy. Success E2 Analyzing Org. Resources & Environ. (SWOT)
E8 Evaluating Firm Perf. Against Plans & Budgets
E6 Carrying out the Board Roles the Way I Prefer E3 Monitoring Ethical & Legal Compliance of Coy. E7 Building Ext. Relations with Key Stakeholders
Not self-reflective enough 67.70%
E4 Overseeing Strategy Implementation Initiative E5 Developing & Evaluating CEO & Senior Mgt. Team
Commitment in thought & action
E9 Reviewing Mission L/T Obj. & Strategies
Fig. 4.
Individual Cognitive Map of Herbert, Deputy Chairman (without construct lines).
ROBERT P. WRIGHT
Need to devote more time
Rigor and Relevance Using Repertory Grids in Strategy Research
319
Principal Component Analysis and presented on two core dimensions (x- and y-axes) in the executive’s psychological space: with the construct lines and without the construct lines, respectfully. How to Read a Repertory Grid Cognitive Map The two axes are labelled based on construct loadings and form the backbone from which the board member makes judgments about his board experience (we will explain later how this was achieved using another example). This in itself tells a discovery. It is the first time that such a discovery has been made and it represents the beginning of things to come out of this clinical psychological methodology in board research (though Jackson and Dutton (1988); and Ketchen and Shook (1996) have applied cluster analysis and multi-dimensional Scaling (MDS) in their analysis of strategy issues). Six Simple Steps on How to Read the Cognitive Maps: (1) The nine elements representing the various critical board roles the Deputy Chairman has experience with are shown on the individual maps marked with the letter ‘‘x’’ for elements E1, E2, E3 y E9. (2) The lines you see in Fig. 3 are the bi-polar constructs elicited during the interview and represents the language the Deputy Chairman use to define/judge and make sense of how his board goes about carrying out the nine critical board roles (some construct/vector lines are longer than others, simply signifying that they are loaded most heavily towards a particular principal component). (3) You’ll notice each bi-polar construct line has an asterisk ‘‘*’’ on one side: this is the preferred choice when forced to choose between the two poles in what is preferred in what makes highly effective boards. (4) Using these construct lines, we can describe how the Deputy Chairman sees/describes each of the nine critical board roles. Go to any one of these elements and use the construct lines near them to describe how the board member perceives them. For example, critical board role E5 (developing and evaluating CEO and senior management team) was construed as ‘not carried out well’, ‘too busy to reflect’ and ‘not self reflective enough’. Clearly this is one critical board role that needs urgent attention if it is going to add any form of value for the board and the overall performance of the company. On another note, E6 – carrying out the critical board roles the way Herbert prefers (encircled in the top left-hand quadrant of the map – is described as well thought out
320
ROBERT P. WRIGHT
(systems in place), deploying assets right, consciously not window-dressing (strong executive inertia), and carried out well. Hence, the presentation of this type of cognitive mapping using the grid technique (and the subsequent data analysis stemming from the grid program) is to consider the interpretations from the perspective of the relationship between the constructs and other elicited constructs, between the key elements and other key elements, and between the constructs and the elements, in psychological space (see Blowers & O’Connor, 1996). (5) In Fig. 4 (without construct lines), the picture becomes even more revealing in terms of the way the Deputy Chairman prefers the critical board roles to be carried out (i.e. the location of E6 – encircled) and where the other elements are positioned. The map now takes on a more decision-making mode – where if the other elements such as E5 (as mentioned above) are to be considered as the preferred according to the Deputy Chairman’s view of things, they each need to move in the direction of E6. So element E9 (reviewing mission, long-term objectives and strategies) need to go up the y-axis – need to be more well thoughtout with systems in place when we do it, and at the same time move along towards the left of the x-axis where there needs to be more commitment in thought and action when we do this. Clearly, Fig. 4 provides some interesting insights in looking deep in the cognitive mindset of an Deputy Chairman and where each of the other critical board role activities are location in relation to how he prefers they should be carried out (location of E6). (6) The percentages labelled for the x- and y-axes represent the ‘‘Percentage of Variance’’. In each of these maps we are seeing 83.22% of the Total Percentage of Variance – capturing almost 100% of the entire picture! One can begin to get a feel of this executive’s board cognitions in a way never before documented in the strategy literature. When these maps were fed back to the respondent to validate our findings, it not only showed his ‘‘strategic thinking’’ by way of simple visual images, but it also posed as a reflective psychological mirror of where matters are and what actions need to be taken for improvement. These maps provided more relevance (and rigor) to the respondent (Anderson, Herriot, & Hodgkinson, 2001; Fiol & Huff, 1992; Hodgkinson & Sparrow, 2002) because it was elicited from him using his own words and his own ratings of the critical board roles using his own generated bi-polar constructs, based on his actual experiences at the board level. No doubt the results are intimate and as such presented meaningful information where action could be taken for improvement. The very
Rigor and Relevance Using Repertory Grids in Strategy Research
321
fact that, using this cognitive mapping methodology we were able to systematically and rigorously capture almost 100% of this Deputy Chairman’s knowledge structure opens up a multitude of opportunities for further research using this venturesome technique. As a point of interest, we present Fig. 5 that shows the cognitive map for ‘‘Rosa’’ the Executive Director on our effective directorship study (based on Fig. 2). From Individual Level to Collective Cognitions: Taking Grid Work to a Whole New Level Once several individual grids have been administered from a particular sample group, (in grid work, anywhere between 15 and 25 interviews would present a reasonable number to aggregate into collective cognitions for further comparative analysis. Previous works concerned with bringing cognitions together has looked at shared cognitions (Borman, 1987; Cannon-Bowers & Salas, 2001; Gibson, 2001; Gioia, Donnellon, & Sims, 1989; Langfield-Smith, 1992; Langfield-Smith & Wirth, 1992; Porac, Thomas, & Baden-Fuller, 1989; Prahalad & Bettis, 1986; Rentsch & Klimoski, 2001; Salas & Cannon-Bowers, 2001; Spender, 1989), collective cognitions (Carley, 1997; Jolly, Reynolds, & Slocum, 1988; Schneider & Angelmar, 1993), and multiple grids (Bell, 1997, 2000; Dalton & Dunnet, 1999; Easterby-Smith, 1980; Feixas, Geldschlager, & Neimeyer, 2002; Hill, 1995; Kalekin-Fishman & Walker, 1996; Locatelli & West, 1996; Senior, 1996; Wright & Lam, 2002; Wright, 2004a, 2004b). In taking repertory grid analysis to a new level, we build on these past works and incorporate variations of the data reduction/aggregating method used by Cammock, Nilakant, and Dakin (1995), Hill (1995) and Feixas et al. (2002) to generate collective cognitions. For demonstration purposes and for variety, we look at our past work on eliciting how top managers make sense of their strategy-making process. Hence the elements used are the 12 elements mentioned earlier in the section on supplied elements. The steps taken are outline here: (1) All individually elicited constructs were number-coded (e.g. D1.1, D1.2 y D2.1, D2.2 etc. y to signify the first director interviewed and this is construct 1, construct 2 y D2.1, the second director interviewed and this is construct 1 etc, etc y ). These coded constructs (along with their original ratings) are then typed into an Excel file Worksheet (separating the different sample groups into different Worksheets in the
322
PrinCom: Rosa (Executive Director) Elements: 9, Constructs: 7, Range: 1 to 5, Context: Board Evaluations E4 An Average Director on the Board (Wendy) E3 Yourself Now on the Board
Don’t have personality to bring in business 21.47%
More focused on things that are small 57.99 %
Able to evaluate & look at whole picture
E5 Another Less Effective Director (John) E2 Less Effective Director on the Board (Philip)
E6 The Ideal Director E9 Myself When I Have Mastered Being a Director
E8 Another Average Director (Christina)
E7 Another Highly Effective Director (Andy)
Ability to bring in business (make it profit)
Fig. 5.
Individual Cognitive Map of Rosa, Executive Director (without construct lines).
ROBERT P. WRIGHT
E1 Highly Effective Director on the Board (William)
Rigor and Relevance Using Repertory Grids in Strategy Research
(2)
(3)
(4)
(5)
(6)
323
Excel file). In our study of cognitions of the strategy-making process, we were comparing 18 board members who were also HR directors and their cognitions of the strategy-making process, compared with 18 actual board members who were non-HR directors (namely, Chairman, CEOs, Executive Directors and Independent Non-Executive Directors). Our example will focus on the non-HR directors on the board. Two ‘‘expert’’ judges were assigned these Excel Worksheets to (firstly) separately and individually sort all the bi-polar constructs into groups of similarly worded constructs. For each grouping, a bi-polar label is assigned to best reflect the nature of the emerging construct groupings (see Table 2 for an example of how individual constructs are aggregated using this method). Upon completion, both judges exchange their Excel file Worksheets and try to see if there is agreement in the other expert’s grouping of the raw elicited constructs. After several days, the two experts then meet in a room with a large table. Each expert’s original construct groupings are printed on A4 paper (with large font size) and then cut into strips of paper. These construct strips are then placed on a large table. Both experts review the other expert’s groupings to agree or disagree on the groupings and the labels assigned to them. Any disagreements are discussed and a consensus is reached. Once all groupings are confirmed, the investigator looks at each groupings (as in Table 2, for example) and goes back to the original ratings of the number-coded constructs provided by each grid respondent; and then calculates their average rating for each collective construct groupings. The resulting collective grid – now made up of collective constructs (based on the above process), along with their average ratings of constructs from each grouping, the supplied strategy elements are then input into a repertory grid program called, RepGrid as if it was a single person’s repertory grid to generate a Principal Component Analysis and collective cognitive maps for discussion (Centre for Person Computer Studies, 1993). How to Analyze the Resultant Collective Cognitive Maps, Focus Tree Diagrams (Cluster Analysis), and Construct Correlations
The method used to analyze collective grids is the same as for individual grids. In generating results from the aggregated grid data, we can first
324
Table 2.
Example of How a Collective Construct is Determined Using Individually Elicited Constructs (18 Boards of Directors).
Coded Constructs Elicited from Respondents
* Very Focused
Neglected and Not as Well Focused
Volatile/variability Not too focused Rules in place - short term focus Focus is too narrow Only look after own turf Blockage of communication Do not want to take responsibility (no risk takers) Do not have immediate result on operation Neglected and not as well focused Resources misallocated Ever-changing Short-term focus Too narrowly focused Do not exactly know how coy interacts with customers
1 2 3 4 5 6 7
3 6 82 87 91 R97 98
*Focus on returns *Very focused *Long-term focus *Very focused and do not deviate *Focused on value drivers *Follow-through stuck to our guns *Stay on track
2 2 2 2 2 2 2
8
153
*Immediately impact on operations
2
9 10 11 12 13 14
177 R187 207 209 R228 R299
*Focus in getting it done *Clear/focused *Necessary for bottom line *Very precise *Common focus *Good at developing internally focused staff
2 2 2 2 2 2
Note: Four constructs were reversed so that all preferred poles (marked with an asterisk ‘‘*’’) appear along the same column. Of the 150 constructs elicited from 18 Board members, two ‘‘expert’’ judges sorted similar constructs into groups and then labelled them using subjects’ own words. This table shows that 14 constructs elicited from 18 Boards of Directors formed the emergent collective construct, ‘‘very focused – neglected and not as well focused’’. Inter-rater reliability between the ‘‘experts’’ was 92%.
ROBERT P. WRIGHT
2
Rigor and Relevance Using Repertory Grids in Strategy Research
325
work out the labels of the key principal components (x- and y-axes) for the cognitive map. This is done through the principal component analysis of the grid program. Table 3 provides construct loadings for each of the key principal components. As you can see, 16 collective constructs were generated from 150 constructs from 18 Boards of Directors. The first two components accounted for 61% of the variance while 77% accounted for the total variance of the first three components. Both the x- and y-axes are labelled based on the highest loadings for each principal component. Once we are able to determine the labels for both the axes, we can then produce the cognitive map of all 18 Board members on their collective cognitions of the way they see and interpret their strategy-making experiences. Figs. 6 and 7 show these maps in psychological space. Once again, reading the map is the same as our description of how to read the individual maps provided earlier on effective boards (with the Deputy Chairman, Herbert) and the effective directorship grid (with the Executive Director, Rosa). Moreover, Fig. 7 provides some meaningful reflection on what corrective action needs to be taken if strategy-making is to be improved according to these 18 board members’ view of things. Interestingly, E11 (Analyzing industry and environment – looking at opportunities and threats) was the only strategy-making element that was perceived to be in the same quadrant as their most preferred way of carrying out the strategymaking process (E6). Fig. 8 presents the same dataset in a different way; this time a Focus Tree Diagram or what is commonly known as a cluster diagram. Here we see, based on the ratings given by respondents, the results cluster tightly between those constructs and elements rating in very similar ways. The smaller the clusters on the right of the diagram, means the high the significant match between them when the ratings were provided. Hence, what to look for in these clusters are those constructs that are clustered tightly together. Fig. 8 provides three such notable groupings here. Cluster A shows us that when board members undertake strategy making they perceive that when things are clear on what must be done, there is buy-in from management, which makes things more focused and as a result leads to better allocation of scare resources. Cluster B groups the following constructs together, when there is understanding of the business and the environment, this provides direction and results in all things being aligned. Finally the third noticeable grouping is cluster C (though not as significantly matched as clusters A and B). This shows that when there is no result capturing system in place, it shows that there is not enough experience and knowledge, which makes strategy making very subjective and hard to measure, and the lack of having a robust
Construct Loading on Three Principal Components – All 18 Boards of Directors. Bi-Polar Constructs
C1 C2 C3 C4 C5 C6 C7 C8
*Very focused *Very clear what must be done *Better control/check and balance *Have measurement system *Understanding the business and environment *Have experienced people and knowledge *Better allocation of scare resources *More communication and transparency *Based on facts/objectives *Provides direction *Buy-in from management *Well established process *Buy-in towards common vision *Subject to change *All aligned *Move quickly
PrinCom 1(x-axis)
PrinCom 2 (y-axis)
PrinCom 3
2 2 2 2 2
Neglected and not as well focused Lack of direction Do not have control over it Do not have result capturing system Does not relate to market demands
1.29 1.82 1.33 2.58 2.09
0.40 0.05 0.13 1.19 1.57
0.14 0.39 0.09 0.19 0.12
2
Do not have much experience and knowledge Not spend enough time and resources Not enough communication and transparency Very subjective - not easy to measure Do not have benchmark There is no buy-in Not having robust process Fall short of objectives Too rigid Poorly align Do not move at all Percentage of variance for each component
2.73
1.68
0.74
2.38 1.30
0.47 1.23
0.43 0.63
2.10 2.12 1.69 2.01 0.20 0.37 1.64 2.64 41.56
0.77 1.33 0.02 2.21 0.89 2.12 1.86 1.3 19.20
1.30 1.16 0.38 0.17 1.74 2.84 1.56 1.89 15.86
2 2 2 2 2 2 2 2 2 2
Note: Constructs with the highest loadings are indicated in bold type set. Asterisk ‘‘*’’ indicate respondents’ preferred construct poles. Principal Component 1 (x-axis) is labelled as ‘‘better allocation of resources – not spend enough time on resources’’ Principal Component 2 (y-axis) is labelled as ‘‘all aligned – poorly aligned’’ Principal Component 3 is labelled as ‘‘subject to change – too rigid’’ The first two components accounted for 60.76% of the total variance, and the sum of all 3 Principal Components accounted for 76.62% of the total variance.
ROBERT P. WRIGHT
C9 C10 C11 C12 C13 C14 C15 C16
326
Table 3.
E8 Evaluating Firm Performance
*well established process
Poorly aligned too rigid
poorly align
*have experienced people and knowledge
Better allocation of resources
*have measurement system fall short of objectives E9 Enforcing Corporate Governance *based on facts /objectives
not spend enough time & resources do not have control over it lack of direction there is no buy-in neglected & not as well focused Not spend enough E10 Liaising with Top Management Team time on resources E12 Formulating New Strategies very subjective - not easy to measure *buy-in towards common vision
*very focused *buy-in from management *very clear what must be done *better control/check and balance *better allocation of scarce resources
E6 Carrying Out the SM Process the Way YOU Prefer *more communication & transparency E11 Analyzing Industry & Environment (O/T) *move quickly *provides direction *understanding the business & environment *all aligned
E1 Developing Mission / Vision Statement do not have result capturing system E2 Implementing Strategy don't have much experience & knowledge *subject to change
All aligned
Fig. 6.
E4 Developing Core Values / Corporate Culture E5 Generating Management Practices & Systems doesn't relate to market demands do not have benchmark don't move at all not enough communication & transparency E3 Analyzing Org. Resources/Capabilities (S/W)
not having robust process E7 Formulating Long Term Objectives
Rigor and Relevance Using Repertory Grids in Strategy Research
PrinCom: All Non-HR Directors Elements: 12, Constructs: 16, Range: 1 to 5, Context: Strategic Cognitions of the SM Process
18 Boards of Directors’ (CEO, MD, GM, VP) Cognitions of the Strategy-Making Process.
327
328
PrinCom: All Non-HR Directors Elements: 12, Constructs: 16, Range:1 to 5, Context: Strategic Cognitions of the SM Process E8 Evaluating Firm Performance
Poorly aligned
E5 Generating Management Practices & Systems E4 Developing Core Values / Corporate Culture
E9 Enforcing Corporate Governance
E3 Analyzing Org. Resources/Capabilities (S/W)
Better allocation of resources
Not spend enough time on resources
E6 Carrying Out the SM Process the Way YOU Prefer
E10 Liaising with Top Management Team E12 Formulating New Strategies E1 Developing Mission / Vision Statement
E11 Analyzing Industry & Environment (O/T)
E2 Implementing Strategy
All aligned
Fig. 7.
18 Boards of Directors’ Cognitive Map (without construct lines).
ROBERT P. WRIGHT
E7 Formulating Long Term Objectives
Rigor and Relevance Using Repertory Grids in Strategy Research
329
FOCUS: All Non-HR Directors Elements: 12, Constructs: 16, Range: 1 to 5, Context: Strategic Cognitions of the SM Process 100 90 too rigid *well established process *based on facts/objectives *have experienced people and knowledge *have measurement system fall short of objectives not enough communication & transparency poorly align do not have benchmark doesn't relate to market demands not spend enough time & resources neglected & not as well focused lack of direction there is no buy-in do not have control over it don't move at all
80
70
*subject to change not having robust process very subjective - not easy to measure don't have much experience & knowledge
C
do not have result capturing system *buy-in towards common vision *more communication & transparency *all aligned *provides direction
B
*understanding the business & environment *better allocation of scarce resources *very focused
A
*very clear what must be done *buy-in from management *better control/check and balance *move quickly
E6 Carrying Out the SM Process the Way YOU Prefer E8 Evaluating Firm Performance E9 Enforcing Corporate Governance E11 Analyzing Industry & Environment (O/T) E2 Implementing Strategy E12 Formulating New Strategies E7 Formulating Long Term Objectives E1 Developing Mission / Vision Statement E10 Liaising with Top Management Team E3 Analyzing Org. Resources/Capabilities (S/W) E5 Generating Management Practices & Systems E4 Developing Core Values / Corporate Culture
Fig. 8.
Cluster Analysis of 18 Boards of Directors’ Strategic Constructs of the Strategy-Making Process.
process in place. If you look at the same three clusters and refer to the opposite poles of the bi-polar constructs, you will also get deeper insights on how these boards of directors make connections between their strategic cognitions of the strategy-making process. Table 4 provides construct correlations of the elicited collective cognitions of the strategy-making process. (Note the full version of these constructs can be found in Table 3, above). Table 4 provides more insights on what boards of directors look for deep in their mental psyche about their strategy making process. For grid work using small samples compared to the larger samples in more conventional questionnaire research, the correlations are usually higher and in the present table, in the direction expected (Blowers &
330
Table 4.
Constuct Correlation using RepGrid programme – All 18 Boards of Directors.
Abbreviated Constructs *Very Focused *Clear on job *Better control *Measurement system *Understand business *Experienced people *Better Allocation *Transperency *Based on facts *Provides direction *Manegement buy in *Established process *Common vision *Flexible *All aligned *Move qucickly
C2 Cle
C3 Ctr
0.59 0.46 0.60
C4 C5 C6 Mea Und Exp
C7 Bet
C8 Tra
C9 Fac
C10 Dir
C11 Buy
0.68 0.55 0.26
0.76 0.66 0.56 0.58 0.71 0.46
0.27 0.54 0.30 0.34 0.70 0.19 0.46
0.55 0.66 0.26 0.58 0.35 0.78 0.31 0.15
0.34 0.61 0.44 0.50 0.79 0.36 0.64 0.82 0.19
0.49 0.60 0.24 0.21 0.82 0.64 0.24 0.16 0.68 0.37 0.23 0.11 0.33 0.61 0.27 0.32 0.31 0.05 0.03 0.16 0.37 0.71 0.43 0.17 0.68 0.23 0.00 0.21 0.27 0.13 0.45 0.17 0.41 0.63 0.27 0.15 0.37 0.08 0.23 0.09 0.53 0.20 0.13 0.06 0.32 0.23
0.43 0.55 0.27 0.50
0.70 0.59 0.48 0.76 0.32
C12 Pro
C13 Vis
C14 Fle
C15 Ali
C16 Qui
0.19 0.57 0.64 0.51 0.39 0.32 0.17 0.30 0.57 0.75 0.02 0.38 0.42 0.60 0.77 0.31 0.11 0.60 0.81 0.33 0.41 0.42 0.05 0.10 0.53 0.25 0.02 0.40 0.24
Note: Constructs with the highest correlations are indicated in bold type set (|r|4 ¼ .75). This|r|value was arbitrarily decided to focus on the most important correlations in the present study. Asterisk ‘‘*’’ indicate Boards of Directors’ preferred constructs when evaluating the strategymaking process. Please refer to Table 3 for full description of these constructs. The RepGrid programme calculated construct correlation given respondents’ ratings of the strategy process elements based on their own collective super constructs.
ROBERT P. WRIGHT
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16
C1 Foc
Rigor and Relevance Using Repertory Grids in Strategy Research
331
O’Connor, 1996). The highest correlations are highlighted in bold for easy reference. For example, with a correlation of 0.82, directors felt when strategy making provides direction (C10) there was more communication and transparency (C8). And when there was direction (C10), all things were aligned (C15) at 0.81 and such alignment provides transparency (C8 at 0.77). When everyone was clear on what must be done (C2) there was clear buy-in from management (C11) at 0.82. When there is understanding of the business (C5), there was clear direction (C10 at 0.79) and this allows us to move quickly (C16 at 0.75). When we have experienced people (C6), we tend to be more factual and objective (C9 at 0.78) and hence the existence of better measurement systems (C4 at 0.76). And finally, when we are more focused (C1) there is better control (C7 at 0.76). From a teaching perspective, as can be seen above, much can be (and has been) learned from examining and discussing the cognitive maps with students (albeit senior undergraduates, to MBAs to doctoral candidates). We have found using these maps in conjunction with traditional modes of teaching including case studies really does get students engaged in trying to understand how others (especially at the senior executive level) see, interpret and make sense of the world. Each map tells its own story (based on the board member’s experience, albeit at the individual or the collective level). One useful conclusion we make to students is for them to try and understand how successful board members view their experiences compared to those that are not so successful: what differences can you see in the way they think about strategic issues? What can be learned from this? These reflective questions encourage students to think more deeply about board issues and take away key successful ways of thinking while simultaneously learning about how not to think about certain issues. All this provides a more refined and thought-provoking appreciation of what goes on in the mindset of the boardroom executive in carrying out of the strategy-making process (or critical board roles for that matter, depending on the focus of your grid research). Class discussions become livelier when we start to compare cognitive maps of board members at different levels and modes of analysis. Some examples we have used compared Executive Directors vs. Independent Non-Executive Directors; board cognitions comparing successful firms vs. less successful firms, board cognitions from family board members vs. non-family members; male directors vs. female directors; and extending this line of work longitudinally and indeed, cross-culturally. In this respect, we leave you with a table of advice (Table 5) in helping achieve these endeavours.
Key Issues Purpose and domain of interest Choosing elements
Supplied or elicited elements?
Must have experience with elements
Pilot test
Mentally prepare your grid respondent
Points to Consider Once this is clear, designing the grid will be much easier. For first time users of the grid, it is best to work with either all people elements, objects, issues, or situations. Ensure your elements adhere to basic characteristics needed for element types. When you have become more experienced in grid applications, you can choose to use verbs as elements to elicit more complex managerial and organizational cognitions. Supplying will introduce a degree of researcher intrusion. However, if one of your purposes is to compare mental models of strategists, then it is advisable to supply them as a common denominator. Strategy elements are only meaningful if top managers have had experience with them otherwise it is problematic to explain what is not experienced. If however, your purpose is more on scenarios and how people would respond to situations (not yet experienced) then it would be okay to provide a list or elicit a set of elements that have not been experienced. This includes ensuring the elements you plan to use in your strategy research are indeed representative and meaningful to potential grid respondents. Pilot testing also includes undertaking several trial runs of actually administering the technique on yourself and with close friends before going out into the field. Only when you feel comfortable should you go out into the field, only this time to pilot test for one last time with a real respondent from your intended sample population. This is good practice. At least one week prior to the intending interview, send an email to the senior executive a 1-page outline of the rundown of the meeting highlighting (a suggested) three Parts: Part I is basically demographic data collection and possibly some questions related to the research focus; Part II will briefly discuss each of the elements in turn based on the focus of the interview; and Part III will be the main focus of the discussion using a special interviewing technique to find out more about how senior executives think about their strategizing experiences. On the day, in the meeting, follow this three Part structure. Before beginning the grid elicitation Part III, it is advisable to provide the CAR, HORSE, TRAIN demonstration. As the grid technique is qualitative in nature and interview intensive, respondents may be concerned about issues of confidentiality. To address this concern, it is advised to attach a signed letter and sent to the senior manager in soliciting their participation in the research interview, pledging your
ROBERT P. WRIGHT
Pledge of Confidentiality
332
Table 5. Key Issues to Consider Using Repertory Grid Technique in Strategy Research.
Qualifying question: ‘‘y in terms of y’’
Laddering up and down
Feed back the grid to validate constructs
Rigor and Relevance Using Repertory Grids in Strategy Research
Eliciting Constructs
confidentiality to the information provided during the interview and that no part will be revealed to any third party connecting the respondent’s organizations and what the respondent has said. This letter and the contents should be again presented at the commencement of the interview meeting on the day. This usually makes our respondents more at ease when such a pledge is provided in good faith. Being systematic in the process provides more rigor. A useful approach is to apply the 123, 456, 789, 147, 258 and 369 combinations, ensuring each element is elicited for constructs twice in providing an equal opportunity for all elements to be construed. Remember not to provide any leading questions or to provide any examples based on your study (if you can help it). The whole idea is to elicit constructs in the top manager’s own language and not that imposed upon by the researcher. It is very important that you get this qualifier right as it will help in keeping the grid elicitation process focused and in line with the main purpose and domain of why you want to interview top managers. It is always useful to have this qualifying question printed at the bottom of your laminated element cards. This way, the executive sees it when constructs are being generated by themselves. The key to eliciting very insightful bi-polar constructs is to ladder (or commonly known as probing). In grid terms, laddering up a person’s construct system means that you want to find out more about what makes this person tick, and ‘‘why’’ does he/she interpret the world in a certain way. ‘‘Why’’ questions will provide you with more super-ordinate constructs that are closer to the person’s core values. But be careful in taking it too far up as you don’t want to catch people with their constructs down – some issues may be too personal to the individual and so one should use some discretion in how deep into someone’s personal world we should go. Asking the ‘‘how and what’’ questions are still okay as they will still help you elicit the subordinate constructs. Remember the key to grid work is not to construe other people’s construing! When the grid has been analyzed, always feed the results back to the respondent to ask them whether the maps and clusters make sense? This is the ultimate validation of the grid data and opens up further opportunity for clarification and discussion.
333
334
ROBERT P. WRIGHT
CONCLUDING COMMENTS In this chapter, we have presented the powers of the RGT built on the noteworthy works of past strategy grid users. Their pioneering works and contributions have inspired us to come up with new and more refined ways of applying the grid technique to elicit more complex cognitions of strategizing both at the individual and at the collective levels of analysis. We believe these breakthroughs using the grid technique are in line with the needs for more fine-grained research using venturesome methods and alternative theoretical explanations given new research directions and questions as our landscapes in the field continue to change. Mapping executive cognitions about the way they see and interpret their strategizing worlds allows us to open the black box of strategic thinking in ways not seen before in the extant strategy literature. The graphic depictions of strategists’ mindset open up new insights on what was not visible before. The RGT as a mapping technique also foster better explanation and understanding and serves as a powerful decision-making tool that is reflective, informative and engaging both for the strategist, researcher and the practical teacher. We believe the application of more psychological theories such as the constructivist perspective outlined and the application of cognitive mapping has great potential in bringing new insights both for the informed practitioner and the empowered teacher. In the final analysis, we believe that research of this character as advocated in this chapter, where rigor meets relevance (Anderson et al., 2001; Barr, 2004; Shrivastava, 1987), is the way forward in meeting the needs of theory, practice and the classroom.
ACKNOWLEDGMENT I wish to acknowledge my sincere appreciation to Professor David J. Ketchen for his belief and encouragement in the value of my work. And to Professors Richard Bell, Geoff Blowers, Brian Gains, Devi Jankowicz and Ed Snape for their invaluable insights and unending support in helping me take the application of the grid technique to new heights. Funding to pursue this program of research using the grid technique in the strategy field has been generously supported by the Department of Management and Marketing’s Departmental General Research Fund (Project #G-T694), The Faculty of Business Research Fund (Project
Rigor and Relevance Using Repertory Grids in Strategy Research
335
#AP-F44) and the Internal Central Fund of the Research Office (Project # A-PA3J) of the Hong Kong Polytechnic University.
NOTE 1. The use of the masculine term also signifies the feminine.
REFERENCES Adams-Webber, J. R. (1989). Some reflections on the meaning of repertory grid responses. International Journal of Personal Construct Psychology, 2, 77–92. Anderson, N., Herriot, P., & Hodgkinson, G. P. (2001). The practitioner–researcher divide in industrial, work and organizational (IWO) psychology: Where are we now, and where do we go from here. Journal of Occupational and Organizational Psychology, 74, 391–411. Axelrod, R. (1976). Structure of decisions. Princeton, NJ: Princeton University Press. Bannister, D., & Mair, J. M. M. (1968). The evaluation of personal constructs. London: Academic Press. Barr, P. S. (2004). Current and potential importance of qualitative methods in strategy research. In: D. J. Ketchen & D. D. Bergh (Eds), Research methodology in strategy and management, (Vol. 1, pp. 165–188). Oxford: Elsevier, JAI. Bavelas, J. B., Chan, A. S., & Guthrie, J. A. (1976). Reliability and validity of traits measured by Kelly’s repertory grid. Canadian Journal of Behavioral Science, 8, 23–38. Beail, N., & Fisher, K. (1988). Two-component solutions of repertory grid data: A comparison of three methods. International Journal of Personal Construct Psychology, 1, 369–374. Bell, R. C. (1990). Analytic issues in the use of repertory grid technique. Advances in Personal Construct Psychology (Vol. 1, pp. 25–48). New York: JAI Press. Bell, R. C. (1997). Using SPSS to analyze repertory grid data. Unpublished manuscript. Bell, R. C. (2000). On testing the commonality of constructs in supplied grids. Journal of Constructivist Psychology, 13, 303–311. Bell, R. C., Vince, J., & Costigan, J. (2002). Which vary more in repertory grid data: Construct or elements? Journal of Constructivist Psychology, 15, 305–315. Blowers, G., & O’Connor, K. P. (1996). Personal construct psychology in the clinical context. Canada: University of Ottawa Press. Borman, W. C. (1987). Personal constructs, performance schemata, and ‘‘folk theories’’ of subordinate effectiveness: Explorations in an army officer sample. Organizational Behavior and Human Decision Processes, 40(3), 307–322. Brown, S. M. (1992). Cognitive mapping and repertory grids for qualitative survey research: Some comparative observations. Journal of Management Studies, 29(3), 287–307. Burr, V., & Butt, T. (1992). Invitation to personal construct psychology. England: Whurr Publishers Ltd. Calori, R., Johnson, G., & Sarnin, P. (1994). CEO’s cognitive maps and the scope of the organization. Strategic Management Journal, 15, 437–457. Cammock, P., Nilakant, V., & Dakin, S. (1995). Developing a lay model of managerial effectiveness: A social constructivist perspective. Journal of Management Studies, 32(4), 443–474. Cannon-Bowers, J. A., & Salas, E. (2001). Reflections on shared cognition. Journal of Organizational Behavior, 22, 195–202.
336
ROBERT P. WRIGHT
Carley, K. M. (1997). Extracting team mental models through textual analysis. Journal of Organizational Behavior, 18, 533–558. Centre for Person Computer Studies. (1993). RepGrid 2 manual. University of Calgary: Centre for Person Computer Studies. Conger, J. A. (1988). Qualitative research as the cornerstone methodology for understanding leadership. Leadership Quarterly, 9(1), 107–122. Daily, C. M., Dalton, D. R., & Cannella, A. A., Jr. (2003). Corporate governance: Decades of dialogue and data (Special issue). Academy of Management Review, 28(3), 371–382. Dalton, D. R., Daily, C. M., Ellstrand, A. E., & Johnson, J. L. (1998). Meta-analytic reviews of board composition, leadership structure, and financial performance. Strategic Management Journal, 19, 269–290. Dalton, D. R., Daily, C. M., Johnson, J. L., & Ellstrand, A. E. (1999). Number of directors and financial performance: A meta-analysis. Academy of Management Journal, 42(6), 674–686. Dalton, P., & Dunnett, G. (1999). A psychology for living: Personal construct theory for professionals and clients. Great Britain: European Personal Construct Association (EPCA). Daniels, K., de Chernatony, L., & Johnson, G. (1995). Validating a method for mapping managers’ mental models of competitive industry structures. Human Relations, 48(9), 975–991. Daniels, K., & Johnson, G. (2002). On trees and triviality traps: Locating the debate on the contribution of cognitive mapping to organizational research. Organization Studies, 23(1), 73–81. Daniels, K., Johnson, G., & de Chernatony, L. (1994). Differences in managerial cognitions of competition (Special issue). British Journal of Management, 5, S21–S29. Daniels, K., Johnson, G., & de Chernatony, L. (2002). Task and institutional influences on managers’ mental models of competition. Organization Studies, 23(1), 31–62. De Leon, E. D., & Guild, P. D. (2003). Using repertory grid to identify intangibles in business plans. Venture Captial, 5(2), 135–160. Dunn, W. N., Cahill, A. G., Dukes, M. J., & Ginsberg, A. (1986). The policy grid: A cognitive methodology for assessing policy dynamics. In: W. N. Dunn (Ed.), Policy analysis: Perspectives, concepts, and methods (pp. 355–375). Greenwich, CT: JAI Press. Dunn, W. N., & Ginsberg, A. (1986). A sociocognitive network approach to organizational approach. Human Relations, 39(11), 955–976. Dutton, J. E., Walton, J. E., & Abrahamson, E. (1989). Important dimensions of strategic issues: Separating the wheat from the chaff. Journal of Management Studies, 26(4), 379–397. Easterby-Smith, M. (1980). The design, analysis and interpretation of repertory grids. International Journal of Man-Machine Studies, 13, 3–24. Easterby-Smith, M., Thorpe, R., & Holman, D. (1996). Using repertory grids in management. Journal of European Industrial Training, 20(3), 3–30. Eden, C. (1992). On the nature of cognitive maps. Journal of Management Studies, 29(3), 261–265. Eden, C., & Ackermann, F. (1998). Making strategy: The journey of strategic management. London: Sage. Feixas, G., Geldschlager, H., & Neimeyer, R. A. (2002). Content analysis of personal constructs. Journal of Constructivist Psychology, 15, 1–19. Fexias, G., Lopez Moliner, J., Navarro Montes, J., Tudela Mari, M., & Neimeyer, R. A. (1992). The stability of structural measures derived from repertory grids. International Journal of Personal Construct Psychology, 5, 25–39.
Rigor and Relevance Using Repertory Grids in Strategy Research
337
Fiol, C. M., & Huff, A. S. (1992). Maps for managers: Where are we? Where do we go from here? Journal of Management Studies, 29, 267–285. Fjeld, S. P., & Landfield, A. W. (1961). Personal construct consistency. Psychological Reports, 8, 127–129. Forbes, D. P., & Milliken, F. J. (1999). Cognition and corporate governance: Understanding boards of directors as strategic decision-making groups. Academy of Management Review, 24(3), 489–505. Fransella, F. (Ed.) (2003). International handbook of personal construct psychology. England: Wiley. Fransella, F., & Adams, B. (1966). An illustration of the use of repertory grid technique in a clinical setting. British Journal of Social Clinical Psychology, 5(1), 51–62. Fransella, F., & Bannister, D. (1977). A manual of repertory grid technique. United Kingdom: Academic Press. Gibson, C. B. (2001). From knowledge accumulation to accommodation: Cycles of collective cognition in work groups. Journal of Organizational Behavior, 22(2), 121–134. Ginsberg, A. (1988). Measuring and modeling changes in strategy: Theoretical foundations and empirical directions. Strategic Management Journal, 9, 559–575. Ginsberg, A. (1989). Construing the business portfolio: A cognitive model of diversification. Journal of Management Studies, 26, 417–438. Ginsberg, A. (1990). Connecting diversification to performance: A sociocognitive approach. Academy of Management Review, 15(3), 514–535. Ginsberg, A. (1994). Minding the competition: From mapping to mastery (Special issue: Competitive Organizational Behavior). Strategic Management Journal, 15, 153–174. Gioia, D. A., Donnellon, A., & Sims, H. P. (1989). Communication and cognition in appraisal: A tale of two paradigms. Organization Studies, 10, 503–529. Gnyawali, D. R., & Tyler, B. B. (2005). Cause mapping in strategic management research: Processes, issues, and observations. In: D. J. Ketchen & D. D. Bergh (Eds), Research methodology in strategy and management, (Vol. 2, pp. 225–257). Oxford: Elsevier, JAI. Hill, R. A. (1995). Content analysis for creating and depicting aggregated personal construct derived cognitive maps. Advances in Personal Construct Psychology, 3, 101–132. Hinkle, D. N. (1965). The change of personal constructs from the viewpoint of a theory of implications. Unpublished doctoral dissertation, Ohio State University. Hitt, M. A., Boyd, B. K., & Li, D. (2004). The strategic management research and a vision of the future. In: D. J. Ketchen & D. D. Bergh (Eds), Research methodology in strategy and management, (Vol. 1, pp. 1–31). Oxford: Elsevier, JAI. Hitt, M. A., Gimeno, J., & Hoskisson, R. E. (1998). Current and future research methods in strategic management. Organizational Research Methods, 1(1), 6–44. Hodgkinson, G. P. (1997). Cognitive inertia in a turbulent market: A comparative causal mapping study. Journal of Management Studies, 34(6), 921–946. Hodgkinson, G. P. (2001a). Cognitive processes in strategic management: Some emerging trends and future directions. In: N. Anderson, D. S. Ones, H. K. Sinangil & C. Viswesvaran (Eds), Handbook of industrial, work and organizational psychology, (Vol. 2, pp. 416–460). London: Sage. Hodgkinson, G. P. (2001b). The psychology of strategic management: Diversity and cognition revisited. In: C. L. Cooper & I. T. Robertson (Eds), International review of industrial and organizational psychology, (Vol. 16, pp. 65–119). Chichester: Wiley. Hodgkinson, G. P. (2002). Comparing managers’ mental models of competition: Why selfreport measures of belief similarity won’t do. Organization Studies, 23(1), 63–72.
338
ROBERT P. WRIGHT
Hodgkinson, G. P. (2005). Images of competitive space: A study of managerial and organizational strategic cognition. Basingstoke, UK: Palgrave Macmillan. Hodgkinson, G. P., & Johnson, G. (1994). Exploring the mental models of competitive strategists: The case for processual approach. Journal of Management Studies, 31(4), 525–551. Hodgkinson, G. P., & Sparrow, P. R. (2002). The competent organization: A psychological analysis of the strategic management process. Buckingham: Open University Press. Huff, A. S. (Ed.) (1990). Mapping strategic thought. Chichester: Wiley. Huff, A. S. (1997). A current and future agenda for cognition research in organizations. Journal of Management Studies, 34(6), 947–952. Huff, A. S., & Jenkins, M. (Eds). (2002). Mapping strategic knowledge. London: Sage. Hunt, D. E. (1951). Studies in role concept repertory: Conceptual consistency. Unpublished master thesis, Ohio State University. Huse, M. (2000). Board of directors in SMEs: A review and research agenda. Entrepreneurship & Regional Development, 12(4), 271–290. Huse, M. (2005). Accountability and creating accountability: A framework for exploring behavioral perspectives of corporate governance (Special issue). British Journal of Management, 16, S65–S79. Jackson, S. E., & Dutton, J. E. (1988). Discerning threats and opportunities. Administrative Science Quarterly, 33, 370–387. Jankowicz, A. D. (1990). Applications of personal construct psychology in business practice. Advances in Personal Construct Psychology, 1, 257–287. Jankowicz, A. D. (2003). The easy guide to repertory grids. UK: Wiley. Jarzabkowski, P. (2003). Strategic practice: An activity theory perspective on continuity and change. Journal of Management Studies, 40(1), 23–55. Jarzabkowski, P. (2004). Strategy as practice: Recurisiveness, adaptation, and practices-in-use. Organization Studies, 25(4), 529–560. Jarzabkowski, P. (2005). Strategy as practice: An activity-based approach. London: Sage. Johnson, G., Melin, L., & Whittington, R. (Eds). (2003). Guest editors’ introduction: Micro strategy and strategizing: Towards an activity-based view. Journal of Management Studies, 40(1), 3–22. Johnson, J. L., Daily, C. M., & Ellstrand, A. E. (1996). Boards of directors: A review and research agenda. Journal of Management, 22(3), 409–438. Jolly, J. P., Reynolds, T. J., & Slocum, J. W. (1988). Application of the means-end theoretic for understanding the cognitive bases of performance appraisal. Organizational Behavior and Human Decision Processes, 41, 153–179. Kalekin-Fishman, D., & Walker, B. M. (1996). The construction of group realities: Culture, society, and personal construct theory. Florida: Krieger Publishing Company. Kelly, G. (1955, 1991). The psychology of personal constructs, (Vol. 1). Norton. Reprinted by Routledge, London. Ketchen, D. J., & Bergh, D. D. (Eds). (2004). Research methodology in strategy and management (Vol. 1). Oxford: Elsevier, JAI. Ketchen, D. J., & Bergh, D. D. (Eds). (2005). Research methodology in strategy and management (Vol. 2). Oxford: Elsevier, JAI. Ketchen, D. J., & Shook, C. L. (1996). The application of cluster analysis in strategic management research: An analysis and critique. Strategic Management Journal, 17, 441–458. Langfield-Smith, K. (1992). Exploring the need for a shared cognitive map. Journal of Management Studies, 29(3), 349–367.
Rigor and Relevance Using Repertory Grids in Strategy Research
339
Langfield-Smith, K., & Wirth, A. (1992). Measuring differences between cognitive maps. Journal of Operational Research Society, 43(12), 1135–1150. Locatelli, V., & West, M. A. (1996). On elephants and blind researchers: Methods for accessing culture in organizations. Leadership and Organization Development Journal, 17(7), 12–21. Lorsch, J. W., & MacIver, E. (1989). Pawns or potentates: The reality of America’s corporate board. Boston, MA: Harvard Business School Press. Lyles, M. (1990). A research agenda for strategic management in the 1990s. Journal of Management Studies, 27(7), 363–375. Mace, M. L. (1971). Directors: Myth and reality. Boston, MA: Harvard Business School Press. Mace, M. L. (1979). Directors: Myth and reality – ten years later. Rutgers Law Review, 32, 293–307. Mair, J. M. M. (1964). The derivation, reliability and validity of grid measures: Some problems and suggestions. British Psychological Society Bulletin, 17(55), 7A. McNulty, T., & Pettigrew, A. (1999). Strategists on the board. Organization Studies, 20(1), 47–74. Meindl, J. R., Stubbart, C., & Porac, J. F. (1996). Cognitions within and between organizations. Thousand Oaks, CA: Sage. Neimeyer, R. A. (1994). The threat index and related methods. In: R. A. Neimeyer (Ed.), Death anxiety handbook: Research, instrumentation and application (pp. 61–101). Washington, D.C.: Taylor & Francis. O’Higgins, E. (2002). Non-executive directors on boards in Ireland: Co-option, characteristics and contributions. Corporate Governance, 10(1), 19–28. Pedersen, F. A. (1958). A consistency study of the R.C.R.T. Unpublished doctoral dissertation, Ohio State University. Pettigrew, A. M. (1992). On studying managerial elites. Strategic Management Journal, 13, 163–182. Pettigrew, A., & McNulty, T. (1995). Power and influence in and around the boardroom. Human Relations, 48(8), 845–872. Porac, J. F., & Thomas, H. (1990). Taxonomic mental models in competitor definition. Academy of Management Review, 15(2), 224–240. Porac, J. F., Thomas, H., & Baden-Fuller, C. (1989). Competitive groups as cognitive communities: The case of Scottish knitwear manufacturers. Journal of Management Studies, 26(4), 397–416. Prahalad, C. K., & Bettis, R. A. (1986). The dominant logic: A new link between diversity and performance. Strategic Management Journal, 7, 485–501. Prasad, A., & Prasad, P. (2002). The coming of age of interpretative organizational research. Organizational Research Methods, 5(1), 4–11. Priem, R. L. (1994). Executive judgement, organizational congruence, and firm performance. Organization Science, 5(3), 421–437. Pye, A., & Pettigrew, A. (2005). Studying board context, process and dynamics: Some challenges for the future (Special issue). British Journal of Management, 16, S27–S38. Reger, R. K. (1987). Competitive positioning in the Chicago banking market: Mapping the mind of the strategist. Unpublished doctoral dissertation, University of Illinois at UrbanaChampaign, USA. Reger, R. K. (1990). Managerial thought structures and competitive positioning. In: A. S. Huff (Ed.), Mapping strategic thought (pp. 71–88). Chichester: Wiley. Reger, R. K., Gustafson, L. T., Demarie, S. M., & Mullane, J. V. (1994). Reframing the organization: Why implementing total quality is easier said than done. Academy of Management Review, 19(3), 565–584.
340
ROBERT P. WRIGHT
Reger, R. K., & Huff, A. S. (1993). Strategic groups: A cognitive perspective. Strategic Management Journal, 14, 103–123. Reger, R. K., & Palmer, T. B. (1996). Managerial categorization of competitors: Using old maps to navigate new environments. Organization Science, 7(1), 22–39. Rentsch, J. R., & Klimoski, R. J. (2001). Why do ‘great minds’ think alike? Antecedents of team member schema agreement. Journal of Organizational Behavior, 22(2), 107–120. Rindova, V. P. (1999). What corporate boards have to do with strategy: A cognitive perspective. Journal of Management Studies, 36(7), 953–975. Roberts, J., McNulty, T., & Stiles, P. (2005). Beyond agency conceptions of the work of the non-executive director: Creating accountability in the boardroom (Special issue). British Journal of Management, 16, S5–S26. Salas, E., & Cannon-Bowers, J. A. (2001). Shared cognition (Special issue). Journal of Organizational Behavior, 22(2), 87–88. Scheer, J. W., & Catina, A. (Eds.). (1996). Empirical constructivism in Europe: The personal construct approach (pp. 13—17), Giessen: Psychosozial Verlag. Retrieved 6 December, 2005, from http://www.pcp-net.de/papers/introduc.htm Schneider, S. C., & Angelmar, R. (1993). Contrib Title:Cognition in organizational analysis: Who’s minding the store? Organization Studies, 14(3), 347–374. Senior, B. (1996). Team performance: Using repertory grid technique to gain a view from the inside. Journal of Managerial Psychology, 11(3), 26–32. Shrivastava, P. (1987). Rigor and practical usefulness of research in strategic management. Strategic Management Journal, 8, 77–92. Simpson, B., & Wilson, M. (1999). Shared cognition: Mapping commonality and individuality. In: J. A. Wagner III (Ed.), Advances in qualitative organizational research, (Vol. 2, pp. 73–96). Stamford, CT: JAI Press. Smircich, L., & Stubbart, C. (1985). Strategic management in an enacted world. Academy of Management Review, 10(4), 724–736. Slater, P. (1977). Dimensions of intra-personal space: The measurement of intra-personal space by grid technique. New York: Wiley (pp. 127–138). Smith, H. J. (2000). The reliability and validity of structural measures derived from repertory grids. Journal of Constructivist Psychology, 13, 221–230. Smith, K. G., & Hitt, M. A. (2005). Great minds in management: The process of theory development. New York: Oxford University Press. Snow, C. C., & Thomas, J. B. (1994). Field research methods in strategic management: Contributions on theory building and testing. Journal of Management Studies, 31(4), 457–480. Sparrow, P. R. (1994). The psychology of strategic management: Emerging themes of diversity and cognition. In: C. L. Cooper & I. T. Robertson (Eds), International review of industrial and organizational psychology (Vol. 9, pp. 147–181). Chichester: Wiley. Spencer, B., Peyrefitte, J., & Churchman, R. (2003). Consensus and divergence in perceptions of cognitive strategic groups: Evidence from the health care industry. Strategic Organization, 1(2), 203–230. Spender, J. C. (1989). Industry recipes: The nature and sources of managerial judgement. Oxford: Basil Blackwell. Sperlinger, D. (1976). Aspects of stability in the repertory grid. British Journal of Medical Psychology, 49, 341–347. Stewart, V., & Stewart, A. (1981). Business applications of repertory grid. England: McGrawHill.
Rigor and Relevance Using Repertory Grids in Strategy Research
341
Stiles, P. (2001). The impact of the board on strategy: An empirical examination. Journal of Management Studies, 38(5), 627–650. Stiles, P., & Taylor, B. (2001). Boards at work: How directors view their roles and responsibilities. New York: Oxford University Press. Stubbart, C. I. (1989). Managerial cognition: A missing link in strategic management research. Journal of Management Studies, 26(4), 325–347. Tenbrunsel, A. E., Galvin, T. L., Neale, M. A., & Bazerman, M. H. (1996). Cognitions in organizations. In: S. R. Clegg, C. Handy & W. R. Nord (Eds), Handbook of organization studies (pp. 313–337). London: Sage. Thomas, H., & Venkatraman, N. (1988). Research on strategic groups: Progress and prognosis. Journal of Management Studies, 25(6), 537–555. Viney, L. L. (1988). Which data-collection methods are appropriate for a constructivist psychology. International Journal of Personal Construct Psychology, 1, 191–203. Walsh, J. P. (1995). Managerial and organizational cognition: Notes from a trip down memory lane. Organization Science, 6(3), 280–321. Walton, E. J. (1986). Managers’ prototypes of financial firms. Journal of Management Studies, 23, 679–698. Weick, K. E. (2001). Making sense of the organization. Oxford: Blackwell. Whittington, R. (1996). Strategy as practice. Long Range Planning, 29(5), 731–735. Whittington, R. (2003). The work of strategizing and organizing: For a practice perspective. Strategic Organization, 1(1), 117–125. Wright, R. P. (2004a). Mapping cognitions to better understand attitudinal and behavioral responses in appraisal research. Journal of Organizational Behavior, 25, 339–374. Wright, R. P. (2004b). Top managers’ strategic cognitions of the strategy making process: Differences between high and low performing firms. Journal of General Management, 30(1), 61–78. Wright, R. P. (2004c). Element selection. In: J. Scheer (Ed.), The Internet Encyclopaedia of Personal Construct Psychology. Retrieved from C: Wright, R. P. (2005a). As strategists thinketh: How HR directors and board of directors construe strategizing. Paper presented at the meeting of the American Strategic Management Society, 25th international annual conference, Orlando, FL, USA. Wright, R. P. (2005b). Rigor and relevance using the repertory grid at the board level. Unpublished manuscript. Wright, R. P., Butler, J. E., & Priem, R. (2003). Asian cognitions of the strategic management process. Paper presented at the American Strategic Management Society mini-conference, Hong Kong. Wright, R. P., & Lam, S. K. K. (2002). Comparing apples with apples: The importance of element wording in grid applications. Journal of Constructivist Psychology, 15, 109–119. Zahra, S. A., & Pearce, J. A. (1989). Boards of directors and corporate financial performance: A review and integrative model. Journal of Management, 15(2), 291–334.
This page intentionally left blank
342
STUDYING THE DYNAMICS OF REPUTATION: A FRAMEWORK FOR RESEARCH ON THE REPUTATIONAL CONSEQUENCES OF CORPORATE ACTIONS Matthew S. Kraatz and E. Geoffrey Love ABSTRACT Strategic management researchers have devoted increasing attention to the study of corporate reputation over the past two decades. Reputation has been conceptualized as a valuable intangible asset, and numerous studies have sought to identify its antecedents and foundations. This chapter recommends a dynamic approach toward reputation research. We argue that studies should examine the processes through which reputational assets are accumulated and depleted over time (i.e. that they should attend to reputational ‘‘flows’’ in addition to reputational ‘‘stocks’’). We specifically suggest that research focus upon particular corporate actions, examining how (and if) corporate reputations change in their wake. We provide pragmatic and theoretical rationales for this approach toward reputation research. We construct a framework for conducting dynamic, action-focused studies of reputational change. We provide general guidelines for designing such studies, and also provide some specific (i.e. ‘‘nuts and bolts’’) advice about executing them. We provide one in-depth example of research conducted Research Methodology in Strategy and Management, Volume 3, 343–383 Copyright r 2006 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1479-8387/doi:10.1016/S1479-8387(06)03011-6
343
344
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
within this framework. We also identify a number of other corporate actions that could be readily examined using the same methodological and theoretical approach. In recent years, the topic of organizational reputation has received growing attention both from organizational scholars and the public at large. Corporations, universities, hospitals, and other organizations are now routinely evaluated and ranked by magazines and other third party observers. These reputational assessments appear to be of substantial concern to organizations and their constituencies, and they have also been the focus of much academic research. Scholars studying corporate reputation have conceptualized it as an overall evaluation of a firm’s appeal, esteem or quality, relative to some peer group (see Deephouse & Carter, 2005; Fombrun, 1996). It is typically thought of as a global, omnibus assessment of the corporate ‘‘whole,’’ which audiences reach by integrating a wide range of available information about the firm (Schultz, Mouritsen, & Gabrielsen, 2001). While reputation scholars generally recognize that observers vary substantially in their evaluations of particular firms, research has shown that external assessments tend toward convergence and also exhibit substantial stability over time (Roberts & Dowling, 2002; Schultz et al., 2001). Thus, while reputation is granted by external audiences, it is often thought of as being possessed by the firm itself. Explaining corporate reputation is an important task for strategic management research because firms with superior reputations are believed to enjoy a number of significant advantages over their less admired peers. These include the ability to charge premium prices, greater attractiveness to potential employees, lower resource costs, increased access to alliance partners, and sustained high performance, among others (Deephouse, 2000; Fombrun & Van Riel, 2004; Roberts & Dowling, 2002; Weigelt & Camerer, 1988). Importantly, these benefits, like reputation itself, are thought to be relatively enduring (Roberts & Dowling, 2002). Reputation, in other words, may be an inimitable, intangible, and robust asset capable of providing the firm with a sustainable competitive advantage over time. Given these presumed benefits, scholars have devoted substantial attention to the task of identifying the antecedents and ‘‘roots’’ of corporate reputation. This chapter is similarly concerned with explaining reputation, but it advocates a new approach toward this end. Specifically, we emphasize the general need to study reputation dynamically, and the specific need to examine how it is affected by various corporate actions.1 We make a conceptual case for this approach toward studying reputation and provide some specific
Studying the Dynamics of Reputation
345
guidelines and desiderata for the design of future studies examining the dynamics of reputation. We draw from our own ongoing research to offer one extended example of the general approach we recommend. We also present a number of other examples of corporate actions whose effects on reputation could be dynamically studied using the same general approach. In addition to our conceptual arguments and examples, we also provide some specific (i.e. ‘‘nuts and bolts’’) advice about conducting dynamic, action-focused reputation studies. Specifically, we offer advice about analytical and measurement issues, and also discuss appropriate data sources at some length.
THE NEED FOR RESEARCH ON THE DYNAMICS OF REPUTATION AND THE REPUTATIONAL CONSEQUENCES OF FIRM ACTIONS Our chapter’s main argument is that reputation research needs to examine how firm reputations change over time, and to focus specifically upon how reputation is affected by various corporate actions. This overarching argument is built up from two more basic assertions. First, we believe that examining the reputational consequences of corporate actions should be a problem of self-evident importance for strategic management research. If we accept that ‘‘reputation matters,’’ it naturally follows that we should be concerned with identifying corporate actions that ‘‘matter for reputation.’’ It is important to recognize, as many scholars have, that reputation is ‘‘sticky,’’ intangible, and somewhat persistent over time (Roberts & Dowling, 2002; Schultz et al., 2001). It is similarly important that we have theories which speculate on reputations’ ultimate foundations and attempt to define its essential nature. But, as a practical matter, we believe it is at least equally important that we develop knowledge about how (and if) ascribed reputations actually change in response to particular things that firms do. Research examining the reputational repercussions of various corporate actions can provide insight into the ways that firms can improve their reputations, even if such changes are only incremental. It can also help strategy scholars understand how reputations are damaged and depleted, thus enabling them to provide better guidance about the types of actions that firms should avoid. Reputation has been usefully categorized as a firm asset – a ‘‘stock’’ variable. However, reputation is accumulated and depleted through temporal flow sequences, just as other assets are. For practical reasons, strategic management research should examine these dynamic sequences and seek to identify the firm actions which affect reputational flows over time.
346
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
Our chapter is also premised on the argument that dynamic, action-focused studies of reputation can generate important theoretical insights, in addition to their clear pragmatic benefits. Specifically, we suggest that such studies may allow us to better evaluate the merits of the different theoretical perspectives on corporate reputation that currently exist in the literature. Most reputation theory converges on the idea that reputation is produced (and reproduced) as audiences interpret the signals that the firm itself sends, and as they interpret signals that other actors (e.g. the media) send about the firm (Fombrun & Shanley, 1990; Rindova, Williamson, Petkova, & Sever, 2005; Schultz et al., 2001; Weigelt & Camerer, 1988). However, as we will discuss just below, existing theories propose that audiences respond to different types of signals, and value these signals for somewhat different reasons. Dynamic studies, which focus on firm actions provide an opportunity to isolate specific signals and to observe their individual effects (and non-effects) on reputation. They can enable researchers to see whether the different signals (‘‘stimuli’’) which various theories portray as consequential actually exert the predicted effects (‘‘responses’’) on firm reputation. Further, to the extent that a particular firm action conveys complex and contradictory signals, such studies may have an additional advantage. Namely, they make it possible to see which signals are relatively more consequential to audiences as they assign reputation. Thus, dynamic, action-focused studies may also help to distinguish between different theoretical accounts of reputation, and help scholars better understand the nature of the evaluative logic which audiences use in ascribing it to firms. This important advantage will become clearer as we review some alternative perspectives on the foundations of reputation, and as we discuss the particular signals which are the foci of these varied perspectives.
THEORETICAL PERSPECTIVES ON THE FOUNDATIONS OF REPUTATION AND THE CONSEQUENCES OF CORPORATE ACTIONS In order for us to effectively develop our case for dynamic, action-focused studies of corporate reputation, it is necessary to first briefly review some prominent theoretical arguments in the diverse literature on corporate reputation. This review will serve four functions. First, it will help the reader better understand the nature (or natures) of corporate reputation itself. Second, it will allow us to see why various firm actions should affect reputation. Third, it will allow us to see what general types of actions might be consequential to reputation. Finally, it will reveal some significant tensions in the existing
Studying the Dynamics of Reputation
347
theoretical literature and allow us to see how dynamic studies of the sort we recommend may help us alleviate and learn from these tensions. Most of the literature on reputation is concerned with understanding the same fundamental question: ‘‘What makes a firm appealing or ‘‘admirable’’ in the eyes of its constituencies?’’ However, varied perspectives within the literature provide somewhat divergent (and occasionally contradictory) answers to this basic question. Scholars are united in the belief that audiences admire firms that send valued signals. But, they diverge appreciably in their insights about the particular signals (i.e. organizational attributes, outcomes, and actions) that audiences value, and in their insights about why audiences actually value and thus respond to these signals.
Reputation as a Function of Organizational Character Traits One central theme in the reputation literature – perhaps the most prominent one – is that audiences bestow admiration and reputation upon firms that appear to possess desirable character traits (Davies, 2002; Dowling, 2001; Fombrun, 1996; Fombrun & Van Riel, 2004). This line of reasoning suggests that audiences’ tend to anthropomorphize organizations and to attribute human traits to them based upon their observed history of actions and transactions (Dowling, 2001). It predicts that constituencies are likely to hold a firm in high esteem when its history of observable actions indicates that the firm (as a whole) possesses universally valued character traits such as trustworthiness, credibility, responsibility, fairness, and integrity (Fombrun, 1996). From this perspective, firms are rewarded for conveying a clear and distinctive identity (a coherent ‘‘corporate personality’’) and for acting in an identity-consistent fashion over time (Fombrun & Van Riel, 2004). Organizations that make clear, character-defining commitments and appear to honor those commitments over time are expected to garner admiration, esteem, and reciprocal commitment from their constituencies In contrast, firms whose actions appear arbitrary, opportunistic, or otherwise incoherent and unreliable are expected to be less well-reputed by observers, according to this view. This general line of reasoning draws support from early institutional scholarship which similarly anthropomorphized organizations and also stressed the pragmatic importance of distinctiveness, integrity, character, and commitment (Selznick, 1957). It similarly parallels some recent work in stakeholder theory which likewise emphasizes how trustworthy behavior helps the firm secure support from its various constituencies (Jones, 1995).2 It also shares an affinity with much work in organizational ecology which
348
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
has stressed that related organizational traits such as reliability and accountability are foundational to organizational success and stakeholder support (Hannan & Freeman, 1984). Ecologists have specifically argued that identity-inconsistent actions signal the absence of these valued traits, and may thus greatly undermine external support for the organization (Baron, 2004; Baron, Hannan, & Burton, 2001). These macro-level arguments are also supported by recent micro-level research on the psychology of legitimacy. This work has shown that peoples’ beliefs about the legitimacy of particular social institutions are primarily the product of fairness judgments (Tyler, 1990, 1999). Importantly, it has further shown that peoples’ beliefs about procedural fairness are generally more determinative of their legitimacy assessments than are their evaluations of distributive (i.e. outcome) fairness. This oft-replicated finding supports the basic idea that people grant approval to firms based upon their beliefs about organizational character traits (in that organizational processes and traits can be seen as closely analogous to one another). This perspective has clear implications for understanding the reputational consequences of corporate actions. In general, it indicates that actions which send clear signals about organizational character will be consequential to reputational assessments. It similarly proposes that audiences will tend to interpret firm actions as if they were indicative of organizational character traits, whether or not this is actually the case. More specifically, this perspective predicts that reputation will tend to suffer as a result of seemingly opportunistic actions and as a result of identity-inconsistent ones. In contrast, however, reputation should be improved by firm actions which appear to reveal organizational integrity, commitment, credibility, or fairness, and by actions which appear to confirm the organization’s past identity claims.
Reputation as a Function of Symbolic Conformity and Cultural Prominence A second, somewhat contrasting, approach toward explaining audience admiration is taken by scholars who have employed neo-institutionalism as a tool for understanding reputation (cf. Deephouse & Carter, 2005; Fombrun & Shanley, 1990; Rao, 1994; Rindova et al., 2005; Staw & Epstein, 2000). The core insight of neo-institutional theory is that organizations are situated within broader institutional environments (or ‘‘fields’’) (cf. DiMaggio & Powell, 1983; Scott, 2001). Assuming this perspective, scholars are compelled to view organizations against the backdrop imposed by these fields. Institutionalists argue that cultural, normative, and regulative processes
Studying the Dynamics of Reputation
349
which operate within institutional fields have profound consequences for organizational actions and outcomes (DiMaggio & Powell, 1983; Scott, 2001). They further emphasize that organizations must often conform to the mandates of field-level forces if they are to secure the support and approval of external audiences (DiMaggio & Powell, 1983; Meyer & Rowan, 1977). One way in which organizations can conform to these demands is through the adoption of symbolically appropriate practices and structures (Meyer & Rowan, 1977). Much institutional research implies that external audiences grant approval to organizations that adopt structures and practices which symbolize their rationality and propriety, and otherwise convey their agreement with prevailing cultural beliefs, norms, and expectations (Ruef & Scott, 1998; Tolbert & Zucker, 1983; Westphal, Gulati, & Shortell, 1997). Studies have suggested that such approval may accrue to organizations even when they adopt practices that are wholly symbolic and largely ‘‘decoupled’’ from actual organizational functioning (Meyer & Rowan, 1977). The technical efficacy of such practices in itself is of secondary concern to evaluating audiences from this perspective. Applied as a partial explanation for corporate reputation, this perspective has at least two clear implications. First, it implies that firm reputation will depend, at least in part, upon the extent to which the firm employs symbolically appropriate structures or practices (Staw & Epstein, 2000). From this perspective, organizations are not evaluated primarily with respect to their own history and identity, as in the prior perspective. Rather, they are evaluated with respect to external, socially constructed standards and categories which exist at the field level and which are often enforced by field level organizations (e.g. professional groups and the state) (Deephouse, 2000; Greenwood, Suddaby, & Hinings, 2002; Ruef & Scott, 1998; Westphal et al., 1997). Likewise, firms are less likely to be evaluated as ‘‘whole’’ actors, and more likely to be seen as collections of culturally appropriate (or inappropriate) structural parts. Thus, organizations that conform to institutional mandates by incorporating symbolically appropriate practices and structures are likely to more highly admired. In contrast, firms who fail to display such symbols should be less highly esteemed. A second implication of this perspective is that reputation depends not just on symbolic conformity to field level demands, but also upon an organization’s prominence within its particular field (Fombrun & Van Riel, 2004; Rindova et al., 2005). Neo-institutionalism holds that legitimacy results partially from taken for grantedness, which is a sort of preconscious approval. Organizational practices and structures become taken for granted as they become increasingly prevalent within a field (Tolbert & Zucker, 1983, 1997). In like manner, an
350
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
organization may acquire acceptance and approval as a partial function of the mere exposure and attention that it receives. From this angle, being well thought of may be of secondary importance to merely being thought of in the first place. The task for organizations seeking reputation is, at least in part, one of merely getting on the ‘‘radar screen’’ and staking out a prominent place in the collective consciousness (Rindova et al., 2005). This perspective also has clear implications for understanding the reputational consequences of corporate actions, and it similarly draws our attention to particular types of actions. Most obviously, it implies that firms should gain approval as they adopt symbolically appropriate structures and practices which conform to field level, cultural and normative expectations. It similarly suggests that reputations may suffer when firms engage in actions that are institutionally proscribed and thus illegitimate. An additional implication is that firm actions which increase exposure and generate attention and publicity should have positive implications for reputation, all else equal.
Reputation as a Function of Technical Efficacy A third distinct perspective in the literature suggests that audience admiration is a primary function of technical efficacy, rather than organizational character or symbolic structural conformity (Fryxell & Wang, 1994; Washington & Zajac, 2005). This perspective indicates that audiences are more attentive to the signals conveyed by organizational outputs than to signals relating to the entities which produce them. According to this line of reasoning, organizations are likely to be highly reputed when they simply produce superior products and services, deliver superior financial results, or otherwise better meet the immediate functional needs of evaluating audiences. This perspective is supported by evidence suggesting that product quality strongly affects corporate reputation (Fombrun & Van Riel, 2004). It is also supported by studies which have found that financial performance is a very strong predictor of firms’ standing in the Fortune ‘‘Most Admired Companies’’ survey which has been the outcome measure of choice in many prior studies of corporate reputation (Brown & Perry, 1994; Fombrun & Shanley, 1990). Indeed, some scholars have argued this effect is so strong that corporate reputation (at least as measured by the Fortune survey) may effectively reduce to financial performance (Fryxell & Wang, 1994). This perspective also has important implications for understanding the reputational consequences of corporate actions, though these are perhaps less direct than with the prior two. To begin, it predicts that corporate actions which improve technical efficacy, or are at least credibly presented as such,
Studying the Dynamics of Reputation
351
should increase reputational standing. Likewise, it predicts that actions should have negative reputational consequences when they impair financial performance, reduce product or service quality, or otherwise disrupt the flow of valued outputs to evaluating audiences. Importantly, this perspective also implies that firm performance announcements and changes in firm performance might also be productively viewed as corporate ‘‘actions’’ themselves. Specifically, we might expect reputational stocks to be augmented or depleted as a result of temporal trends in the quality of organizational outputs.
Reputation as a Function of Relational Status A final perspective in the literature argues that audience admiration stems in large part from the quality of a firm’s external associations. This argument derives from research and theory on status and inter-organizational networks (Podolny, 1993, 1994; Stuart, Hoang, & Hybels, 1999). Scholars have noted that audiences often have difficulty judging the ‘‘true’’ quality of a firm, especially in industries where product quality is hard to precisely gauge and where organizations have uncertain technologies (i.e. poorly understood procedures for transforming input into outputs) (Shrum & Wuthnow, 1988). As a result, they often tend to rely on a firm’s relationships with other organizations to draw inferences about the quality of the firm itself. In particular, reputation is likely to be higher to the extent that a firm is affiliated with prestigious partners or clients (Podolny, 1993, 1994; Shrum & Wuthnow, 1988). The willingness of high status others to associate with the focal firm is seen as a critical indicator of its underlying quality (Stuart et al., 1999). A clear implication of this perspective is that reputation should be responsive to corporate actions which alter the firm’s network of relationships. Specifically, we might expect reputation to be positively affected when a firm establishes public relationships with prestigious partners or clients. Likewise, we would expect it to be negatively affected when the firm dissolves relationships with such partners and clients, or when it establishes relationships with other firms that are of lower status than the focal organization. Table 1 summarizes these four perspectives and their implications. As can be seen in the table, each perspective offers distinct insights about the basis of audience admiration, and about the nature of the evaluative logic that audiences use in ascribing reputation. Importantly, each of the four perspectives also predicts that reputations should change in the wake of particular corporate actions. However, each perspective draws attention to different types of actions and proposes that they will affect reputation for somewhat different reasons.
352
Table 1. Summary of Different Perspectives on Reputation.
Basis of audience admiration
Attributes of reputation enhancing actions
Attributes of reputation damaging actions
Theoretical roots and relationships
Reputation as a Function of Symbolic Conformity and Cultural Prominence
Reputation as a Function of Technical Efficacy
Reputation as a Function of Relational Status
Clear and distinctive identity; demonstrated character; perceived trustworthiness and credibility Firm evaluated as a coherent actor or ‘‘whole;’’ character traits attributed based on past actions; firm anthropomorphized by audience Reveal and affirm character and identity; signal trustworthiness, credibility, commitment
Conformity to cultural norms and categories; prominence within organizational field
Ability to deliver valued outputs (products, services, technologies, financial performance)
Ties to high status actors; centrality
Firm evaluated as collection of structures and practices (‘‘parts’’), cultural appropriateness and prominence central, loose coupling Symbolize conformity with normative expectations and cultural beliefs; increase ‘‘coverage’’ of firm Deviate from cultural prescriptions, violate norms, decrease attention and coverage Neo-institutionalism
Firm evaluated based on outputs produced
Firm evaluated based on its relations (or nonrelations) to other firms
Convey technical efficacy, efficiency, and/or excellence along multiple dimensions
Result in closer association with high status others
Convey lack of efficacy or excellence along same dimensions
Weaken association with high status others; create association with lower status others Organizational status and stratification
Suggest opportunism; inconsistent with claimed identity and past commitments ‘‘Old’’ institutionalism, stakeholder theory, psychology of legitimacy
Multiple theoretical connections, no dominant root perspective
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
Audiences’ evaluative logic
Reputation as a Function of Organizational Character Traits
Studying the Dynamics of Reputation
353
SOME GUIDELINES FOR CONDUCTING DYNAMIC, ACTION-FOCUSED STUDIES OF CORPORATE REPUTATION Having made a conceptual case for dynamic, action-focused studies of reputation and having reviewed the relevant theoretical literature, it is now possible to develop some more specific guidelines for the design and execution of such studies. We develop these guidelines as answers to a series of rhetorical questions, each of which is likely to arise in the process of conceptualizing, designing, and conducting such studies. What Types of Actions Should Be Studied? Actions that are Interesting and Seemingly Consequential for Firms Themselves It is possible to make theoretically informed – or even theoretically determined – choices about which actions to study. Indeed, the remainder of this section provides insights about how different reputation theories can be used to identify appropriate research topics. However, it is also possible – and arguably more productive – to identify research-worthy corporate actions merely by engaging the empirical world with open eyes. Reading the popular business press, for instance, one finds no shortage of corporate actions which appear to be ‘‘reputation relevant.’’ Corporations engage in all sorts of actions that may affect their individual and collective reputations, either positively or negatively. The media, other industry observers, and corporate managers themselves endlessly speculate about how different strategic and tactical actions will (or will not) affect external audiences’ approval for a firm. We believe that scholars can identify excellent research topics merely by monitoring (and questioning) this ongoing public discourse. A distinct advantage of the research approach we propose is its ability to subject popular understandings and speculative claims to critical, scientific scrutiny. Dynamic, action-focused studies of corporate reputation may allow scholars to discern what actually ‘‘works’’ – and what does not work – as far as reputation-building efforts are concerned. Further, by focusing upon actions that are of manifest concern to industry participants, researchers can increase the chances of producing knowledge that has practical implications. Actions that Appear to Send Signals about Organizational Character (or the Lack Thereof) A variety of corporate actions fall into this broad category. Most obviously, studies could examine how reputation is affected by instances of corporate
354
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
malfeasance, including corporate fraud, environmental violations, or earnings restatements. To the extent that reputation is a function of attributed character, such actions should have strongly negative reputational consequences. Studies could also examine how reputation is affected by corporate actions which are undertaken to increase performance and efficiency, but which also carry a taint of greed or opportunism. It would be interesting to know, for example, whether reputations have been negatively affected by firms’ recent efforts to escape pension obligations, by their increasing use of outsourcing, by their efforts to win wage concessions from unions, or by other like actions. Research exploring the ‘‘reputation as character’’ perspective should also try to take organizations’ particular histories and identities into explicit account. Scholars might examine how reputations are affected by major strategic changes which represent stark departures from the firm’s own past, and thus signal deep changes in organizational commitments and corporate identity. If reputation is a partial function of historical continuity, such changes may negatively affect reputation – even if they are technically efficacious. Studies in this vein could also productively examine the reputational effects of various governance changes. Integrity, fairness, responsibility, and identity are the very stuff of corporate governance. To the extent that these same issues are also central in the ascription of reputation, reputational evaluations may be highly responsive to certain governance changes and to executive compensation policies, as well. The directions of these particular effects, if observed, might be especially revealing. Specifically, they might tell us much about what actually constitutes ‘‘good governance’’ in the eyes of evaluating audiences. Actions that Signal Organizational Conformity (or Nonconformity) with Field Level Demands Research could also productively study the reputational consequences of organizational actions that are undertaken (or appear to be undertaken) in direct response to field-level institutional processes. Many previous studies have shown how various managerial and organizational innovations have ‘‘diffused’’ among large American corporations (e.g. Baron, Dobbin, & Jennings, 1986; Edelman, 1992; Tolbert & Zucker, 1983). Much of this research has explained this observed diffusion as the reciprocal product of overarching institutional forces, and firms’ efforts to win approval from these forces. However, relatively few studies have examined whether the adoption of symbolically appropriate organizational ‘‘parts’’ actually translates into greater approval for corporate wholes that adopt them. Future studies need to examine whether (and when) firms’ self-conscious efforts to win external
Studying the Dynamics of Reputation
355
approval actually result in reputational gains. Staw and Epstein (2000) found some evidence that such gains may accrue. Specifically, they showed that corporations achieved reputational benefits as a result of their use of popular management practices such as Total Quality Management, Empowerment, and Teams. They found that this effect occurred despite these innovations’ questionable performance benefits. More studies which are similarly conceived are needed, however. The same basic Action-Reputation framework could be applied in order to evaluate the reputational effects of any number of other (apparently) legitimate organizational structures and practices. Such studies might importantly inform neo-institutionalism as well as reputation theory. Actions that Increase Exposure and Attention for the Firm Accepting the idea that reputation flows partially from mere exposure and attention, studies could also examine the effects of actions that generate significant media attention for the firm. Advertising campaigns, mergers, and major product launches come readily to mind as examples. Dynamic studies of reputation could also examine whether it is affected simply by changes in advertising expenditures over time. Studies could also examine how media attention moderates the main effects of corporate actions themselves. Actions that Directly Affect or Credibly Claim to Affect Performance Performance concerns play a central role in most significant corporate actions. Firms do very little that has nothing to do with performance, and most corporate actions are at least partially justified with respect to economic needs and benefits. Further, it is known that technical and financial performance play a critical role in reputational assessments. Given this, studies should examine how reputation is affected by various corporate actions which aim to improve performance (or which manifestly harm it). Research should also examine how the actual, realized performance consequences of particular actions (e.g. market reactions and accounting performance) condition those actions’ ultimate effects on reputation. Actions that are Themselves Performance Signals Our dynamic approach toward reputation also suggests that we can view performance announcements and discontinuous changes in performance as types of corporate actions themselves. Research could productively examine whether reputations change in response to earnings announcements, for example. Earnings announcements which exceed or fail to meet expectations may be of particular interest. Earnings announcements might also be examined
356
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
with respect to a firms’ past history of announcements based on the notion that history, precedent, and credibility are consequential for reputation. Studies could also examine how other types of performance events (i.e. non-financial events) affect overall corporate reputations. For example, it would be interesting to know whether overall firm reputations change as a result of year to year changes in product quality ratings that are issued by news outlets (e.g. Consumer Reports) and product research firms (e.g. J.D. Power and Associates). Actions that Alter the Relational Status of the Firm To the extent that reputation is a partial function of relational status, studies should examine whether and how reputation is affected by the formation and dissolution of inter-organizational relationships. Obvious examples here are joint ventures, strategic alliances, and participation in inter-firm consortia. Studies of mergers’ effects on reputation might also be usefully informed by arguments about relational status – as mergers also represent the ‘‘joining’’ of organizations. Studies examining actions that alter relationships should give special consideration to the status of partner firms. Developing ties to higher status firms may have a different effect than establishing ties to lower status ones. The centrality of firms within an overall network of inter-firm ties – and changes in this centrality – may be similarly consequential for reputation. Actions that Appear to Send Complex and Conflicting Signals Reading between the lines of the above discussion, one can begin to see that many important firm actions are not merely of one type. Many, and perhaps most, significant corporate actions fall into more than one category and may thus send multiple (and possibly conflicting) signals. Most actions, for instance, are justified with respect to performance concerns, and they may also exert observable effects on actual performance outcomes. But, the same actions may also signal something important about firm character, convey symbolic conformity (or non-conformity), generate coverage for the firm, and/or alter the relational status of the firm. We believe that actions which send conflicting signals – actions which should improve reputation from one perspective but damage it according to another perspective – may be particularly useful and revealing to study. By studying such actions’ actual reputational consequences, we may gain insight into the relative power and usefulness of divergent accounts of corporate reputation. We may also gain important insights about complementarities between alternative theories of reputation through such studies.
Studying the Dynamics of Reputation
357
What Should Scholars Aspire to Learn from Dynamic, Action-Focused Studies of Reputation? As our introduction indicated, there are at least two distinct answers to this question. Pragmatically speaking, researchers can gain insight into the reputational consequences of particular corporate actions and, perhaps, into the consequences of general categories of actions, as well. From a pragmatic perspective, the different theories of reputation we have discussed can be seen as conceptual tools for understanding specific relationships which exist in the empirical world. These theoretical tools help us categorize corporate actions, and suggest possible reasons why they may be consequential to reputation. They also provide partial insights into the nature of reputation itself. But, the empirical relationships between particular actions and ascribed reputation are themselves of primary importance. From a pragmatic perspective, theories are appropriately viewed as means, not as ends. Much organizational and strategy research has taken this sort of ‘‘problemcentered’’ approach, either implicitly or explicitly, and we believe it is a legitimate and useful one. Theoretically speaking, dynamic, action-focused studies can also provide evidence about the individual merits of particular perspectives on reputation. They provide an opportunity to observe whether audiences actually respond to the particular signals emphasized by these different theories. Important support for each of the four individual perspectives on reputation which we reviewed could be inferred if studies were to show that reputation is actually affected by the types of actions which these perspectives respectively draw attention to. Further, dynamic studies can usefully extend each of these individual perspectives by showing how reputational flows (rather than stocks) are affected by particular firm actions/signals. That is, they can provide insight into the process through which reputation is built up and depleted over time. More importantly, perhaps, dynamic studies may also provide critical insights into potential complementarities between these perspectives, and into their relative merits. We have noted that many corporate actions send multiple signals and thus fall into more than one categorical type. Studies examining the reputational consequences of such actions provide a chance to explore complementarities between theoretical perspectives. Further, to the extent that particular actions send conflicting signals, examining their reputational consequences can help us better establish the relative power of competing theories. For instance, if an action is performance enhancing but also sends negative signals about firm character, and if reputation is found
358
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
to be positively affected by that action, stronger support for the ‘‘reputation as technical efficacy’’ perspective could be inferred, relative to the ‘‘reputation as character traits’’ view.
How Should Dynamic, Action-Focused Studies of Reputation Be Theoretically Framed? As we have noted, one legitimate option is to frame research in a pragmatic, problem-centered fashion. Scholars can attempt to maintain a position of theoretical agnosticism, and to make actual empirical relationships the central focus of the research. While appealing in some ways, this sort of ‘‘radical empiricism’’ has practical limits. To begin, many corporate actions are likely to capture scholarly attention precisely because of their apparent correspondence with a particular theoretical perspective (e.g. because they appear to imply a deficit of character, appear to be a response to field-level institutional dynamics, or appear to be performance-enhancing, etc.). Further, it is natural that researchers will bring certain theoretical predispositions into their empirical studies of actions and their reputational consequences. Reputation is a very general concept, and one which occupies a prominent place in very different theoretical schools (e.g. game theory and neo-institutionalism). It is thus quite likely that particular studies will be primarily grounded within one or the other of these schools. Scholars may legitimately embark on research in order to further a particular body of knowledge about reputation, rather than to further general knowledge about reputation, per se. As acknowledged, we believe it is critically important for empirical studies to be meaningfully informed by multiple theoretical perspectives even if they are primarily grounded in a particular one. A corporate action may attract research attention because it appears to send a particular signal which a researcher’s preferred theory of reputation holds to be consequential. However, this may not be the only signal which the action sends. Further, the action’s observed consequences on reputation may not be wholly – or even correctly – understood through the particular theoretical lens which primarily focuses the study in question. We strongly believe that scholars are likely to get a more accurate understanding of the reputational consequences of particular corporate actions if they take a multi-theoretic approach toward framing their research. Fig. 1 provides a schematic view that encapsulates this argument. We further believe that this approach, taken over time, can result in the construction of better individual theories of reputation. It is
Studying the Dynamics of Reputation
359
Implications for Symbolic Conformity, Effects on Prominence w/in Field
Projected Traits, Consistency with Identity and Prior Actions
Reputational Consequences of Action
Impact on Relational Status
Impact on Technical Efficacy
Fig. 1. Understanding the Reputational Consequences of Corporate Actions.
perhaps unlikely that such an approach will ever result in the emergence of a grand, integrated theory of corporate reputation. And, we do not necessarily believe that developing such a theory is a realistic or desirable goal. However, the research approach we recommend can facilitate continuing conversation between scholars in different theoretical camps by focusing their shared attention on empirical phenomena of mutual interest.
What Needs to Be Taken into Account Empirically? If studies are to be meaningfully informed by multiple theories, they must empirically account for the factors which these respective theories hold to be causally important. Specifically, it is necessary to include measures which capture the range of different signals which may emanate from or accompany a particular action. Building such indicators into the research design will typically require considerable knowledge about the particular empirical context under study. It will also require substantial knowledge of different theoretical perspectives on reputation, and an understanding of the methods and measures typically used within these different perspectives. This section provides some specific guidance on these empirical design questions.
360
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
Consider How the Action Relates to the Firm’s Past Actions, Espoused Commitments, and/or Claimed Identity If reputational evaluations depend upon audiences’ beliefs about organizational character traits, it is necessary for research to situate actions within the context of a particular firm’s history. From this perspective, the reputational impact of particular actions should be partially determined by their consistency (or lack thereof) with a firm’s prior actions, statements, and commitments (Fombrun, 1996). It is thus necessary that we know something about the particular firm’s past. Research should attend to the claimed and revealed identity of the firm in question. It should examine how a particular action fits (or fails to fit) with other actions that the firm has undertaken in the past. Actions that are reputation enhancing for one firm may be deeply problematic for another, depending upon each firm’s particular history and espoused commitments. We believe it is important that research take firm histories into account even if the particular action under study is being considered for another reason. For instance, while a particular action may capture research attention because of its apparent effects on performance, that action’s consistency or inconsistency with the firms’ past actions may moderate (or even primarily explain) its ultimate effect on the reputation. Consider How the Action Relates to Field Level Institutional Demands and Broader Institutional Dynamics If firm reputations depend critically upon field level institutional processes, it is important that research attend to these processes. In other words, studies should view particular actions as they are situated against the backdrop of a broader organizational field, in addition to see them as situated in a firm’s unique history. Situating actions within a field entails at least four things. First, it requires understanding the boundaries of a field. Who, in other words, are the focal organization’s peers? Second, it entails considering the prevalence of the action in question within the organizational field. Neo-institutional research indicates that prevalence is a critically important indicator of legitimacy, and that actions which are more prevalent among peer firms are more likely to confer approval on a focal firm (Tolbert & Zucker, 1997). Third, it entails explicitly considering time as a factor. From an institutional perspective, when an organization engages in an action it has everything to do with that action’s ultimate effects on social approval. Specifically, actions which are adopted later in the diffusion cycle can be expected to produce greater approval (Tolbert & Zucker, 1983; Westphal et al., 1997). Finally, it entails attending to field level actors (e.g. professional groups, the media, the state, and industry groups) (Deephouse, 2000; Fombrun & Shanley, 1990; Mezias,
Studying the Dynamics of Reputation
361
1990; Ruef & Scott, 1998; Westphal et al., 1997). These actors grant approval to some corporate actions and explicitly proscribe or frown upon other ones. They propagate cultural logics that influence organizations and enforce field level norms and regulations. Further, they often pass public judgments on organizations’ actions, thereby affecting how these actions are evaluated by other audiences. In other words, they act as ‘‘information intermediaries’’ who play a key role in the construction of reputation (Deephouse, 2000; Pollock & Rindova, 2003). Consider How the Action Affects the Organization’s Prominence and Exposure within the Field The neo-institutional perspective also implies that a firm’s prominence is an important determinant of reputation (Rindova et al., 2005). Accepting this, it is important to consider how particular actions may affect the attention and exposure a firm receives. The amount of media coverage an action generates for the firm may exert an independent effect on subsequent reputational change. Taking this idea seriously, research should consider the volume of media coverage attending to particular corporate actions, in addition to examining the content (i.e. positive vs. negative) of that coverage (see Deephouse, 2000; Fombrun & Shanley, 1990; Staw & Epstein, 2000). Consider How the Action Affects Technical Performance Given that performance is a critical factor in most corporate decisions and given that it has been found to be consistently and strongly linked to reputational assessments, it is particularly important to consider an action’s performance implications if we are to draw correct causal conclusions. One way to effectively do this is to examine the immediate performance implications of particular corporate actions. Specifically, studies can examine the excess returns associated with corporate actions and decisions (Brown & Warner, 1985; Love & Kraatz, 2006). They can also consider how external evaluators’ (e.g. stock analysts and newspaper reporters) immediate reactions to particular actions may condition their ultimate effects on reputation. Studies could also examine the explanations which firms themselves provide for their actions and assess whether the performance justifications they offer are credible and persuasive. Another approach is to assess the performance results that have previously been obtained by firms engaging in similar actions (see Haunschild & Miner, 1997; Kraatz, 1998). It is possible that audiences learn from these shared past experiences and bring them to bear in passing judgment on a focal firm’s actions. Finally, it is possible to observe an action’s long-term effects on operational and accounting performance, and to consider these as partial
362
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
determinants of the action’s reputational effects. However, this is very difficult as much time may pass before these effects manifest themselves, and as many other events which may also affect reputation are likely to occur in the intervening period. It is perhaps worth noting that the (initially great) conceptual distance between the ‘‘reputation as technical efficacy’’ and the neo-institutional perspective on reputation appears to shrink considerably as the difficulties of gauging an action’s performance consequences are considered in depth. Specifically, the experiences of other organizations, organizational justifications for their actions, and the evaluations of field level actors and information intermediaries emerge as potentially critical factors according to both perspectives. This would seem to underscore the value of the multi-theoretic approach we have advocated. Consider How the Action Affects the Firm’s Relational Status If reputation partially derives from a firm’s relationships with other firms, it is necessary for research to consider how an action affects those relationships. It should be particularly concerned with the relative status of partner firms with whom relationships are established or severed (Podolny, 1993, 1994). Studies should obviously consider these relational status concerns when examining corporate actions that are primarily relational in nature (e.g. forming alliances and joint ventures). However, they should also consider the possible relational implications of actions undertaken for different purposes. Many firm actions may have unintended consequences for relational status. Consider, as one particular example, Levi-Strauss’ recent decision to sell its jeans through Wal-Mart. This action was undertaken with the clear objective of increasing sales. However, it established an important linkage between the firms which may have substantial, and not necessarily positive, implications for Levi’s reputation. In particular, its reputation for product quality and social responsibility may both suffer as a result of this new association with Wal-Mart. Consider the Audience(s) While we have focused mostly upon the need to empirically situate corporate actions, it is also necessary for studies to properly situate the audiences who actually ascribe reputation. Different audiences may have different views of what is admirable in firms – which character traits are most appealing, what constitutes symbolic conformity, what aspects of technical performance are valued, and so on (Fombrun, 1996; Friedland & Alford, 1991; Meyer, 2002). For these reasons, studies should account for the characteristics of the particular audiences whose reputational judgments they
Studying the Dynamics of Reputation
363
actually examine. Moreover, research should attempt to empirically separate different audiences and to compare their respective reactions to particular corporate actions, to the extent that this is possible. Studies examining potential differences between audiences may be particularly valuable from a theory building perspective. Specifically, such studies can help establish whether reputations actually rest upon universalistic criteria. Or, they might instead support the idea that ‘‘where you sit determines what you see’’ as you evaluate a firm and its actions. Any one of the above listed empirical ‘‘considerations’’ (i.e. character, conformity, performance, exposure, relationships, audience) may be the primary one at work in determining a particular action’s ultimate effects on reputation. However, as we have emphasized, it is important for research to consider each of these factors empirically, to the extent that this is possible within a given research context. Studies which fail to consider all of these factors run the risk of misinterpreting observed Action- Reputation relationships, and may thus draw inappropriate theoretical conclusions. Even if a particular study is primarily concerned with one of these empirical considerations (e.g. performance or relationships), it should consider other factors as important control variables. Further, studies should also consider other factors as potential moderators of the relationship which is of primary interest. A particular action may affect reputation primarily because it signals symbolic conformity, for instance. But, this effect may be importantly conditioned by the observable performance effects associated with the action and/or by the media coverage surrounding it. In some cases, properly accounting for moderating effects may largely suppress the initially observed main effect which would indicate that the moderator may itself be of primary causal importance.
How Should Reputation be Measured? While we have thus far focused primarily upon the ‘‘action’’ side of the Action-Reputation relationship, it is also necessary to elaborate upon the ‘‘reputation’’ element of this equation. We offer the following general guidelines for measuring reputation in dynamic, action focused studies. Reputation Should be Measured as a Global, Omnibus Evaluation of the Firm as a Whole At a basic level, this advice merely reiterates a commonly used definition of corporate reputation itself. But, it is nonetheless important to reemphasize the need to employ reputation measures that cohere with this definition. The
364
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
sole point of convergence between the alternative theories of reputation we have reviewed is their shared concern with overall impressions of firm quality. Though these perspectives emphasize that reputation flows from different sources, they are united in their desire to explain what makes a firm appealing or admirable in a general sense. In order to conjointly apply these perspectives and conduct dynamic studies of the type we advocate, an empirical focus on overall evaluations of firm quality must thus be maintained. While more narrow definitions and measures of reputation (e.g. reputation for product quality, for financial performance, for treatment of employees, for social responsibility, etc.) clearly provide opportunities for research, they fail to provide the global point of convergence needed here. They also fail to cohere with the basic definition of corporate reputation which we – and many others – have employed. Reputation Should be Measured Using Perceptual Evaluations The need to employ perceptual measures is also implicit in the very definition of the concept. Rankings based upon objective data which are collected from or about organizations are inappropriate for the type of reputational research we advocate here. Formula-based ranking schemes (e.g. U.S. News and World Report’s rankings of colleges and universities) are certainly consequential and worthy of research attention, but they are not appropriately studied within the theoretical and empirical framework we have constructed. Such instruments do not measure audience admiration or esteem in a general sense – though they are likely one important source of such esteem. Studies Should Use Measures of Corporate Reputation which are Taken at Multiple Intervals Over Extended Periods of Time Studying reputation in a dynamic sense obviously requires time series data. Without time-varying measures of reputation, it is not possible to assess the reputational consequences of particular corporate actions. It is also not possible to adequately control for the antecedents of corporate actions. Having multiple years of time series data is particularly critical if studies are to meaningfully integrate institutional arguments about field-level dynamics. Time is a paramount concern from an institutional perspective and a key component of most neo-institutional research. Studies Should Use Established Measures of Reputation which Capture Public Attention and which Firms themselves Care About We believe this is good advice for two reasons. First, firms’ public or ‘‘broadcast’’ reputations are the ones which matter most from a pragmatic
Studying the Dynamics of Reputation
365
perspective. It is important to understand the determinants of such reputations even if they are not the ones that we, as reputation scholars, might design ourselves. In a pragmatic sense, ‘‘reputation is as reputation does.’’ Publicly ascribed and scrutinized reputations are ‘‘what reputation does.’’ Second, public reputations are the ones most likely to inform managerial decision-making. If managers care about reputation, in general, they should care about their public, ascribed reputations, in particular. It is plausible to believe that firms act, in part, with reference to these published rankings. Studies should use measures of reputation that are ascribed by audiences whose opinions matter to the firm and should also evaluate multiple audiences, where possible. The preceding arguments also imply that studies should use measures of reputation which are ascribed by particular audiences that firms care about. It implies that research should focus on audiences that transact with the firm and who actually know something about it. Such evaluations ‘‘matter’’ for firms. It is further important that reputation measures evaluate reputation as it is ascribed by different audiences (e.g. customers, peers, employees, investors), when possible. We have noted above that separate audiences may respond differently to the same actions. We have also stressed that these observed differences (or similarities) may be highly revealing from a theoretical standpoint. Two Specific Measures of Reputation Worth Considering While the criteria above are broadly applicable, two specific reputational rankings seem most worthy of further discussion in light of these desiderata: Fortune’s long-running ‘‘Most Admired Companies’’ survey and the Fombrun/ Harris ‘‘Reputation Quotient’’ survey. Of these, the Fortune survey is the most prominent and oft-used in reputational research. The measure has many appealing features, relating to the criteria above. First, it is an omnibus, global evaluation of the appeal, or ‘‘admiration’’ of the firm as a whole. Respondents evaluate firms on eight disparate dimensions that Fortune combines into a single published firm-level score.3 Second, the scores directly reflect audience’s overall perceptual evaluations, as respondents are asked to rate firms on simple Likert scales (1–10). In addition, firms are presented only in comparison with their industry peers. This lends a distinctly relational element to the evaluation which is consistent with scholarly emphasis on reputation’s inherently relational nature. Third, annual data is available for over 20 years (since 1982), on firms in a large and expanding number of industries. This enables the type of long-term longitudinal studies that we have advocated. Fourth, the survey is quite salient and apparently consequential to firms themselves. Fortune has a very wide and diverse
366
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
readership and is monitored by participants in many different industries and fields. The simple fact that over ten thousand busy executives and securities analysts take the time to respond to the survey each year provides some compelling evidence of the survey’s significance and its consequential nature. Fifth, the groups that respond to the survey (peer firm executives and security analysts) are audiences whose opinions clearly matter to firms themselves. These audiences are also particularly knowledgeable about the firms they rate. Finally, while Fortune only publishes an overall ranking (combining analysts and peer firm executives ratings), it is also possible to obtain the data on these groups’ individual rankings of firms.4 This is particularly important for reasons described just above. Despite these highly desirable features, the Fortune survey has been the subject of appreciable academic criticism, and it is important for scholars using the survey to acknowledge and address these criticisms in their work (see Szwajkowski & Figlewicz, 1997 for a review). Researchers should also strive to use the survey in ways that minimize the liabilities inherent in its design. One of the main criticisms of the Fortune rankings is that they are unrepresentative of the firm’s ‘‘whole’’ reputation, given that only two distinct groups (analysts and peer firm executives) are surveyed, while other important audiences (notably customers and employees) are not. The exclusion of these other audiences’ clearly renders the survey weak for some research purposes. It obviously does not capture the whole of public opinion about the firm. As acknowledged, we believe that scholars can effectively blunt the ‘‘non-representative’’ criticism in several ways. To begin, they can present the survey for what it is, and not for something else. It is a powerful, perceptual, omnibus measure of a firm’s overall reputation as assessed by a remarkably large group of knowledgeable evaluators whose opinions are highly consequential to firm actions and outcomes. It is a measure which is very widely publicized and closely followed by firms themselves. It is not, however, a measure which captures everything that is known about a firm by all of its relevant audiences. It is thus critically important that researchers emphasize the situated nature of the Fortune survey measure. They should make the particular audiences who respond to the survey an integral part of the research ‘‘story.’’ They should, likewise, avoid telling stories that are rendered improbable by the known characteristics and limitations of the survey. This advice, while perhaps intuitive, has not always been followed. We believe that much of the prior debate surrounding the Fortune measure has occurred because it has been used to assess research questions for which it is not particularly well-suited, such as questions about the relationship between a firm’s social responsibility and its financial performance (e.g. McGuire, Sundgren, & Schneeweiss, 1988).
Studying the Dynamics of Reputation
367
The Fortune survey has also been criticized for other reasons. First, scholars have found that it appears to contain a strong performance ‘‘halo’’ (Brown & Perry, 1994; Fombrun & Shanley, 1990). Research has shown that overall firm rankings are very strongly affected by financial performance, and has also found that financial performance is predictive of individual survey items (e.g. community and social responsibility) which bear no obvious, intuitive relationship to profitability and stock price. Second, scholars have found that the individual items are strongly inter-correlated and tend to load on a single factor when subjected to factor analysis (Fombrun & Shanley, 1990; Fryxell & Wang, 1994). While acknowledging these important empirical facts, we diverge appreciably from previous critics of the survey in our interpretation of their meaning. We believe that these criticisms are far from damning, and that they can also be effectively addressed simply by properly situating the survey and its respondents. To wit, there are good reasons to expect that financial performance should exert a great effect on reputation, as it is ascribed by the particular corporate audiences surveyed by Fortune. Given their roles and responsibilities, executives and analysts should give great weight to financial performance as they adjudge overall firm quality. But, the observation that financial performance is a strong influence on reputation in the eyes of these audiences by no means justifies the conclusion that the two are one and the same. As we have strongly emphasized above, research needs to effectively control for performance considerations in assessing the reputational consequences of various corporate actions. But, to the extent that these effects remain after adequately accounting for performance, the ‘‘reputation equals performance’’ argument becomes considerably less persuasive. Indeed, a key advantage of the general approach we recommend is that it makes the issue of whether reputation equals performance into a tractable empirical question, rather than a philosophical debate. Dynamic, action-focused studies of reputation allow scholars to separate performance signals from other signals that firms send and make it possible to evaluate their respective effects on ascribed reputation. They hold substantial promise for this reason (among others). We also believe that the previously observed inter-correlations between survey items are to be expected given what is known (or at least widely believed) about the nature of reputation itself. If reputation is, in fact, an overall, omnibus evaluation of firm quality, this clustering is exactly what one would expect to observe (Fombrun & Shanley, 1990). One would only expect such inter-correlations to be absent if we assume, ex ante, that reputation constitutes nothing more than ‘‘the sum of the parts.’’ Again, a dynamic, action-focused approach toward the study of reputation has the
368
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
key advantage of transforming this (otherwise philosophical) question about the relation between parts and wholes into an empirical one.5 A second useful reputation measure is the Reputation Quotient which was more recently developed by Fombrun and colleagues in conjunction with Harris Interactive (see Fombrun, Gardberg, & Sever, 2000; Fombrun & Van Riel, 2004). These researchers have conducted annual surveys of public regard for the 60 of the most visible companies in the U.S. since 2001, with more limited data available back to 1999. This instrument is also appealing in light of the above-listed criteria. It is explicitly designed as an overall, perceptual measure of respondents’ regard for firms. It is not as well-known or closely followed as Fortune’s survey, but summary results and interpretations are published in the Wall Street Journal each year. A clear advantage of the survey is that it polls a representative sample of thousands of Americans, rather than the narrower corporate audiences polled by Fortune. Some disadvantages include the relatively shorter number of available years, the relatively smaller number of firms rated and, the focus on general public opinion (vs. the opinions of specific corporate constituencies). Given these specific features, it is also important that research employing the Reputation Quotient properly situates the survey and its respondents. How Should the Reputational Consequences of Corporate Actions be Analyzed? The general model for analyzing the reputational consequences of actions is straightforward. First, if dynamic studies are the goal, an appropriate timeseries analysis of reputational changes for a relevant sample of firms is needed. The dependent variable can be constructed as a measure of change in reputation, or as reputation in the current period, controlling for reputation in the prior period. The primary independent variable should, generally speaking, be an indicator variable that specifies whether a given action took place in a given firm-year. Potential moderating effects, such as those described above, can be incorporated by constructing multiplicative interaction terms. Appropriate controls (discussed further below) must also be included in order to mitigate against problems of unobserved heterogeneity. Beyond this broad and flexible outline, there are three particular analytical issues which merit more in-depth discussion. Rankings and Ratings of Reputation Often researchers will have a choice of analyzing changes in reputational ratings, or changes in reputational rankings. Ratings are the ‘‘raw scores’’ – they
Studying the Dynamics of Reputation
369
constitute evaluators’ actual responses to the Likert-style survey items. Rankings, in contrast, represent the firm’s relative standing within its peer group. They are derived from the raw scores. Each measure has some distinct advantages. From a strictly analytic viewpoint, ratings are more desirable because collapsing them into rankings eliminates some information from consideration. Conceptually, however, there is a strong case for using rankings which are an overtly relational dependent measure. With ranking data, we no longer see a firm in isolation, but rather as it stands in comparison to its peers. This fits very nicely with reputation theory’s emphasis on the relative nature of social approval. Using rankings also effectively controls for inter-industry differences, in a manner closely analogous to (industry) fixed effects models. Our experience suggests that the two approaches tend to yield very similar results. But, this experience may not be wholly typical, and researchers are encouraged to experiment with both approaches. Analytic Technique Changes in ratings can be analyzed through standard cross-sectional time series regression techniques. Fixed effects models are a particularly appropriate technique, given that dynamic, action-focused studies are centrally concerned with explaining ‘‘within firm’’ variance over time. Industry fixed effects, as noted, can also be included. To analyze changes in rankings, we suggest that researchers consider rank-ordered logistic regression (also known as exploded logit regression) (see Allison & Christakis, 1994; Beggs, Cardell, & Hausman, 1981). This technique is a generalization of the conditional logit model (McFadden, 1974) that is particularly suited to analysis of reputational rankings. It has been previously used, for example, to explore the factors that affect human resource manager’s relative rankings of fictional job candidates (vanBeek, Koopmans, & vanPraag, 1997), and to explain consumers’ relative rankings of competing products (e.g. Hausman & Ruud, 1987; Hausman & Taylor, 1981). These previous uses of the technique clearly suggest its applicability for firm ranking data. How to Control for Firm Performance Studies need to include extensive performance controls for two reasons. First, as noted, it is well-known that reputation is strongly affected by financial performance. Second, firm performance likely plays a role in motivating firms to perform many actions and is thus a source of unobserved heterogeneity which must be accounted for if accurate interpretations of Action-Reputation relationships are to be reached. Studies should account for current firm performance, historical firm performance, and for changes in performance which
370
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
may accompany or precede the action under study.6 It is also essential to control for multiple dimensions of financial performance (e.g. overall profitability, revenue growth, stock price, etc.). Finally, because some audiences (notably analysts) are likely to be especially concerned with firm’s future performance prospects, studies should also include forward-looking measures of performance expectations. Two good examples of performance expectation measures are recent changes in the firm’s market valuation and changes in analysts’ estimates of future earnings (using the IBES database). Both wellreflect audiences’ expectations about future firm performance.
AN IN-DEPTH EXAMPLE: DOWNSIZING AND CORPORATE REPUTATION We believe that the theoretical and empirical framework for dynamic, actionfocused reputation research that we have elaborated is a potentially powerful, broadly applicable, and highly flexible tool. However, we have developed this framework largely as a result of our efforts to understand (both empirically and theoretically) one particular Action-Reputation relationship. This section summarizes that particular study (Love & Kraatz, 2006), describes some of the challenges we faced in conducting it, and explains the choices we made in response to these challenges. We believe this ‘‘story’’ is worth recounting here for several reasons. First, it provides one specific example of the general approach we have advocated. Second, it may allow the reader to see how we actually reached some of the conclusions which we have thus far presented. Third, we hope it will alert readers to some of the challenges they may face in conducting similar research and usefully inform their responses to these challenges. Finally, we hope that this study’s particular findings which we will also summarize, will provide some indication of the benefits, which may be realized by future studies working within our general framework. Our study which is as yet unpublished, focuses on corporate downsizing. It examines how this corporate action affected ascribed reputations over the period from 1985 to 1994, as downsizing became increasingly prevalent. Downsizing initially captured our attention for primarily pragmatic reasons. It was, simply, one of the most obvious and talked about changes occurring in American business during the 1990s. Two obvious, but seemingly contradictory, facts about downsizing made it particularly interesting to us. First, lots of people did not like it much. It was widely criticized (even by presidential candidates!), and many critics portrayed it as an opportunistic action – a betrayal of trust and an act of bad faith on the part of corporate actors
Studying the Dynamics of Reputation
371
(Gordon, 1996; Noer, 1993). Second, almost all large American firms engaged in the practice anyway (Baumol, Blinder, & Wolff, 2003; McKinley, Mone, & Barker, 1998). Further, lots of other people, particularly members of the financial community (see Useem, 1993, 1996) and some vocal and prominent executives (e.g. Jack Welch), thought it was quite a good idea and strongly emphasized its benefits. The question of how this particular action would affect observers’ approval of the firms actually engaging in it seemed like an important one, if only from a ‘‘lay-theoretic’’ standpoint. Our curiosity about this particular question was perhaps accentuated by the fact the Fortune ‘‘Most Admired Companies’’ survey also emerged as a prominent feature of the corporate landscape during roughly the same time period. Though we were by no means innocent of theoretical predispositions as we approached this empirical problem, the question of how downsizing would affect reputation became progressively more ‘‘interesting’’ (and confusing) as we sought to engage it on more self-consciously theoretical terms. Much existing academic literature on downsizing (e.g. Lamertz & Baum, 1998; Love, 2000; McKinley, Sanchez, & Schick, 1995) had viewed the practice through an institutional lens, and this lens sharply focused our initial thinking. Earlier research had emphasized that downsizing had ‘‘diffused,’’ and become ‘‘legitimate’’ (Lamertz & Baum, 1998; Love, 2000). It proposed that firms had engaged in the action in order to gain social approval from the financial community and from their downsizing peers (McKinley, Zhao, & Rust, 2000; Useem, 1993). It had similarly emphasized that these legitimacy concerns outweighed technical or efficiency considerations in motivating firms to downsize (given downsizing’s ambiguous effects on financial and operational performance) (McKinley et al., 1998). These insights implied that we should expect downsizing to yield reputational gains, especially given that the Fortune rankings are based on the evaluations of stock analysts and peer firm executives (two groups who were themselves deeply implicated in the downsizing phenomenon, either as participants or advocates). Neo-institutional insights further led us to expect these reputational benefits would increase over time, as the practice became more prevalent and thus legitimate. Looking deeper into the reputation literature, however, we were led to some contradictory expectations. Much reputation theory, as we have noted, suggests that reputation is the product of anthropomorphization processes, wherein audiences attribute human traits to firms (Dowling, 2001). This literature similarly holds that audiences evaluate firms as ‘‘wholes,’’ rather than as collections of institutionally appropriate ‘‘parts.’’ It indicates that audiences place a very high value on traits such as trustworthiness, commitment,
372
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
credibility, and identity-consistency (Fombrun, 1996; Fombrun & Van Riel, 2004). From this perspective, downsizing’s effects on corporate reputation seemed equally obvious – but obviously different. The act strongly appeared to convey a lack of commitment, responsibility, and trustworthiness – even outright opportunism. As noted, many critics described the practice in exactly these terms. We also encountered literature suggesting that reputation (at least as measured by Fortune) is a primary function of technical efficacy and, more specifically, of financial performance (Fryxell & Wang, 1994). Assuming this perspective, concerns about institutional conformity and firm character both faded into the background. What became salient, instead, was downsizing’s ultimate effects (or non-effects) on the bottom line. From this perspective, it was not clear that downsizing would even be consequential for reputations, once accompanying performance signals were appropriately factored out. Confronting these disparate perspectives and trying to take each of them seriously caused us substantial intellectual pain and considerably slowed our progress. We found it particularly vexing that each perspective lent great clarity to parts of the downsizing phenomenon but, at the same time, rendered other parts of the whole empirical picture completely opaque (or invisible). Worse yet, as we assumed these individual perspectives in turn, the very nature of the things we were observing often appeared to morph. It was not just that the perspectives provided different insights into the empirical relationship at hand. Rather, downsizing and reputation themselves seemed to become different things, depending upon our ‘‘theory in use’’ at the time. Ultimately, we came to believe that it was possible (if not easy) to deeply and seriously engage each of these individual perspectives without allowing our analysis to become wholly subsumed within any one of them. We began to see that maintaining a sort of ‘‘inside/outside’’ theoretical posture made it possible to see the empirical relationship at hand through multiple lenses simultaneously. (See again Fig. 1 above). Importantly, assuming this posture allowed – indeed required – us to incorporate corresponding variables into our empirical analyses. (See above section on ‘‘What needs to be taken into account empirically’’). The end result, we believe, is that we were able to craft a ‘‘whole’’ explanation of downsizing’s reputational consequences which is something considerably more than that which we could have obtained if we had relied exclusively on any one theoretical ‘‘part’’ in grounding the study. Further, we also believe that the study yielded some important theoretical insights which would have been otherwise unobtainable precisely because of the inside/outside, multi-theoretical posture that we attempted to
Studying the Dynamics of Reputation
373
maintain throughout. The study yields no definitive conclusions about the nature of reputation, or about the essential factors that determine reputational flows. Scholars who are strong adherents to any particular theoretical perspective may find it unsatisfying for this reason (and perhaps others). But, the study does provide important insights about the individual and relative merits of different perspectives, at least within our particular study context. It also yields important insights about the apparent interrelationships and complementarities between these perspectives. Importantly, it was only after we had come to terms with these theoretical and empirical issues in our particular research context that we were able to see the potential broad applicability of our methodological and theoretical ‘‘template.’’ Indeed, the very idea of advocating a dynamic, action-focused approach toward studying reputation was the outcome of this particular study, rather than its animus. Perhaps unsurprisingly, the actual design and execution of the study closely parallels the theoretical and empirical design advice we have provided above. To begin, we framed the study in terms of general questions that transcend particular theoretical perspectives on reputation. Namely, ‘‘What makes a firm admirable in the eyes of its external audiences?’’ and ‘‘How do corporate actions affect this admiration?’’ These general questions create a shared focal point for otherwise disparate theoretical perspectives. We then tried to deeply engage alternative perspectives on these questions, taking each on its own terms and being careful not to portray them as necessarily oppositional to one another. We subsequently brought these individual perspectives’ insights to bear in order to develop a series of hypotheses about how downsizing, in particular, should affect corporate reputation. We situated these hypotheses with respect to the known facts of the context, as well as with respect to the various theories. Some of the hypotheses we developed are oppositional (we specified competing hypotheses about downsizing’s main effect on reputation, for example). But, many of them were complementary. We hypothesized, for instance, that downsizing’s effect on reputation would be positively moderated both by stock market reactions (consistent with the technical efficacy perspective) and by the prevalence of downsizing within the field (consistent with the neoinstitutional perspective). We also included extensive controls in our study, particularly for financial performance. This was essential for reasons we have repeatedly discussed. Our study used the Fortune survey data as its outcome measure, but we were particularly careful to situate this survey, consistent with the advice provided above. We framed our arguments and hypotheses in order to make
374
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
them consistent with the known characteristics of the Fortune survey, and in order to effectively anticipate criticism of this measure. Indeed, the Fortune surveys’ exclusive focus on peer firm executives and analysts became a strength, rather than a weakness, of our study. Specifically, these groups’ deep involvement with – and evident approval of – downsizing lent plausibility to the argument that downsizing would positively affect their reputational judgments. Consistent with the advice offered above, we also examined differences between executive and analyst groups and developed a hypothesis about these differences. We analyzed downsizing’s effects on reputation using the rank-ordered logistic regression technique described above. Our results showed that downsizing had a strong negative effect on reputation. The average downsizing firm lost more than 2/3rds of a position in the intra-industry rankings during the year it downsized, net of all other controls. This finding is noteworthy for several reasons. First, the mere existence of a significant relationship, regardless of its direction, provides some important support for the basic actionfocused approach we have advocated. Respondents fill out the Fortune survey only once annually and they process countless pieces of information about the firms they rate each year. Downsizing’s observed influence on ascribed reputation indicates that additional research might productively examine the influence of other important corporate actions using a similar methodology. Second, the negative effect we observed provides some support for the argument that reputational assessments (and changes therein) hinge importantly on organizational character concerns. Third, the fact that we observed this negative relationship despite downsizing’s prevalence and cultural/political standing provides support for the character trait perspective relative to the institutional perspective with its emphasis on symbolic conformity and field level forces. While much research implies that firms engaged in downsizing in a symbolic attempt to gain social approval from external audiences, our findings suggest that the practice was not typically received as such. Several other findings are also notable. First, contrary to our predictions, we found that analysts were actually slightly more negative than peer executives in their evaluations of downsizing firms. This finding lends further credibility to the idea that reputation rests, at least in part, on attributed character traits. It also fails to support arguments which hold that reputational evaluations merely reflect audiences’ self-interested evaluations and need projections. With downsizing, firms seemed to give analysts exactly what they valued and, indeed, asked for. Yet, their reputations suffered as a result of this apparent capitulation. Second, and somewhat alternatively, our study did find that initial market reactions to downsizing (i.e. ‘‘excess returns’’) positively moderated its
Studying the Dynamics of Reputation
375
reputational consequences. Specifically, audiences tended to judge downsizing firms less negatively to the extent that excess returns were positive (and vice versa). This finding supports the idea that performance is influential in shaping reputational evaluations, if not determinative of them. It also points to the importance of ‘‘information intermediaries’’ in conditioning the reputational effects of firm actions. However, support for these arguments is only partial, as the main effect of downsizing remained strong and consistent despite the inclusion of this (and other) moderating variables. Third, we also found that downsizing’s reputational effect was moderated by prior performance trends at the firm. Specifically, downsizing had a less negative effect on reputation to the extent that firm performance was declining, and a more negative effect when it was increasing. These findings are also consistent with an attributed character trait explanation. Specifically, it appears that firms were partially exempted from their prior commitments when financial troubles gave them an excuse for their actions. Alternatively, they were more likely to be viewed as opportunistic to the extent they downsized without obvious cause. Finally, and perhaps most importantly, we found that the initially strong and negative effect of downsizing largely dissipated over time as the practice became increasingly prevalent and thus, apparently, legitimate. While the practice initially seemed to convey opportunism, it apparently lost much of this negative connotation as more and more firms engaged in the practice. This finding provides strong evidence that field level institutional processes importantly condition the reputational consequences of firm actions, even though they did not appear to determine these consequences in our particular study.
CONCLUSION: POTENTIAL DIRECTIONS FOR DYNAMIC, ACTION-FOCUSED STUDIES OF REPUTATION We believe that the dynamic, action-focused approach that we employed in our study of downsizing could be readily applied to examine the reputational consequences of any number of other corporate actions. We believe that the framework we have developed here could be especially useful if applied to examine the reputational consequences of other corporate actions that have become widely prevalent and/or captured great public and/or scholarly attention over the past two decades. Below, we draw attention to some particular actions which would appear to be appropriate foci for future research, and discuss how these actions might be studied within our framework.
376
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
Studies Examining the Reputational Consequences of Governance Changes One avenue for future research would be to examine how corporate reputations were affected by various corporate governance changes that have been widely adopted in recent years. These include the changes in the composition of governing boards, changes in executive compensation plans (Zajac & Westphal, 1995), and firms’ public espousal of the shareholder value model itself (Fiss & Zajac, 2004). These are substantively important changes which have also been the subject of much academic research. Research examining the reputational effects of such changes would appear merited for both pragmatic and theoretical reasons. Firms have adopted many of these changes under pressure from external groups and with the evident intention of gaining these groups’ approval. Research conducted using our framework may help determine whether they yield the intended consequences. Such studies might also be theoretically revealing for reasons similar to our downsizing study. Specifically, some governance changes appear to convey complex and conflicting signals to audiences. By altering governance arrangements in order to privilege shareholder interests, firms signal their responsiveness to that particular constituency and may also signal their conformity with culturally powerful conceptions of firm governance. However, these same actions may lead audiences to make negative attributions about the fairness and responsibility of the corporate whole.
Studies Examining the Reputational Consequences of Changing Firm Boundaries Dynamic studies might also productively examine the reputational consequences of major changes in firm boundaries, such as acquisitions and divestitures. These structural changes have also been seemingly ubiquitous occurrences over the past two decades and are interesting from both a substantive and theoretical standpoint. Acquisitions and sell-offs are almost universally justified with respect to their anticipated performance benefits. However, research has also shown that institutional processes figure prominently in their occurrence (as with downsizing). Specifically, firms have been argued to alter their boundaries in partial response to institutional expectations and demands (Davis, Diekmann, & Tinsley, 1994; Zuckerman, 2000). Further, major contractions or expansions in the boundaries of a firm would appear to send strong signals about organizational character,
Studying the Dynamics of Reputation
377
identity, and historical continuity. Finally, such actions may also affect the relational status of a firm (particularly in corporate acquisitions). Given these complexities, our theoretical understanding may be much augmented by studies examining how acquisitions and divestitures affect reputations.
Studies Examining the Reputational Consequences of Changing Inter-Organizational Relationships Studies examining reputational changes resulting from the creation of new inter-organizational relationships, as in joint ventures and strategic alliances, could also be conducted within our framework. These inter-organizational relationships have similarly become increasingly common in recent years. To the extent that reputation is a partial function of relational status, we would expect it to affect the formation and dissolution of such relationships. However, as with the other actions we have discussed, joint venture announcements may send multiple and possibly conflicting signals. Particularly, these actions may also affect performance and lead audiences to make attributions about organizational identity and character. Further, to the extent that joint ventures have become more prevalent due to institutional processes, we might also view them as signals of cultural conformity. Studies using our framework to study the reputational consequences of joint ventures and alliances could help sort out these effects.
Studies Examining the Reputational Consequences of Performance Changes, Patterns, or Events We also believe that studies examining the reputational impact of various performance signals can usefully be conducted within the confines of the broad framework we have crafted (even though they do not appear to be ‘‘corporate actions’’ as such). The relationship between performance outcomes and reputation has been of intense interest to reputational researchers (Brown & Perry, 1994; Fombrun & Shanley, 1990). However, if we take seriously the idea that reputational assessments are situated in multiple contexts, additional and unexplored questions about the PerformanceReputation relationship emerge. For example, one could draw on neoinstitutional arguments to explore whether corporate reputation has been driven by different performance variables in different historical eras. As a case in point, it is possible that stock prices have become more determinative
378
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
Table 2. – Summary of Recommendations. Study reputational changes over time Reputational ‘‘flows’’ vs. reputational ‘‘stocks’’ Examine processes through which reputation is augmented/depleted Study reputational consequences of particular corporate actions Identify how reputational flows are affected by things firms do Pragmatic and theoretical benefits Bring multiple theoretical perspectives on reputation to bear Perspectives draw attention to different actions Perspectives provide different predictions/explanations about effects of same action Studies grounded in one perspective should be informed by multiple perspectives Particular perspectives to consider: Reputation as a function of organizational character traits Reputation as a function of symbolic conformity and cultural prominence Reputation as a function of technical efficacy Reputation as a function of relational status Focus particularly on following types of actions: Actions that are interesting and seemingly consequential for firms themselves Actions that appear to send signals about organizational character, convey symbolic conformity, generate attention for the firm, affect technical efficacy, or alter the relational status of the firm Actions that appear to send complex and conflicting signals may be particularly appropriate Empirically ‘‘situate’’ corporate actions: Consider how the action relates to the firm’s past actions, espoused commitments, and/or claimed identity Consider how the action relates to field level institutional demands and broader institutional dynamics Consider how the action affects the organization’s prominence and exposure within the field Consider how the action affects technical performance Consider how the action affects the firm’s relational status Consider how the action is likely to be construed by particular audiences ascribing reputation Empirically examine these factors as moderators and/or controls How to measure reputation Use global, omnibus evaluations of the firm as a whole Use perceptual evaluations Use measures taken at multiple intervals over extended periods of time Use measures that are established, that capture public attention, and which firms care about Use measures that are ascribed by audiences whose opinions matter to the firm Use measures that separate multiple audiences, where possible
Studying the Dynamics of Reputation
379
Table 2. (Continued ) How to analyze reputational consequences of firm actions Use fixed effects regression to examine changes in reputational ratings Use rank-ordered logistic regression to examine changes in reputational rankings Incorporate extensive controls, particularly for firm performance Some ideas for future dynamic, action-focused studies of reputation Study the reputational consequences of governance changes Study the reputational consequences of changing firm boundaries Study the reputational consequences of changing inter-organizational relationships Study the reputational consequences of performance changes, patterns, or events
of reputational standing as the shareholder value model gained increasing cultural influence over time. Research might also examine how within-firm trends in performance affect reputational assessments, and examine how reputation is affected by discontinuities in these trends. A unifying theme discernible in all of these suggested studies is that actions which initially appear consequential for one reason, may be as consequential (or more so) for other reasons. Analogously, actions which capture research attention because they appear interesting through one theoretical lens, may be equally interesting (if less intuitive) when seen through an alternate one. For these reasons, and others which we have elaborated, we believe that a multi-theoretical approach is highly desirable for the conduct of dynamic, action-focused studies. Table 2 summarizes the various recommendations that we have provided for research design. We hope that future studies will examine the specific questions and phenomena that we have identified, as well as other like questions and phenomena. Such studies may provide important insights into the basic questions we set out to address. Specifically, they might tell us much about the dynamics of reputation and the reputational consequences of firm actions.
NOTES 1. There are relatively small number of extant studies that examine the relationship between corporate actions and reputation (e.g. Flanagan & O’Shaughnessy, 2005; Staw & Epstein, 2000; Zyglidopoulos, 2001). These studies, to varying degrees, contain elements of the approach we advocate here. We present the integrated approach in this chapter as a way to situate studies of corporate actions in a broader and more robust conceptual framework than has been used to date.
380
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
2. Game theorists have also emphasized that trustworthy behavior is important in the construction of reputation. However, they have taken a somewhat more narrow view of the concept of reputation than have most organizational scholars (Weigelt & Camerer, 1988). 3. The dimensions are: Management quality, product quality, innovativeness, value as a long-term investment, financial soundness, ability to attract, develop and retain personnel, community and environmental responsibility, and use of corporate assets. 4. Recently, Fortune has moved to selling these disaggregated data for a hefty fee. 5. Szwajkowski and Figlewicz (1997) consider the issue of high inter-item correlations on the Fortune survey in great depth. They similarly conclude that these correlations are unproblematic. 6. Here we focus specifically on a firm’s financial performance. Other types of performance and performance change should be similarly controlled for to the extent possible.
REFERENCES Allison, P. D., & Christakis, N. A. (1994). Logit models for sets of ranked items. Sociological Methodology, 24, 199–228. Baron, J. N. (2004). Employing identities in organizational ecology. Industrial and Corporate Change, 13(1), 3–32. Baron, J. N., Hannan, M. T., & Burton, M. D. (2001). Labor pains: Change in organizational models and employee turnover in young, high-tech firms. American Journal of Sociology, 106(4), 960–1012. Baron, J. P., Dobbin, F., & Jennings, P. D. (1986). War and peace: The evolution of modern personnel administration in U.S. industry. American Journal of Sociology, 92, 250–283. Baumol, W. J., Blinder, A. S., & Wolff, E. N. (2003). Downsizing in America: Reality, causes, and consequences. New York: Russell Sage. Beggs, S., Cardell, S., & Hausman, J. (1981). Assessing the potential demand for electric cars. Journal of Econometrics, 17(1), 1–19. Brown, B., & Perry, S. (1994). Removing the financial performance halo from fortunes most admired companies. Academy of Management Journal, 37(5), 1347–1359. Brown, S. J., & Warner, J. B. (1985). Using daily stock returns: The case of event studies. Journal of Financial Economics, 14, 3–31. Davies, G. (2002). Corporate reputation and competitiveness. New York: Routledge. Davis, G. F., Diekmann, K. A., & Tinsley, C. H. (1994). The decline and fall of the conglomerate firm in the 1980s – the deinstitutionalization of an organizational form. American Sociological Review, 59(4), 547–570. Deephouse, D. L. (2000). Media reputation as a strategic resource: An integration of mass communication and resource-based theories. Journal of Management, 26(6), 1091–1112. Deephouse, D. L., & Carter, S. (2005). An examination of the differences between organizational legitimacy and organizational reputation. Journal of Management Studies, 42(2), 329–360. DiMaggio, P. J., & Powell, W. W. (1983). The iron cage revisited: Institutional isomorphism and collective rationality in organizational fields. American Sociological Review, 48(April), 147–160.
Studying the Dynamics of Reputation
381
Dowling, G. R. (2001). Creating corporate reputations: Identity, image, and performance. New York: Oxford University Press. Edelman, L. B. (1992). Legal ambiguity and symbolic structures: Organizational mediation of civil rights law. American Journal of Sociology, 97, 1531–1576. Fiss, P. C., & Zajac, E. J. (2004). The diffusion of ideas over contested terrain: The (non)adoption of a shareholder value orientation among German firms. Administrative Science Quarterly, 49(4), 501–534. Flanagan, D. J., & O’Shaughnessy, K. C. (2005). The effect of layoffs on firm reputation. Journal of Management, 31(3), 445–463. Fombrun, C., Gardberg, N. A., & Sever, J. M. (2000). The reputation quotient: A multistakeholder measure of corporate reputation. Journal of Brand Management, 1(3), 241–255. Fombrun, C., & Shanley, M. (1990). Whats in a name – reputation building and corporatestrategy. Academy of Management Journal, 33(2), 233–258. Fombrun, C. J. (1996). Reputation: Realizing value from the corporate image. Boston, Mass.: Harvard Business School Press. Fombrun, C. J., & VanRiel, C. B. M. (2004). Fame & fortune: How successful companies build winning reputations. Upper Saddle River, NJ: Pearson Education. Friedland, R., & Alford, R. R. (1991). Bringing society back in: Symbols, practices, and institutional contradictions. In: W. W. Powell & P. DiMaggio (Eds), The new institutionalism in organizational analysis (pp. 232–263). Chicago: University of Chicago Press. Fryxell, G. E., & Wang, J. (1994). The fortune corporate reputation index – Reputation for what. Journal of Management, 20(1), 1–14. Gordon, D. M. (1996). Fat and mean: The corporate squeeze of working Americans and the myth of managerial downsizing. New York: Martin Kessler Books. Greenwood, R., Suddaby, R., & Hinings, C. R. (2002). Theorizing change: The role of professional associations in the transformation of institutionalized fields. Academy of Management Journal, 45(1), 58–80. Hannan, M. T., & Freeman, J. (1984). Structural inertia and organizational change. American Sociological Review, 49, 149–164. Haunschild, P., & Miner, A. S. (1997). Modes of interorganizational imitation: The effects of outcome salience and uncertainty. Administrative Science Quarterly, 42, 472–500. Hausman, J. A., & Ruud, P. A. (1987). Specifying and testing econometric-models for rankordered data. Journal of Econometrics, 34(1–2), 83–104. Hausman, J. A., & Taylor, W. E. (1981). Panel data and unobservable individual effects. Econometrica, 49(6), 1377–1398. Jones, T. M. (1995). Instrumental stakeholder theory: A synthesis of ethics and economics. Academy of Management Review, 20(2), 404–437. Kraatz, M. S. (1998). Learning by association? Interorganizational networks and adaptation to environmental change. Academy of Management Journal, 41(6), 621–643. Lamertz, K., & Baum, J. A. C. (1998). The legitimacy of organizational downsizing in Canada: An analysis of explanatory media accounts. Canadian Journal of Administrative SciencesRevue Canadienne Des Sciences De L Administration, 15(1), 93–107. Love, E. G. (2000). Changing technical and institutional influences on adoption of an administrative practice: Downsizing at large U.S. firms, 1977–1995. Academy of Management Best Papers Proceedings.
382
MATTHEW S. KRAATZ AND E. GEOFFREY LOVE
Love, E. G., & Kraatz, M. S. (2006). The dynamics of reputation: How downsizing affected corporate reputation at large U.S. Firms, 1985–1994. Working Paper, University of Illinois at Urbana-Champaign. McFadden, D. (1974). Conditional logit analysis of qualitative choice behavior. In: P. Zarembka (Ed.), Frontiers in econometrics (pp. 105–142). New York: Academic Press. McGuire, J., Sundgren, A., & Schneeweiss, T. (1988). Corporate social responsibility and firm financial performance. Academy of Management Journal, 31, 854–872. McKinley, W., Mone, M. A., & Barker, V. L. (1998). Some ideological foundations of organizational downsizing. Journal of Management Inquiry, 7(3), 198–212. McKinley, W., Sanchez, C. M., & Schick, A. G. (1995). Organizational downsizing: Constraining, cloning, learning. Academy of Management Executive, 9(3), 32–44. McKinley, W., Zhao, J., & Rust, K. G. (2000). A sociocognitive interpretation of organizational downsizing. Academy of Management Review, 25(1), 227–243. Meyer, J. W., & Rowan, B. (1977). Institutional organizations: Formal structure as myth and ceremony. American Journal of Sociology, 83(2), 440–463. Meyer, M. W. (2002). Rethinking performance measurement: Beyond the balanced scorecard. Cambridge: Cambridge University Press. Mezias, S. J. (1990). An institutional model of organizational practice – financial-reporting at the fortune 200. Administrative Science Quarterly, 35(3), 431–457. Noer, D. M. (1993). Healing the wounds: Overcoming the trauma of layoffs and revitalizing downsized organizations (1st ed.). San Francisco: Jossey-Bass. Podolny, J. M. (1993). A status-based model of market competition. American Journal of Sociology, 98(4), 829–872. Podolny, J. M. (1994). Market uncertainty and the social character of economic exchange. Administrative Science Quarterly, 39(3), 458–483. Pollock, T. G., & Rindova, V. P. (2003). Media legitimation effects in the market for initial public offerings. Academy of Management Journal, 46(5), 631–642. Rao, H. (1994). The social construction of reputation – certification contests, legitimation, and the survival of organizations in the American automobile-industry – 1895–1912. Strategic Management Journal, 15, 29–44. Rindova, V. P., Williamson, I. O., Petkova, A. P., & Sever, J. M. (2005). Being good or being known: An empirical examination of the dimensions, antecedents, and consequences of organizational reputation. Academy of Management Journal, 48(6), 1033–1049. Roberts, P. W., & Dowling, G. R. (2002). Corporate reputation and sustained superior financial performance. Strategic Management Journal, 23(12), 1077–1093. Ruef, M., & Scott, W. R. (1998). A multidimensional model of organizational legitimacy: Hospital survival in changing institutional environments. Administrative Science Quarterly, 43(4), 877–904. Schultz, M., Mouritsen, J., & Gabrielsen, G. (2001). Sticky reputation: Analyzing a ranking system. Corporate Reputation Review, 4(1), 21–41. Scott, W. R. (2001). Institutions and organizations (2nd ed.). Thousand Oaks CA: Sage. Selznick, P. (1957). Leadership in administration. Evanston, II: Row, Peterson. Shrum, W., & Wuthnow, R. (1988). Reputational status of organizations in technical systems. American Journal of Sociology, 93(4), 882–912. Staw, B. M., & Epstein, L. D. (2000). What bandwagons bring: Effects of popular management techniques on corporate performance, reputation, and CEO pay. Administrative Science Quarterly, 45(3), 523–556.
Studying the Dynamics of Reputation
383
Stuart, T. E., Hoang, H., & Hybels, R. C. (1999). Interorganizational endorsements and the performance of entrepreneurial ventures. Administrative Science Quarterly, 44(2), 315–349. Szwajkowski, E., & Figlewicz, R. E. (1997). Of babies and bathwater: An extension of the business & society research forum on the fortune reputation database. Business and Society, 36(4), 362–386. Tolbert, P. S., & Zucker, L. G. (1983). Institutional sources of change in formal structure of organizations: The diffusion of civil service reform. Administrative Science Quarterly, 28, 22–39. Tolbert, P. S., & Zucker, L. G. (1997). The institutionalization of institutional theory. In: S. Clegg, C. Hardy & o. other (Eds), Handbook of organization studies (pp. 174–190). Newbury Park: Sage. Tyler, T. R. (1990). Why people obey the law. New Haven, CT: Yale University Press. Tyler, T. R. (1999). The psychology of legitimacy. Personality and social psychology review, 1, 323–344. Useem, M. (1993). Executive defense: Shareholder power and corporate reorganization. Cambridge, MA: Harvard University Press. Useem, M. (1996). Investor capitalism: How money managers are changing the face of corporate America (1st ed.). New York: Basic Books. vanBeek, K. W. H., Koopmans, C. C., & vanPraag, B. M. S. (1997). Shopping at the labour market: A real tale of fiction. European Economic Review, 41(2), 295–317. Washington, M., & Zajac, E. J. (2005). Status evolution and competition: Theory and evidence. Academy of Management Journal, 48(2), 282–296. Weigelt, K., & Camerer, C. (1988). Reputation and corporate-strategy – a review of recent theory and applications. Strategic Management Journal, 9(5), 443–454. Westphal, J. D., Gulati, R., & Shortell, S. M. (1997). Customization or conformity? An institutional and network perspective on the content and consequences of TQM adoption. Administrative Science Quarterly, 42(2), 366–394. Zajac, E. J., & Westphal, J. D. (1995). Accounting for the explanations of ceo compensation – substance and symbolism. Administrative Science Quarterly, 40(2), 283–308. Zuckerman, E. W. (2000). Focusing the corporate product: Securities analysts and de-diversification. Administrative Science Quarterly, 45(3), 591–619. Zyglidopoulos, S. C. (2001). The impact of accidents on firms’ reputation for social performance. Business and Society, 40, 416–441.
This page intentionally left blank
384
AN ASSESSMENT OF THE USE OF STRUCTURAL EQUATION MODELING IN INTERNATIONAL BUSINESS RESEARCH$ G. Tomas M. Hult, David J. Ketchen Jr., Anna Shaojie Cui, Andrea M. Prud’homme, Steven H. Seggie, Michael A. Stanko, Alex Shichun Xu and S. Tamer Cavusgil ABSTRACT Structural equation modeling (SEM) is a powerful multivariate statistical technique that requires careful application. The use of SEM in international business research has substantially increased recently, necessitating a critical evaluation of its use in the field. Through an analysis of 148 articles in the international business (IB) literature, we detail the state of current use of SEM in IB research and compare its use to the established best practices. In many instances, SEM’s use in IB has been
$
This research was supported by the Center for International Business Education and Research at Michigan State University (MSU-CIBER), and is the product of a doctoral seminar paper from a seminar led by the first author.
Research Methodology in Strategy and Management, Volume 3, 385–415 Copyright r 2006 by Elsevier Ltd. All rights of reproduction in any form reserved ISSN: 1479-8387/doi:10.1016/S1479-8387(06)03012-8
385
386
G. TOMAS M. HULT ET AL.
faulty, suggesting that authors may have drawn incorrect conclusions. To expand the IB field’s knowledge base, methodological accuracy is essential. Based on our review of the technique’s use in IB research coupled with the established practices in the social science literature, we provide practical suggestions for better applying SEM in the IB literature.
INTRODUCTION Since its emergence as an area of business research in the mid-1950s (e.g., Wright, 1970), international business (IB) has evolved in many ways, including the choice of research methodologies. This evolution includes the reliance on sophisticated data analysis procedures such as structural equation modeling (SEM). Broadly, SEM is a multivariate statistical technique that tests relationships between observed and/or latent variables while incorporating potential measurement errors; it has a unique ability to simultaneously examine complex webs of linkages among variables (Jo¨reskog, So¨rbom, du Toit, & du Toit, 1999). As such, SEM allows for a more powerful assessment of interactions, measurement error, and multiple independent and dependent relationships than traditional multivariate techniques (Hair, Anderson, Tatham, & Black, 2005). The expanded capability to examine data in a nomological net also allows researchers to focus on the validity of the overall conceptual model rather than a more limited focus on individual coefficients. At the same time, SEM is a complex technique, and with that complexity enters challenges for users. Indeed, reviews of SEM’s use in fields such as marketing (Steenkamp & van Trijp, 1991), organizational behavior (Brannick, 1995), management information systems (Chin, 1998), logistics (Garver & Mentzer, 1999), and strategic management (Shook, Ketchen, Hult, & Kacmar, 2004) have uncovered flaws in their respective literatures. As with each of these other literatures, the use of SEM within IB research is unique. As such, it is appropriate and critical to examine SEM’s use in IB research at this stage of development of the field and its use of the technique. If similar problems exist in the IB literature, significant consequences are likely. For example, any mistake in the use of SEM weakens the validity of resultant findings. From a broader perspective, such issues inhibit IB researchers’ abilities to build knowledge and, to the extent that errors are embedded in published studies, the guidance offered for subsequent research will be faulty. Over time, this undermines the ability to draw scholarly conclusions and to provide accurate suggestions for managers.
An Assessment of the Use of Structural Equation Modeling
387
The peculiarities of studying IB phenomena introduce threats to reliability and validity not present in domestically based studies. These peculiarities include translation issues, cross-cultural differences, and variations in accounting systems across nations (e.g., Craig & Douglas, 2000). The potential of these threats to distort findings makes it imperative that researchers carefully control other threats such as by implementing SEM appropriately. In addition, IB is a relatively young discipline that is striving to gain a level of ‘status’ that parallels those of the more established functional disciplines such as marketing and management. The continued development of the IB field, relative to its sister disciplines in business and the social sciences, depends on the extent that statistical techniques (especially sophisticated techniques such as SEM) are implemented appropriately. The issues surrounding SEM’s use in the IB context are of increasing importance because of the technique’s growing prevalence in the field. While the first article to use SEM in IB research appeared two decades ago (i.e., Graham, 1985), SEM’s use among IB researchers has expanded rapidly in recent years. For example, JIBS published 26 SEM-based studies between 1999 and 2004, but it featured only nine such articles prior to 1999. Given the challenges inherent in using SEM, the recent flurry of SEM-based IB research, and the likely consequences of the misuse of SEM in the IB field, an appraisal of SEM’s use in IB research is both valuable and warranted. Thus, the purposes of our paper are to (1) assess the quality of past applications of SEM in IB research and (2) suggest remedies for any shortcomings discovered. As such, our paper offers what Hitt, Boyd, and Li (2004, p. 23) call ‘research kaizen’ – an effort to identify methodological predicaments and propose solutions. As Hitt et al. (2004) suggest, fields such as IB need research kaizen in order to ensure theoretical progress and knowledge generation.
KEY CONCERNS WHEN USING SEM We examined a total of eight topics central to IB’s use of SEM. Six topics have been identified in previous SEM reviews as being critical (cf. Shook et al., 2004): data characteristics, reporting, model respecification, reliability and validity, equivalent models, and evaluation of model fit. In addition, we included two other SEM-based issues of particular importance to IB research. The first deals with the assessment of measurement equivalence, a particularly useful feature of SEM for IB because IB research often involves cross-country studies. SEM is well suited to multiple-sample inquiry because it allows for the testing of equivalence of measurement models across groups
388
G. TOMAS M. HULT ET AL.
(Bollen, 1989). Second, we examined how IB researchers coped with common method bias. A large portion of IB research is survey based, which means that common-method bias is inherently a significant concern. SEM allows researchers to perform ‘single-factor’ (e.g., McFarlin & Sweeney, 1992) and ‘same-source’ (e.g., Netemeyer, Boles, McKee, & McMurrian, 1997) tests to assess the potential for common method variance to inhibit results’ reliability and validity. Table 1 lists the 148 studies we examined relative to our eight topics. Of the ten journals we searched, four were selected owing to their standings as the premier academic journals focusing on IB, while the other six are prominent journals in functional areas of business that feature rigorous IB research. An article had to meet two criteria to be included in the analysis: the article had to be international in focus (i.e., it examined multinational enterprises, international joint ventures, cross-cultural issues, and/or included more than one country in the study – Werner & Brouthers, 2002) and the authors had to use SEM for investigation of a measurement model or measurement constructs and/or a structural model. We did not include studies that used partial least squares (PLS), as this is not a covariancebased technique. Two independent researchers coded each of the 148 articles. The inter-rater reliability was 96.65%, which compares favorably with similar studies (e.g., 93%, Shook et al., 2004). In instances of disagreement, the two coders discussed the issue until an agreement was reached. Table 1 suggests that SEM’s use has grown significantly in the last few years. We found only 52 relevant articles from 1985 to 1999 (35% of the total), while we found 96 articles from 2000 to 2004 (65%). Of the 148 articles, 140 (95%) disclosed the number of countries sampled, usually by listing the specific countries. Of these 140 articles, 66 (47%) sampled more than one country. The mean number of countries sampled per article was 3.9, with a maximum of 77. The most frequently studied country was the United States (74 studies), followed by Japan (33), China (22), and the United Kingdom (19). Eighty articles (54%) used SEM for both the measurement and structural models; 47 articles (32%) contained a measurement model or tested constructs only (no structural model or structural relationships were examined), and 21 (14%) used SEM for the structural model only. Data Characteristics Although SEM is often referred to as ‘causal modeling,’ it can only provide evidence of causality, not establish causality. Also, the strength of the evidence SEM provides depends on the design used. Cross-sectional studies
An Assessment of the Use of Structural Equation Modeling
Table 1.
389
Articles Included in the Analysis.
Academy of Management Journal (n ¼ 6) Peterson et al. (1995) Janssens et al. (1995) Tsai and Ghoshal (1998) Isobe et al. (2000) Montoya-Weiss et al. (2001) Kostova and Roth (2002) Administrative Science Quarterly (n ¼ 2) Farh et al. (1997) Luo (2001) International Business Review (n ¼ 23) Nonaka et al. (1994) Tan and Vertinsky (1995) Jun et al. (1997) Usunier (1998) Andersen and Kheam (1998) Pedersen and Petersen (1998) Peng and Peterson (1998) Forsgren et al. (1999) Ebrahimi (2000) Holm and Eriksson (2000) Birkinshaw et al. (2000) Andersson et al. (2001) Lindquist et al. (2001) Chetty and Eriksson (2002) Moon and Jain (2002) Baldauf et al. (2002) Yli-Renko et al. (2002) Eriksson and Chetty (2003) Balabanis and Katsikea (2003) Aurifeille and Quester (2003) Zeybek et al. (2003) Knight et al. (2003) Lee and Habte-Giorgis (2004) Journal of International Business Studies (n ¼ 35) Johnson et al. (1990) Lee and Green (1991) Roth and Romeo (1992) Cullen et al. (1995) Rao and Hashimoto (1996) Holm et al. (1996) Dyer and Song (1997) Eriksson et al. (1997)
Lin and Germain (1998)
Simonin (1999)
Money and Graham (1999) Shaffer et al. (1999) Cadogan et al. (1999) Pillai et al. (1999) Griffith et al. (2000) Steensma et al. (2000) Balabanis (2001) Jun et al. (2001) Laroche et al. (2001) Klein (2002) Olsen and Olsson (2002) Makhija and Stewart (2002) Pothukuchi et al. (2002) Skarmeas et al. (2002) Cadogan et al. (2002) Arino (2003) Steenkamp, et al. (2003) Buck et al. (2003) Zhang et al. (2003) Simonin (2004) Shay and Baack (2004) Knight and Cavusgil (2004) Jensen and Szulanski (2004) Hui et al. (2004) Dhanaraj et al. (2004) Journal of International Marketing (n ¼ 42) Golden et al. (1995) Keillor et al. (1996) Kumcu (1997) LaBahn and Harich (1997) Powpaka (1998) Styles (1998) McGowan and Sternquist (1998) Zou et al. (1998) Li (1999) Shoham (1999) Li and Atuahene-Gima (1999) Lin and Germain (1999) Zou and Ozsomer (1999) Knight (2000) Yeoh (2000) McNaughton and Bell (2000) Ozsomer and Prussia (2000)
390
G. TOMAS M. HULT ET AL.
Table 1. (Continued ) Song and Xie (2000)
Calantone and Zhao (2000) Hult et al. (2000) Yip et al. (2000) Myers et al. (2000) Dow (2001) Myers and Harvey (2001) Granzin and Painter (2001) Armstrong and Yee (2001) Griffith et al. (2001) Hsieh (2002) Kim and Oh (2002) Homburg et al. (2002) Alashban et al. (2002) Lin and Germain (2003) Zou et al. (2003) Thelen and Honeycutt (2004) Lages and Lages (2004) Toften and Olsen (2004) Laroche et al. (2004) Piercy et al. (2004) Townsend et al. (2004) Keillor, et al. (2004) Luo et al. (2004) Cavusgil et al. (2004) Journal of Marketing (n ¼ 12) Johnson et al. (1993) Bello and Gilliland (1997) Klein et al. (1998) Rose (1999) Song et al. (2000) Cannon and Homburg (2001) Grewal and Tansuhaj (2001) De Wulf et al. (2001) Zou and Cavusgil (2002) Atuahene-Gima and Li (2002)
Coviello et al. (2002)
Morgan et al. (2004)
Management Science (n ¼ 1) Gatignon et al. (2002) Management International Review (n ¼ 8) Lim et al. (1991) Hoang (1998) Furu (2000) Eriksson et al. (2000) Goll et al. (2001) Agarwal, Malhotra, and Wu (2002) West and Graham (2004) Fiegenbaum et al. (2004) Marketing Science (n ¼ 2) Graham (1985) Calantone et al. (1996) Strategic Management Journal (17 articles) Murtha et al. (1998) Capron (1999) Simonin (1999) Yeoh and Roth (1999) Holm et al. (1999) Steensma and Lyles (2000) Hult and Ketchen (2001) Capron et al. (2001) Li and Atuahene-Gima (2002) Koka and Prescott (2002) Worren et al. (2002) Kotabe et al. (2002) Andersson et al. (2002) Schroeder, et al. (2002) Goerzen and Beamish (2003) Barr and Glynn (2004) Hoskisson et al. (2004)
Using the guidelines of MacCallum et al. (1996), the 52 studies marked with an asterisk had
adequate power; 70 other studies did not, and we were unable to assess power for 26 studies.
provide weak evidence of causality because of the fact that if two factors are related, empirically it is not clear which construct shapes the other. Longitudinal designs provide stronger causal evidence because independent variables are measured before the dependent variables, minimizing the
An Assessment of the Use of Structural Equation Modeling
391
chance of a reverse causality problem. As shown in Table 2, we found that 144 out of 148 studies (97%) relied on cross-sectional designs. All the four studies using longitudinal designs appeared between 2000 and 2004, perhaps suggesting a burgeoning awareness of longitudinal designs’ value. However, the 3% use of longitudinal designs does not compare favorably to the 25% use of such designs found in the SEM-based strategic management literature (Shook et al., 2004). In sum, IB research has not provided as strong evidence as it could have about causal relationships when relying on SEM. Normality is a second concern involving the data used for SEM analysis. Most estimation techniques in SEM, including the most popular ones (maximum likelihood estimation and general least squares), require that the indicator variables display multivariate normality in order to obtain reliable estimates. When the normality assumption is violated, using these techniques frequently results in distorted goodness-of-fit measures and underestimated standard errors (MacCallum, Roznowski, & Necowitz, 1992). When standard errors are underestimated, significant coefficients may be derived simply because of the reduction in standard errors. One likely result is inaccurate findings and possibly erroneous conclusions. Therefore, authors need to assess and discuss the multivariate normality of their data and, if needed, use remedies to account for non-normality. As shown in Table 2, 134 of the 148 studies included in our analysis (91%) did not discuss the distribution characteristics of the data. Of the 14 studies that did discuss data normality issues, nine (64%) noted the normality of their data while five (36%) noted that their data were non-normal. Table 2 also presents results comparing usage since 2000 to earlier usage (1985–1999). On the plus side, the mention of data distribution properties has increased slightly over time. Six percent of the articles published before 2000 mention data normality, while 11% of the studies published in 2000 or later explicitly mentioned data properties. Future scholars can continue this positive trend by considering the two general approaches to dealing with the non-normality problems in SEMbased research. One solution is to use data transformation before SEM estimation to make the data better approximate multivariate normality. However, we would caution against this as it often violates the theoretical logic underpinning the original dataset. An alternative is to use nonconventional estimation techniques that have less stringent requirements of data distribution properties, or to use estimation techniques that adjust the model fit statistics and standard errors of each individual parameter estimate for non-normal data (Jo¨reskog et al., 1999). As an example, Browne
392
Table 2. Summary of Decisions Made by IB Researchers When Using SEM. 1985–1999 (52 Studies)
2000–2004 (96 Studies)
1985–2004 (148 Studies)
Percentage
Number of studies
Percentage
Number of studies
Percentage
Data characteristics Sampling design Cross sectional Longitudinal
52 0
100 0
92 4
96 4
144 4
97 3
Sample distribution Normal Not normal Not mentioned
3 0 49
6 0 94
6 5 85
6 5 89
9 5 134
9 5 134
Estimation technique MLE ERLS GLS GWLS Not noted
18 0 2 0 32
35 0 4 0 62
24 6 2 1 63
25 6 2 1 66
42 6 4 1 95
28 4 3 1 64
Reporting Input matrix used Correlation Covariance Raw data Not specified
5 17 1 29
10 33 2 56
2 19 0 75
2 20 0 78
7 36 1 104
5 24 1 70
Software package used LISREL EQS
34 8
65 15
49 16
51 17
83 24
56 16
G. TOMAS M. HULT ET AL.
Number of studies
Software package version reported Yes No
1 1 1 7
2 2 2 13
8 1 2 20
8 1 2 21
9 2 3 27
6 1 2 18
22 30
42 58
38 58
40 60
60 88
41 59
27
29
30
43
29
Model Respecification Specification search conducted Yes 14 Exploratory nature noted Yes 1 No 13 Changes cross validated Yes 1 No 13 Theory cited as justification for changes All 1 Some 4 None 9 No specification search 38 Reliability and validity Reliability (Note: some articles used multiple methods) Discussed 36 Coefficient a 25 Composite reliability 10 Other 3 Not discussed 16
73
69
31
2 27
3 40
0 29
1 42
3 3 23 67
4 7 32 105
75 50 30 4 21
70
78
22
111 75 40 7 37
71
An Assessment of the Use of Structural Equation Modeling
AMOS CALIS Other Not noted
75
25
393
394
Table 2. (Continued ) 1985–1999 (52 Studies) Number of studies
Percentage
Equivalent models Models tested Tested competing models No test of competing model Mentioned competing models Not applicable Evaluation of model fit Number of measures of fit used One Two
1985–2004 (148 Studies)
Number of studies
Percentage
Number of studies
Percentage
58 39 13 7 17 38
60
81 56 17 11 23 67
55
62 29 19 17 13 34
40 65
35
86 40 20 21 23 62
45 58
42
10 26 14 17
28 72
19 46 20 30
29 71
29 72 34 47
29 73
3 4
6 8
6 3
6 3
9 7
6 5
G. TOMAS M. HULT ET AL.
Convergent validity (Note: some articles used multiple methods) Discussed 23 44 Factor loadings 17 Average variance extracted 4 R2 4 Other 6 Not discussed 29 56 Discriminant validity (Note: some articles used multiple methods) Discussed 24 46 Pairwise tests 11 Average variance Extracted/ 1 shared variance Correlation/covariance 4 Other 10 Not discussed 28 54
2000–2004 (96 Studies)
Common method bias Assessment method Used single respondents Performed single-factor test Used multiple respondents Used secondary data
10 18 6 7 2 0 1 0 1
19 35 12 13 4 0 2 0 2
10 23 26 13 5 4 1 1 4
10 24 27 14 5 4 1 1 4
20 41 32 20 7 4 2 1 5
14 28 22 14 5 3 1 1 3
0 52
0 100
0 96
0 100
0 148
0 100
44 3 4 4
84
84 8 6 6
88
86
6 6
128 11 10 10
17
18
26
18
27 14 3 12
28
45 24 5 14
30
8 8
Measurement equivalence Assessment method (Note: some articles used multiple methods) Not applicable (i.e., single9 17 country study) Mentioned 18 35 Translation equivalence 10 Factor equivalence 2 Metric/measurement 2 Equivalence/invariance Other 8 Not mentioned 25 48
9 52
54
17 77
7 7
An Assessment of the Use of Structural Equation Modeling
Three Four Five Six Seven Eight Nine Ten Zero Statistical power noted Yes No
52
395
396
G. TOMAS M. HULT ET AL.
(1984) provides adjustments to the test statistics and standard errors so that modified significance tests from maximum likelihood (ML) or generalized least squares (GLS) estimates may be asymptotically correct. Estimation techniques that have less stringent requirements of data distribution properties include, but are not limited to weighted least squares (WLS), general weighted least squares (GWLS), and elliptical reweighted least squares (ERLS) (see Bentler, 1995 for discussions of these techniques). Of the five studies we examined that had non-normal datasets, none of them used data transformation as the corrective solution; four of them chose specific estimation techniques to solve the problem, while one study did not take any corrective action at all. Of the four studies that did choose appropriate estimation techniques to solve the problem, three of them used ERLS and one used GWLS. In the latter article, Buck, Filatotchev, Demina, and Wright (2003) explicitly state that owing to the use of binary variables, the assumption of multivariate normality is violated; as such, they use GWLS to estimate their model. Other articles (e.g., Zou & Cavusgil, 2002) took a more preventive method approach by choosing an estimation technique (i.e., ERLS) that provides better estimates when the normality assumption is violated. Authors in general should also report their estimation technique when relying on SEM. As can be seen in Table 2, of the 53 studies that reported the specific technique used, most (42) relied on ML. However, the vast majority of the articles (95) did not report which technique was being used, and only a select few articles reported usage of techniques other than ML. Besides, it may be that most authors of the ‘not noted’ articles in Table 2 opted for the software default option instead of carefully selecting the most appropriate technique (for most SEM software the default technique is ML). As data distribution has serious consequences for the accuracy of research findings, it is strongly advised that researchers check the data distribution properties before proceeding to parameter estimations, and report their findings. Mardia (1970) provided multivariate tests of skewness and kurtosis that help to detect multi-normal violations, and statistical software programs such as PRELIS (a precursor to LISREL; Jo¨reskog & So¨rbom, 1996), SPSS, and SAS can provide normality testing. If the scholar(s) is satisfied that the theoretical logic underpinning the data is not violated, then nonnormal data may be transformed into multi-normal data, followed by the application of ML or GLS-based SEM techniques. Alternatively, estimation techniques that do not require multivariate normal data as input should be used when the multi-normal assumption is violated.
An Assessment of the Use of Structural Equation Modeling
397
Reporting Clear reporting of researchers’ decisions is necessary to allow for possible replication of studies and to facilitate meta-analyses. At a minimum, scholars need to report what input matrix and software they used (Shook et al., 2004). For example, Cadogan, Diamantopoulos, and Mortanges (1999) reported that they used the covariance matrix as the input matrix and estimated the model using LISREL 8. This information is vital because different input matrices and software may produce slightly different findings owing to computational variation (such as revised computation methods and/or revised default parameters in the software). As shown in Table 2, 104 of the 148 studies (70%) did not mention the input matrix. Thirty-six (24%) reported the use of the covariance matrix as input, seven (5%) reported the use of the correlation matrix, and one study reported using raw data as input. The overall trend over time has actually been negative. Through 1999, 56% of studies did not report the input matrix, while 78% failed to do so in the post-1999 period. On the positive side, given the common practice of using the covariance matrix, an assumption may be that researchers in the recent time period (2000–2004) used the covariance matrix as input without specifying the matrix explicitly in the paper. On the negative side, IB authors must do better with the basic issue of reporting the input matrix and journal gatekeepers such as reviewers and editors should insist that they do so. The software package used was reported in 121 studies (82%). The most frequently reported software package was LISREL in 83 studies (56%), followed by EQS (16%), AMOS (6%), and CALIS (1.4%). The specific version of the software package used was only noted in 60 studies (41%). This level was consistent across the two time periods. Also, journals have varied in their attention to this issue. Of the ten journals, JIBS has the highest percentage (91%) of articles that report the software package used, suggesting that other journals should follow JIBS’s lead regarding this issue. Overall, the results are troublesome. The fact that most studies do not report the choice of input matrix and the software version used restricts the ability of scholars to replicate past studies and build upon existing research for enhanced theory development. Looking forward, IB researchers need to be mindful that clear reporting is an essential step in the knowledge accumulation process. Without such disclosures, other scholars will be restricted in their ability to build upon previous work and to interpret findings in light of the overall IB research dialog.
398
G. TOMAS M. HULT ET AL.
Model Respecification If the originally proposed model is found to offer a poor fit to the data, researchers often add or remove paths to achieve better fit (e.g., Anderson & Gerbing, 1988). Given the relative youth of the IB field, model respecifications may sometimes be valuable, especially for exploratory studies. Typically, model respecifications should not be data driven, but instead should be based on theoretical justifications. For example, when Eriksson et al. (1997, p. 350) modified their model to better fit the data, they explicitly justified the changes with a theoretical discussion. Beyond ensuring that studies remain theoretically focused, this also avoids capitalizing on idiosyncrasies in a dataset (Anderson & Gerbing, 1988). Researchers normally change their models based on some modification index; however, MacCallum et al. (1992) demonstrate that modification indices fluctuate widely owing to sampling errors. MacCallum (1986) states that researchers are very unlikely to arrive at the true model through respecification and Chin (1998) suggests that a respecified model should be cross validated through an additional sample(s) before it is accepted. Our analysis revealed that 43 of the 148 studies included model respecifications (29%). Of these 43 studies, four provided theoretical justifications for all changed paths, seven justified a subset of the paths, and 32 did not provide any theoretical justification. Only three studies noted the exploratory nature of their respecification and only one study cross validated the changes with a new sample. In the latter study, Keillor, Hult, Erffmeyer, and Babakus (1996) first included 70 items in their survey instrument in order to develop a national identity scale (i.e., the NATID scale). These items were tested using a U.S. sample. The initial results suggested that a parsimonious model could be constructed consisting of only 17 items. This respecified model was cross validated in the same article using Japanese and Swedish samples (Keillor et al., 1996) and was also later further cross validated using samples from Hong Kong and Mexico (Keillor & Hult, 1999). The cross-validation procedure, particularly in the same article, assures readers that the respecified model is not merely a reflection of the U.S. sample, but applies to other contexts as well. As a young discipline, IB research is, at times, viewed by some functional specialists as utilizing less rigorous empirical methods than more mature disciplines within business. Given that context, it is interesting that the frequency of model respecifications in IB research is actually lower than that in the strategic management field, where theory is heavily emphasized (29% vs. 47% in Shook et al., 2004). To build on this relative rigor, we suggest that future model respecifications within IB research should be carried out
An Assessment of the Use of Structural Equation Modeling
399
with caution, as every change to the original model toward a better model fit is prone to the criticism of capitalization on chance (Brannick, 1995). Even though empirical procedures such as the Lagrange Multiplier (LM) test (Bentler, 1995) provide some short cuts for model respecification, these procedures should be combined with a theoretical base (e.g., Anderson & Gerbing, 1988). Next, results from a respecified model should be cross validated with a new sample to validate its robustness. Furthermore, it is appropriate for researchers to state the limitation, as applicable, that the probability levels and estimates of the respecified model should be viewed with caution, and provide a discussion of the limitations that the respecified model may exhibit in their research context (Bollen, 1989). Reliability and Validity Establishing construct reliability and validity are prerequisites for achieving results that can be viewed with confidence. Scale reliability refers to the internal consistency of a set of items in measuring a latent construct that is composed of a set of reflective indicators. For a scale to be reliable, we expect its items to be reflective of the same underlying latent construct and to vary together (i.e., be significantly correlated). Traditionally, Cronbach’s coefficient a is the most commonly used measure of scale reliability (Cronbach, 1951); it is a measure of squared correlation between the observed scores and the true scores. However, a suffers from a number of limitations. First, the accuracy of the reliability estimation is influenced by the number of items included in the test (Garver & Mentzer, 1999). Second, Cronbach’s a mistakenly operates as if all items play equal roles in reliability calculations (Bollen, 1989). Accordingly, Cronbach recently acknowledged that using the a coefficient alone to determine reliability might be a mistake (Cronbach & Shavelson, 2004). In response to a’s limitations, several SEM-based reliability measures have been proposed. SEM differentiates between item reliability measures and construct reliability measures. The R2 value that is associated with each construct-to-item equation is a measure of the reliability of an individual item. For construct reliability measures, composite reliability (also called construct reliability at times) draws on the standardized loadings and measurement errors for each item (Fornell & Larcker, 1981). Similar to Cronbach’s a, an acceptable threshold for composite reliability is 0.70, with each item’s reliability recommended to be Z0.50 to equal a factor loading of 0.707 for each item (Fornell & Larcker, 1981), although Peter (1979) suggests that a minimum composite reliability of 0.80 is desired within confirmatory factor analysis settings.
400
G. TOMAS M. HULT ET AL.
Of the 148 articles examined, 111 (75%) explicitly discussed reliability, with several studies reporting multiple measures. Cronbach’s coefficient a was used in 75 studies (51%); composite reliability was reported in 40 studies (27%); and 8 studies (5%) used other reliability measures such as R2 and item-to-total correlation. Consideration of reliability increased slightly over time (69% of studies reported reliability measures between 1985 and 1999 vs. 78% between 2000 and 2004). Encouragingly, the use of composite reliability has been much more frequent in recent years than in the earlier years. In terms of validity, within the empirical assessment of the data, SEM researchers are primarily concerned with convergent validity and discriminant validity. Convergent validity is a measure of how well the items in a scale converge, or ‘load together,’ on a single latent construct. This is normally achieved by examining the overall fit of the measurement model and the magnitude, direction, and statistical significance of the estimated parameters between the latent constructs and their reflective indicators. The calculation of average variance extracted for a particular construct is often used to assess convergent validity (Fornell & Larcker, 1981). Among our studies, convergent validity was reported in 81 (55%), with several studies reporting multiple measures. In 56 studies (38%), authors examined the factor loadings, 17 studies (12%) reported average variances extracted, 11 (7%) reported R2, and 23 (16%) used other measures such as multi-trait-multi-method, shared variances, correlations, and construct reliability. A positive trend was found over time: 44% of the 1985–1999 studies covered convergent validity and 60% of the 2000–2004 studies also did so. Discriminant validity ensures that scales used to measure different constructs are indeed measuring different constructs. Within the CFA setting, discriminant validity is typically assessed by using two different methods. The first method involves calculating the shared variance between constructs and verifying that it is lower than the variances extracted for the involved constructs (Fornell & Larcker, 1981). As such, a relatively low correlation between a pair of latent variables is the first indication of the presence of discriminant validity. The absolute maximum should typically be no greater than r ¼ 0.707 to correspond to a shared variance which is no greater than Fornell and Larcker’s (1981) recommended minimum of 50% average variance extracted for a particular construct. A second method for assessing discriminant validity involves examining all possible pairs of constructs in a series of two-factor CFA models (Bagozzi & Phillips, 1982). Specifically, each pairwise CFA model is run twice – first, constraining the f coefficient to unity and second, allowing
An Assessment of the Use of Structural Equation Modeling
401
f to vary freely. Based on the results of a w2-difference test (Gerbing & Anderson, 1992), one can ensure discriminant validity between measures by finding that the ‘unconstrained model’ performs better than the associated ‘constrained model’ when f ¼ 1 (i.e., Dw2(1)43.84 needs to be exceeded). Discriminant validity was discussed in 86 of the 148 studies (58%), again with multiple measures reported in several studies. Forty studies (27%) included pair-wise tests, correlation or covariance matrix examination was reported in 21 (14%) studies, 20 (14%) employed the average variances extracted approach, and 23 (16%) used other measures to check for discriminant validity. Similar to the assessment of convergent validity, the IB literature has made progress over time. Examination of discriminant validity was more frequent in recent studies (65%) than in the older ones (46%). One observation from the reliability and validity analysis is that some researchers only assess reliability without a corresponding test of validity. As in traditional studies using, for example, SPSS or SAS, it is important to check both reliability and validity of the measures. The same rule applies in the SEM-based studies: having reliable measures does not ensure validity of the measures (Bollen, 1989). While the importance of establishing reliability and validity may seem obvious, it is important to note how they should be examined and tested. Specifically, traditional measures such as Cronbach’s a, if used (e.g., for comparison purposes to established research), should be accompanied by an SEM-based measure, because one of the benefits of SEM is the ability to provide researchers with a more powerful tool to provide accurate estimates of reliability and validity (Steenkamp & Van Trijp, 1991). A good example in this regard is Atuahene-Gima and Li (2002), who first used in-depth interviews to ensure the face validity of their measures. Confirmatory factor analysis was then conducted to establish composite reliability, convergent validity, and discriminant validity. The authors established composite reliability, as outlined above, and assessed convergent validity by examining the average variances extracted. Next, they used the two different methods to ensure discriminant validity. The implication is that scholars drawing on the conclusions of Atuahene-Gima and Li (2002) can do so with relative confidence, while those designing studies can draw on their reliability and validity procedures for guidance. Equivalent Models One of SEM’s major advantages over the traditional statistical techniques is the ability to easily compare rival models, allowing researchers to identify
402
G. TOMAS M. HULT ET AL.
the model with the best fit, rather than simply selecting a model with an acceptable fit. However, for any given theoretically sound model, there may be other ‘competing’ models that demonstrate equivalent goodness-of-fit statistics that incorporate different (alternative) relationships between latent variables. Because possible alternative models may be very different from the theoretical model under examination, the conclusions drawn when only one model is considered may be called into question. The possibility of such an alternative model being plausible is especially important in IB research because of its typical examination of multiple samples (e.g., firms or customer samples from different countries). It is possible that findings from different groups will be best explained by the use of slightly (or perhaps significantly) different model structures, which can only be known through the testing of multiple models. Therefore, authors should either examine the possibility of theory-driven equivalent models, or at a minimum should acknowledge the possibility of unexamined equivalent models as a limitation of their results (MacCallum, Wagner, Uchino, & Farbigar, 1993). An exemplar in this regard is a study by Steenkamp, Batra, and Alden (2003), in which the testing of two theory-based competing models was performed before moving onto testing and discussing the hypothesized model. SEM was used to examine only a measurement model in 47 (32%) of the 148 articles in the dataset. When SEM is used only for measurement validation, there is no examination of equivalent models (MacCallum et al., 1993). Of the remaining 101 articles, 34 explicitly mentioned the possible existence of equivalent models, while 29 of those studies tested and reported results for possible equivalent models. In the early time period (1985–1999), 10 of 14 articles that mentioned the possibility of equivalent models also examined such models, while 19 of 20 corresponding scenarios in 2000–2004 tested equivalent models. Because SEM only provides information on the goodness of fit for the model being tested (in addition to the null model), and does not provide information about possible alternative models, it is critical for researchers to examine alternative models to ensure that the model being studied is in fact the best-fitting model from a range of theoretically sound models. Viewed at the field level, IB has done well with the issue of ‘competing’ models. In the most recent published review of SEM use in a functional discipline, Shook et al. (2004) found that only 2% of SEM-based strategic management studies acknowledge the possible existence of equivalent models. While IB often draws theory from functional disciplines, the treatment of equivalent models may be an area where knowledge transfers in the opposite direction can be valuable.
An Assessment of the Use of Structural Equation Modeling
403
Evaluation of Model Fit Confidence in the overall fit of a structural model is a prerequisite for confidence in testing at the individual path level. Thus, before evaluating the significance of individual paths, overall model fit should be evaluated (Bollen, 1989). Based on the covariance modeling method, most fit indices measure the discrepancy between the sample covariance matrix and the implied covariance matrix from the hypothesized model. The w2 test is a direct comparison of these two matrices, and provides a significance test (with non-significance as an indication of good fit). However, w2 is known for its sensitivity to sample size, and is only recommended with moderate sample sizes, ideally with a sample size (n) where 200rnZ100 (Tabachnick & Fidell, 1996). When the sample size is small (o100), the assumption of w2 distribution is not necessarily valid because of the possibility of nonnormality of the data; when the sample size is large (4200), the possibility of a randomly significant w2 and rejection of the model is high. Also, w2 is sensitive to variance in degrees of freedom and favors complex models and over-fitting, i.e., a saturated model (Bollen, 1989). Given the concerns with w2, other fit indices have been developed. These indices do not provide significance tests; rather they provide a cutoff value to be used as the evaluation criterion. The goodness-of-fit index (GFI) and adjusted goodness-of-fit index (AGFI) evaluate the relative amount of variance and covariance accounted for by the model. GFI and AGFI are also sensitive to sample size because the distribution of the indices is a function of sample size. Instead of directly comparing the covariance matrices, incremental fit indices compare the w2 of the target model with a selected baseline model. Such indices include the comparative fit index (CFI; Bentler, 1990), relative non-centrality index (RNI), Tucker–Lewis index (TLI; Tucker & Lewis, 1973), DELTA2 (Bollen, 1989), and others. Among these indices, RNI, CFI, and DELTA2 are the most stable and robust according to Gerbing and Anderson (1992). The examination of residuals is also an important means of evaluating the overall model fit (Bollen, 1989). Fit indices based on residuals include root mean squared residual (RMR), standardized root mean squared residual (SRMR), and root mean square error of approximation (RMSEA; Steiger & Lind, 1980). The advantages of RMSEA are that it has a known distribution, is not sensitive to sample size, and compensates for the effect of model complexity (Hu & Bentler, 1999). Direct examination of standardized residuals can also help researchers to assess model fit (Bollen,
404
G. TOMAS M. HULT ET AL.
1989). Considering the advantages and disadvantages of each individual index, multiple fit indices are typically needed to evaluate overall model fit (Breckler, 1990). The fit indices selected should be able to best assess the fit of the model, rather than simply picking those indices that support the model under study. Fit indices for both measurement models (127) and structural models (101) were included in our dataset (80 articles used SEM to examine both measurement and structural models). Table 3 presents these results. There were 127 studies reporting at least one measurement model with an average of 3.4 fit measures reported per model(s), with a range between zero and ten fit measures being used. Forty-six different fit indices were reported, showing a large variation in the selection and use of fit indices. The most frequently used measure was w2, which appeared in 137 of the 148 studies in our sample (93%), followed by CFI (94, 64%), GFI (78, 53%), RMSEA (59, 40%), and NFI (45, 30%). In total, 93 studies reported multiple measurement-model fit indices, five reported only one fit index and alarmingly 29 reported no fit indices for the measurement model(s). There were 110 studies reporting at least one structural model with an average of 4.3 fit measures model(s), ranging from zero to nine fit measures used. In total, 88 (87%) of these studies reported multiple indices, ten (10%) reported only one fit index and three (3%) reported zero fit indices. In general, reporting of multiple measures is a common practice, although the finding that 29 studies (23%) that examined a measurement model did not report any fit indices indicates that researchers are less diligent about reporting fit indices for the measurement model vs. the structural model. Also, relatively few fit measures are reported for measurement models (mean of 3.4 measures used) than for structural models (mean of 4.3). This might be related to the misunderstanding of the necessity of evaluating measurement model fit vis-a`-vis the factor loadings, reliabilities, and variances extracted in a measurement model. Good measurement model fit is a necessary and important requirement prior to establishing convergent and discriminant validity at the construct level in confirmatory factor analysis. There are no universally accepted guidelines for the selection of multiple fit measures, i.e., no specific fit index is a ‘must’ for diligent reporting. Highly recommended fit measures (such as CFI and RMSEA) are more often used than others, but some fit indices that have been found to have substantial disadvantages are still widely used. For example, despite their sensitivity to sample size, GFI and AGFI are among the most frequently reported measures. RMR (34 studies, 23%) is more often used than SRMR (10 studies, 7%), though standardized residuals are preferred for the
An Assessment of the Use of Structural Equation Modeling
Table 3.
Usage of Model Fit Indices.
Measurement Modelsa
w2 CFI GFI RMSEA NFI NNFI AGFI RMR Normed w2 IFI TLI SRMR DELTA2 RNI AOSR PNFI RFI Others
405
Structural Modelsb
Overallc
Number of studies (n ¼ 101)
Percentage
Number of studies (n ¼ 127)
Percentage
Number of studies (n ¼ 148)
Percentage
88 72 47 49 27 34 20 15 12 12 9 7 5 5 3 3 3 11
69 57 37 39 21 27 16 12 9 9 7 6 4 4 2 2 2 9
95 63 50 36 35 26 25 23 12 10 11 7 5 3 4 1 1 18
94 62 50 36 35 26 25 23 12 10 11 7 5 3 4 1 1 18
137 94 78 59 45 43 36 31 19 16 14 12 7 5 4 3 3 21
93 64 53 40 30 29 24 21 13 11 9 8 5 3 3 2 2 14
CFI, comparative fit index; GFI, goodness-of-fit index; RMSEA, root mean square error of approximation; NFI, normed fit index; NNFI, non-normed fit index; AGFI, adjusted goodnessof-fit index; RMR, root mean square residual (in some studies this is referred as RMSR); IFI, incremental fit index; TLI, Tucker–Lewis index; SRMR, standardized root mean square residual (in some studies this is referred to as SRMSR); RNI, relative non-centrality index; AOSR, average off-diagonal standardized residual; PNFI, parsimony normed fit index; RFI, relative fit index. a Percentages calculated using the 101 studies reporting a measurement model. b Percentages calculated using the 127 studies reporting a structural model. c Percentages calculated using the 148 articles reporting either a measurement, a structural model, or both.
purpose of comparison (Bollen, 1989). Among the three robust incremental fit indices suggested by Gerbing and Anderson (1992), CFI is widely used, while RNI and DELTA2 do not seem to draw the attention of most researchers. We found only four articles that used these three fit measures: one from JIBS (Knight & Cavusgil, 2004), one from SMJ (Hult & Ketchen, 2001), and two from JIM (Hult, Nichols, Giunipero, & Hurley, 2000; Keillor, Hult, & Kandemir, 2004).
406
G. TOMAS M. HULT ET AL.
Evaluating the overall model fit is a critical step in assessing the rigor of research in SEM. Reporting of fit measures is necessary for other researchers in the community to evaluate and compare published studies, which in the end contributes to the development of the field by improving the rigor of research. Specifically, establishing a common standard of reporting, adopting a combination of fit measures, and selecting robust fit measures rather than convenient ones would enhance the application of SEM in IB research. To do this, we suggest that researchers report the indices recommended by Gerbing and Anderson (1992) (i.e., CFI, RNI, and DELTA2) and any additional indices that are well suited to the nature of a particular study. Beyond fit indices, an essential element of assessing model fit is achieving an appropriate sample size to provide an adequate power level to carry out hypothesis testing. As noted above, the vast majority of studies use a w2 test, wherein a lack of significant differences between the implied covariances derived from the model and those of the sample data is suggestive of support for the model. This contrasts with traditional techniques, such as regression, where the lack of differences suggests lack of model support. Within SEM, a lack of statistical difference could result from an inability to detect model misspecification owing to a lack of power because of a small sample size (MacCallum, Browne, & Sugawara, 1996). Thus, while in traditional regression analysis the risk of a Type II error (a false rejection) is a function of power, in SEM it is the risk of a Type I error (a false failure to reject) that is a function of power (Bollen, 1989; Saris & Satorra, 1993). Because a Type I error is generally considered to be the more serious mistake, the need for adequate sample size when using SEM is perhaps more critical than when using other statistical techniques. This issue is magnified in IB research. IB research often involves multiple samples (e.g., countries), and each sample must have an adequate size to allow for meaningful, reliable, and valid between-sample comparisons. Several authors have provided procedures for computing the power and sample sizes necessary in SEM (MacCallum et al., 1996; Saris & Satorra, 1993; Satorra & Saris, 1983; Satorra, Saris, & de Pijper, 1991). Unfortunately, of the 148 articles we examined, none described whether power was adequate or not. In response, we estimated the adequacy of the studies’ power based upon the guidelines of MacCallum et al. (1996). MacCallum et al. (1996) provide a list of the minimum sample sizes needed to achieve a given power level (i.e., 0.80) at various degrees of freedom. Of the 148 studies, 52 appear to have achieved an acceptable power level (0.80 and above) while 70 studies, based on our estimation, did not achieve adequate power. We could not estimate power for 26 studies because they lacked
An Assessment of the Use of Structural Equation Modeling
407
critical information on sample size and/or degrees of freedom. As an indication of adequate power in a particular study, we placed an asterisk next to those studies in Table 1 that had adequate power. The results indicate that studies often obtain significant findings despite apparently insufficient sample sizes. It seems likely that at least some of these studies committed Type I errors (i.e., accepting findings that were not true as true). Significance tests and confidence intervals have received much attention from researchers using traditional methods because they serve as safeguards against Type I errors. Within SEM, it is the lack of adequate power that creates Type I errors and thereby potentially devastates the process of knowledge development. Thus, our finding that studies lacking adequate power outnumber those possessing power is quite disconcerting. Researchers who wish to ensure adequate power can use the tabular guidelines or the recommended program lists from MacCallum et al. (1996). Another alternative, although more computationally complex, is to use the calculations outlined by Satorra and Saris (1985). Measurement Equivalence An important, long-standing goal of IB research is discovering whether, and how theories, behavior, and phenomena examined in one country, region, or sample group can be applied to other countries, regions, or sample groups. Because IB research often crosses multiple boundaries such as language, culture, politics, and economics, there is an increased need for measurement equivalence that allows for valid and accurate comparison of multiple groups. Indeed, a significant prerequisite to generalizing IB theories is having measurement instruments that are appropriately capturing data across boundaries without introducing ‘international’ bias. If there is a lack of evidence of measurement invariance, any differences found cannot be assumed to be actual differences in the phenomena under study; they could simply be a function of measurement bias (Steenkamp & Baumgartner, 1998). The key strengths of SEM in this context are that SEM can more easily assess the differences between groups, more thoroughly examine construct validity, better understand error rates, create stronger claims of causality within tested models, and assess measurement equivalence (Myers, Calantone, Page, & Taylor, 2000). Specific issues of measurement equivalence relate to the development of data gathering instruments, instrument scaling, transferability of concepts and ideas across cultures and languages, response equivalence, sampling design, data collection timing, and others (Mintu, Calantone, & Gassenheimer,
408
G. TOMAS M. HULT ET AL.
1994; Myers et al., 2000; Sekaran, 1983; Singh, 1995; Steenkamp & Baumgartner, 1998). These issues require the researcher to try to eliminate possible sources of invariance ex ante. Researchers can also test their data ex post facto to determine if invariance exists. One of the most powerful and versatile methods for testing cross-boundary measurement invariance is multi-group confirmatory factor analysis (Jo¨reskog, 1971). The elimination (or at least reduction) of measurement inequivalence in IB research that uses SEM can provide meaningful insights into the magnitude of differences within multi-group comparisons. An exemplar of the examination of measurement equivalence can be found in Agarwal, Malhotra, and Wu (2002), where a series of nested models were tested across groups. Although these issues should be addressed when conducting research on sample units from different groups, many studies either neglect to examine these issues, or fail to report how these issues were handled in the research design, data collection, and data analysis phases. Of the 148 articles we examined, 66 (45%) drew parallel samples from two or more countries. Only 45 of the 66 (68%) mentioned the importance of measurement equivalence. Several of the authors who mentioned measurement equivalence did not provide clear information on how the issue was addressed, controlled, or tested within their research. The most common ex ante issue addressed by authors related to language and instrument translation (i.e., 24 of the 45 articles – 53%), and often included a discussion of solutions, such as having a survey translated from one language into another by a native speaker, or the use of the commonly accepted practice of translation–back translation (Craig & Douglas, 2000). An ex post facto examination of factor equivalence (such as ‘factor loadings’ or ‘factorial similarity’) was mentioned in five (11%) of the articles. In 14 (31%) of the articles, the author(s) mentioned ‘measurement equivalence,’ ‘measurement invariance,’ ‘metric equivalence,’ or ‘metric invariance,’ but failed to mention specifically how it was achieved (i.e., how they controlled or tested for it). In order to better understand the ability to generalize theory or to understand differences across boundaries, it is worrisome that a greater level of reporting on measurement equivalence within relevant SEM studies has not been achieved. Although there are several published works to help guide the IB researcher (Mullen, 1995; Myers et al., 2000; Sekaran, 1983; Singh, 1995; Steenkamp & Baumgartner, 1998), a lack of understanding of measurement equivalence is too prevalent in IB research. Also, a universally accepted standard on how to manage measurement equivalence (and perhaps the broader concept of data equivalence) and how to report findings on this form of equivalence appears to be lacking in the IB literature.
An Assessment of the Use of Structural Equation Modeling
409
Common Method Bias Common method bias arises because of common method variance, which is the variance attributable to the measurement method used rather than to the constructs (Podasakoff, MacKenzie, Lee, & Podsakoff, 2003). For example, if a survey respondent is predisposed to provide strongly positive answers to questions, he or she is likely to inflate the relationships among variables measured through the survey. Thus, when the relationship between two variables is examined, common method variance poses a confounding influence, and might lead to misleading conclusions (Campbell & Fiske, 1959) such as that the link between the constructs is stronger than it actually is. Procedural remedies for controlling common method bias aim to minimize potential bias during the data collection process. For example, some scholars use multiple respondents wherein one informant provides the measure of the independent variables and another provides the measure of the dependent variables. Others opt to use survey data for the antecedents and objective data (from secondary sources) for the outcomes. If these options are not possible, the potential for common method bias within a single informant design can be assessed through statistical tests, of which the most commonly used one is the Harman’s single-factor test (McFarlin & Sweeney, 1992). If common method bias poses a serious problem within a dataset, a single latent factor will account for a large portion of the variance across the observations (Podsakoff & Organ, 1986). Therefore, a substantially worse fit for the one-factor model would suggest that common method bias does not pose serious threat (Sanchez, Korbin, & Viscarra, 1995). In SEM, the influence of common method bias on hypotheses testing can also be assessed by adding a same-source factor to the indicators of all model constructs (MacKenzie, Podsakoff, & Fetter, 1993; Netemeyer et al., 1997; Williams & Anderson, 1994) and then comparing two models: a model where the same-source factor loadings are constrained to zero (the constrained model) and a model where the same-source factor loadings are estimated freely (the unconstrained model). If the unconstrained model provides a significantly better model fit than the constrained model, it suggests that a same-source factor exists (i.e., common method bias poses a problem in the analysis). The significance of paths in the structural model can also be compared to examine the influence of common method bias. If a path is significant in the constrained model, but not significant in the unconstrained model, it indicates that the path is not significant when the effect of common method bias is considered.
410
G. TOMAS M. HULT ET AL.
Among the 148 studies in our data set, 10 used secondary data, which alleviates common method bias issues (at least between the survey-based variables and the objective data from secondary sources). Another 10 used multiple respondents, which also reduces the threat of common method bias. For example, Schroeder, Bates, and Junttila (2002) used different respondents in each plant in a survey, representing different functional areas. Among the remaining 128 studies, 11 studies (9%) applied the Harman’s singlefactor test, while the majority of the studies did not mention or conduct any test of common method bias (119 studies, 91%). No study used the ‘samesource’ factor approach, a relatively new technique in business research (see Netemeyer et al., 1997 for an explanation of the same-source technique). Given the wide use of self-reported survey data in IB research, common method bias is a major threat to the validity of studies. There has been an increase in the use of a single-factor test over the past 20 years – only three studies used the test between 1985 and 1999, while eight did so between 2000 and 2004. However, our data show that less attention is paid to this issue than should be. Ideally, all survey-based cross-sectional studies would incorporate common method bias testing at the measurement level (Harman’s one-factor test) as well as at the hypothesis testing level (the ‘same-source’ test). Consideration of common method bias in study designs, improvement of the data collection processes, and application of statistical tests to examine the influence of common method bias are needed to improve the rigor of IB research. The use of secondary data sources to supplement primary research can also help reduce the threat of common method bias. The same-source factor coupled with the one-factor test shows the advantage of SEM in not only identifying common method bias but, more importantly, in assessing the influence of common method bias in the process of hypotheses testing.
DISCUSSION With its growing popularity among IB researchers, SEM is contributing substantially to the field. Yet, our analysis of the current state of SEM application shows that the knowledge-generation capabilities of SEM have not been fully exploited within IB research. SEM is a powerful technique, especially for multi-group studies, but its benefits may be inhibited by improper application of the technique. From the perspective of knowledge accumulation in the field, information about the statistical procedures used to carry out research needs to be made available to other researchers. Being able to evaluate, compare, or
An Assessment of the Use of Structural Equation Modeling
411
Table 4. Checklist of Issues to Report in SEM-Based IB Studies. 1. Data characteristics (1) A general description of the data (2) Discussion of data normality and techniques applied if non-normality exists (3) Sample size (4) Statistical power (5) Potential of common method bias and remedies to control the bias 2. Reproducibility issues (1) Input matrix (2) Name and version of software used 3. Validity and reliability (1) Measures of reliability (2) Measures of convergent validity (3) Measures of discriminate validity 4. Evaluation of model fit (1) Multiple fit indices need to be reported (2) Preferred fit indices to use: w2, CFI, RMSEA, RNI, and DELTA2 5. Equivalent models (1) The potential existence of equivalent (‘competing’) models should be acknowledged and preferably tested 6. Model respecification (1) Theoretical reasons need to be given for model respecification (2) Respecification needs to be cross validated 7. Measurement equivalence (in cross-country, cross-culture studies) (1) Describe procedure(s) taken to ensure measurement equivalence (2) Test for equivalence before pooling the samples or comparing structural models across groups
reproduce published studies will help to improve the rigor of research and the development of the field. To fully realize SEM’s potential, a common standard of reporting in SEM is needed, and further improvements are necessary in the reporting of the many issues related to the use of SEM techniques in IB research. Based on our analysis, we recommend that several essential items be reported within SEM studies, as shown in Table 4. We
412
G. TOMAS M. HULT ET AL.
also recommend that journal reviewers and editors hold submitted papers to these standards so that the published body of knowledge begins to offer full disclosure in the interest of achieving the greatest research value from each SEM-based study. Looking to the future, it is clear that significant opportunities remain for SEM to contribute to IB research. With the advantage of modeling latent variables in a structural model, SEM is uniquely suited for IB research focused on unobservable constructs such as culture, technology transfer, and intangible capabilities (cf. Hult & Ketchen, 2001). Using SEM in conjunction with metaanalysis is another potentially powerful path to developing knowledge. Metaanalysis combines findings from multiple studies to identify the nature and strength of relationships between variables (Hunter & Schmidt, 1990). These relations can be used as the basis for structural models that capture the state of knowledge about a given topic. However, these lofty ambitions can only be realized if there is a concurrent trend toward improved practice within the IB research community. Such moves would help IB’s continuation to establish itself as a rigorous and distinct discipline as well as improve the validity of IB studies.
REFERENCES Agarwal, J., Malhotra, N. K., & Wu, T. (2002). Does NAFTA influence Mexico’s product image? A theoretical framework and an empirical investigation in two countries. Management International Review, 42(4), 441–471. Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103(3), 411. Atuahene-Gima, K., & Li, H. (2002). When does trust matter? Antecedents and contingent effects of supervisee trust on performance in selling new products in China and the United States. Journal of Marketing, 66(3), 61–81. Bagozzi, R. P., & Phillips, L. W. (1982). Representing and testing organizational theories: A holistic construal. Administrative Science Quarterly, 27(September), 459–489. Bentler, P. M. (1990). Comparative fit indexes in structural equation modeling. Psychological Bulletin, 107(2), 238–246. Bentler, P. M. (1995). EQS – Structural equations program manual. Encino, CA: Multivariate Software, Inc. Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley. Brannick, M. T. (1995). Critical comments on applying covariance structure modeling. Journal of Organizational Behavior, 16(3), 201–213. Breckler, S. J. (1990). Applications of covariance structure modeling in psychology: Cause for concern? Psychological Bulletin, 107(2), 260–273. Browne, M. W. (1984). Asymptotic distribution free methods in analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 62–83.
An Assessment of the Use of Structural Equation Modeling
413
Buck, T., Filatotchev, I., Demina, N., & Wright, M. (2003). Insider ownership, human resource strategies and performance in a transition economy. Journal of International Business Studies, 34(6), 530–549. Cadogan, J. W., Diamantopoulos, A., & Mortanges, C. P. (1999). A measure of export market orientation: Scale development and cross-cultural validation. Journal of International Business Studies, 30(4), 689. Campbell, D. T., & Fiske, D. (1959). Convergent and discriminant validation by the multitraitmultimethod matrix. Psychological Bulletin, 56, 81–105. Chin, W. (1998). Issues and opinions on structural equation modeling. MIS Quarterly, 22(1), 7–16. Craig, C. S., & Douglas, S. P. (2000). International Marketing Research (2nd ed.). Chichester, England: Wiley. Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334. Cronbach, L. J., & Shavelson, R. J. (2004). My current thoughts on coefficient alpha and successor procedures. Educational and Psychological Measurement, 64(3), 391–418. Eriksson, K., Johanson, J., Majkga˚rd, A., & Sharma, D. D. (1997). Experiential knowledge and costs in the internationalization process. Journal of International Business Studies, 28(2), 337–360. Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research, 18(1), 39–50. Garver, M. S., & Mentzer, J. T. (1999). Logistics research methods: Employing structural equation modeling to test for construct validity. Journal of Business Logistics, 20(1), 33–58. Gerbing, D. A., & Anderson, J. C. (1992). Monte Carlo evaluations of goodness of fit indices for structural equation models. Sociological Methods and Research, 20(2), 132–160. Graham, J. L. (1985). Cross-cultural marketing negotiations: A laboratory experiment. Marketing Science, 4(2), 130–146. Hair, J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (2005). Multivariate data analysis. Upper Saddle River, NJ: Prentice-Hall. Hitt, M., Boyd, B. K., & Li, D. (2004). The state of strategic management research and a vision of the future. In: D. J. Ketchen, Jr. & D. D. Bergh (Eds), Research methodology in strategy and management (Vol. 1, pp. 1–32). Hu, Li-T., & Bentler, P. M. (1999). Cutoff Criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55. Hult, G. T. M., & Ketchen, D. J., Jr. (2001). Does market orientation matter? A test of the relationship between positional advantage and performance. Strategic Management Journal, 22(9), 899–906. Hult, G. T. M., Nichols, E. L., Jr., Giunipero, L. C., & Hurley, R. F. (2000). Global organizational learning in the supply chain: A low versus high learning study. Journal of International Marketing, 8(3), 61–83. Hunter, J. E., & Schmidt, F. (1990). Methods of meta-analysis: Correcting error and bias in research findings. Newbury Park, CA: Sage Publications. Jo¨reskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36(December), 409–426.
414
G. TOMAS M. HULT ET AL.
Jo¨reskog, K. G., & So¨rbom, D. (1996). PRELIS 2 – User’s reference guide. Chicago, IL: Scientific Software International. Jo¨reskog, K. G., So¨rbom, D., Du Toit, S., & Du Toit, M. (1999). LISREL 8: Users’ reference guide. Chicago, IL: Scientific Software International. Keillor, B. D., & Hult, G. T. M. (1999). A five country study of national identity: Implications for international marketing research and practice. International Marketing Review, 16(1), 65–82. Keillor, B. D., Hult, G. T. M., Erffmeyer, R. C., & Babakus, E. (1996). NATID: The development and application of a national identity measure for use in international marketing. Journal of International Marketing, 4(2), 57–73. Keillor, B. D., Hult, G. T. M., & Kandemir, D. (2004). A study of the service encounter in eight diverse countries. Journal of International Marketing, 12(1), 9–35. Knight, G. A., & Cavusgil, S. T. (2004). Innovation, organizational capabilities, and the bornglobal firm. Journal of International Business Studies, 35(4), 124–141. MacCallum, R. (1986). Specification searches in covariance structure modeling. Psychological Bulletin, 100(1), 107–120. MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis and determination of sample size for covariance structure modeling. Psychological Methods, 1(2), 130–149. MacCallum, R. C., Roznowski, M., & Necowitz, L. B. (1992). Model modifications in covariance structure analysis: The problem of capitalization on chance. Psychological Bulletin, 111(3), 490–504. MacCallum, R. C., Wagner, D. T., Uchino, B. N., & Farbigar, F. R. (1993). The problem of equivalent models in applications of covariance structure analysis. Psychological Bulletin, 114(1), 185–199. MacKenzie, S. B., Podsakoff, P. M., & Fetter, R. (1993). The impact of organizational citizenship behavior on evaluations of salesperson performance. Journal of Marketing, 57(1), 70. Mardia, K. V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57, 519–530. McFarlin, D. B., & Sweeney, P. D. (1992). Distributive and procedural justice as predictors of satisfaction with personal and organizational outcomes. Academy of Management Journal, 35(3), 626–637. Mintu, A. T., Calantone, R. J., & Gassenheimer, J. B. (1994). Towards improving cross-cultural research: Extending Churchill’s research paradigm. Journal of International Consumer Marketing, 7(2), 5–23. Mullen, M. R. (1995). Diagnosing measurement equivalence in cross-national research. Journal of International Business Studies, 26(3), 573–596. Myers, M. B., Calantone, R. J., Page, T. J., Jr., & Taylor, C. R. (2000). An application of multiple-group causal models in assessing cross-national measurement equivalence. Journal of International Research, 8(4), 108–201. Netemeyer, R. G., Boles, J. S., McKee, D. O., & McMurrian, R. (1997). An investigation into the antecedents of organizational citizenship behaviors in a personal selling context. Journal of Marketing, 61(3), 85–98. Peter, J. P. (1979). Reliability: A review of psychometric basics and recent marketing practices. Journal of Marketing Research, 16(1), 6–17. Podasakoff, P. M., MacKenzie, S. B., Lee, J.-Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: A critical review of the literature and recommended remedies. Journal of Applied Psychology, 88(5), 879–903.
An Assessment of the Use of Structural Equation Modeling
415
Podsakoff, P. M., & Organ, D. W. (1986). Self-reports in organizational research: Problems and prospects. Journal of Management, 12(4), 531. Sanchez, J. I., Korbin, W. P., & Viscarra, D. M. (1995). Corporate support in the aftermath of a natural disaster: Effect on employee strains. Academy of Management Journal, 38(2), 504–521. Saris, W. E., & Satorra, A. (1993). Power evaluations in structural equation models. In: K. A. Bollen (Ed.), Testing structural equation models (pp. 181–204). Newberry Park, CA: Sage Publications. Satorra, A., & Saris, W. E. (1983). The accuracy of a procedure for calculation of the power of the likelihood ratio test as used within the LISREL framework. In: C. O. Middendorp (Ed.), Sociometric research 1982 (pp. 129–190). Amsterdam, The Netherlands: Sociometric Research Foundation. Satorra, A., & Saris, W. E. (1985). Power of the maximum likelihood ratio test in covariance structure analysis. Psychometrika, 50(1), 83–90. Satorra, A., Saris, W. E., & De Pijper, W. M. (1991). A comparison of several approximations to the power function of the likelihood ratio test in covariance structure analysis. Statistica Neerlandica, 45, 173–185. Schroeder, R. G., Bates, K. A., & Junttila, M. A. (2002). A resource-based view of manufacturing strategy and the relationship to manufacturing performance. Strategic Management Journal, 23(2), 105–125. Sekaran, U. (1983). Methodological and theoretical issues and advancements in cross-cultural research. Journal of International Business Studies, 14(2), 61–73. Shook, C. L., Ketchen, D. J., Hult, G. T. M., & Kacmar, K. M. (2004). An assessment of the use of structural equation modeling in strategic management research. Strategic Management Journal, 25(4), 397–404. Singh, J. (1995). Measurement issues in cross-national research. Journal of International Business Studies, 26(Third Quarter), 597–620. Steenkamp, J.-B. E. M., Batra, R., & Alden, D. L. (2003). How perceived brand globalness creates brand value. Journal of International Business Studies, 34(1), 53–65. Steenkamp, J.-B. E. M., & Baumgartner, H. (1998). Assessing measurement invariance in crossnational consumer research. Journal of Consumer Research, 25(June), 78–90. Steenkamp, J.-B. E. M., & Van Trijp, H. C. M. (1991). The use of LISREL in validating marketing constructs. International Journal of Research in Marketing, 8(4), 283–299. Steiger, J.H., & Lind, J.C. (1980). Statistically based tests for the number of common factors. Paper presented at the annual meeting of the Psychometric Society, Iowa City, IA. Tabachnick, B. G., & Fidell, L. S. (1996). Using multivariate statistics. New York: Harper Collins. Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 1–10. Werner, S., & Brouthers, L. E. (2002). How international is management? Journal of International Business Studies, 33(3), 583–591. Williams, L. J., & Anderson, S. E. (1994). An alternative approach to method effects by using latent-variable models: Applications in organizational behavior research. Journal of Applied Psychology, 79(3), 323–331. Wright, R. W. (1970). Trends in international business research. Journal of International Business Studies, 1(01), 109–124. Zou, S., & Cavusgil, S. T. (2002). The GMS: A broad conceptualization of global marketing strategy and its effect on firm performance. Journal of Marketing, 66(4), 40–56.
This page intentionally left blank
416