APPLICATIONS OF THE EXPANSION METHOD
The social sciences are currently engaged in a critical selfscrutiny regarding th...
162 downloads
1292 Views
5MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
APPLICATIONS OF THE EXPANSION METHOD
The social sciences are currently engaged in a critical selfscrutiny regarding the universality of their theories and models. While there is increasing recognition that the social counterparts of invariant natural laws such as the law of gravity will not be discovered, many social scientists nevertheless construct and estimate models under implicit assumptions of invariance and universality. The central tenets of the expansion method paradigm are that models are likely to hold differently across environments and that the model-context nexus should be theorized and investigated. The expansion method provides a means for introducing the complexities of real world contexts into the decontextualized models, conceptual frameworks, and theories of the social sciences. As a research paradigm, the expansion method provides a systematic methodology appropriate for the investigation of contextual variability in virtually any empirical research setting. This is the first book to bring together researchers with interest in the expansion method. The authors examine the theoretical implications of the paradigm, contribute methodological advances, and offer variety of applications in substantive areas, including population, urban systems, social policy analysis, economic development, and remote sensing. The book will be of interest to those whose substantive research interests involve modelling, whether in geography or in any other social science. John Paul Jones III is Associate Professor of Geography at the University of Kentucky. He is the author of a number of articles and co-edited (with Janet E.Kodras) Geographic Dimensions of US Social Policy (Edward Arnold, 1990). Emilio Casetti is Professor of Geography at Ohio State University. He is editor of Geographical Analysis, and is the author of more than 100 articles.
APPLICATIONS OF THE EXPANSION METHOD Edited by John Paul Jones, III and Emilio Casetti
London and New York
First published 1992 by Routledge 11 New Fetter Lane, London EC4P 4EE Simultaneously published in the USA and Canada by Routledge a division of Routledge, Chapman and Hall, Inc. 29 West 35th Street, New York, NY 10001 Routledge is an imprint of the Taylor & Francis Group This edition published in the Taylor & Francis e-Library, 2005. “To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.” © 1992 J.P.Jones III and E.Casetti All rights reserved. No part of this book may be reprinted or reproduced or utilized in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. British Library Cataloguing in Publication Data Applications of the Expansion Method 1. Human geography I. Jones, John Paul 1955– II. Casetti, Emilio 1928– 304.20724 ISBN 0-203-40538-2 Master e-book ISBN
ISBN 0-203-71362-1 (Adobe eReader Format) ISBN 0-415-03494-9 (Print Edition) Library of Congress Cataloging in Publication Data Applications of the Expansion Method/Edited by John Paul Jones III and Emilio Casetti p. cm. Includes bibliographical references and index. ISBN 0-415-03494-9 1. Social sciences—Mathematical models 2. Population geography— Mathematical models 3. Economic geography—Mathematical models I. Jones, John Paul, 1955– II. Casetti, Emilio, 1928– III. Title: Expansion method H61.25.A63 1992 300′.01′5118–dc20
CONTENTS
List of figures
vi
List of tables
ix
Contributors
xi
Acknowledgments
xiii
1
AN INTRODUCTION TO THE EXPANSION METHOD AND TO ITS APPLICATIONS Emilio Casetti and John Paul Jones, III
1
2
THE DUAL EXPANSION METHOD: AN APPLICATION FOR EVALUATING THE EFFECTS OF POPULATION GROWTH ON DEVELOPMENT Emilio Casetti
8
3
PARADIGMATIC DIMENSIONS OF THE EXPANSION METHOD John Paul Jones, III
32
4
A CONTEXTUAL EXPANSION OF THE WELFARE MODEL Janet E.Kodras
47
5
A COMPARISON OF DRIFT ANALYSES AND THE EXPANSION METHOD: THE EVALUATION OF FEDERAL POLICIES ON THE SUPPLY OF PHYSICIANS Stuart A.Foster, Wilpen L.Gorr and Francis C. Wimberly
71
6
PERSONAL CHARACTERISTICS IN MODELS OF MIGRATION DECISIONS: AN ANALYSIS OF DESTINATION CHOICE IN ECUADOR Mark EllisJohn Odland
88
7
ALTERNATIVE APPROACHES TO THE STUDY OF METROPOLITAN DECENTRALIZATION Shaul Krakover
101
8
LONG-WAVE SPATIAL AND ECONOMIC RELATIONSHIPS IN URBAN DEVELOPMENT
125
v
Shaul Krakover and Richard L.Morrill 9
AN INVESTIGATION INTO THE DYNAMICS OF DEVELOPMENT INEQUALITIES VIA EXPANDED RANKSIZE FUNCTIONS C.Cindy Fan
144
10
IDENTIFYING HIERARCHICAL DEVELOPMENT TRENDS IN THE HUNGARIAN URBAN SYSTEM USING THE EXPANSION METHOD Darrick R.Danta
166
11
AN EXPLORATION OF THE RELATIONSHIP BETWEEN SECTORAL LABOR SHARES AND ECONOMIC DEVELOPMENT Kavita Pandit
179
12
PRODUCTION FUNCTION ESTIMATION AND THE SPATIAL STRUCTURE OF AGRICULTURE Sent Visser
197
13
INCORPORATING THE EXPANSION METHOD INTO REMOTE SENSING-BASED WATER QUALITY ANALYSES Martin Miles, Douglas A.Stow and John Paul Jones, III
219
14
INNOVATION DIFFUSION THEORY AND THE EXPANSION METHOD Michael Sonis
234
15
SPATIAL DEPENDENCE AND SPATIAL HETEROGENEITY: MODEL SPECIFICATION ISSUES IN THE SPATIAL EXPANSION PARADIGM Luc Anselin
264
16
GENERATING VARYING PARAMETER MODELS USING CUBIC SPLINE FUNCTIONS Robert Q.Hanham
280
Index
287
FIGURES
2.1 Effect of population growth on the rate of development 23 2.2 Phase diagram of the relationship between rates and levels of economic 23 development (corresponding to population growth rates of 1, 2, and 3 percent per year) 4.1 Interstate variations in the work-disincentive effect, 1979 59 5.1 Parameter paths for LPRIMR in the GPRIM model 82 5.2 Parameter paths for GPOP in the GPRIM model 83 5.3 Parameter paths for LSPECR in the GSPEC model 83 5.4 Parameter paths for GPOP in the GSPEC model 83 7.1 Urban settlements in the urban region of Tel Aviv 108 7.2 Population development in the Tel Aviv urban region, 1961–80 109 7.3 Shares of population, central city versus suburbs, Tel Aviv urban region,110 1961–80 7.4 Index of population growth, central city versus suburbs, Tel Aviv urban 110 region, 1961–80 7.5 Growth profiles obtained via two different distance bands delineations, 113 Tel Aviv urban region, 1961–80 7.6 Distance-temporal structure of population growth in the Tel Aviv urban 114 region, 1961–80 7.7 Spatio-temporal structure of population growth in the urban region of 118 Tel Aviv, 1970 and 1980 7.8 Distribution of population growth in the urban region of Tel Aviv, raw 118 data, 1961–80 7.9 Southward cross-section of the spatio-temporal population growth 119 structure, urban region of Tel Aviv, 1961–80 7.10 Analysis of southward cross-section from Tel Aviv to Qiryat Eqron, 120 1965–80 8.1 Selected study areas 131 8.2 Estimated spatio-temporal growth structure for the urban region of 135 Philadelphia 8.3 Estimated spatio-temporal growth structure for the urban region of 137 Chicago 8.4 Estimated spatio-temporal growth structure for the urban region of 139 Atlanta 9.1 Zipf’s ideal rank-size distribution 146
vii
9.2 (a) Perfect equality rank-size distribution; (b) perfect inequality rank- 148 size distribution 9.3 Nonlinear rank-size curves 151 9.4 (a) Scatter diagram of ln y and ln r, 1913; (b) scatter diagram of ln y and 156 ln r, 1929; (c) scatter diagram of ln y and ln r, 1950; (d) scatter diagram of ln y and ln r, 1960; (e) scatter diagram of ln y and ln r, 1970; (f) scatter diagram of ln y and ln r, 1980 10.1 The urban turnaround model 167 10.2 Polarized growth 168 10.3 Hungarian rank-size distributions 169 10.4 Estimates of q by rank 174 10.5 Timing of switch of q from increasing to decreasing 174 11.1 Agricultural and manufacturing labor allocation during development 186 11.2 Effect of population size on sectoral labor allocation relations 192 11.3 Effect of resource flows on sectoral labor allocation relations 192 11.4 Temporal variation in sectoral labor allocation relations 193 11.5 Sectoral labor allocation relations for (a) Latin America, (b) Africa, (c) 193 Asia, and (d) more developed countries 12.1 Observations of the relationship between inputs and outputs relative to 198 the actual production function defined for optimal combinations of inputs 12.2 Observations of the relationship between inputs and outputs for farmers 199 responding to variation in real factor and output prices 12.3 Hypothesized shape of agricultural production functions with increasing 204 marginal returns to intensity at low levels of intensity 12.4 Production function shapes and the location allocation of output types 205 12.5 Empirical observations of yield and intensity and the underlying 206 production functions 12.6 Unique linear-log production functions for individual agricultural types 214 generate a production function envelope that is estimated as a CobbDouglas function 12.7 The effect of varying annual prices of output on estimation of 215 production functions measured in terms of value of yield 13.1 Spatial distribution of sampling sites in Neuse Estuary 223 13.2 Distribution of b parameter from turbidity model 226 13.3 Distribution of b parameter from salinity model 227 13.4 Distribution of c parameter from salinity model 227 13.5 Distribution of b parameter from total suspended solids model 229 13.6 Distribution of c parameter from total suspended solids model 229 13.7 Distribution of b parameter from chlorophyll-a model 230 14.1 The operational stages of the expansion method 237 14.2 Cumulative temporal S-shaped growth of the relative portion of adopters239 of an innovation: (a) innovation diffusion within an indifferent environment; (b) innovation diffusion within an active environment
viii
14.3 Scheme of the redistribution of an innovation between adopters and nonadopters caused by the intervention of an active environment 14.4 Construction of the level curves for general spatio-temporal innovation spread 14.5 Qualitative description of innovation diffusion dynamics with asymptotically stable initial and final equilibria 14.6 Interconnections between diffusion of competitive innovations and individual utility choice within an active environment 16.1 Moving window regression time plot of β for Pittsburgh 16.2 Quadratic and cubic spline time plot of β for Pittsburgh
241 245 249 255 283 285
TABLES
2.1 A compilation of correlations between rates of growth of population and 15 product per capita 2.2 Tabulation of PRC1(P'), PRC2(P'), and g[y*(P')] evaluated at a range of 24 values of P' 4.1 Major income maintenance programs, 1979 50 4.2 Varimax rotated factor matrix 55 4.3 Results of the initial model 57 5.1 Descriptive statistics: 1963–83 annual data for the contiguous forty76 eight states 5.2 Ordinary least squares estimates of the terminal model: quadratic 76 expansions in time 5.3 Annual regressions for GPRIM, model (5.5): estimated coefficients and 77 p values 5.4 Annual regression estimates for GSPEC, model (5.6): estimated 78 coefficients and p values 5.5 Three-year moving-average window regressions for GPRIM, model (5. 79 5): estimated coefficients and p values (n=144) 5.6 Three-year moving-average window regressions for GSPEC, model (5. 80 6): estimated coefficients and p values (n=144) 6.1 Coefficients of the terminal model for age categories of males 93 6.2 (a) Parameter estimates for origins in the Costa region; (b) parameter 96 estimates for origins in the Sierra region; (c) parameter estimates for origins in the Oriente region 7.1 Components of decentralization as treated by the four methods 111 7.2 Population and population growth in the Tel Aviv urban region by rings 111 7.3 Location and shifts of the peak point of growth: Tel Aviv, 1961–80 115 7.4 Cross-section to the south from Tel Aviv (128, 36) to Qiryat Eqron 119 (133, 60) 8.1 Summary of regression results 132 8.2 Conformity of results with hypotheses 140 9.1 (a) Estimates for linear rank-size functions; (b) estimates for linear rank-154 size function expanded in time 9.2 Estimates for terminal model: In y'=a+ [b=f(r)] In r 158 9.3 Estimates for terminal models (9.20) and (9.21) 161 9.4 Values of db/dt for selected ranks 162 10.1 Results of rank and time expansion analysis: Hungary’s urban system 172
x
11.1 Structural changes during economic development 11.2 Regression results for the Chenery and Syrquin model 11.3 Regression results for the exponential model 11.4 Regression results—expansion by population 11.5 Regression results—expansion by resource flow 11.6 Regression results—temporal expansion 11.7 Regression results—spatial expansion 12.1 Key to regression variables 12.2 Cobb-Douglas production function estimates 12.3 Linear-log model production function estimates 13.1 Range of observed water quality values 13.2 Range of mean Landsat multispectral scanner digital numbers 13.3 Summary of results
179 184 185 188 189 190 190 210 211 212 223 224 225
CONTRIBUTORS
Luc Anselin is Associate Director, National Center for Geographic Information and Analysis and Professor, Departments of Geography and Economics, University of California, Santa Barbara, California. Emilio Casetti is Professor, Department of Geography, The Ohio State University, Columbus, Ohio. Darrick R.Danta is Associate Professor, Department of Geography, California State University, Northridge, California. Mark Ellis is Assistant Professor, Department of Geography, Florida State University, Tallahassee, Florida. C.Cindy Fan is Assistant Professor, Department of Geography, University of California, Los Angeles, California. Stuart A.Foster is Assistant Professor, Department of Geography and Geology, Western Kentucky University, Bowling Green, Kentucky. Wilpen L.Gorr is Professor, School of Urban and Public Affairs, Carnegie Mellon University, Pittsburgh, Pennsylvania. Robert Q.Hanham is Associate Professor, Department of Geology and Geography, West Virginia University, Morgantown, West Virginia. John Paul Jones, III is Associate Professor, Department of Geography, University of Kentucky, Lexington, Kentucky. Janet E.Kodras is Associate Professor, Department of Geography, Florida State University, Tallahassee, Florida. Shaul Krakover is Senior Lecturer, Department of Geography, Ben Gurion University of the Negev, Beer Sheva, Israel. Martin Miles is a Ph.D. candidate, Department of Geography, University of Colorado, Boulder, Colorado. Richard L.Morrill is Professor, Department of Geography, University of Washington, Seattle, Washington. John Odland is Professor, Department of Geography, Indiana University, Bloomington, Indiana.
xii
Kavita Pandit is Assistant Professor, Department of Geography, University of Georgia, Athens, Georgia. Michael Sonis is Associate Professor, Department of Geography, Bar-Han University, Ramat-Gan, Israel, and Adjunct Professor, Department of Geography, University of Illinois, Urbana-Champaign, Illinois. Douglas A.Stow is Professor, Department of Geography, San Diego State University, San Diego, California. Sent Visser is Associate Professor, Department of Geography and Planning, Southwest Texas State University, San Marcos, Texas. Francis C.Wimberly is Senior Computer Scientist, Pittsburgh Supercomputing Center, Carnegie Mellon University, Pittsburgh, Pennsylvania.
ACKNOWLEDGMENTS
We would like to express thanks to the editors of the IEEE Transactions on Systems, Man, and Cybernetics, the Annals, Association of American Geographers, and Modeling and Simulation, for permission to reprint chapters 2, 4, and 16, respectively. In addition, we would like to thank Steve Grant, Department of Geography, University of Kentucky, for assistance in preparing some of the graphics appearing in this book.
1 AN INTRODUCTION TO THE EXPANSION METHOD AND TO ITS APPLICATIONS Emilio Casetti and John Paul Jones, III
In this introduction we present an overview of the applications of the expansion methodology appearing in this book. First, however, it is useful to outline what the expansion method is, and why you, the reader, might, or should be, interested in it. Often, the processes of scientific inquiry identify critical variables and ‘important relationships’ among them. These relationships are likely to reflect and incorporate theoretical presuppositions, and are eventually formalized into mathematical models and estimated. Production functions, demand functions, the rank-size rule, and spatial interaction models are examples of such ‘important’ or ‘special status’ relationships. Relationships such as these play a central role in the contemporary social sciences. Disciplines such as economics, psychology, or political science grew by carving from a common matrix certain ‘proprietary’ clusters of important relationships. The standing of individual scientific disciplines tends to be related to a major degree to their success in identifying, theorizing, modeling, and estimating such distinctive relationships. There is no question that the abstraction of simple important relations from complex contexts can provide very significant additions to knowledge. Nevertheless, many limitations and shortcomings of the social sciences and their models can be traced to the same processes of abstraction that are also responsible for these advances. Simple and elegant models can yield important insights into naturam rerum (the nature of things), but they are in all likelihood inadequate for understanding complex realities and for intervening to change them. There is a need to reintroduce the complexities of the real world into simple theoretically grounded mathematical models without destroying these models in the process. In fact, the simple models and the simplified important relationships prevalent in the contemporary social sciences should be regarded as early steps in the growth of knowledge, rather than the end point and culmination of it. The expansion methodology combines a technique and a research philosophy that is especially well suited to bring together simple models and complex realities.
2 E.CASETTI AND J.P.JONES, III
The expansion method is both a technique for creating or modifying mathematical models and a research paradigm. As a technique, it consists of the following well-defined operational steps: (a) an ‘initial model’ is specified; the model is made of variables and/or random variables and at least some of its parameters are in letter form; (b) at least some of the letter parameters in the initial model are redefined by ‘expansion equations’ into functions of variables and/or random variables; in many cases these are substantively significant indices representing a context; (c) the expanded parameters are replaced into the initial model to create a ‘terminal model’; and (d) the expansions can be iterated, since the terminal model produced by one expansion can become the initial model of a subsequent one. Suppose that we take as initial model an important relationship with strong theoretical grounding, and that the expansion equations model the contextual variation of this relationship. Then the terminal model obtained from the two will encompass in the same entity both the model and its contextual drift. Thus, the identity of the initial model is preserved, but at the same time the initial model is rendered capable of addressing complex contextual realities that were previously not part of it. The expansion methodology is also a research philosophy which carries within itself the suggestion that important theoretically grounded relationships should be regarded as building blocks of more complex theoretical structures encompassing both them and their contexts or environments. Specifically, these higher structures should reflect both the theory behind the initial model and the theory about the nexus between the initial model and its contexts. Clearly, the expansion paradigm has major implications as regards estimation. The theoretically grounded relationships from individual disciplines tend to be investigated and estimated under the implicit presupposition that they possess some form of quasi universal validity (i.e. invariance). Certainly, in most cases, they are presumed to be invariant over the data sets from which they are estimated. In contrast, the expansion methodology suggests that presuppositions of invariance are almost always unwarranted. Instead, the variation of relationships across contexts should be presumed, investigated, tested for, and theorized. The ‘invariance’ or ‘universal validity’ of a relation should be a conclusion arising from an extensive, protracted, and unsuccessful search for contextual variation, rather than a presupposition. Furthermore, the contextual variation of relationships should not be regarded as a nuisance or an aberration, as is currently the case. On the contrary, the theoretical and empirical investigation of the variation of important relations across contexts should be regarded as the obvious second phase of any scientific effort that has brought these relationships into focus. In this next phase, potentially relevant contexts and environments should be focused upon to determine whether a relationship drifts across them, and to theorize why we should expect such drift to occur.
AN INTRODUCTION TO THE EXPANSION METHOD 3
The word paradigm has diverse meanings. However, it is often used to denote an intertwined cluster of research questions and operational approaches/ techniques to obtain answers to these questions. In this sense, the expansion methodology is a paradigm, since it suggests that researchers ask questions about the contextual variation of relations while at the same time it provides the operational routines to model this variation and to test for its occurrence. Research involving mathematical models involves diverse activities and is carried out within diverse schools of thought. To exemplify, let us consider some cases. One class of model-oriented research aims at determining the optimum states or optimum time paths of systems by techniques such as mathematical programming, optimum control, and others. Other activities are concerned with extracting the implications of models. Examples include research on systems of equations (as in input-output studies), the solving of differential equations, the execution of simulations (as in the System Dynamics tradition), and the investigation of the qualitative properties of dynamic systems. Other types of model-oriented research are concerned with the estimation of a model’s parameters using empirical data. Estimation work is carried out by practitioners and theoreticians such as engineers, econometricians, statisticians, geographers, and physicists, to name a few, all of whom are very different in their objectives, concerns, and preferences. Their approaches may differ in the extent to which a researcher is committed to a specific model or, alternatively, is willing to consider variants or alternatives to it; in the emphasis on substantive modeling vis-à-vis the specification of error terms; in the manner and extent to which prior information is brought to bear upon the estimation process; and so on. The vast diversity of mathematical modeling is placed into focus here in order to make the point that the expansion methodology can be applied to modeloriented research of any kind and within an open-ended spectrum of research approaches. For instance, it can be used to construct and modify very abstract models within a frame of reference encompassing the qualitative study of differential equations in which no estimation is contemplated, or to construct or modify models within, say, an econometric perspective. Indeed, the expansion method has a far greater potential, both methodologically and substantively, than is represented by the diversity of papers in this volume—most of which have been written by scholars with primarily substantive interests. These papers, introduced in the paragraphs that follow, reflect their authors’ perceptions of the expansion method and correspond to their diverse substantive and methodological preferences. Casetti’s paper is a reprint of a 1986 statement on the expansion method that appeared in the IEEE Transactions on Systems, Man, and Cybernetics. The paper provides a guide to diverse applications of the expansion method. It also introduces ‘dual expansions’, a methodology which enables the researcher to investigate the duality between model and context using the expansion method. Casetti shows that when a model is expanded with respect to contextual
4 E.CASETTI AND J.P.JONES, III
variables, an implicit second model becomes defined in which the primal context becomes the dual model and the primal model becomes the dual context. The paper illustrates the model-context duality in an empirical study of economic development and population growth. The next contribution, by Jones, discusses the paradigmatic aspects of the expansion methodology. He explores the implications of the expansion method for ‘open’ research, for altering research trajectories, for testing alternative theoretical frameworks, and for micro and macro level analyses. Jones then uses the expansion method to undercut the distinction between regional and systematic geography. The paper ends by drawing some parallels between the expansion method and analyses employing scientific realism. Kodras’ paper focuses upon the spatial variation of the relationship between participation in welfare programs and welfare benefits. Traditionally, debates over welfare have been dominated by opposing theories reflecting liberal and conservative perspectives. The liberal theory views welfare provision as a policy response to social needs, with the corollary that greater welfare participation is the counterpart of greater social need. The conservatives’ work-disincentive theory formalizes the notion that high welfare benefits discourage participation in the labor force and encourage participation in welfare programs. The research in this area has been largely aimed at determining the validity of these theories, or at most their comparative ability to explain reality. Kodras, instead, investigates spatial variation in the explanatory power of these theories. Her paper is concerned with participation in the Aid to Families with Dependent Children program. In a capsule, Kodras expands an initial model relating program participation and benefits into varimax rotated factors extracted from a number of relevant contextual variables. Her conclusions are that ‘each position in the welfare debate is more valid in some places than in others because the programs have different impacts in different contexts’, and that the spatial pattern of welfare provision is characterized by a mismatch between welfare services and welfare needs. Foster, Gorr, and Wimberly address the comparative merits and the complementarities of drift analyses versus expansion approaches in the study of parametric variation. Drift analyses involve multiple estimations of an initial model, for instance at different points/regions in geographic space, or for different points/intervals in time. In this paper several specifications of moving window regressions are estimated and their results are contrasted with those produced by expansions. The initial model is a functional relation between the growth rates of physicians, the dependent variable, versus density of physicians and population growth, the independent variables. State level data from the American Medical Association master files are employed. The empirical analyses in the Foster et al. paper are suggested by the literature on the locational behavior of physicians, and are designed to estimate the effects of federal programs intended to end physician shortages and maldistribution.
AN INTRODUCTION TO THE EXPANSION METHOD 5
Using random utility theory, Ellis and Odland specify a model of destination choice functionally similar to an originspecific gravity model. This model is expanded first to allow for the distinction between urban and rural destinations and then in order to examine the effects of age and gender on destination choices. All the variables in the model are categorical or categorized. The ages of migrants are categorized into age classes, and the migration distances are categorized into distance bands. The Ellis and Odland formulation yields a 240cell contingency table; is characterized by a binomial sampling error; involves heteroskedasticity; and requires generalized least squares for its estimation. The sampling zeros in the contingency table are removed using pseudo-Bayesian estimates. The empirical analysis presented is based on an Ecuador data set with about 78,000 observations. This paper brings together concepts and formalisms from ANOVA, categorical data analysis, and the expansion method. Krakover discusses four approaches to the investigation of metropolitan decentralization, of which two are applications of the expansion method. The latter approaches involve expanding into time polynomials, respectively, the parameters of a polynomial in distance from the CBD and the parameters of a trend surface. Population growth is the dependent variable in both formulations. The methods discussed in the paper are demonstrated and contrasted by a case study for the urban region of Tel Aviv. The paper by Krakover and Morrill investigates a cluster of hypotheses concerning the dynamics of urban centralization and decentralization. Namely, they hypothesize (a) that the third Kondratieff cycle (1896–1933) coincided with metropolitan centralization; (b) that the fourth Kondratieff cycle (1933–72) coincided with metropolitan deconcentration; and that during both cycles, (c) periods of prosperity were characterized by the growth of more central counties of metropolitan areas and by decline in less central counties, while (d) recessionary years tended to exhibit inner county decline and outer county growth. These hypotheses are tested using county data for the metropolitan areas of Philadelphia, Chicago, and Atlanta. The analyses are based on a model obtained by expanding into time the parameters of an initial formulation relating population growth to distance. Both Danta’s and Fan’s contributions employ expanded rank-size models. Danta analyzes the temporal and structural changes in the Hungarian urban hierarchy between 1870 and 1986. He starts from the classical formulation relating the sizes of urban centers to their rank, and argues that the parameter associated with the rank variable is a measure of hierarchical concentration. In the conventional unexpanded formulation this measure refers to one point in time, and to the entire system. By expanding the parameters of the rank-size model with respect to time and rank, Danta generates a terminal model that can portray agglomerative and deglomerative tendencies over time, and at various levels of an urban hierarchy. His empirical analyses estimate the temporal shifts of agglomerative and deglomerative trends of the Hungarian urban system, and test the effectiveness of that country’s policies aimed at reducing urban primacy.
6 E.CASETTI AND J.P.JONES, III
Fan extends the potential use of the rank-size functions to the study of inequalities. She argues that the slope parameter of a log transformed rank-size relationship is a systemic measure of inequality in any system. Fan proposes using expanded rank-size relationships to investigate the change of inequality of a system across any context, as well as within any system. This approach is applied to investigate the dynamics of development inequalities for thirty-eight countries between 1913 and 1980. A crucial suggestion arising from the expansion method paradigm is the stability of social science ‘laws’. Theoretically grounded empirical regularities with a law or quasi-law status are usually estimated under an implicit presupposition of invariance. As Pandit’s paper demonstrates, the drift of such laws across contexts is likely. She investigates the contextual drift of the law-like country level relationship between labor shares in agriculture, manufacturing, and services on one hand, and gross national product per capita on the other. Pandit’s starting point is a classical study by Chenery and Syrquin in which these relationships are estimated under an implicit assumption of invariance. Using a virtually identical data set, she is able to show that the relationships display a statistically significant drift over time and across space, meaning that they are not invariant and are thus not laws. Visser shows that ‘expansions’ are required to estimate agricultural production functions from areal data in a manner that is consistent with location theory. His argument runs as follows. Location theory tells us that under competitive market equilibrium, agricultural types and intensities are distributed in space so as to maximize rents. For a single type of agriculture the spatial pattern thus produced is one of intensities decreasing with distance from markets. For multiple agricultural types, at each point in space, only one agricultural type will have an optimum intensity that maximizes rent. However, decreasing intensities with distance from market will still prevail for each agricultural type. These propositions imply that when the parameters of an aggregate agricultural production function are estimated from areal data, they should be expanded into indices of the strength of various agricultural types in order to come to grips with the fact that each areal aggregate includes a mix of agricultural types. A successful empirical analysis concludes Visser’s paper. The remote sensing application of the expansion methodology presented by Miles, Stow, and Jones is an effort that opens up a wide vista of similar applications. In remote sensing, measurements of phenomena taken at a distance, for instance from satellites, need to be functionally related to measurements taken at the surface of the earth. These relationships provide the basis for securing high resolution, inexpensive, and reliable information on earth surface phenomena. Miles et al. argue that initial models relating satellite measurements to surface measurements can be usefully expanded in terms of substantively relevant variables for the purpose of improving our ability to make accurate inferences about earth phenomena from space. In their application, trend surface expansions of a model relating satellites and surface measurements of
AN INTRODUCTION TO THE EXPANSION METHOD 7
several estuary water properties were tested, with results ranging from encouraging to very good. The book ends with three papers centering upon theoretical and methodological themes. Sonis employs a generalization of the expansion methodology to link geographic diffusion theory, economic utility theory, and ecological competition theory. His is an example of expansion method applications that are not directly oriented toward estimation. Anselin, on the other hand, addresses estimation themes from a spatial econometrics point of view. His paper focuses upon the issues that arise when the error terms are spatially autocorrelated and/or heteroskedastic (possibly because of stochastic expansion equations). The classes of spatial estimation issues including the ones addressed by Anselin are attracting a growing interest in geography and regional science. In the final paper Hanham discusses the expansions into ‘splines’, thus integrating the expansion methodology with a class of techniques that has been diffusing from engineering into the social sciences. His paper includes an application of spline expansions to regional unemployment response functions. In closing this introduction, we would like to express the hope that the readers of this book will experiment with expansion method techniques and themes in their own research.
2 THE DUAL EXPANSION METHOD: AN APPLICATION FOR EVALUATING THE EFFECTS OF POPULATION GROWTH ON DEVELOPMENT Emilio Casetti The progress of scientific research involves a recursive process in which two logically distinct phases can be identified. One phase is concerned with formulating or revising the disciplinary conceptual frameworks that define (a) issues and phenomena to be studied, (b) what constitutes pertinent information and data, (c) the procedures through which information and data are obtained, and finally (d) the ‘important’ relations among phenomena that are singled out for investigation. The second phase of the process is concerned with modeling. It involves constructing alternative mathematical models, assessing their consistency with empirical data, estimating their parameters, and extracting their quantitative and qualitative implications. The conceptual and modeling phases of this process interact with one another. In fact, research involves recursive iterations between the two. The results of the modeling phase, in the form of implications of models or tests of hypotheses, modifies the previous conceptual framework, which in turn generates new tasks to be carried out during a subsequent modeling phase, and so on. Most of the attention of the scholars working with models tends to focus on parameter estimation, on extracting the implications of propositions formalized into specific mathematical models, or on producing the models required by specific research leads. The logical processes by which models are arrived at tend to remain in the background and generally are not viewed as a distinct object of enquiry. And yet, an awareness of the mental and logical routines by which models are generated has the potential to render modeling simpler, more efficient, and more likely to produce better results. THE EXPANSION METHOD Casetti (1972, 1982) outlined a routine for creating or modifying models made of a sequence of clearly identified logical steps, and called it the expansion method. The expansion method is a technique for generating models. It involves the following: (a) an ‘Initial’ model in which some or all of the parameters are in non-numerical form is selected; (b) some or all of the letter parameters are ‘expanded’ by expansions equations that redefine them as functions of variables
E.CASETTI 9
and/or of random variables, that may or may not appear in the initial model; (c) a ‘terminal’ model is generated by replacing the expanded parameters into the initial model. Let us illustrate the operation of the method at the formal level and in a purely deterministic context. Denote by Y a dependent variable and by X and Z two sets of variables X1, X2,…, Xp and Z1, Z2,…, Zq. For simplicity in the discussion that follows p=q=2. Assume an initial model Y=f(X) represented by a linear relation between a dependent variable Y and the X variables, (2.1) and expansion equations defining the parameters of this initial model into linear functions of the Z variables: (2.2) (2.3) (2.4) By substituting the right-hand side of (2.2), (2.3), and (2.4) for the corresponding coefficients in (2.1) the following terminal model is obtained: (2.5) The first and second subscripts of the c denote respectively which X and Z variables are represented in the expression to which the coefficient is attached. For instance, c12 identifies the presence of variables X1 and Z2. A zero subscript denotes the absence of the corresponding variable. For instance, c01 is associated with an expression containing the Z variable Z1 but no X variable. The usefulness of this notation will become apparent later. Most mathematical models can be related to simpler ones by viewing them as the result of an expansion of the simpler models’ parameters. For instance, quadratic polynomials can be regarded as expansions of linear polynomials. To show this, take as initial model the polynomial and expand n into a linear function of t by the expansion equation Then, the terminal model is a quadratic. As a technique to relate models to one another, the expansion method is quite general, since any model can be expanded, and in turn just about any model can be thought of as the result of the expansion of a simpler model. The primitive within this frame of reference is a model consisting of a variable set equal to a constant value, because no expansion can generate it. On the other hand, any primitive Y=a can be converted by repeated expansions into a wide variety of different functional relations. However, the same terminal model may be arrived at from different initial models. In order to place into focus the generality of the expansion method let us consider the variety of initial models and expansion equations it can encompass. Mathematical models can be purely deterministic, purely stochastic, or they may include both deterministic and stochastic components. A functional relationship between a dependent variable and one or more independent variables exemplifies
10 THE DUAL EXPANSION METHOD
the first case. A mathematical structure involving jointly distributed random variables constitutes an instance of the second type of model. The usual econometric model of the type in which W and e are random variables and the Vs are variables typifies the third situation. Letter parameters may appear in these models within a functional relation involving variables and/or random variables, or as parameters of stochastic processes or of probability distributions. Models from any one of the classes indicated above can be used as initial models, and any of the letter parameters appearing in a model may be redefined by expansion equations as functions of variables or of random variables, or of both. For example, the parameters of a functional relation can be expanded into functions of other variables and/or random variables. Or a probability distribution’s parameters can be expanded into functions of variables, or of random variables, or both. Specialized literatures, often centered on estimation issues, have developed constructs that can easily be conceptualized in terms of the expansion method. Here are a few examples. The random coefficients models (Wald 1947; Swamy 1971; Judge et al. 1980: 352ff.) can be thought of as expansions of the parameters of functional relations or of the deterministic portion of linear models into random variables. Autoregressive or lagged dependent variables models (Johnston 1972: 292ff.; Pindyck and Rubinfeld 1976:485ff.) are easily produced by expanding the parameters of a nonautoregressive initial model into functions of the dependent variable lagged in time. The empirical Bayes inference (Robbins 1955, 1964, 1980; Morris 1983; DuMouchel and Harris 1983) postulates that observed data are produced by a stochastic process the parameters of which result from a second stochastic process. Related to these are the hierarchical statistical models (Lindley and Smith 1972; Good 1980). They involve nested probability distributions in which the parameters of the highest order distribution are random variables with parameters that are also random variables. Within the expansion method’s frame of reference these constructs involve stochastic initial models with parameters expanded into random variables. The expansion method capability to routinize the creation of mathematical models renders it potentially useful along several dimensions. Specifically, it can be used 1 to test hypotheses concerning the drift and/or stability of a model’s parameters, and to obtain functional portraits of this drift; 2 to create complex models from simpler ones for specific research purposes; 3 as an organizing scheme within the context of which mathematical models can be classified and related to one another; 4 to interpret complex models in terms of simpler initial model(s) and related expansion equations; and 5 as an artificial intelligence technique allowing model creation by computer programs.
E.CASETTI 11
These five classes of applications of the expansion method are interrelated and cannot be entirely separated from one another. Samples of the expansion method literature are reviewed in the paragraphs that follow. A number of applications of the expansion method test hypotheses of spatial parameter variation or estimate the structural parameters of models expanded to incorporate spatial dimensions. In one of the earliest applications (Casetti and Demko 1969, 1973; Demko and Casetti 1970) fertility and mortality decline was modeled using a generalized logistic. Then, the model’s parameters specifying the rate and timing of the decline were expanded into the distances from the urban centers where these declines had originated. The terminal model’s parameters were estimated using areally disaggregated data for the USSR for the years 1940, 1950, 1960, and 1965. Strongly significant spatial temporal patterns were observed. Other demographic applications of the expansion method exist in the literature (Casetti 1973; Hanham 1974; Bretschneider et al. 1981; Ying 1982; Zdorkowski and Hanham 1983). Another early application of the expansion method (Casetti and Semple 1969) involved testing hypotheses on the diffusion of tractors in the USA. In order to do this, the parameters of a logistic relating percentage adopters and time were expanded into linear functions of distance from the adoption leader (North Dakota). The terminal model obtained was estimated using data on the percentage of farms using tractors for twenty-five states in the central portion of the USA and for nine time periods spanning the 1920–64 time horizon. The investigation revealed a diffusion lag proportional to distance from the adoption leader. Instead, it did not support the hypothesis that the diffusion rates would differ with distance from the adoption leader. A number of other papers employed the expansion method to investigate diffusion phenomena (Casetti et al. 1972b; Hanham and Brown 1976; Casetti and Gauthier 1977; Zdorkowski and Hanham 1983). Some spatial applications of the expansion method have been concerned with testing for the occurrence of polarized growth (Casetti et al. 1970; Odland et al. 1973; Gaile 1977; Uyanga 1977; Krakover 1983). Recently trend surface expansions have been experimented with. They involve expanding an initial model’s parameter into spatial coordinate polynomials of the type used in trend surface regressions. Trend surface expansions are well suited to produce geographical maps of parameter variation. Also they appear to be one possible tool for removing spatial autocorrelation in regression residuals (Jones 1983). Trend surface expansions have been discussed by Jones (1984), Casetti and Jones (1983), Brown and Jones (1985), and Selwood (1984). In the Brown and Jones application trend surface expansions of a conventional migration model were employed to pinpoint the relations between migration modalities and economic development in Costa Rica. Some applications of the expansion method focused on the temporal drift of the parameters of the initial models. For example, the expansion method was used by Malecki (1975, 1980) to test hypotheses concerning the change over
12 THE DUAL EXPANSION METHOD
time of rank-size functions, by Bretschneider and Gorr (1983) to develop models suited for forecasting and policy evaluation, and by Bretschneider et al. (1981) to evaluate natural gas conservation policies in Ohio. In some cases the expansion method has been used to construct models not necessarily or immediately employed in the context of data analyses. For instance, Casetti (1972) showed that expanding some coefficients of a class of mathematical programs in terms of their dual variables generates the spatial equilibrium formulations ordinarily predicated upon the maximization of a social pay-off function (Samuelson 1952; Takayama and Judge 1964). Sonis (1983) used the expansion method to construct very general Volterra-Lotka type vectorial differential equations of diffusion processes. Visser (1980a, b, 1981, 1982) started from optimum relations between agricultural intensity and distance to market and used the expansion method to incorporate technological and other changes into them. Other expansion method applications are disconnected from the clusters of themes referred to above. Here are some examples. Briggs (1974) expanded differential equations relating aggregate CBD sales to urban population. Thrall (1979) used the expansion method to investigate spatial inequities in tax assessments. Thrall and Tsitanidis (1983) employed it to study the changes in the locational patterns of physicians, and Kodras (1984) used it to evaluate the regional variation of the determinants of food stamps program participation. In this paper the expansion method and its dual expansions extension will be applied to the analysis of parameter drift and parameter stability within the framework of a linear model. The population models assumed throughout this paper involve on the left-hand side of the equality sign a dependent random variable and on the right-hand side a linear combination of variables plus a stochastic error term. The expansions dealt with concern the parameters of the independent variables. Let us outline how the expansion method can be used for investigating parameter stability. Suppose we wish to test the hypothesis that the parameters of the initial model (2.1) drift ‘linearly’ with respect to the Z variables. The expansion equations (2.2), (2.3), and (2.4) are a formalization of this linear drift. ‘Population’ models are arrived at by adding random disturbances to (2.1) and (2. 5). If these disturbances satisfy independence and homoskedasticity assumptions the parameters of the terminal model can be estimated by an ordinary regression. If none of the coefficients associated with the terms in which the Z variables appear is statistically significant the initial model is ‘stable’, or at least does not have the type of instability that can be detected by linear expansions. It could still have, though, an instability that could be revealed by some other expansion. If instead one or more of the cs associated with Z variables are significant, this indicates that (2.1) changes with the Z variables in the manner that can be made explicit by replacing estimated cs into (2.2), (2.3), and (2.4). The estimated expansion equations associate each and every point in Z space with a ‘realization’ of the initial model. Since values of Z are given for each
E.CASETTI 13
observation, they can associate every observation with a numerical realization of the initial model. The estimated expansion equations also provide the basis for a sensitivity analysis. To show this, suppose we define a neighborhood in Z space and investigate the realizations of the initial model associated with the points in this neighborhood. This investigation constitutes indeed a sensitivity analysis of the initial model. DUAL EXPANSION METHOD If a terminal model has been generated from a linear initial model by linear expansion equations, there is a second linear initial model and associated linear expansion equation(s) that will yield the same terminal model. If a terminal model is given, as soon as the linear initial model and linear expansion equations capable of producing it are defined, then a second linear initial model and associated linear expansion equations which will produce the same terminal model become defined. Let us call the two initial models and associated expansion equations respectively primal and dual. Which expansion is primal is arbitrary, but when an expansion is defined as primal, the second one becomes the dual of the first one. In the next few paragraphs the dual expansions will be demonstrated using the example introduced earlier. A more formal presentation will follow. The intrinsic duality of the linear expansions is illustrated by the fact that the same terminal model (2.5) can be arrived at from an initial model relating the Y and Z variables by expanding its coefficients in terms of the X variables. To show this assume the initial model Y=g(Z), (2.6) and the expansion equations (2.7) (2.8) (2.9) that will indeed produce again the terminal model (2.5). In fact, if the parameters of (2.5) are estimated using empirical data, the resulting equation can be interpreted either in terms of the initial model (2.1) and of the expansion equations (2.2), (2.3), and (2.4), or in terms of the initial model (2.6) taken in conjunction with the expansion (2.7), (2.8), and (2.9), The estimated terminal model tests simultaneously the hypothesis that the parameters of (2.1) drift with respect to the Z variables, and the hypothesis that the parameters of (2.6) drift with respect to the X variables. Also, upon estimation the terminal model will yield two sets of expansion equations with numerical parameters that specify the nature of the drift of each initial model’s parameters in the dual space represented by the ‘other’ set of variables. This in turn implies that every point in
14 THE DUAL EXPANSION METHOD
Z space is associated with a realization of the initial model f(X), and every point in X space is associated with a realization of the dual initial model g(Z). Consequently, every observation can be associated with a realization of the f(X) and g(Z) models. It was noted that as soon as a model is conceptualized and interpreted in terms of a linear initial model and associated expansion equations, a unique linear dual initial model and associated expansion equations are implicitly defined. This situation is reminiscent of the primal dual relationships in mathematical programming. The dual expansion tableau scheme that follows (a) shows that a linear initial model and expansion equations are associated with a unique dual linear initial model and associated expansion equations; and (b) provides a simple routine for extricating primal and dual initial models and expansion equations from cumbersome terminal models. This dual expansion tableau is reminiscent of the Tucker tableau used to relate primal and dual linear programming formulations. The expansions in the previous example as specified by (2.1) – (2.9) are summarized, condensed, and portrayed in the following tableau: The tableau consists of a central region filled by the c coefficients of the expanded model, and surrounded by stubs. The coefficients of the two initial models appear on the left stub and on the top stub. The X and Z variables are located on the stubs opposite to these. The dependent variable Y is in the top left corner. By relating the tableau to (2.1)–(2.9) it will become apparent that any of these equations can be readily arrived at by dropping the X variables sidewise and the Z variables upward onto the appropriate coefficients, and by adding the required plus and equality signs. The tableau clearly shows that a unique linear dual expansion becomes defined as soon as a primal linear expansion is specified. As soon as the as of the primal initial model, the cs of its expansions, and the X and Z variables are given and placed in the appropriate locations in the tableau, the dual initial model and expansions are defined by the vertical entries in it. The tableau above has been compacted and generalized in the vector and matrix formulation below: The entries in this second tableau are the vector and matrix equivalents of the entries in the previous one. For instance, and so on. The relations between initial models, expansion equations, and terminal model are easily represented using a, b, C, and Y. Specifically, is the primal initial model, represents the expansion equations related to it, and the bilinear form is the terminal model. The dual
E.CASETTI 15
initial model is The expansion equations related to it are which upon substitution yield again the terminal model The intrinsic duality of the linear expansions has practical research implications. Let us bring into focus three of them. 1 An awareness of the expansions’ duality carries within itself the suggestion to identify and then to interpret dual initial models and expansion equations. This is reminiscent of the practice of interpreting dual variables in mathematical programming. As in mathematical programming, the attempt to interpret duals can produce interesting results in some cases, and uninteresting ones in others. 2 If the X and Z variables are distinct and competing potential explanators of Y that correspond to alternative theoretical frames of references, the dual expansions formalism allows investigation of their comparative efficacy and whether they influence each other’s efficacy. 3 Finally, the expansions’ duality can be put to use by taking a complex model as a starting point and then interpreting it in terms of alternative primal/dual initial models and associated expansion equations. These three approaches to using the intrinsic duality of the expansion method will be referred to as the dual expansions method. The practical usefulness of the dual expansions method can best be demonstrated by an example. AN EXAMPLE In the sections that follow the dual expansions method will be employed to investigate the effects of population growth on development. The issue of whether population growth has a positive or a negative influence on econom ic growth has been the object of heated ideological debates that often Table 2.1 A compilation of correlations between rates of growth of population and product per capita Source
Number of countries
Type of countries
Time span
R
R2
1
16
LDC
1952–58
−0.680
0.462
63
ALL
1950–64
−0.309
0.095
2
Stockwell 1962 Kuznets 1967
16 THE DUAL EXPANSION METHOD
Source
Number of countries
Type of countries
Time span
R
R2
3 4 5 6 7
21 40 35 20 26
MDC LDC LDC MDC LDC
1950–64 1950–64 1959–69 1959–69 1963–68
–0.434 0.111 –0.120 –0.300 –0.370
0.188 0.012 0.014 0.090 0.137
49
ALL
1950–66
–0.125
0.016
32
LDC
1950–66
–0.032
0.001
17
MDC
1950–66
–0.447
0.200
16
MDC
1960–70
–0.010
0.000
76
LDC
1960–70
–0.040
0.002
52
LDC
8 9 10 11 12 13
Kuznets 1967 Kuznets 1967 Sauvy 1972 Sauvy 1972 Stockwell 1972 Thirlwall 1972 Thirlwall 1972 Thirlwail 1972 Chesnais and Sauvy 1973 Chesnais and Sauvy 1973 Chesnais and Sauvy 1973 Simon 1977
1959–61 to 0.110 0.012 1969–71 14 11 MDC 90 to 115 –0.359 0.129 years 15 Simon 1977 10 MDC 40 to 70 years −0.122 0.015 Note: LDC, less developed countries; MDC, more developed countries; ALL, both types of countries.
emphasized the mechanisms through which these influences are exercised. Extensive reviews of this literature are available (United Nations 1953, 1973; Easterlin 1967; Kuznets 1967; Simon 1977, 1981; Cassen 1976; McNicoll 1984). Positive effects of population growth have been credited to scale economies; external economies; increased division of labor and specialization (Chenery 1960; Robinson, 1960; Maizels 1963; Glover and Simon 1975; Simon 1977:275); induced innovation effects (Hirschman 1958; Boserup 1965, 1981; Clark 1967; Binswanger 1978); the easier adjustments to change by younger populations; the greater pool of talent in larger populations; and finally the climate of buoyancy and optimism associated with growth (Kuznets 1965; Simon 1981:197ff.). On the negative side it has been maintained that high rates of population growth produce pressures on limited nonrenewable resources (Fisher and Potter 1971), bring about negative externalities of various kinds as well as external and scale diseconomies, reduce private and public capital formation, and channel investments to maintaining rather than increasing capital intensity (Coale and Hoover 1958; Kelly and Williamson 1974; Cassen 1978; Bilsborrow 1979).
E.CASETTI 17
No consensus has been reached as to which effect prevails. The empirical investigations regarding the relations between population growth and rate of development yielded results that are generally regarded as inconclusive. The compilation given in Table 2.1 shows that the correlations between the two rates are generally low, and are positive in some cases while negative in others. Inconclusive results could be produced by instability or drift of the relation between population growth and rate of development. The expansion method renders easy the testing of hypotheses concerning parameter drift. An awareness of the method renders the researcher wary of assuming parameter stability, and heightens instead his/her propensity to ask questions as to which variables, mechanisms, and theoretical propositions might be related to parameter drift. In the case in point, this line of enquiry suggests level of development as the single most likely cause of drift here. First, the contemporary countries differ to a very major extent in development levels. Second, several considerations suggest that the relation between population growth and rate of development may be different depending on a country’s level of development. The mechanisms through which population growth would exercise positive and negative effects on development can be presumed to operate with different intensity at different development levels. Specifically: 1 High rates of population growth are said to reduce capital formation and to divert savings toward less productive demographic investments. However, this effect will be stronger at the initial stages of the development process that tend to be characterized by a greater shortage of capital. 2 High rates of population growth compound the pressure that diseconomies of scale and negative externalities such as congestion and pollution place on economic growth. However, diseconomies of scale and negative externalities are more likely to be problems confronting mature economies located at the upper end of the development continuum. 3 A number of authors suggested that population growth renders change, and adjustment to change, easier. The bulk of the dislocations associated with the transfer of population from the countryside into evolving hierarchies of urban centers materializes at intermediate stages of the development process, which in turn suggests that the positive influence of population growth in facilitating these transformations will be felt more strongly by countries in an intermediate position on the development scale. These considerations suggest that the effects of population growth on rate of development should be more negative at low and high levels of development. Let us proceed to test this hypothesis. Denote population by POP, product per capita by PRC, and their natural logarithms by respectively P and y. Both the product per capita variable PRC and its logarithm y are indices of level of development. Assume that POP and PRC are continuous and differentiable
18 THE DUAL EXPANSION METHOD
functions of time t. Then the time derivatives of P and y, which are identified by primes, denote the instantaneous percentage rates of change of POP and PRC over time: (2.10)
(2.11) A linear relationship between y′ and P′ provides a convenient starting point for investigating the effect of population growth on rate of development: (2.12) The parameters a0 and a1 of this initial model are expanded in terms of development levels, i.e. they are redefined as functions of development levels. Which functions should be selected is dictated by considerations of simplicity and by the need to use a function that for appropriate values of its parameters can produce the type of change of the initial models’ parameters that theoretical presuppositions lead us to expect. In this case, we hypothesize that a1 will be lower at low and high development levels, which suggests expanding a1 into a quadratic of a development index. Considerations of symmetry suggest to expand also a0 in a similar fashion. Both expansions into PRC and y were tried out. The one based on y was eventually selected because it performed better in the empirical analyses described later. The expansion equations employed are: (2.13) (2.14) By replacing the right-hand side of the expansion equations (2.13) and (2.14) into the initial model (2.12) the following terminal model is obtained: (2.15) The dual tableau for this expansion is as follows: We can easily extract from this tableau the dual initial model (2.16) and the dual expansion equations (2.17)
E.CASETTI 19
(2.18) (2.19) Clearly, the dual initial model and expansion equations produce the same terminal model (2.15) generated by the primal formulations. The dual initial model links up to a literature distinct from the one related to the primal expansions. This literature is concerned with the tendency for contemporary countries in an intermediate position on the development scale to experience rates of economic growth higher than those of the countries closer to the end points of this scale. The occurrence of this tendency was suggested by Russett (1964:309ff.) who reported a parabolic relation between annual rates of growth of gross national product (GNP) per capita and levels of GNP per capita. The validity of the finding was later questioned by Hagen and Havrylyshyn (1969:88). The occurrence of the trend suggested by Russett was reaffirmed by Horvat (1972) on the basis of spline regressions relating growth rates to levels of GNP per capita, and then by Kristensen (1974). Kristensen’s conclusions were based on tabular data on average annual growth rates and levels of GNP per capita for seven groups of countries at consecutive development levels. Casetti (1979) fitted statistically significant quadratic and cubic polynomials to data on rates and levels of development for ninety-two countries. These data spanned the time interval 1950–73. The polynomials estimated did yield a maximum rate of development at intermediate development levels. Several interesting theories can be related to the tendency for countries at an intermediate development level to grow faster. The literature on the acceleration in the rates of economic growth associated with the onset of the transition from premodern stagnation to modern exponential growth (Rostow 1960; Kahn et al. 1976) is relevant to explain the increase in rates of growth of GNP per capita associated with the transition from low to intermediate development levels. The theories suggesting that countries modernizing later develop faster also contribute to justifying why the highest rates of development tend to occur at intermediate development levels. The thesis that latecomers to development have potential advantages with respect to the countries that modernize earlier—and consequently at comparative levels of development are likely to progress faster than the countries that preceded them—was proposed by Gershenkron (1962:5–30). A proposition related but not identical to Gershenkron’s has been articulated by Kahn (1976: 34ff.) who contended that the gap between the developed and developing nations, often deplored as the source and cause of underdevelopment, may instead very well be the reason why many nations can develop much more rapidly today than western countries did. A similar statement has also been put forward by Kristensen (1974: 28ff.). Both Kahn and Kristensen suggest that in today’s world the higher rates of growth of developing countries are due to the very existence of the higher level of development of the most developed countries.
20 THE DUAL EXPANSION METHOD
Finally, the theoretical statements suggesting that mature economies will tend to experience a retardation in economic growth (Matthews 1982; Olson 1982) are consistent with the occurrence of lower growth rates of GNP per capita in countries characterized by higher levels of GNP per capita. The explanations as to why middle-income modernizing countries experience the highest rates of development have been articulated within three frames of reference, which pertain respectively to (a) the rates of application of the existing stock of scientific and technological knowledge; (b) the dynamics of capital formation and investments; and (c) changes in scale economies. A large stock of scientific and technological knowledge has been generated during the modern economic growth of the more developed countries. The leastdeveloped societies, because of their socio-economic structure and their lack of capital and trained manpower, have difficulty in applying it. As they develop, their ability to use the existing technology grows, while at the same time the stock of unused applicable technologies dwindles and slows down their growth. In the limit, their economic growth becomes dependent upon the creation of new scientific and technological knowledge, as is the case for the developed countries (Kristensen 1974:16). Capital formation tends to be inadequate in pre-modern societies. It has been maintained that self-sustained modern economic growth only begins when the net capital formation exceeds a threshold of about 5 to 6 percent of GNP. As societies develop, their ability to save increases. However, the capital-to-output ratio tends to be low for less developed countries and increases with development. As a country develops, its increasingly large ability to generate savings and its decreasing capital-to-output ratio concur to produce a phase of accelerating economic growth. Then, in the more developed countries, capital formation may tend to be reduced by a growing preference for leisure, and also savings may be increasingly diverted toward consumption and welfare expenditures. These tendencies would of course be associated with a deceleration of economic growth. Finally, the expansion of markets that goes together with development brings about at first increasing returns to scale and external economies and then decreasing returns to scale and external diseconomies. Each of these mechanisms is capable by itself of accounting for higher growth rates at intermediate development levels. However, these mechanisms are also linked to each other and are mutually self-reinforcing. The dual initial model is a differential equation relating the rate of change of development over time to levels of development. Cross-sectional estimates of this equation’s parameters can identify tendencies prevailing over the time interval given. However, these estimates should not be used for predicting development paths over time unless one is prepared to assume that the dynamics of these countries is time invariant (‘stationary’), and that the same differential equation with the same parameters is applicable to all.
E.CASETTI 21
Nevertheless, the estimate of differential equation (2.16) provides insightful information about the tendencies of countries at various levels of development to grow more or less rapidly, or not at all, over a given time interval. Stable or unstable equilibria of the equation can be easily interpreted within this perspective. Equation (2.16) is capable of topologically distinct types of behavior depending upon the values of its parameters. The dual expansion equations (2. 17) – (2.19) express these parameters as functions of population growth rates. Consequently, the estimated expansion equations allow investigation of the qualitative and quantitative changes in behavior of (2.16) that are associated with different values of Pβ. It should be noted that these dual expansions provide an illustration of the use of the expansion method for the type of investigation falling within the scope of ‘nonlinear dynamics’ and ‘catastrophe theory’. However, they also provide an alternative approach to investigating the effects of population growth on development, since if the parameters of yβ(y) drift in the dual Pβ space, this drift represents an effect of population growth on the rate of development. This discussion has demonstrated how the dual expansion method operationalizes questions regarding the effects of population growth on rates of development into tests of hypotheses regarding the parameter drift of complementary models that correspond to distinct theoretical propositions. Let us now consider briefly how the estimated parameters of the terminal equation (2. 15) can shed light on the effects of population growth on the rate of development. Conclusions regarding the effects of population growth on the rates of development that are supported by the data will depend on which nonzero coefficients appear in this final regression equation. As an illustration let us consider a few possibilities. If only c00 and c10 are different from zero the data support the proposition that P′ affects y′ in a manner that is unaffected by development levels. If all the coefficients of the terms in which P′ appear are nonsignificant we would have to conclude that P′ does not affect y′. If, however, c11 or c12 are different from zero, then P′ affects y′ in different manners at different development levels. In this case a portrait of the drift of the two initial models’ parameters can be obtained by replacing the estimated c coefficients into the expansion equations. Let us postpone further consideration of these points until after the empirical analysis described in the next section. AN EMPIRICAL ANALYSIS This section presents an empirical test of the hypothesis that the effect of population growth on development is a function of development levels. The y', P β, and y variables were operationalized as (a) annual percentage rates of change of population and product per capita over the time interval 1950–73 and (b) the logarithm of product per capita at the midpoint of the interval. Variables (a) and (b) were computed using country level data on population and product per capita
22 THE DUAL EXPANSION METHOD
for 1973 and on the growth rates of these over the 1950–73 time horizon, published by the World Bank (1976). The sample used (92 data points) included all the countries with market economies for which source data were available and which had a 1973 population greater than 2 million people. The parameters of (2.15) were estimated using a backward selection procedure: first a regression including all the variables was computed; then, at each step of the procedure, the variable with the lowest t value was removed until all the coefficients of the variables still in the equation were significant at the 5 percent level or better. The procedure produced the following regression equation:
where the t values are in parentheses under their respective coefficients. The coefficients of the estimated terminal model were placed in the appropriate locations in the dual expansion tableau. The primal initial model , the dual initial model , and the estimated expansion equations can be read directly from the tableau. They are as follows: (2.20) (2.21) (2.22) (2.23) (2.24) (2.25) (2.26) The interpretation of the primal results is straightforward. The parameter a1 of the initial model specifies the change in rate of development associated with a unit increase in the rate of population growth. Consequently a1 represents the ‘effect’ of population growth on the rate of development. The estimated expansion equation (2.22) shows that the a1(y) function is a parabola with a maximum at y=ln (801), and with negative values throughout the range of y spanned by the data. This suggests that the effect of population growth on the rate of development is
E.CASETTI 23
Figure 2.1 Effect of population growth on the rate of development: a1(y) indicates by how much the rate of development is reduced by a population growth rate of 1 percent per year; y=ln(PRC), and PRC is product per capita in 1973 US dollars
Figure 2.2 Phase diagram of the relationship between rates and levels of economic development (corresponding to population growth rates of 1, 2, and 3 percent per year)
always negative, and that it is more so at very low and very high development levels, as hypothesized. A sketch of a1(y) is shown in Figure 2.1. The rate of development a0 in the absence of population growth is 4.473 percent per year. The a1(y) function indicates by how much this rate of development is reduced for each 1 percent of population growth, at specified levels of GNP per capita. Figure 2.1 shows that the least negative impact of P′ on
24 THE DUAL EXPANSION METHOD
y′ occurs at intermediate development levels, and the highest at very low levels of y. Let us consider the dual expansions next. Recall that the dual initial model is a differential equation relating rates and levels of product per capita, and that the dual expansion equations specify the manner in which its coefficients change as a function of population growth. For any given value of P′ the estimated terminal model yields a realization of . Figure 2.2 shows the phase diagram of three such realizations, that correspond to P' values of 1, 2, and 3 percent. The figure shows that each of these realizations of g(y) attains a maximum at y*(P') and intersects the y axis in two points that are the roots of . The values y1(P β) and y2(Pβ) of y corresponding to these roots are equilibrium values of (2.23). Specifically, y1 corresponds to an unstable equilibrium and y2 to a stable equilibrium, since dg/dy is respectively greater than zero at y1 and smaller than zero at y2. The P′ values appearing in the data set are greater than zero and smaller than 4. Over this range of P′, g(y) is topologically invariant in the sense that it is characterized by a finite maximum at y* and by unstable/stable equilibria y1 and y2 where y1< y*< y2. The effects of P′ on the relationship between rates and levels of product per capita are best understood by expressing y*, g(y*), y1, and y2 as functions of P′. After a few manipulations we obtain the following: (2.27) (2.28) (2.29) (2.30) The value of y at which y′ attains a maximum y* is a ‘constant’ function of P′. Instead, the maximum rate of growth of product per capita g(y*) declines at larger values of P′. The equilibrium values y1 and y2 tend respectively to minus and plus infinity as P′ tends to zero. If P′ is increased past realistic values toward a threshold located approximately at P′=11.6, y1, y*, and y2 collapse into a single value, and g(y*) tends to zero. This means that the topology of g(y) changes at P′ =0, at approximately P′=11.6, and is invariant between these. Let . A tabulation of PRC1, PRC2, and g(y*) for realistic values of P′ is given in Table 2.2. The table shows that g(y) is remarkably sensitive to the changes in P′ that can be found in the contemporary world. The rates of development tend Table 2.2 Tabulation of PRC1(Pβ), PRC2(Pβ), and g[y* (Pβ)] evaluated at a range of values of Pβ Pβ
PRC1
PRC2
g(y*)
0.5 1.0 1.5
1 7 19
721,569 88,200 33,973
4.28 4.09 3.90
E.CASETTI 25
Pβ
PRC1
PRC2
g(y*)
2.0 34 18,962 3.70 2.5 51 12,603 3.51 3.0 69 9,244 3.31 3.5 89 7,213 3.13 Notes: P′, percentage growth rate of population; PRC1, exp(y1) where y1 is the smaller root of g(y)=0; PRC2, exp (y2) where y2 is the larger root of g(y)=0; g(y*), maximum value of y′, y′, rate of development. PRC values are in 1973 US dollars.
to be highest at intermediate levels of development. However, for large rates of growth of population the maximum rates of development y* are lower, and the rates of development associated with any level of development are smaller. Also, the stable equilibrium level of product per capita PRC2 occurs at lower development levels when P′ is higher. Summing up, the dual expansions results reveal perverse effects of higher rates of population growth within a theoretical frame of reference different from the one associated with the primal expansions. DISCUSSION AND CONCLUSIONS At this point, it seems appropriate to place the expansion method and the dual expansion method in perspective. The expansion method is a procedure made of a clearly identified sequence of operational steps for generating or modifying mathematical models. Two aspects of the expansion method will be highlighted here. First, the expansion method is concerned with the formal aspects of the process by which mathematical models are created. By abstracting the manipulation of models from the rationale for and the contexts of such manipulations, a class of useful abstractions is arrived at much in the same way as, say, numbers became useful abstractions when they acquired a separate existence from objects counted. Second, the expansion method has a generality of scope that can be traced to the very fact that it addresses formal aspects of model manipulation. Because of this generality, it is applicable to any model, and any variables. In the literature there have been instances of what can be regarded as special purpose expansions. The technique discussed by Gujarati (1970a, b) can be thought of as involving expansions in terms of dummy variables for the purpose of determining whether regressions are significantly different over data subsets. Some phases of the time-varying parameters literature (Bennett 1979:323ff.) can also be conceptualized as involving expansions in terms of time and/or timelagged variables and variates. Also, some substantively oriented contributions suggest themselves as the product of ‘expansions’. For instance, the introduction of neutral technological
26 THE DUAL EXPANSION METHOD
progress into a Cobb-Douglas production function involves what in the expansion method terminology would be called the expansion of a coefficient into a linear function of time. Possibly, researchers may go through the sequence of steps of the expansion method to solve specific substantive research problems without recognizing in these mental processes the application of a general methodology. It can be argued that an awareness of the expansion method as a methodology general in scope can save the effort of redeveloping aspects of this methodology as specific solutions to specific research problems, and consequently can contribute to render research easier, which is a major role of methodological contributions. Also, such awareness can contribute to using the expansion method in situations in which it should be applied but is not. This is an important point that deserves to be developed at some length. Let us concentrate on the issues of parameter drift and stability, in which it is easy to show that an awareness of the expansion method can make a substantial difference. Social scientists assume more often than not that parameters they estimated are stable. Such stability tends to be an untested assumption rather than a conclusion accepted only after the efforts to reveal parameter instability or drift have proven unsuccessful. Indeed, instances of research focusing on the possible occurrence of parameter instability do exist. However, the bias is toward presupposing parameter stability, while instead the opposite should be the rule. Whenever functional relations are estimated, the default presupposition should be that their parameters are likely to vary over space, over time, and over different environments and circumstances. The expansion method provides a simple and easily applicable technique for testing hypotheses of parameter drift. Also, because of its very nature it can contribute to making researchers self-conscious concerning the possible occurrence of parameter drift and the theoretical reasons suggesting it. In a capsule, the expansion method can routinize the asking and answering questions concerning parameter drift and parameter stability. The social scientists’ propensity to assume parameter stability can be traced to a quest for ‘laws’. The mathematical formulation and estimation of invariant ‘laws’ has had a pivotal role in the physical sciences. The importation of mathematical modeling and techniques from the hard sciences into the social sciences tended perhaps to be associated with importing also an aspiration to discover laws. However, in the social sciences, functional relations are likely to represent subsystems that will perform differently in different environments and circumstances rather than invariant laws. The duality principles discussed in this paper state that, whenever a linear initial model and linear expansion equations are defined, also a dual linear initial model and associated expansion equations are implicitly defined. Each of the two initial models relates the dependent variable Y to a set of variables that are potential causes of drift of the ‘other’ initial model. The dual expansion method
E.CASETTI 27
involves investigating whether one set of alternative potential explanators of Y produces effects on Y to an extent determined by the other set of explanators. Both the expansion method and the dual expansion method can perform a most useful function. They can prompt the asking and facilitate the answering of questions, such as the ones concerning parameter drift, that need to be addressed to a much greater extent than has been the case so far. NOTE Reprinted with permission from IEEE Transactions on Systems, Man, and Cybernetics SMC–16 (1), January-February 1986, pp. 29–39. REFERENCES Bennett, R.J. (1979) Spatial Time Series: Analysis, Forecasting and Control, London: Pion. Bilsborrow, R.E. (1979) ‘Age distribution and saving rates in less developed countries’, Economic Development and Cultural Change 28:23–45. Binswanger, H.P. (1978) ‘Induced technical change: evolution of thought’, in H.P.Binswanger and V.W.Ruttan (eds) Induced Innovation; Technology, Institutions, and Development, Baltimore, MD: Johns Hopkins University Press. Boserup, E. (1965) The Conditions of Agricultural Growth, London: Allen & Unwin. ——(1981) Population and Technological Change: A Study of Long-term Trends, Chicago, IL: University of Chicago Press. Bretschneider, S.I. and Gorr, W.L. (1983) ‘Ad hoc model building using time-varying parameter models’, Decision Science 14:221– 39. Bretschneider, S.I., Gorr, W.L. and Roblee, P.R. (1981) An Evaluation of Natural Gas Conservation in the Residential Sector of Ohio using Time-varying Parameter Models, Columbus, OH: National Regulatory Research Institute. Briggs, R. (1974) ‘A model to relate the size of the central business district to the population of a city’, Geographical Analysis 6:265–79. Brown, L.A. and Jones, J.P. (1985) ‘Spatial variation in migration processes and development: a Costa Rican example of conventional modeling augmented by the expansion method’, Demography 22:327–52. Casetti, E. (1972) ‘Generating models by the expansion method: applications to geographic research’, Geographical Analysis 4: 81–91. ——(1973) ‘Testing for spatial temporal trends: an application to urban population density trends using the expansion method’, Canadian Geographer 17:127–37. ——(1979) ‘A class of differential equations relating rates and levels of economic development’, Modeling and Simulation 10: 1419–23. ——(1982) ‘Mathematical modeling and the expansion method’, in R.B.Mandal (ed.) Statistics for Geographers and Social Scientists, pp. 81–95, New Delhi: Concept Publishing. Casetti, E. and Demko, G.J. (1969) ‘A diffusion model of fertility decline: an application to selected Soviet data: 1940–1965’, Discussion Paper 5, Department of Geography, Ohio State University.
28 THE DUAL EXPANSION METHOD
——and——(1973) ‘A diffusion model of fertility decline: an application to selected Soviet data: 1940–1965’, Acta Geographica 12:53–67. Casetti, E. and Gauthier, H.L. (1977) ‘A formalization and test of the “Hollow Frontier” hypothesis’, Economic Geography 53:70–8. Casetti, E. and Jones, J.P. (1983) ‘Regional shifts in the manufacturing productivity response to output growth: sunbelt versus snowbelt’, Urban Geography 4:285–301. Casetti, E., and Semple, R.K. (1969) ‘Concerning the testing of spatial diffusion hypothesis’, Geographical Analysis 1:254–9. Casetti, E., King, L.J. and Odland, J. (1970) ‘On the formal identification of growth poles in a spatial temporal context’, Proceedings of the Association of Canadian Geographers 1:39–43. ——,——and——(1972a) ‘The formalization and testing of concepts of growth poles in spatial context’, Environment and Planning 3:377–82. Casetti, E., King, L.J. and Williams, F. (1972b) ‘Concerning the spatial spread of economic development’, in W.P.Adams and F.M.Helleiner (eds) International Geography, pp. 897–9, Toronto: University of Toronto Press. Cassen, R.H. (1976) ‘Population and development: a survey’, World Development 4: 785–830. ——(1978) India: Population, Economy, Society, New York: Holmes and Meier. Chenery, H.B. (1960) ‘Patterns of industrial growth’, American Economic Review 50: 624–54. Chesnais, J.C. and Sauvy, A. (1973) ‘Progres economique et accroissement de la population: une experience commentee’, Population 28:843–57. Clark, C. (1967) Population Growth and Land Use, New York: St Martin’s Press. Coale, A.J. and Hoover, E.M. (1958) Population Growth and Economic Development in Low-income Countries, Princeton, NJ: Princeton University Press. Demko, G.D. and Casetti, E. (1970) ‘A diffusion model for selected demographic variables: an application to Soviet data’, Annals, Association of American Geographers 60:533–9. DuMouchel, W.H. and Harris, J.E. (1983) ‘Bayes methods for combining the results of cancer studies in human and other species’, Journal of the American Statistical Association 78:293– 308. Easterlin, R.A. (1967) ‘Effects of population growth on the economic development of developing countries’, Annals of the Academy of Political and Social Sciences 369: 98–108. Fisher, J.L. and Potter, N. (1971) ‘The effects of population growth on resource adequacy and quality’, in Rapid Population Growth, National Academy of Science, vol. 2, Baltimore, MD: Johns Hopkins University Press. Gaile, G.L. (1977) ‘Toward a strategy of growth paths’, Environment and Planning A 9: 675–9. Gershenkron, A. (1962) Economic Backwardness in Historical Perspective, New York, Washington, and London: Praeger. Glover, D. and Simon, J.L. (1975) ‘The effects of population density upon infra-structure: the case of road building’, Economic Development and Cultural Change 23:453–68. Good, I.J. (1980) ‘Some history of the hierarchical Bayesian methodology’, in J.M.Bernardo et al. (eds) Bayesian Statistics, Valencia University Press. Gujarati, D. (1970a) ‘Use of dummy variables in testing for equality between sets of coefficients in two linear regressions: a note’, American Statistician 24:50–2.
E.CASETTI 29
——(1970b) ‘Use of dummy variables in testing for equality between sets of coefficients in linear regressions: a generalization’, American Statistician 24:18–22. Hagen, E.E. and Havrylyshyn, 0. (1969) ‘Analysis of world income and growth 1955– 1965’, Economic Development and Cultural Change 18:1–96. Hanham, R.Q. (1974) ‘The diffusion of birth control and space-time trends in the decline of fertility’, Proceedings of the Association of the American Geographers 8:80–3. Hanham, R.Q. and Brown, L.A. (1976) ‘Diffusion waves within the context of regional economic development’, Journal of Regional Science 16:65–71. Hirschman, A.O. (1958) The Strategy of Economic Development, New Haven, CT: Yale University Press. Horvat, B. (1972) ‘Relation between the rate of growth and the level of development’, Working Paper 13, International Development Research Center, Indiana University, Bloomington, IN. Johnston, J. (1972) Econometric Methods, New York: McGraw-Hill. Jones, J.P. (1983) ‘Parameter variation via the expansion method with tests for autocorrelation’, Modeling and Simulation 17:853– 7. ——(1984) ‘A spatially varying parameter model of AFDC participation: empirical analysis using the expansion method’, Professional Geographer 36:455–61. Judge, G.G., Griffiths, W.E., Hill, R.C. and Lee, T.C. (1980) The Theory and Practice of Econometrics, New York: Wiley. Kahn, H., Brown, W. and Martel, L. (1976) The Next 200 Years: A Scenario for America and the World, New York: William Morrow. Kelly, A.C. and Williamson, J.G. (1974) Lessons from Japanese Development: An Analytical Economic History, Chicago, IL: University of Chicago Press. Kodras, J.E. (1984) ‘Regional variation in the determinants of food stamp program participation’, Environment and Planning C: Government and Policy 2:67–78. Krakover, S. (1983) ‘Identification of spatio temporal paths of spread and backwash,’ Geographical Analysis 15:318–29. Kristensen, T. (1974) Development in Rich and Poor Countries, New York, Washington, and London: Praeger. Kuznets, S. (1965) Economic Growth and Structure: Selected Essays, New York: Norton. ——(1967) ‘Population and economic growth’, Proceedings of the American Philosophical Society 111:170–93. Lindley, D.W. and Smith, A.F.M. (1972) ‘Bayes estimates for the linear model (with discussion)’, Journal of the Royal Statistical Society, Ser. B 34:1–41. McNicoll, G. (1984) ‘Consequences of rapid population growth: an overview and assessment’, Population and Development Review 10:177–240. Maizels, A. (1963) Industrial Growth and World Trade, Cambridge: Cambridge University Press. Malecki, E.J. (1975) ‘Examining change in rank-size systems of cities’, Professional Geographer 27:43–7. ——(1980) ‘Growth and change in the analysis of rank-size distribution: empirical findings’, Environment and Planning A 12:41–52. Matthews, R.C.O. (ed.) (1982) Slower Growth in the Western World, London: Heinemann. Morris, C.N. (1983) ‘Parametric empirical Bayes inference: theory and applications’, Journal of the American Statistical Association 78:47–55.
30 THE DUAL EXPANSION METHOD
Odland, J., Casetti, E. and King, L.J. (1973) ‘Testing hypotheses of polarized growth within a central place hierarchy’, Economic Geography 49:74–9. Olson, M. (1982) The Rise and Decline of Nations: Economic Growth, Stagflation and Social Rigidities, New Haven, CT, and London: Yale University Press. Pindyck, R.S. and Rubinfeld, D.L. (1976) Econometric Models and Economic Forecasts, New York: McGraw-Hill. Robbins, H. (1955) ‘An empirical Bayes approach to statistics’, Proceedings of the Third Berkeley Symposium on Mathematics, Statistics and Probability 1:157–64. ——(1964) ‘The empirical Bayes approach to statistical decision problems’, Annals of Mathematical Statistics 35:1–20. ——(1980) ‘An empirical Bayes estimation problem’, Proceedings of the National Academy of Science, USA 77:6988–9. Robinson, E.A.G. (ed.) (1960) Economic Consequences of the Size of Nations, New York: St Martin’s Press. Rostow, W.W. (1960) The Stages of Economic Growth: A Non Communist Manifesto, Cambridge: Cambridge University Press. Russett, B., Alker Jr, H.R., Deutsch, K.W. and Lasswell, H.D. (1964) World Handbook of Political and Social Indicators, New Haven, CT: Yale University Press. Samuelson, P.A. (1952) ‘Spatial price equilibrium and linear programming’, American Economic Review 42:283–303. Sauvy, A. (1972) ‘Les charges économiques et les avantages de la croissance de la population’, Population 27:9–26. Selwood, D. (1984) ‘Office employment in the U.S. urban system, 1910–1970’, Modeling and Simulation 15:277–82. Simon, J.L. (1977) The Economics of Population Growth, Princeton, NJ: Princeton University Press. ——(1981) The Ultimate Resource, Princeton, NJ: Princeton University Press. Sonis, M. (1983) ‘Spatio-temporal spread of competitive innovations: an ecological approach’, Papers of the Regional Science Association 52:159–74. Stockwell, E.G. (1962) ‘The relationship between population growth and economic development’, American Sociological Review 27:250–2. ——(1972) ‘Some observations on the relationship between population growth and economic development during the 1960s’, Rural Sociology 37:628–32. Swamy, P.A.V.B. (1971) Statistical Inference In Random Coefficients Regression Models, Heidelberg: Springer Verlag. Takayama, T. and Judge, G.G. (1964) ‘Spatial equilibrium and quadratic programming’, Journal of Farm Economics 46:67–93. Thirlwall, A.P. (1972) ‘A cross section study of population growth and the growth of output and per capita income in a production function framework’, Manchester School of Economic and Social Studies 40:339–56. Thrall, G.I. (1979) ‘Spatial inequities in tax assessment: a case study of Hamilton, Ontario’, Economic Geography 55:123–34. Thrall, G.I. and Tsitanidis, J.G. (1983) ‘A model of the change, attributable to government health insurance plans, in location patterns of physicians, with supporting evidence from Ontario, Canada’, Environment and Planning C: Government and Policy I: 45–55. United Nations (1953) The Determinants and Consequences of Population Trends, New York.
E.CASETTI 31
——(1973) The Determinants and Consequences of Population Trends, New York. Uyanga, J. (1977) ‘Testing formal polarization of growth in spatial context’, Nigerian Geographical Journal 20:145–51. Visser, S. (1980a) ‘Technological change and the spatial structure of agriculture’, Economic Geography 56:311–19. ——(1980b) ‘Modeling the spatial structure of agricultural intensity: illustration of the Casetti expansion method’, Modeling and Simulation 11:1393–8. ——(1981) ‘Estimation of agricultural production functions’, Modeling and Simulation 12:833–9. ——(1982) ‘On agricultural location theory’, Geographical Analysis 14:167–76. Wald, A. (1947) ‘A note on regression analysis’, Annals of Mathematical Statistics 18: 586–9. World Bank (1976) World Tables 1976, Baltimore, MD: Johns Hopkins University Press. Ying, K. (1982) ‘A method for projecting urban population by census tracts using the analysis of spatial-temporal trends of urban population density’, Modeling and Simulation 13:1163–7. Zdorkowski, R.T. and Hanham, R.Q. (1983) ‘Two views of the city as a source of spacetime trends and the decline of human fertility’, Urban Geography 4:54–62.
3 PARADIGMATIC DIMENSIONS OF THE EXPANSION METHOD John Paul Jones, III
The introduction of new analytical methodologies in the social sciences tends to follow a common historical pattern. They are typically transported from more analytically oriented disciplines by a small cadre of researchers. If the new approaches have merit in providing solutions to the types of questions addressed by the discipline, they may gain a larger following and through their use achieve a certain degree of acceptance. In the end, however, many newly introduced techniques ultimately become passé—which is to say, they wear out their welcome in the discipline’s premier journal pages, only to be superseded by newer methodologies or ones that are better suited to solving emerging disciplinary questions. The above description perhaps applies best to factor analysis, but it characterizes the life span of many other methodologies as well. One reason for the eventual abandonment of analytical methodologies is that they seldom contain the seeds for rethinking larger disciplinary issues. This is especially true when techniques fail to shed light on historic disciplinary problematics, dualisms, and debates. If the techniques fail to say anything about what constitutes a discipline’s identity search, i.e. for its objects of inquiry, its fundamental questions, and, broadly speaking, its research agendas, then they remain only techniques. Such methodologies will also usually fail to create a paradigmatic space large enough to engage emerging perspectives and critiques. Caught between saying little to resolve historically significant debates, and being superseded by new problems and frameworks, they can become analytical blips in the time path of a discipline. In contrast to the above sketch, the expansion method has for nearly twenty years bypassed the stage of abandonment in methodological history. Indeed, as the papers in this volume attest, it continues to attract a diverse following, in terms of both research style and substantive application area. In this paper I consider some of the reasons behind the expansion method’s continued development, maturation, and use. I direct my comments specifically to the issues raised above, namely, that the method’s paradigm is useful not only for rethinking larger issues of central concern to a discipline, in this case geography, but that it is also capable of intersecting with themes found in newly emerging conceptual frameworks. While these comments are directed specifically to the
J.P.JONES, III 33
question of the expansion method’s role in geography, it will become clear that the methodology has much to contribute to the larger social science effort in general. The remainder of the paper is organized into four sections. In the first I describe the fundamental characteristics of the expansion method paradigm, working from specific characteristics of the methodology to more general aspects and implications of its paradigmatic message. The section describes the consequences of the paradigm for social science laws, the academic division of labor, the evaluation of theories, and the relationship between micro arid macro level research. In the subsequent section I employ the paradigm as a lens to shed new light on the long debated and seemingly intractable distinctions between regional/idiographic and systematic/nomothetic geography. I also discuss shared themes existing between localities research and the expansion method. The penultimate section examines the relationship between the expansion method and realist methodology, a perspective which has been highly critical of the post– 1950s scientific endeavor in geography, while a final section offers conclusions. THE EXPANSION METHOD PARADIGM In a restricted sense, the expansion method (Casetti 1972, 1982, 1986) suggests that we pose questions concerning the manner in which functional relations perform in wider contexts. (It is also of course a methodology for answering such questions.) Functional relations are significant in that the social sciences typically address questions of variation. The variations of interest obviously differ across disciplines, from the behavior of persons in psychology, to the growth of local economies in regional science, to policy outputs among governmental bodies in political science. Nevertheless, the theme of variation across relevant units of interest is prevalent in all social sciences, and the empirical estimation of functional relations is one accepted way of understanding both the hows and the whys of such variation. The expansion method most commonly intervenes at the point where a wellestablished, theoretically grounded, empirical regularity has been identified through the estimation of functional relations. Questions pertaining to the contextual variation of these established relations can then lead to myriad avenues of inquiry. For example, context might include gender, race, cultural attributes, position in the global economy, type of economic or political system, religion, density, environment, hierarchy, demographic structure, level of development, etc. (Examples of such effects in this volume can be found in Chapters 2, 4, 6, 9, 10, 11, and 12. Determining how established functional relations vary with respect to these attributes may both provide clues to the original variation of interest and suggest new inquiries concerning the theoretical and conceptual premises upon which the relationships are predicated. In addition to the above listing, the expansion method’s contexts of inquiry include time and/or space. In the former, we are enjoined to ask questions
34 PARADIGMATIC DIMENSIONS OF THE EXPANSION METHOD
concerning the variability of relationships from one time period to another, in either discrete or continuous form (e.g. see Chapters 5 and 16 in this volume). In the case of the latter we are concerned with the manner in which established functional relations, and by implication the conceptual frameworks which underlie them, vary from place to place. As in the case of time, space may be treated as discrete (e.g. regions or specific places; see Chapter 11) or continuous (e.g. distance from a point of significance or two-dimensional Cartesian space; see Chapter 13). Finally, one aspect of the methodology’s flexibility is that it enables researchers to reflect upon and apprehend the simultaneous variation of relationships in space-time (see Chapters 7 and 8). The above comments provide us with one explanation of the expansion method’s popularity in geography. The discipline, like other social sciences, is concerned with variation. Yet because geography analyzes spatial variations (patterns) and their causes (processes), it is particularly sensitive to contextual variability. For geography, the expansion method contributes an additional question to the fundamental pattern-process problematic, namely, is there a pattern to the process? In other words, is the pattern-process nexus itself contextually mutable? This question—which represents a second tier of inquiry— encourages researchers to question the geography of pattern-process relationships. In other words, it implores us not to be satisfied by the estimation of functional relations, but instead to ascertain how, where, when, and why these relations vary from context to context. Such issues have been raised in other disciplines,1 yet they are especially meaningful in geography, which is arguably the most context-dependent social science. Seldom are analytical approaches so closely matched to an established disciplinary agenda. The paradigmatic aspects of the expansion method are not restricted to the redefinition of the parameters of functional relations by contextual variables, however. It also carries a broader message, i.e. that research should be ‘opened’, or continually reassessed for truth value. An open research agenda actively questions the answers that are obtained at every stage of the research process. It never leaves closed questions regarding the applicability of hypotheses, the universality of models and relationships, or the generality of conclusions. It suggests an interrogation of conceptual premises vis-à-vis empirical settings and thus views conclusions as contextually dependent instead of final. By insisting upon open research procedures, the paradigm disavows the notion of research as a truth finding mission. The quest for laws or universal models is not consistent with open research. Instead, the expansion method poses research as an interactive process in which answers to questions lead to new questions, and new answers to still newer questions. The expansion method can be conceptualized as an umbrella of whats, whens, wheres, and whys that query every round of the question-answer nexus. The result is a paradigm in which explanation, to the extent that it can be achieved, depends upon context. As a consequence of the above, the expansion method is incompatible with a conception of the world as an abstract, nontextured, ahistoric, and aspatial
J.P.JONES, III 35
isotropic plane. Rather, the world, when viewed from the vantage point of the expansion method, is a complex setting which is temporally and spatially diverse and laden with contextual layers. From an analytical standpoint this diversity may be problematic since contextual information tends to be redundant, overdetermined, or analytically inseparable, but this does not inhibit the asking of questions. Identifying complexity is not an end in itself, however. It does not suggest that we revert to a Passargian search which makes sense of contexts by mapping them one at a time. Instead, context is always fundamentally keyed to theory, to process, and to substantively important questions. This has rather dire consequences for research undertaken in some quarters of the social sciences. It means, for example, that we can no longer treat social science models as expressions of universals, but instead must treat them as mathematical or descriptive portraits of empirical regularities or of subsystems which, if the last twenty years of its use are an indication (see Chapter 2), vary quite frequently across contexts. It also implies limitations on the extent to which the social sciences can or should mirror the procedures and goals of the physical sciences. Instead of seeing social science laws as long-term goals which await the further development of theory, the expansion method views laws themselves as unexamined assumptions about how the world is ordered. An example may clarify this claim. A great deal of empirical research has been concerned with estimating distance decay parameters (for a review see Sheppard 1984). Many studies have revealed systematic spatial variation in the parameter. This has led some (e.g. Fotheringham 1984) to suggest that the models employed to estimate distance decay parameters have been misspecified. If correctly specified, the reasoning follows, distance decay would attain a constant value across space. The expansion method intersects with this line of research at two levels. At one level, it suggests that parameter stability must be proven rather than assumed; thus it is sympathetic to research which questions the spatial stability of the parameter in particular research contexts. At another, more conceptual level, however, the paradigm challenges the very presupposition of order that underpins the search for a constant parameter. It suggests that spatial context cannot be ‘equated’ or ‘controlled for’ through ceteris paribus conditionality. The search for a constant parameter becomes futile as a result (Eldridge and Jones 1991). To summarize, the examination of the contextual variation of relationships in an open and interactive research program is consistent with a complex reality. Those whose goal it is to search for immutable processes might argue that such complexity can be overlooked or couched within distinctions between ‘general’ and ‘particular’. In contrast, the expansion method makes the relationship between explanation and complexity the central question. This does not imply that research merely celebrates the unique, nor does it suggest that tendencies do not exist or that we must reject the modeling enterprise entirely. It requires instead that we open our questioning to the varied contextual settings in which
36 PARADIGMATIC DIMENSIONS OF THE EXPANSION METHOD
research is undertaken. In the research process we cannot afford to stop at ‘initial models’—whether they be mathematical, empirically derived, or qualitative. These assertions have several important consequences. First, the method affirms a strong role for geography in the academic division of labor. Economists are often criticized for developing models ‘on the head of a pin’. Much the same criticism can be leveled against other social sciences that do not share geography’s concern for context. By arguing the significance of a textured ‘real world’ upon the research process, and by providing a methodology for the identification of parameter variation, the expansion method is uniquely positioned to contextualize the theories and models of its sister disciplines. This opens considerable room for cross-disciplinary interaction as the general statements of one discipline are molded to particular settings, which are of course the primary concern of geography. The aspatial and de-contextualized models of other social sciences can be improved and modified as a result, while at the same time geography’s traditional concern for the effects of context is placed within a strong theoretical frame of reference. Substantive areas in which this claim is demonstrated include Verdoorn’s law (Casetti and Jones 1983), the sectoral allocation of labor in development (this volume, Chapter 11), the estimation of production functions (Chapter 12), and human capital theory (Chapter 6). Second, application of the paradigm has consequences for the development of theory in the social sciences. It has already proven its potential for influencing the trajectories of research agendas through its ability to open up new lines of inquiry where tired and worn questions and answers now prevail. Questions of the how and why variability of hypotheses, theories, and relationships, a fundamental part of the research process under the expansion method, is absent in a surprising number of literatures. Many such areas have been invigorated and enhanced through its application, as new angles are suggested and fresh connections are made with previously unrelated conceptual frameworks. The current volume provides many such examples, ranging from agricultural location theory (Chapter 12), migration (Chapter 6), development (Chapters 2 and 11), and social policy (Chapters 4 and 5), to urban systems theory (Chapters 9 and 10) and spatial demography (Chapter 7). A further theoretical implication involves the role of the method in the evaluation of competing theoretical frameworks. Many multivariate analyses are carried out in an effort to assess the ‘correctness’ of alternative frameworks. Variables consistent with theoretical propositions are measured and evaluated vis-à-vis one another in an effort to establish theoretical ascendancy. The expansion method suggests that this type of research may be asking limited, if not incorrect, questions. If conceptual frameworks are found to be context dependent, then the search for ‘correct’ positions becomes futile. An example illustrating this point may be found in the work of Jones and Kodras (1986). They evaluated two perspectives on welfare participation growth, one focusing on the supply of welfare programs (e.g. growth in benefit levels) and one
J.P.JONES, III 37
focusing on the demand for support (e.g. growth in unemployment). In previous work of this type, the issue had been demand versus supply. The contextual expansions carried out in their paper revealed that the demand perspective better characterized the growth of participation in the northeast and north central states, while the supply perspective was more suited to explanation in the south and west. Emerging from their analyses was a concern for the contextual validity of the competing theoretical frameworks, rather than an explicit rejection of either perspective. It should be emphasized that, although the expansion method has primarily been employed to investigate analytical models such as production functions, its paradigmatic message is equally valid for theoretical statements of a purely descriptive nature. Such qualitative models are equally likely to be posed as singular versions of reality without sufficient recognition of contextual dependence. Examples include society versus state-centered theories of the state (Clark and Dear 1984) and the various models of decisionmaking in management science (March 1978). Although sometimes lacking the formal structures that would enable a researcher to test their validity in a quantitative sense, such frameworks are nevertheless sufficiently rigorous to invite empirical investigation. In many cases, however, researchers may be content to assess the validity of such models in a particular research context, without explicit recognition that within that context there may be subcontexts that justifiably call for the type of critical examination undertaken in the expansion method. In assessing society versus state-centered theories of the state, for example, we might well be advised to examine whether evidence in one context points toward one theoretical framework, whereas an opposing perspective is suggested in a different context. In models of decisionmaking the same logic applies: what rational argument can be marshaled to support a research agenda aimed at the ‘discovery’ of a correct theory of decisionmaking? In spite of the fact that such models do not readily lend themselves to functional relations, the paradigm may still be employed as a guide to open research. Finally, the expansion method has implications for questions concerning the appropriate scale of empirical research. An issue of paramount importance in the social sciences is whether research should be directed toward the investigation of the actions of individuals or instead upon the aggregate characteristics, or macro level structures, embodied in societies, institutions, and places (Alexander et al. 1987). In geography this issue is central to the distinction between behavioral research at the individual level, on the one hand, and spatial analyses employing geographic aggregates, on the other hand (Golledge et al. 1972; Bunting and Guelke 1979). In recent years some geographers (e.g. Gregory 1981; Thrift 1983; Pred 1984) have attempted to overcome this dichotomy by adopting a spatialized version of Gidden’s (1984) structuration theory. Giddens situates human agency within larger societal structures which not only enable and constrain agency but are at the same time transformed by it. As a result, the
38 PARADIGMATIC DIMENSIONS OF THE EXPANSION METHOD
distinction between the micro and macro levels is overcome, but in such a way as to preserve the tension between both. The expansion method can be employed in such a way as similarly to forge a linkage between micro and macro level research. We can, for example, contextualize most micro level models by expanding their parameters in terms of the contextual settings in which the behavior under investigation takes place. Operationally, this involves merging micro level data with information pertinent to the context experienced by the individual. It is apparent that this approach is consistent with at least part of Giddens’ structure-agency dialectic, namely, that individuals are both enabled and constrained by structures. To illustrate, assume that we adopt human capital theory as the basis for a model of migration that will be estimated with individual level data. The expansion method suggests that we raise questions concerning the variability of parameters for individual characteristics such as education, age, sex, race, etc. That is, are these effects contingent upon space, time, or other contextual characteristics? Identifying how the parameters of individual level variables vary with respect to such contexts not only informs us about the effect of context upon migration (as Ellis and Odland demonstrate in Chapter 6), it provides at the same time a critique of an overly voluntaristic human capital theory. THE REGIONAL GEOGRAPHY QUESTION The debate between the regional and systematic schools in geography has a long and somewhat tendentious history (James and Martin 1981; Johnston 1987; Entrikin and Brunn 1989). The essential outlines of these two perspectives can be described as follows. Regional geography directs us to examine the interrelationships of various earth features, both physical and man-made, that exist within a geographically defined area. The end product of a regional analysis is a coherent and organized description (Hart 1982) and understanding (Hartshorne 1939, 1959) of the area under investigation. In contrast, systematic geography focuses on the spatial aspects of a small number of phenomena, with a greater concern for understanding their spatial variability. Thus, while regional geography takes as an operating framework the constellation of interrelationships existing within a specific place, systematic geography focuses upon the phenomena per se, without regional geography’s concern for providing accurate descriptions and analyses of a particular part of the earth’s surface. Hartshorne (1959) resolved the tension between regional and systematic geography by placing them on a continuum, in which, he suggested, one could examine a single phenomenon across the entire surface of the earth at one extreme, or examine all interrelated phenomena in a very small area at the other extreme. Along this continuum one could move to include more areas but examine fewer phenomena, or, alternatively, examine more interrelationships among phenomena but with less geographic coverage.
J.P.JONES, III 39
However compelling this compromise may seem, it did not resolve the tensions between the competing schools. The regional perspective grew in disfavor during the late 1950s and 1960s as the discipline underwent a quantitative revolution (Burton 1963) that favored systematic analyses over regional ones (Taaffe 1974; Amedeo and Golledge 1975). One dimension of this disciplinary debate concerned the failure, in the opinion of systematic geographers, of regional geography to provide lasting statements of a causal scientific nature (Shaeffer 1953; Gould 1979). Regional geography was accused of being ‘mere’ description, a term denigrating decades of careful mapping of innumerable phenomena. On the other hand, members of the regional school recounted that systematic geographers were ignorant of the ‘craft’ of the discipline, that they were more concerned with mathematical and statistical models than with real places, and that such research would not lead to the development of geographic laws (e.g. Hart 1982). Interconnected with the debates between regional and systematic geography is the idiographic/nomothetic dualism in human geography. The idiographic views places as fundamentally unique and incapable of being related by general laws and theories, while the nomothetic searches for such laws. Characteristic of research in the former are descriptions of spatial variations that celebrate the unique aspects of the phenomena or region under investigation. Nomothetic research, on the other hand, tends to view the spatial variation of all phenomena as causally related to a set of general processes. In this view, any failure to provide accurate explanations of the real world is related to the current state of theoretical development in the discipline, not to a limitation in the ultimate purpose of general explanation. During the 1960s, regional geographers were accused of being idiographic by systematic geographers who adopted Shaeffer’s (1953) definition of geography—the development of ‘spatial laws’. The expansion method can be used to recast this debate, albeit in a different form from that favored by Hartshorne. It does so by initially aligning itself with those who favor systematic analyses of spatial patterns. At the same time, however, it is consistent with regional geography in that it suggests that researchers would be foolish to ignore the contexts within which the examination of these phenomena takes place. The paradigm does not favor adopting a middle ground; rather, it accepts as a first stage of the research process the empirical assessment of hypotheses emanating from theoretically meaningful propositions. Beyond this, however, it accepts the premises of the regional school, i.e. that the world is composed of diverse layers of phenomena interrelated in such a way as to provide infinite variability and uniqueness. This variation renders impossible the identification of meaningful general statements of a global nature. However, at the same time the expansion method suggests that such variation may have a systematic quality in its effect upon general processes. This is in fact what is accomplished by modifying the parameters of an initial model in an effort to identify regularity in the impact of contexts which are themselves unique. Thus, the expansion method accepts first the identification of causal processes, but, in
40 PARADIGMATIC DIMENSIONS OF THE EXPANSION METHOD
recognition of the limitations of lawseeking enterprises imposed by the diversity of the real world, seeks second to identify the variation in general processes resulting from such contexts, while, third, maintaining a concern for the orderly investigation of the mechanisms governing the variation in the general processes. As a consequence, the issue is no longer the correctness of either side of the idiographic/nomothetic dualism, but rather whether one can identify regularities in the effects that contexts have upon general processes. The paradigm is thus consistent with neither of the perspectives as described above. Instead, it views the debates in fundamentally new ways, with general processes and explanations as starting points, their modifications in certain circumstances an inevitable result of the structure of the real world, and the systematic identification of these modifications possible through the analytical procedures of the method. The above paragraphs have sketched the debate largely in terms of polar oppositions in empirical research. The programmatic statements of some regional geographers do show considerable sympathy with the expansion method paradigm, however. In fact, a close reading of these programmatic statements suggests that regional geographers had the expansion method’s paradigmatic structure in mind as the definition of the discipline from the outset. To justify this claim one need only investigate the works of two prominent regional geographers, Preston James and Richard Hartshorne. First, it would be wise to establish that Hartshorne, the acknowledged leader of the regional school, favored scientific work over idiographic treatises. Following Hettner, he writes: scientific advance in geography depends on the development of generic concepts and the establishment and application of principles of generic relationships. (Hartshorne 1959:160) But concerning the possibility of determining general laws, he states: The application of any general principle to a particular case depends on generic concepts which fit the particular case only approximately. The attainment of the maximum degree of accuracy requires determination of the degree to which the particular conditions depart from the ‘norm’ represented in each generic concept involved, and the consequences, in the process relationship, of those minor differences. (Hartshorne 1959:158) Hartshorne is not content, however, to speak simply of anomalies (i.e. residuals) from a general norm due to place uniqueness. Instead, he recognizes that one reason for deviations from the general case is that relationships in one area may differ from those in another. The investigation of spatial variations in relationships, or ‘areal patterns of covariance’, as Hartshorne framed it, was to be
J.P.JONES, III 41
carried out by ‘comparative regional geography’, a term which never gained much favor in geographic writing. Operationally, comparative regional geography called for controlling variability in process relationships by dividing the world into regions distinguished not on the basis of one or more variables, but on the basis of one or more constant relationships: The purpose in dividing the area is to secure areal sections, or ‘regions,’ such that within each region the elements… under study will demonstrate nearly constant interrelations. (Hartshorne 1959:129, emphasis added) Hartshorne felt a need to construct regions on the basis of constancy in relationships because, in his words: In any part of the world in which man is included, we can be sure before we start…that we will not find the same integration [interrelation] repeated in areas of different culture, or in areas of different climate… (Hartshorne 1959:128, brackets added) Hartshorne’s comparative regional geography was thus an attempt to control for instability in relationships among variables so that investigations would not be contaminated by shifting parameters. The message is that context matters and that geography is not only the study of place variations but also the study of different integrations and interrelations over space. James, writing before the appearance of the Shaeffer paper, expresses the issue even more concisely. He states that geography contributes to an understanding of the operation of processes in particular places. It focuses attention on the modifications in the operation of processes by the other things that are not equal, by noting the actual operation of processes in particular places modified by the presence of the other things unsystematically associated there. (James 1952:222) This sentiment, that processes are modified by other factors, and that these can be clarified by examining them within particular places, is a central notion in the expansion method. A more contemporary perspective on the relationship between geography and the expansion method can be found in Taaffe’s (1974) effort to integrate the regional and systematic schools. He urged scientific geographers to develop general laws and theories that could then be examined for their place-to-place applicability by regional geographers. In other words, he saw regional and systematic inquiry as mutually reinforcing and beneficial. Who better to
42 PARADIGMATIC DIMENSIONS OF THE EXPANSION METHOD
contextualize general models and theories of systematic geographers than regional geographers with an intimate knowledge of places? The issues surrounding debates between regional/idiographic and systematic/ nomothetic geography continue to attract the attention of contemporary writers, though in a form and language somewhat different than past accounts. Specifically, they have arisen within the context of the ‘new regional geography’ (Pudup 1988) and ‘localities research’ (Duncan 1989). Both are ventures which have spatial differentiation as the central theoretical question. They arose in response to an overly functionalist radical social science which treated space more as a setting or stage for social processes than as an integral component of their operation. The theoretical background for rethinking the role of space and society derives from Soja’s work on the socio-spatial dialectic (1980, 1989), Harvey’s (1982) effort to integrate space into an understanding of capitalism, and Massey’s (1984) work on spatial divisions of labor. The claim that ‘space matters’ in social processes has led geographers and sociologists to undertake both theoretical work (e.g. Gregory and Urry 1985) and empirical work (e.g. Murgatroyd et al. 1985) which addresses how local social circumstances (e.g. of class, politics, gender, and race) influence the operation of larger social processes (e.g. the international division of labor). In addition, claims have been made on behalf of a constellation of purely local causal processes that define unique localities (Duncan 1989). These developments raise questions of the general and the particular, the idiographic and the nomothetic, and the status of place in theoretical geography. Smith (1987), for example, has sounded a cautionary note against the ‘empirical turn’ of localities research, a trend that would return geography to descriptive, atheoretical accounts of local variations and responses. One consensus, however, is that localities research should seek to investigate how large-scale social processes are affected by the constellation of circumstances in particular places. Duncan (1989) argues that identifying such ‘spatial contingencies’ is an important dimension of localities research. Clearly these issues intersect with themes that are central to the expansion method. In particular, there is a correspondence between the paradigm’s concern for the ways in which processes are molded to particular circumstances in local areas (Jones 1984), and locality research’s interest in the spatial specificity of social processes (Duncan 1989). Why these connections have not been previously illuminated, however, is another question. I believe this results from a disjuncture in methodologies rather than conceptual frameworks. Localities researchers have tended, for various reasons, to adopt a realist methodology in their empirical investigations, while expansion method applications are most often associated with modeling endeavors. In the next section I explore the interconnections between expansion and realist methods with an eye toward synthesizing some of the communalities and differences between these two approaches.
J.P.JONES, III 43
THE EXPANSION METHOD AND THE REALIST CRITIQUE OF SCIENCE In recent years spatial science has come under increasing attack from those who have adopted realist methodologies (Bhaskar 1978; Sayer 1984). Realism shares with traditional social scientific methods a concern for explanation, but it diverges considerably from them in terms of how that explanation is derived. First, realism offers a critique of extensive research methods that rely upon the identification of common properties and general patterns to establish causal relations. It argues that extensive methodologies fail to tell us what underlying processes have produced the co-extensive patterns under investigation. It maintains that these processes can only be ascertained by intensive research which places in focus a concern for how causal processes work out in a particular case or in a limited number of cases (Sayer 1984, 1985). This requires that the researcher engage in a series of abstractions which ultimately lead to the identification of various necessary and contingent relations that are specific to the case or cases under investigation. Necessary relations are associated with a domain of real structures whose causal powers account for, but are separate from, the empirical events they produce. Interceding between these generative mechanisms and the empirical level of events are a host of contingent relations that give form to concrete events. The purpose of realism is to identify the ordering and operation of necessary and contingent relations. Sayer (1985) offers an example of realist methodology in the area of housing. A necessary relation significant to understanding the rental market is the existence of private property relations. In the absence of this necessary relation the landlord-tenant distinction would not be possible. The degree of conflict between landlord and tenant, however, may be affected by a host of contingent circumstances that can only be ascertained for a set of concrete circumstances. Examples of such relations might include the role of the local state in protecting renters or the race/gender/class position of the landlord and tenant. Intensive research seeks to identify how such relations impinge upon the particular circumstances under investigation. Although the generality of such research is necessarily limited by the case study approach, realists argue that such methods are necessary to make any theoretical claims, since the world is seen by realists to be an open system by virtue of contingent relations which may or may not have effects in particular circumstances. Sayer (1984) argues that closed systems are only encountered in laboratory situations, and that it is only in such contexts that regularities (or laws) can be identified, since the mechanisms under investigation are invariant and the relationships between the mechanisms and the conditions in which they occur are constant. In social systems, which are open, this is not the case, so actual events must be explained by the interplay of contingently related conditions. Herein lies the basis of the realist claim that modelling regularities tell us nothing about causation.
44 PARADIGMATIC DIMENSIONS OF THE EXPANSION METHOD
Irrespective of the criticisms that realists offer for traditional scientific methods, there are clearly strong similarities between the world view of the expansion method and realist methodology. In the language of the realist, the expansion method prompts us to engage in investigations of the varying mechanisms of fundamentally open systems. It shares realism’s view of the world as differentiated and stratified. And like realism, the expansion method is concerned with the investigation of contingent effects. Realist analyses of contingencies are presently carried out in detailed case studies, but some empirical work does bear a strong resemblance to the variation in outcomes based on context which is central to expansion method. These messages conjoin in an article by Mark-Lawson et al. (1985). The authors examine the effect of variations in Labour Party strength upon local social welfare provision in the UK. They uncover a contingency which determines whether the Labour Party’s state apparatus gets translated into increased sovial welfare—specifically, they find that gender equality in the workplace results in a stronger political effect. This article specifies hypotheses in nearly the same terminology that one might use in writing an expansion method paper, though it is firmly rooted in the realist tradition. CONCLUSION Science, Whitehead wrote, ‘is to see what is general in what is particular’ (quoted in Gould 1979). The expansion method adds to Whitehead’s homily the following: science should equally not ignore ‘what is particular in what is general’. Thus, the paradigm turns our attention to the contextual specificity of general social scientific statements. It not only leads us to question such statements, but it also provides a means by which such questions may be answered. It is within this spirit that the expansion method was originally proposed. A far more radical interpretation of the paradigm remains to be written, however. In particular, the expansion method is not inconsistent with a world view which argues that there are no ‘general processes’. Social scientific abstraction, however useful it may be as a heuristic device, is still abstraction. In the ‘real world’, which is, after all, what social science aims to understand, processes operate in space-time contexts. While such processes may be generalizable, they need not be conceptualized as general, at least not in the sense of operating in a state of suspension over space-time contexts. Accepting the embeddedness of processes in context does not, however, require that we abandon the aims of the social disciplines qua science. It will require, however, that these fields re-conceptualize the nature of context in the construction of their models and in their procedures of abstraction. For these endeavors, the expansion method will remain an indispensable component of social scientific methodology.
J.P.JONES, III 45
NOTE 1 Consider, for example, the following warning by political scientists Forbes and Tufte concerning the perils of cross-sectional research: different units can have different causal processes’ (1968:1261).
REFERENCES Alexander, J., Giesen, B., Munch, R. and Smelser, N. (1987) The Micro-Macro Link, Berkeley, CA: University of California Press. Amedeo, D. and Golledge, R.G. (1975) An Introduction to Scientific Reasoning in Geography, New York: Wiley. Bhaskar, R. (1978) A Realist Theory of Science, Brighton: Harvester. Bunting, T.E. and Guelke, E. (1979) ‘Behavioral and perception geography: a critical appraisal’, Annals, Association of American Geographers 69:448–62. Burton, I. (1963) ‘The quantitative revolution and theoretical geography’, Canadian Geographer 7:151–62. Casetti, E. (1972) ‘Generating models by the expansion method: applications to geographical research’, Geographical Analysis 4: 81–91. ——(1982) ‘Mathematical modeling and the expansion method’, in R.B.Mandel (ed.) Statistics for Geographers and Social Scientists, pp. 81–95, New Delhi: Concept Publishing. ——(1986) ‘The dual expansion method: an application for evaluating the effects of population growth on development’, IEEE Transactions on Systems, Man, and Cybernetics SMC–16:29– 39. Casetti, E. and Jones, J.P. (1983) ‘Regional shifts in the manufacturing productivity response to output growth: sunbelt vs. snowbelt’, Urban Geography 4:285–301. Clark, G. and Dear, M. (1984) State Apparatus: Structures and Language of Legitimacy, Boston, MA: Allen & Unwin. Duncan, S. (1989) ‘What is a locality?’, in R.Peet and N.Thrift (eds) New Models in Geography, vol. 2, pp. 221–52, London: Unwin Hyman. Eldridge, J.D. and Jones, J.P. (1991) ‘Warped space: toward a geography of distancedecay’, Professional Geographer 41 (in press). Entrikin, J.N. and Brunn, S. (eds) (1989) Reflections on Richard Hartshorne’s The Nature of Geography, Washington, DC: Association of American Geographers. Forbes, H. and Tufte, E. (1968) ‘A note of caution in causal modeling’, American Political Science Review 62:1258–64. Fotheringham, S. (1984) ‘Spatial flows and spatial patterns’, Environment and Planning A 16:529–43. Giddens, A. (1984) The Constitution of Society, Oxford: Polity Press. Golledge, R., Brown, L.A. and Williamson, F. (1972) ‘Behavioral approaches in geography: an overview’, Australian Geographer 12:59–79. Gould, P. (1979) ‘Geography 1957–77: the Augean period’, Annals, Association of American Geographers 69:139–51. Gregory, D. (1981) ‘Human agency and human geography’, Transactions, Institute of British Geographers NS6:1–18.
46 PARADIGMATIC DIMENSIONS OF THE EXPANSION METHOD
Gregory, D. and Urry, J. (eds) (1985) Social Relations and Spatial Structures, New York: St Martin’s Press. Hart, J.F. (1982) ‘The highest form of a geographer’s art’, Annals, Association of American Geographers 72:1–29. Hartshorne, R. (1939) The Nature of Geography: A Critical Survey of Current Thought in Light of the Past, Lancaster, PA: Association of American Geographers. ——(1959) Perspectives on the Nature of Geography, Washington, DC: Association of American Geographers. Harvey, D.W. (1982) Limits to Capital, Chicago, IL: University of Chicago Press. James, P.E. (1952) ‘Towards a fuller understanding of the regional concept’, Annals, Association of American Geographers 42:195– 222. James, P.E. and Martin, G.J. (1981) All Possible Worlds: A History of Geographic Ideas, 2nd edn, New York: Wiley. Johnston, R.J. (1987) Geography and Geographers, 3rd edn. London: Edward Arnold. Jones, J.P. (1984) ‘A spatially-varying parameter model of AFDC participation: empirical analyses using the expansion method’, Professional Geographer 36:455–61. Jones, J.P. and Kodras, J.E. (1986) ‘The policy context of the welfare debate’, Environment and Planning A 18:63–72. March, J.G. (1978) ‘Bounded rationality, ambiguity, and the engineering of choice’, Bell Journal of Economics 9:587–608. Mark-Lawson, J., Savage, M. and Warde, A. (1985) ‘Gender and local politics: struggles over welfare policies, 1918–1939’, in L. Murgatroyd et al. (eds) Localities, Class, and Gender, pp. 195– 215, London: Pion. Massey, D. (1984) Spatial Divisions of Labor: Social Structures and the Geography of Production, London: Methuen. Murgatroyd, L. et al. (eds) (1985) Localities, Class, and Gender, London: Pion. Pred, A.R. (1984) ‘Place as historically contingent process: structuration and the timegeography of becoming places’, Annals, Association of American Geographers 74: 279–97. Pudup, M.B. (1988) ‘Arguments within regional geography’, Progress in Human Geography 12:369–90. Sayer, A. (1984) Method in Social Science: A Realist Approach, London: Hutchinson. ——(1985) ‘Realism and geography’, in R.J.Johnston (ed.) The Future of Geography, pp. 159–73, London: Methuen. Shaeffer, F.K. (1953) ‘Exceptionalism in geography: a methodological examination’, Annals, Association of American Geographers 43:226–49. Sheppard, E. (1984) ‘The distance-decay gravity model debate’, in G.L.Gaile and C.J.Wilmott (eds) Spatial Statistics and Models, Boston, MA: Reidel. Smith, N. (1987) ‘Dangers of the empirical turn: some comments on the CURS initiative’, Antipode 19:59–68. Soja, E. (1980) ‘The socio-spatial dialectic’, Annals, Association of American Geographers 70:207–25. ——(1989) Postmodern Geographies, London: Verso. Taaffe, E.J. (1974) ‘The spatial view in context’, Annals, Association of American Geographers 64:1–16. Thrift, N.J. (1983) ‘On the determination of social action in space and time’, Environment and Planning D: Society and Space 1:23– 57.
4 A CONTEXTUAL EXPANSION OF THE WELFARE MODEL Janet E.Kodras
Nomothetic and idiographic approaches to social science research have traditionally been viewed as antagonistic. The former seeks to develop generalizations about human conduct, culling out idiosyncrasies of behavior and thought, and abstracting to the level of theoretical understanding; the latter rejects the representation of human action in terms of theory and seeks instead to probe deeply into the single place and event. In response to this tension and the acknowledged strengths and weaknesses of each approach, a number of researchers have begun the search for a middle ground. Working from the nomothetic toward the idiographic, they seek ‘more flexible, extroverted, and combinatorial theorizations… theorizations that do not dogmatically project themselves onto the empirical world but instead are informed and open to diversity, uniqueness, and conjuncture—the distinctiveness of time and place, event and locality’ (Soja 1987:293). If this middle ground can be found, the ‘loosening’ of theories, such that their terms can vary across specific contexts, will represent a maturing of social science thought (Taaffe and Casetti 1990). This new formulation thus holds promise for researchers across the spectrum, and a fervent debate has arisen as to how we might best proceed. Most problematical is the issue of how empirical research can be conducted, given that the methodologies used in nomothetic and idiographic research differ fundamentally. The purpose of this paper is to demonstrate that the expansion method can contribute to this new research agenda, since it is a means for empirically testing how relationships posited by theory can vary according to the contingencies of place. This point is made by example. The first section reviews labor-leisure theory, the fundamental model underlying the design and implementation of US welfare policy, as well as popular attitudes toward it. Briefly, welfare assistance, by offering an alternative to employment, is seen to reduce the motivation to work, which decreases the labor supply and lowers output. Based on this argument, public assistance programs are designed such that they minimize competition with the private sector. The second section describes a set of geographically-specific conditions, ignored by labor-leisure theory, which are argued to have an effect upon the decision between work and welfare.
48 J.E.KODRAS
Specifically, in places with insufficient labor market opportunities, poverty households are effectively prevented from choosing employment no matter how strong their desire to work. Where welfare practices are particularly restrictive, or the cultural stigma attached to the use of welfare is great, the decision to choose public assistance is inhibited. It is argued that the decision between work and welfare is not free-choice, nor is it aspatial, as assumed by theory. In the third section, the expansion method is used to test for variations in the work-disincentive effect of welfare across labor market and policy settings. Such variations are found, and in the following section they are interpreted in the context of regional political cultures, private sector influences, and state fiscal conditions. A concluding section makes the point that labor-leisure theory is overaggregated and underspecified, ignoring the complexities of the locale in which the decision between work and welfare is made. As a result, policy based upon it is misdirected and popular attitudes toward welfare are distorted. The flexible testing procedure of the expansion method allows these complexities to be brought to light. THE THEORY Economic theory represents the decision between work and welfare as a special case of the labor-leisure tradeoff (see Brehm and Saving (1964) for a formal graphical presentation of the labor-leisure model applied to welfare choice). Assuming that the marginal utility of labor is negative and that of leisure is positive, some utility-maximizing individuals will opt for leisure over labor, so long as income support programs guarantee a minimum income level. In the classic Economics of Welfare (1952:728), Pigou describes this process: this type of transference is involved in all Poor Law systems that fix a state of minimum fortune below which they will not allow any citizen to fall. For, in so far as they raise to this level the real income of all citizens whose provision for themselves falls below it, they implicitly promise that any reduction in private provision shall be made good by an equivalent addition to state provision. It is plain that the expectation of these differential transferences will greatly weaken the motive of many poor persons to make provision for themselves. In particular, two financial aspects of welfare programs influence work effort: the guarantee and the marginal tax rate (Masters and Garfinkel 1977; Danziger et al. 1980). The guarantee, which varies by family size, is the payment to a family with no other income. The marginal tax rate is the percentage by which welfare payments decline as earnings increase. For example, a tax rate of 60 percent means that benefits are reduced by 60¢ for each additional dollar earned. Guarantees and tax rates are positive for most welfare programs, including Aid to
A CONTEXTUAL EXPANSION OF THE WELFARE MODEL 49
Families with Dependent Children (AFDC), the program examined here. Thus, benefits are highest for families with no income and fall as income rises. Positive guarantees and tax rates both reduce work effort. The guarantee offers an alternative to employment and, the higher the guarantee is, the more attractive is the prospect of not working. The tax rate effectively reduces the wage rate by which the worker is rewarded (Danziger et al. 1980). Financial analyst Louis Rukeyser demonstrates the impact of the tax rate on the decision to work: ‘A typical family of four living in Los Angeles, with $4, 800 annual earnings will garner $810, 49 in net monthly spendable income if they use all available welfare assistance to which they are entitled. For each additional dollar earned, the family loses so many benefits that real income increases only 5.2 cents, an effective marginal tax rate of 94.8 percent. If family earnings double to $9, 600 a year, net spendable income would decline from $810.49 per month to $773.82’ (quoted in Weil 1978:48–9). Several researchers have sought to quantify the amount of work effort which is lost due to social assistance nationwide. Lampman (1978) estimates that the expansion of all social spending (including housing, education, manpower, and welfare programs) between 1950 and 1976 reduced hours worked by 7 percent more than if the system had not expanded. Danziger et al. (1980) find a work reduction of 3 percent attributable to increases in income support programs only. In addition, a number of studies, explicitly or implicitly derived from laborleisure tradeoff theory, support the contention that benefit levels influence the decision between work and welfare, which then affects the magnitude of public assistance rolls and the labor supply (Brehm and Saving 1964; Cain and Watts 1973; Spall and McGoughran 1974; Hamermesh 1977; Keeley et al. 1978a, 1978b; Bieker 1981; Menefee et al. 1981; Jones 1987). THE CONSTRAINTS Welfare payments which are sufficiently high to compete with earned income do indeed have the potential to reduce work motivation. But poverty is not simply the result of individual failure, the ‘inadequacy of human nature’ (Reagan 1968: 122), and welfare dependence is not merely the result of a free choice decision by the indolent in favor of leisure over labor. The choice between work and welfare is constrained by barriers to employment in the labor market and obstacles to assistance in the welfare system. First, the structure of the labor market makes work and self-sufficiency unattainable for many low-income Americans. That segment of the labor market available to the poor is often typified by high unemployment, seasonal jobs, discriminatory practices, and wage levels so low that even fulltime employment does not allow one to rise above the poverty level (Morrill and Wohlenberg 1971; Levitan and Johnson 1984). As the nation shifts from its traditional manufacturing base to a service-oriented economy, the nature of labor demand (its location, skill requirements, and wage structure) is rearranged. Whereas
50 J.E.KODRAS
manufacturing typically paid middle-income wages to blue collar labor, the expanding service sector is highly polarized, offering high compensation to a few but very low wages to the growing legions of cashiers, food service personnel, computer terminal operators, etc. (Rumberger 1981; Kuttner 1983). These structural shifts leave technologically displaced labor, the unskilled, and many new entrants to the labor force little option but to accept service positions with low wages, sporadic employment, and little opportunity for self-advancement (Howes and Markusen 1981; Seninger and Smeeding 1981; Kuttner 1983). Furthermore, these labor market constraints are spatially variable. Different areas of the country produce distinctive sets of economic opportunities, earnings potentials, and types of poverty as a result of their position in the core/periphery of the national space economy, their intersectoral patterns of employment concentration, and the attendant wage structures. Those areas with long agrarian histories and recent industrialization, resistance against unionization, and traditions of racial or gender discrimination, as well as the regions leading the current restructuring toward a service economy, may provide many jobs but at low compensation. In addition to these labor market factors which impede the decision to work, there exist many obstacles to choosing welfare. The popular image is that welfare is a nationwide public dole, available to all who choose not to work. Yet the great majority of all income maintenance programs in the USA are confined to the elderly and the disabled, who are not expected to work. In 1979, fully 70 percent of major maintenance program funds were allocated to OASDI (Old Age, Survivors, and Disability Insurance, commonly called Social Security), Medicare, Supplementary Security Income, and Veterans’ Compensation (Table 4.1). Programs available to the able-bodied, working age population are primarily limited to Unemployment Insurance (UI) and Workers’ Compensation, AFDC, General Assistance, and a set of basic need packages, such as food stamps, Medicaid, and housing assistance. UI provides temporary income substitution for the involuntarily unemployed. Table 4.1 Major income maintenance programs, 1979 Program
Expenditure (in $ billion)
Total Social insurance Cash benefits Old Age, Survivors, and Disability Insurance (OASDI) Unemployment Insurance Workers’ Compensation Veterans’ Pensions and Compensation In-kind benefits
252.7 194.2 131.7 11.3 11.5 10.6
A CONTEXTUAL EXPANSION OF THE WELFARE MODEL 51
Program
Expenditure (in $ billion)
Medicare Welfare Cash benefits Aid to Families with Dependent Children (AFDC) Supplementary Security Income (SSI) General Assistance In-kind benefits Medicaid Food stamps Housing assistance Source: US Bureau of the Census 1984: Tables 605 and 608
29.1 58.5 11.7 7.5 2.2 21.8 7.3 8.0
Because it requires previous attachment to the formal labor market, UI is unavailable to those with poor work histories or employment in the informal sector. As a result, many low-income Americans are excluded. General Assistance provides emergency cash to individuals in need. Funded entirely by the states, the program accounts for less than 1 percent of all income maintenance expenditures and, as such, cannot be regarded as a viable alternative to employment nationwide. Of greater importance are the Food Stamp Program, Medicaid, and housing assistance, which provide the basic needs of food, medical attention, and shelter to poverty households. Before the establishment of the Food Stamp Plan in 1964, no national program existed to assist low-income families headed by able-bodied, employable males (Masters and Garfinkel 1977: 7). Finally, AFDC is a very large and expensive program, which provides financial assistance to low-income families with children under 18 years. Approximately 98.5 percent of all recipient households are headed by females (US Social Security Administration 1982). When first established in 1935, the program was designed to assist women who had lost their husbands and needed to care for children in the home. As societal norms have shifted and program costs have risen, these women are increasingly expected to work. By providing an alternative to employment, AFDC is now regarded as a female work disincentive. In summary, the welfare options available to the working age population are far more limited than generally perceived. They are also spatially variable. The federalized public assistance system, which grants control of program operation largely to the states, ensures that variations in welfare provision will conform primarily to state boundaries. For example, AFDC is decentralized, and the states are responsible for setting eligibility criteria, benefit levels, and the location and operation of welfare offices. Because each state bears part of the financial responsibility, its legislature must devise a welfare budget in accordance with the
52 J.E.KODRAS
political priorities of public assistance, vis-à-vis other competing concerns, in that state. The state welfare bureaucracy is then responsible for program administration, within the budget constraints sent down from the legislature, and in accordance with local attitudes concerning the viability of public assistance. Thus, where there exists substantial local resistance against welfare, as, for instance, where the private sector requires a low-wage, dependent labor force, public assistance can be made most unattractive to potential recipients by understaffing welfare offices, creating administrative confusion, or explicit harassment (Elman 1966; Albin and Stein 1968; Piven and Cloward 1971, 1981; Masters and Garfinkel 1977). Federalism gives spatial form to the US public assistance system, allowing the states to modify welfare provision to their needs and political priorities. Considering the spatially varying barriers in the labor market and the welfare system together, it is evident that the ‘decision’ between work and welfare is strongly conditioned by the context in which one lives. As Albin and Stein (1968: 301) note: ‘The choice available to the actual or potential recipient is rarely more than a choice between a single, narrowly constrained relief option and the available market opportunities for wage income.’ Accordingly, a number of empirical studies have examined the effect of labor market and/or welfare constraints on the use of public assistance (Kasper 1968; Albin and Stein 1971; Winegarden 1973; Thrall 1981; Kodras 1982, 1984; Jones 1984a; Jones and Kodras 1984; Kodras 1984; Johnston 1990). AN EMPIRICAL TEST Based upon the two positions identified above, this study works from the perspective that welfare assistance can, and does, influence work motivation, but that the extent of this effect is conditioned by spatially varying opportunities in the labor market and in the public assistance system. Having discussed the general mechanisms by which welfare acts as a work disincentive, the study now turns to an empirical investigation, specific to AFDC. As noted above, AFDC is currently regarded as a work disincentive. Established by the Social Security Act of 1935, it has evolved into one of the largest, most costly, and most controversial forms of public assistance. It provides federal grants to states to partially defray the costs of providing financial assistance to needy children. Assistance is granted to households whose children are under the age of 18 and deprived of support because of the death, absence, or incapacity of a parent (Platky 1977). In addition, roughly one-half of the states elect to provide payments to households with an unemployed parent under AFDC-UP, an optional program passed by Congress in 1961. Although the federal government does provide guidelines, states are given considerable discretion in AFDC operation. Decentralized control has resulted in substantial interstate disparities in administrative restrictiveness (Wohlenberg 1976a), eligibility criteria (Chief 1979), benefit levels (Wohlenberg 1976b),
A CONTEXTUAL EXPANSION OF THE WELFARE MODEL 53
optional program riders (Hosek 1982), and the overall effectiveness of state programs (Wohlenberg 1976c). The analysis consists of a two-step procedure. An initial model applies laborleisure theory to AFDC. It specifies the response, or elasticity, of state level AFDC recipient rates to corresponding work-disincentive levels. The latter are measured as ratios of welfare payment to earned income which estimate the incentive to substitute welfare for work. This model represents the free-choice situation, as no constraints are introduced as controlling variables. In the second step, the initial model is expanded to allow the relationship to vary across different labor market and public assistance contexts. The importance of these constraints upon the work-disincentive effect is judged by parametric shifts in the relationships. The study does not specifically address the individual decisionmaking process between work and welfare. Rather, we examine the aggregate effect of such decisions upon welfare participation. It is the aggregate effect, the magnitude of program participation, which is the concern of welfare officials, private entrepreneurs, and the tax-paying public. State level aggregation is appropriate, since the states exert most control over program administration and regulations. Full definitions and sources of all data are presented in Appendix 4.1. Initial model The magnitude of state AFDC participation is measured here as the proportion of poverty families who are program recipients:1 PR is the number of AFDC recipient families divided by the number of poverty families. The workdisincentive effect, WD, is measured as the mean AFDC income of recipient families divided by the mean earned income of poverty families. Large values of WD indicate high welfare payments relative to that which is gained from employment, and therefore a strong economic motivation to substitute welfare for work.2 Large work disincentives, according to theory, should translate into greater welfare participation. Thus, we anticipate (4.1) The parameters of a double log specification were estimated by ordinary least squares (OLS) regression, using state level data for 1979. The results are as follows: (4.2) where t values indicating the strength of relationships appear in parentheses. The elasticity of the relationship is positive, as expected. As the welfare-to-work income ratio of poverty families increases by 10 percent, their participation in AFDC increases by 6.5 percent. Thus, we do see some evidence of a work
54 J.E.KODRAS
disincentive. Where states set AFDC benefits high relative to the earned income of the poor, the use of welfare is also high. The constraints Next, a set of labor market and welfare variables, hypothesized to have an effect upon the initial relationship, is examined. Labor market barriers, which are relevant to the predominantly female AFDC population, include state data on the following: 1 FUNEMP—the female unemployment rate; 2 NOJOB—the proportion of state employment which is in occupations other than retail trade, nonprofessional services, or nondurable manufacturing, jobs most open to the AFDC population (Bieker 1981); 3 WAGEDIFF—the ratio of male to female mean annual earnings;3 4 PFWORK—the proportion of female-headed households whose head is employed full time (35 + hours per week) and full year (50 + weeks per year) yet remain below the poverty level; 5 SEVERITY—the median dollar amount by which poverty families fall below the poverty level; 6 FEMPOV—the proportion of female-headed households below the poverty level. These variables are selected as labor market barriers because the decision to work is constrained by high unemployment, the unavailability of jobs for which one is qualified, wage discrimination, meager earnings, and the severity of poverty (in general and specific to the AFDC target group), respectively. Welfare constraints affecting AFDC participation include state data on the following: 1 NEEDS—the state welfare administration’s determination of the amount necessary to meet basic needs; 2 ADMIN—an index of state welfare administration leniency (Jones 1984b); 3 SERVICE—the ratio of state and local welfare employees to poverty families; 4 TRANSFER—the proportion of persons receiving public assistance who are classified poor excluding public assistance and nonpoor including public assistance. These variables represent program regulation (as evidenced by the needs standard and administrative practices), the ability of the welfare bureaucracy to serve potential demand, and the effectiveness of welfare in raising groups above the poverty level, respectively.
A CONTEXTUAL EXPANSION OF THE WELFARE MODEL 55
Factor analysis was used to group these labor market and welfare variables into contextual factors. There exist both methodological and conceptual rationales for this course of analysis. First, using a small number of factors in the terminal model presented below, rather than the nine variables separately, avoids problems of multicollinearity and ensures a lesser reduction in degrees of freedom. Second, factor analysis allows us to identify spatial patterns in the combinations of constraints rather than considering them as independent forces. As noted above, labor market opportunities vary throughout the country, because of different mixtures in economic base, historical patterns of labor compensation, and regional economic cycling. Given the flexibility of a federalized public assistance system, welfare may be spatially adjusted to these labor market conditions. Thus, different groupings of variables are possible. If liberal welfare provision is positively associated with substantial poverty, for example, welfare services are addressing the need for assistance. The analysis, using state data for 1979, resulted in three principal components attaining eigenvalues greater than 1.0; together they account for 65 percent of the total variation. Table 4.2 Varimax rotated factor matrix Variable
Factor 1
Factor 2
Factor 3
Proportion of variation accounted for by the factors
FUMEMP –0.030 0.716a –0.014 0.514 NOJOB 0.214 –0.118 0.813a 0.721 a WAGEDIFF –0.142 0.034 0.862 0.764 PFWORK –0.653a 0.545a –0.021 0.724 SEVERITY –0.755a 0.008 0.017 0.570 FEMPOV –0.412 0.752a 0.066 0.740 a NEEDS 0.830 –0.075 0.137 0.713 ADMIN –0.200 –0.616a 0.093 0.428 SERVICE 0.546a –0.439 0.151 0.514 TRANSFER 0.880a 0.118 –0.100 0.798 a Note: Factor loadings greater than 0.50 indicate that more than one-quarter of the variation in the variable is accounted for by the factor.
Upon varimax rotation they yielded the factors shown in Table 4.2. Note, from the rightmost column, that the majority of variables are well represented by the grouping procedure. Briefly, factor 1 represents conditions of little poverty and effective welfare provision; factor 2 reflects high female unemployment and poverty and restrictive public assistance; and factor 3 represents labor market constraints, such as the unavailability of jobs open to the AFDC population and female wage discrimination. Taken together, the three factors tend to group
56 J.E.KODRAS
favorable labor market conditions with liberal welfare provision and substantial poverty with restrictive public assistance. Thus, it does not appear that interstate variations in welfare provision are due to a process of matching services with needs. Considering each in detail, factor 1 exhibits strong negative loadings on the severity of poverty (SEVERITY) and the percentage of female-headed households who work full time yet remain below the poverty level (PFWORK). High positive loadings are shown for the effectiveness of transfer payments in raising families above the poverty level (TRANSFER), liberal need standards set by the state (NEEDS), and welfare bureaucracies which are sufficiently large to serve potential demand (SERVICE). Thus, the factor separates states with little poverty and effective welfare provision from those where poverty is substantial but public assistance restrictive. In general, New England states, the upper Midwest, and the west coast have high positive factor scores while the southeast exhibits strong negative scores (factor scores are listed in Appendix 4.2). Factor 2 shows high positive loadings for the percentage of female-headed households below the poverty level (FEMPOV), the female unemployment rate (FUNEMP), and the percentage of female-headed households who work full time yet remain below the poverty level (PFWORK) and a strong negative loading for leniency in AFDC administration (ADMIN). Thus, this factor represents conditions specific to female poverty and its association with welfare constraints. Factor scores are strongly positive throughout the south but negative in the upper Midwest and interior western states. Finally, factor 3 exhibits strong positive loadings on male-female wage differentials (WAGEDIFF) and the unavailability of jobs for which the majority of the AFDC population are qualified (NOJOB). These labor market constraints are reflected in positive factor scores in the industrial core and Midwest. Negative factor scores are found primarily in the deep south, where wages for males and females are both low, given four years of education, and where lowearning occupations are a greater proportion of all employment than is the case elsewhere. The terminal model To this point, the analysis has demonstrated that the states vary in AFDC use by the poverty population, in accordance with differences in the relative financial advantage of participating. Additionally, the states exhibit variable combinations of labor market and welfare barriers, which may affect the extent to which work disincentives are translated into program use. To test this proposition, the constraints are introduced into the model via the expansion method. Beginning with the initial model, (4.3)
A CONTEXTUAL EXPANSION OF THE WELFARE MODEL 57
the b coefficient represents the national average in the response of AFDC participation to work disincentives. To incorporate spatially varying conditions in the labor market and welfare system, b can be expanded from a constant to a function of the factor scores: (4.4) The factor scores measure the extent to which conditions identified by the factors exist in each state. Substituting (4.4) into (4.3) yields the terminal model: (4.5) Equivalently, (4.6) The magnitude and direction of the parameters have conceptual meaning. First, b0 represents the independent effect of work disincentives on program participation. Parameters b1 through b3 measure the interactive effect of work disincentives, under conditions imposed by the labor market and welfare system. Where substantial labor market constraints are in effect, we expect greater use of welfare, since employment is difficult to obtain or nonremunerative. Thus, for example, the parameter b3 should be positive, since factor score 3, which represents labor market constraints, should raise the effect of a given WD upon PR. Alternatively, where substantial welfare constraints are in effect, we expect less welfare participation response to a given work-disincentive level. The parameters of equation (4.6) were estimated using OLS stepwise regression. The stepwise procedure was used to build a model which retains only those variables accounting for significant variation. If b1 through b3 are not significant, and b0 retains its significance, the work-disincentive effect is unaffected by labor market and welfare conditions in the states (see Stonecash and Hayes (1981) for a discussion of the stepwise procedure and decision rules regarding significance for interactive models of this kind). Table 4.3 Results of the initial model ln PR=4.144+1.049 ln WD+0.290FS1 ln WD–0.203FS2 ln WD (85.921)
(8.310)
(2.582)
(–3.004)
+0.177FS3 ln WD (2.288) Step
Variable
R2 change
t-to-enter
1 ln WD 0.4697 6.520 2 FS1 ln WD 0.0661 2.587 3 FS2 ln WD 0.0513 –2.390 4 FS3 ln WD 0.0430 2.288 Note: t values appear in parentheses; R2 = 0.63; adj. R2=0.60.
Significance 0.000 0.013 0.021 0.027
58 J.E.KODRAS
Results are reported in Table 4.3. All variables contribute significantly to the model; the change in R2 is greater than 4 percent at each step and the t-to-enter is significant at the 0.03 level or better. Multicollinearity does not appear to be a problem, since coefficients remain stable as the model builds and zero-order correlations among independent variables are on the order of 0.3 or 0.4. Substituting the estimated parameters into equation (4.5) yields (4.7) The parameter b0 (1.049) represents the overall relationship between WD and PR. As was the case for the initial model, the parameter is positive. In this case, however, the work-disincentive effect on welfare use is modified by the labor market and welfare contexts, represented by the factor scores and their corresponding parameters. The interactive effect with factor score 1 is positive (0.290). Recall that states with positive factor scores on factor 1 have low poverty and generally effective welfare provision. In this context, the relationship between the work-to-welfare income ratio and AFDC use is accentuated, as represented by the positive parameter. For a given ratio of welfare-to-work income, AFDC use is relatively greater in states with little poverty and effective welfare. In other words, where welfare is sufficiently provided and the economic incentive to participate is great, the response is high welfare use. On the other hand, the relationship between AFDC use and the work-to-welfare ratio is diminished in states with substantial poverty and welfare constraints, as represented by their negative factor scores. The interaction with factor 2 is negative (–0.203). States with positive factor scores have substantial female poverty and unemployment, yet restrictive welfare. In this context, then, the work-disincentive effect on AFDC is diminished. For a given ratio of welfare to work income, there is relatively less welfare use, where there exist constraints imposed in the welfare system or substantial female poverty and unemployment. The interaction with factor 3 is positive (0.177). States with positive factor scores exhibit labor market constraints, such as wage discrimination and a small proportion of all jobs which are available to the majority of AFDC heads. In this context, the response of AFDC participation to work disincentives is accentuated. Where jobs are difficult to obtain or non-remunerative, welfare use is relatively high, for a given welfare-to-work income ratio. Individual state parameters, representing the sensitivity of AFDC use to work disincentives in the context of their labor market and welfare constraints, are calculated by substituting parameter estimates and state factor scores into equation (4.4): For example, Minnesota’s parameter is calculated as
A CONTEXTUAL EXPANSION OF THE WELFARE MODEL 59
Figure 4.1 Interstate variations in the work-disincentive effect, 1979. States in striped pattern have the strongest relationship between work disincentives and welfare participation. Southeastern states, in white, show the lowest response in state welfare use to work disincentives
(4.8)
Compare the magnitude of Minnesota’s work-disincentive effect with that of Mississippi: (4.9)
The state parameters are mapped in Figure 4.1 and listed in Appendix 4.2. The work-disincentive effect is greatest on the west coast, in the upper Midwest, and in lower New England. A given welfare-to-work income ratio results in greater AFDC use in these states, which tend to have negative factor scores on factor 2 and positive on factors 1 and 3. Therefore, welfare use response is high where there is relatively little poverty and welfare is sufficiently provided but labor market constraints are in effect. The group with the highest work-disincentive effect is not entirely as expected. To the extent that welfare critics take a spatial
60 J.E.KODRAS
perspective, the accusing finger points to the industrial northeast, particularly New York, New Jersey, Ohio, and Illinois, as the area where potential recipients are enticed into welfare dependence, and out of the labor market, with liberal program benefits. These are not the states, however, which show the greatest work-disincentive effects. The response of AFDC participation to the disincentive is lowest in the southeast, which has negative factor scores on factors 1 and 3 but positive scores on factor 2. Thus, in the south, where there is substantial poverty, restrictive welfare, and a greater proportion of jobs available to the AFDC population, welfare participation is low for a given welfare-to-work income ratio. Finally, it is informative to compare situations with and without constraints. Massachusetts represents the case relatively free of labor market and welfare barriers (a positive score for factor 1 and negative scores on factors 2 and 3). In this context, the work-disincentive effect is strong, as predicted by labor-leisure theory. Only three states fulfill the assumption of no constraints made by laborleisure theory. Where these barriers are substantial, the work-disincentive effect is much diminished. In Louisiana, for example, the existence of all constraints (a negative score for factor 1 and positive scores on factors 2 and 3) largely negates the economic motivation to substitute welfare for work. For the great majority of states, one or more barriers limit the work-disincentive effect of welfare.4 INTERPRETATION OF THE SPATIALLY VARYING PARAMETERS The state parameters mapped in Figure 4.1 illustrate that work disincentives are differentially translated into welfare participation across the nation. As estimated in equation (4.7), these variations in the work-disincentive effect derive from different sets of labor market and welfare conditions existing in the states. However, such conditions are but proximate causes, the quantified surrogates, of underlying political and economic forces which affect operation of the welfare system within the larger economy. By extension, the state parameters, representing the severity of the work-disincentive effect in each jurisdiction, are calculated estimates of the variable role played by these forces. For example, the Mississippi state parameter, calculated in equation (4.9), is the lowest in the nation. The motivation to substitute welfare for work is minimal in the state since there exist substantial barriers to the use of welfare (low basic needs standards, restrictive administration, small bureaucracies to serve potential demand, etc.) but relatively inconsequential labor market constraints (e.g. low-skill job availability). These proximate variables of the principal components analysis reflect fundamental political and economic processes, rooted in the historical traditions of the state. First, Mississippi’s punitive welfare system is a function of its small tax base and a traditionalist political culture, which assigns a low priority to public assistance. Second, the state’s labor structure is oriented toward low-skill, small-compensation jobs due
A CONTEXTUAL EXPANSION OF THE WELFARE MODEL 61
to its long agrarian history and recent industrialization, the role it has played in the periphery of the nation’s economy, and its racial traditions of servitude and slavery. The work-disincentive effect varies by states because they are the physical manifestations of a federalized public assistance system, which operates within the larger, spatially variable economy. First, federalism directly affects the political process of welfare provision. In fact, several of the welfare variables in the principal components analysis would not be variable if it were not for state control. Additionally, federalism allows the states to modify welfare in accordance with their particular economic circumstances. Hence, different combinations of welfare system and labor market conditions are possible. The rationale for decentralized control is that state and local governments are best able to identify and address the economic problems in their jurisdictions, since ‘no nationally uniform system can do justice to the infinite variety of types of need, individual problems, and potentials’ (Freeman 1981:27). Thus, in matching services to needs, ‘the closer the level of government is to the people, the more efficient and effective our social welfare programs are apt to be’ (Anderson 1978: 166). This study indicates that services do not match need, however. Recall from the principal components analysis that states with the most severe poverty, in general and specific to female-headed households, are associated with punitive welfare policies, as evidenced by low basic needs standards, restrictive administration, and small bureaucracies relative to potential demand. Three alternative perspectives may be advanced as to why interstate variations in welfare provision do not accord with a map of need for assistance. First, states with substantial poverty may lack the fiscal ability to support welfare since a small tax base must be allocated among many competing concerns. Thus, states with the greatest need for assistance will have the most punitive welfare programs. Second, state welfare systems may be used to address the needs of capital rather than the needs of the poor. For instance, economies based on low-wage structures cannot be sustained without a restrictive welfare system. The poor states may provide only minimal assistance to prevent welfare benefits from competing with wages in the private sector. Finally, states may modify welfare packages in accordance with local political attitudes toward public assistance. Elazar (1972) has examined the geography of American political ideologies and its effect upon the variable provision of government programs: Many differences in state responses within the federal system appear to be stimulated by differences in political culture among the states…the particular pattern of orientation to political action in which each political system is imbedded. Political culture, like all culture, is rooted in the cumulative historical experiences of particular groups of people…[Thus] every state has certain dominant traditions about what constitutes proper
62 J.E.KODRAS
government action and every state is generally predisposed toward the federal programs it can accept as consistent with those traditions. (Elazar 1972:88–9) Interstate variations in welfare provision are associated with political cultures dominant in different regions. For example, Elazar (1972) defines three major cultures and their linkages to public policy: the moralist tradition, which prevails in the upper Midwest, views government intervention as a positive force which is necessary to insure the common good of all citizens; the individualist tradition, strongest in the industrial core, views government as an allocation mechanism between individuals bargaining for rewards; and in the traditionalist culture, dominant in the deep south, government is oriented toward the interests of a governing elite. With regard to welfare, the moralist tradition is most supportive, since it addresses the issue of social well-being, while the traditionalist culture is most resistant, since it is a redistributive mechanism between income groups, which threatens the position of elites. The results of this study support these associations. As shown in Appendix 4.2, states with liberal welfare programs (positive scores on factor 1 and negative on factor 2) and the resultant large work-disincentive effects (high parameter values) are predominantly moralist. On the other hand, those with restrictive welfare and minimal work-disincentive effects are primarily traditionalist states in the southeast. None of the three perspectives cited above is alone a sufficient explanation for why welfare provision does not match needs. In fact, state fiscal abilities, private sector influences, and political ideologies are not separable. Moralist political cultures tend to exist in wealthier states, which have the financial ability to support welfare and a private sector not resistant to liberal public assistance. Traditionalist cultures are primarily found in poorer states, which lack the tax base necessary to fund the programs and must be sensitive to the needs of private sectors with low wage structures. Thus, it is not possible to identify first causes among these conditions. The point is that the states are able to adjust welfare to their fiscal abilities, labor market conditions, and dominant political ideologies, given the flexibility of decentralized control. Federalism gives geographic expression to the US public assistance system and is an underlying mechanism which allows spatial variation in the political and economic forces influencing the work-disincentive effect of welfare. THE EXPANSION METHOD AND SPATIALLY VARIABLE SOCIAL POLICY If this study had used a purely nomothetic approach, the empirical test of laborleisure theory would have been confined to the initial model, which found general support for the thesis that welfare engenders a work disincentive. The expansion method was used to extend beyond this initial finding, incorporating
A CONTEXTUAL EXPANSION OF THE WELFARE MODEL 63
into the terminal model a test for the differential operation of work disincentives, according to the context in which the decision between work and welfare is made. Thus, the two-step procedure can be seen as a movement from a purely nomothetic study in the direction of the idiographic. The validity of the labor-leisure model was found to vary spatially, as the work-disincentive effect of welfare varied across labor market and policy contexts. This finding accords with conclusions drawn from the study of public assistance in particular locales. The point is perhaps best made by reference to the statements of leading critics and advocates of the welfare system. For example, when President Reagan wished to stress a point on welfare fraud, he often used anecdotes from large Northeastern cities, where welfare bureaucracies are often so overloaded that some of the undeserving no doubt receive benefits. An example is the account of Chicago’s Welfare Queen, a woman accused of misrepresenting herself as the widow of several deceased Navy men and collecting welfare and widow’s grants, which Reagan often included in presidential campaign speeches (Hannaford 1983:90–1). When Piven and Cloward (1971) portrayed welfare as a tool manipulated in the interests of capital, they drew most of their examples from the rural south, where, during the 1950s and 1960s, welfare was quite stringently controlled to prevent it from drawing workers away from southern agriculture. With regard to the most recent and famous welfare anecdote (Murray 1984), in which Phyllis and Harold decide between work and marriage versus welfare and separation, Murray and Greenstein (1985) and Greenstein (1985) have discussed whether Phyllis and Harold’s decision was affected by the fact that they lived in the liberal welfare state of Pennsylvania rather than a more conservative state such as Mississippi. It is evident in all these examples that the advocates of different positions use regions that support their cases most strongly. Each position in the welfare debate is more valid in some places than in others because the programs have different impacts in different contexts. Therefore, national welfare programs whose design is based solely on the idiographic study of particular places or on the nomothetic analysis of abstract theory will be misguided. Further, societal perceptions of the welfare system which are derived from only one of these approaches will be distorted. The expansion method is a flexible procedure which tests theoretical notions according to the contingencies of place. As such, it is an appropriate method for researchers seeking a middle ground between the abstractions of nomothetic approaches and the parochialism of idiographic ones. NOTES This paper is a modified version of an article appearing in the Annals, Association of American Geographers 26 (2), 1986, pp. 228– 46. Permission to incorporate it here was granted by the Annals editor.
64 J.E.KODRAS
1 The number of poverty families is used as a surrogate of the number eligible. Due to program regulations, not all poverty families are in fact eligible, but this measure was judged to be more representative than the number of all families, households, or persons, which are commonly used in studies of AFDC participation (Brehm and Saving 1964; Albin and Stein 1971; Winegarden 1973; Bieker 1981; Jones 1984b). These studies incorporate variations in need and eligibility as independent variables. The present study incorporates eligibility variations into the dependent variable. 2 The work-disincentive variable is composed of actual levels in AFDC benefits and earnings rather than potential levels, such as the ratio of the AFDC guarantee to wage rates. As such, it measures the average AFDC payment relative to average earnings of poverty families, rather than the potential which might be acquired from the welfare system or through employment. The ratio of actual levels is the more appropriate measure for several reasons. First, studies have shown that the informal calculus of the work-welfare decision tends to be based on the experiences of acquaintances in similar circumstances and the amounts they actually receive for work or on welfare. Only rarely are AFDC recipients fully aware of the potential amounts they might receive, given the myriad regulations influencing welfare payment levels and the complexity of the employment market (Opton 1971; Chrissinger 1980). Second, potential welfare benefits would be very difficult to estimate for the numerator of the ratio. The states vary not only in the AFDC guarantee (the maximum benefit to a family with no income), but also in the marginal tax rate, by which the benefit decreases as earnings increase. The complexities of these state sliding scales prevent the marginal tax rate from being incorporated with the guarantee as a measure of potential welfare benefits. 3 This measure of wage discrimination is calculated for males and females older than 18 years with four years of high school education, and with full-time (35 + hours per week) and full-year (40 + weeks per year) employment. It does not explicitly control for age differences but state ratios for various age groups were, in most cases, quite similar to the overall adult ratios. Four years of high school education is appropriate for the AFDC population. The measure does not control for job tenure, however, since such data are unavailable (see Rytina (1982) for a discussion of job tenure effects on male-female earnings differentials). 4 Previous research has identified a number of economic, political, and demographic conditions which are associated with aggregate welfare participation (see Kodras (1982) and Jones (1984b) for reviews of these studies). To test whether the workdisincentive effect remains stable after controlling for these variables, and thus represents a truly significant correlate with AFDC use, a number of models were run, with sets of independent variables added to equation (4.6). Demographic variables included percent urban, percent Black, percent Hispanic, and the divorce rate. The economic and political variables were those used in the factor analysis, entered independently into the equation. Because of problems with multicollinearity, all variables could not be incorporated into the same equation. In each model the magnitudes of parameters b0 through b3 were altered, but the workdisincentive effect remained significant and state parameter maps showed similar regional patterns. In fact, zero-order correlations of state parameter values between the various models were in all cases greater than 0.90. These subsidiary models
A CONTEXTUAL EXPANSION OF THE WELFARE MODEL 65
lend credence to the results reported in the text, demonstrating the importance of spatially varying work-disincentive effects on welfare participation.
REFERENCES Albin, P. and Stein, B. (1968) ‘The constrained demand for public assistance’, Journal of Human Resources 3:300–11. ——and——(1971) ‘Determinants of relief policy at the subfederal level’, Southern Economic Journal 37:445–57. Anderson, M. (1978) Welfare, Stanford, CA: Hoover Institution Press. Bieker, R. (1981) ‘Work and welfare: an analysis of AFDC participation rates in Delaware’, Social Science Quarterly 62:169–76. Brehm, C. and Saving, T. (1964) ‘The demand for general assistance payments’, American Economic Review 54:1002–18. Cain, G. and Watts, H. (1973) Income Maintenance and Labor Supply, Chicago, IL: Markham. Chief, E. (1979) ‘Need determination in the AFDC program’, Social Security Bulletin 42: 11–21. Chrissinger, M. (1980) ‘Factors affecting employment of welfare mothers’, Social Work 25:52–6. Danziger, S., Haveman, R. and Plotnick, R. (1980) ‘Retrenchment or reorientation: options for income support policy’, Public Policy 28:473–90. Elazar, D.J. (1972) American Federalism: A View from the States, New York: Thomas Crowell. Elman, R. (1966) The Poorhouse State: The American Way of Life on Public Assistance, New York: Random House. Freeman, R.A. (1981) The Wayward Welfare State, Stanford, CA: Hoover Institution Press. FSU Census Access System, 1980 Census of Population and Housing. Greenstein, R. (1985) ‘Losing face in “Losing Ground”’, New Republic 192:12–19. Hamermesh, D. (1977) Jobless Pay and the Economy, Baltimore, MD: Johns Hopkins University Press. Hannaford, P. (1983) The Reagans: A Political Portrait, New York: Coward-McCann. Hosek, J. (1982) ‘The AFDC-unemployed fathers program: determinants of participation and implications for welfare reform’, in P. Sommers (ed.) Welfare Reform in America, Boston, MA: Kluwer Nijhoff. Howes, C. and Markusen, A. (1981) ‘Poverty: a regional political economy perspective’, in A.Hawley and S.Mazie (eds) Non-metropolitan America in Transition, ch. 11, Chapel Hill, NC: University of North Carolina Press. Johnston, R.J. (1990) ‘Economic and social policy implementation and outputs: an exploration of two contrasting geographies’, in J.Kodras and J.P.Jones III (eds) Geographic Dimensions of United States Social Policy, pp. 37–58, London: Edward Arnold. Jones, J.P. III (1984a) ‘A spatially-varying parameter model of AFDC participation: empirical analysis using the expansion method’, Professional Geographer 36: 455–61.
66 J.E.KODRAS
——(1984b) ‘Spatial parameter variation in models of AFDC participation: analyses using the expansion method’, Ph.D. dissertation, Ohio State University. ——(1987) ‘Work, welfare, and poverty among black female-headed families’, Economic Geography 63:20–34. Jones, J.P. III and Kodras, J. (1984) ‘AFDC participation dynamics and policies’, Modeling and Simulation 15:41–5. Kasper, H. (1968) ‘Welfare payments and work incentive: some determinants of the rate of general assistance payments’, Journal of Human Resources 3:86–110. Keeley, M., Robins, P., Spiegelman, R. and West, R. (1978a) ‘The labor-supply effects and costs of alternative negative income tax programs’, Journal of Human Resources 13:3–36. ——, ——, —— and ——(1978b) ‘The estimation of labor supply models using experimental data’, American Economic Review 68:873–87. Kodras, J.E. (1982) ‘The geographic perspective in social policy evaluation: a conceptual approach with application to the U.S. Food Stamp Program’, Ph.D. thesis, Ohio State University. ——(1984) ‘Regional variation in the determinants of food stamp program participation’, Environment and Planning C: Government and Policy 2:67–78. Kuttner, B. (1983) ‘The declining middle’, The Atlantic Monthly July: 60–72. Lampman, R. (1978) ‘Labor supply and social welfare benefits in the United States’, Institute for Research on Poverty Special Report 22, Madison, WI. Levitan, S. and Johnson, C. (1984) Beyond the Safety Net: Reviving the Promise of Opportunity in America, Cambridge, MA: Ballinger. Masters, S. and Garfinkel, 1. (1977) Estimating the Labor Supply Effects of Income Maintenance Alternatives, New York: Academic. Menefee, J., Edwards, B. and Schieber, S. (1981) ‘Analysis of non participation in the SS1 program’, Social Security Bulletin 44:3– 21. Morrill, R. and Wohlenberg, E. (1971) The Geography of Poverty in the United States, New York: McGraw-Hill. Murray, C. (1984) Losing Ground: American Social Policy 1950–1980, New York: Basic Books. Murray, C. and Greenstein, R. (1985) ‘The greed society: an exchange’, New Republic 192:21–3. Opton, E. (1971) Factors Associated with Employment among Welfare Mothers, Berkeley, CA: Wright Institute. Pigou, A. (1952) The Economics of Welfare, London: Macmillan. Piven, F.F. and Cloward, R. (1971) Regulating the Poor: The Functions of Public Welfare, New York: Vintage. —— and ——(1981) ‘Keeping labor lean and hungry’, The Nation 233:466–7. Platky, L. (1977) ‘Aid to families with dependent children: an overview’, Social Security Bulletin 40:17–22. Reagan, R. (1968) The Creative Society, New York: Devin-Adair. Rumberger, R. (1981) ‘The changing skill requirements of jobs in the US economy’, Industrial and Labor Relations Review 34:578–90. Rytina, N. (1982) ‘Tenure as a factor in the male-female earnings gap’, Monthly Labor Review 105:32–4.
A CONTEXTUAL EXPANSION OF THE WELFARE MODEL 67
Seninger, S. and Smeeding, T. (1981) ‘Poverty: a human resource— income maintenance perspective’, in A.Hawley and S.Mazie (eds) Nonmetropolitan America in Transition, ch. 10, Chapel Hill, NC: University of North Carolina Press. Soja, E.W. (1987) ‘The postmodernization of geography: a review’, Annals, Association of American Geographers 77 (2):289–94. Spall, H. and McGoughran, E. (1974) ‘AFDC in Michigan during the twentieth century’, Review of Social Economy 32:70–85. Stonecash, J. and Hayes, S. (1981) ‘The sources of public policy: welfare policy in the American states’, Policy Studies Journal 9: 681–98. Taaffe, E.J. and Casetti, E. (1990) ‘The model context problem and the expansion method’, Paper presented at the annual meeting of the Association of American Geographers, Toronto, Canada. Thrall, G.I. (1981) ‘Regional dynamics of local government welfare expenditures’, Urban Geography 2:255–68. US Bureau of the Census (1982) State and Metropolitan Area Data Book, Washington, DC: USGPO. ——(1983) Census of Population, 1980, Washington, DC: USGPO. ——(1984) Statistical Abstract of the United States, Washington, DC: USGPO. US Social Security Administration (1980) Annual Statistical Supplement to the Social Security Bulletin, Washington, DC: USGPO. ——(1981) Characteristics of State Plans for AFDC, Washington, DC: USGPO. ——(1982) 1979 Recipient Characteristics Study, Washington, DC: USGPO. Weil, G. (1978) The Welfare Debate of 1978, White Plains, NY: Institute for Socioeconomic Studies. Winegarden, C. (1973) ‘The welfare explosion: determinants of the size and recent growth of the AFDC population’, American Journal of Economics and Sociology 32: 245–56. Wohlenberg, E. (1976a) ‘An index of eligibility standards for welfare benefits’, Professional Geographer 28:381–4. ——(1976b) ‘Interstate variations in AFDC programs’, Economic Geography 52:254–66. ——(1976c) ‘Public assistance effectiveness by states’, Annals, Association of American Geographers 66:440–50.
APPENDIX 4.1: VARIABLE DEFINITIONS AND SOURCES 1 PR—AFDC participation rate:
Source: US Social Security Administration 1980 and US Bureau of the Census 1982: Table C–1002 2 WD—work disincentive:
Source: US Bureau of the Census 1983: Table 248
68 J.E.KODRAS
3 FUNEMP—female unemployment rate:
Source: US Bureau of the Census 1982: Table C–855 4 NOJOB—proportion of all jobs not in retail trade, nondurable manufacturing, or nonprofessional services:
(RT, retail trade; NDM, nondurable manufacturing; NFS, nonprofessional services). Source: FSU Census Access System: Table 65 5 WAGED1FF—ratio of male to female earnings:
calculated for males and females older than 18 years, with four years of high school education, who are employed full time (35+hours per week) and full year (40+weeks per year). Source: US Bureau of the Census 1983: Table 237 6 PFWORK—the proportion of female-headed households whose head works full time (35+hours per week) and full year (50+weeks per year) who remain below the poverty level:
Source: US Bureau of the Census 1983: Table 246 7 SEVERITY—the median dollar amount by which poverty families fall below the poverty level. Source: US Bureau of the Census 1983: Table 251 8 FEMPOV—the proportion of female-headed households below the poverty level:
Source: US Bureau of the Census 1982: Table C–1007 9 NEEDS—state welfare administration’s determination of the monthly amount necessary to meet basic needs. Source: US Social Security Administration 1981: Table C 10 ADMIN—index of state administrative leniency in welfare provision. Source: Jones 1984b:74 11 SERVICE—ratio of state and local welfare employees to the poverty population:
A CONTEXTUAL EXPANSION OF THE WELFARE MODEL 69
Source: US Bureau of the Census 1982: Tables C–1184 and C–1012 12 TRANSFER—the proportion of persons receiving public assistance who are classified poor excluding public assistance and nonpoor including public assistance:
Source: US Bureau of the Census 1983: Table 249 APPENDIX 4.2: STATES RANKED BY WORK-DISINCENTIVE EFFECT, WITH FACTOR SCORES AND POLITICAL CULTURE CATEGORY
MN WI NH VT IA WY NB CA CT ND MA WA UT MI OR OK RI PA KS NY MD OH
Parameter value (Figure 4.1)
Factor scores
1 1.735 1.689 1.600 1.589 1.501 1.496 1.474 1.456 1.434 1.426 1.417 1.400 1.390 1.372 1.359 1.322 1.267 1.254 1.185 1.167 1.144 1.128
2 1.250 2.210 0.589 2.243 0.535 –0.262 –0.333 1.830 0.146 0.152 1.328 1.043 0.391 1.816 1.077 0.649 0.392 0.649 –0.481 0.847 –0.678 –0.236
3 –1.253 0.327 –1.866 0.219 –0.953 –0.753 –2.084 0.119 –1.319 –0.777 –0.357 0.228 –0.109 1.675 0.634 –0.037 –0.512 0.024 –1.066 0.356 –1.682 –0.039
Political culturea
0.392 0.365 0.009 –0.380 0.583 2.089 0.553 –0.567 0.422 0.986 –0.508 0.518 1.154 0.764 0.709 0.436 –0.001 0.118 0.333 –0.320 –0.280 0.786
M M MI M MI IM IM MI IM M IM MI M M M TI IM I MI IM I I
70 J.E.KODRAS
Parameter value (Figure 4.1) 1 NJ 1.109 CO 1.107 WV 1.101 IL 1.092 AK 1.089 MO 1.086 MT 1.077 IN 1.013 HA 0.990 ID 0.968 ME 0.961 SD 0.949 AZ 0.931 VA 0.845 KY 0.752 TX 0.734 NV 0.654 NM LA DE FL AR TN NC AL GA SC MS
2 0.259 –0.096 –0.120 –1.239 0.853 –0.247 –0.516 –0.427 1.557 0.165 0.928 –0.941 –1.028 –0.770 –0.841 –1.748 –0.703 0.631 0.627 0.617 0.615 0.605 0.527 0.464 0.412 0.268 0.221 0.209
Political culturea
Factor scores
–0.986 –1.430 –0.522 –0.971 –0.480 –0.903 –0.611 –0.878 –1.121 –1.254 –1.085
3 –0.069 –0.438 0.768 –1.315 0.680 –0.456 0.149 0.454 0.348 0.950 0.687 –0.445 –0.766 –0.948 0.515 –0.696 –1.350
1.130 1.365 0.156 0.047 1.375 0.777 0.178 1.836 1.105 0.302 2.885
–0.166 –0.016 1.366 0.768 –0.394 0.089 1.171 1.016 –2.483 0.361 –1.232 0.469 0.142 –0.975 0.291 0.291 –2.618 0.552 1.523 –1.401 –0.804 –0.146 –0.576 –2.098 –0.051 –1.305 –2.269 0.336
I M TI I I IT MI I IT MI M MI TM T TI TI I TI T I TI T T TM T T T T
Note: aM, moralist; I, individualist; T, traditionalist. Two-letter code indicates a combination of cultures, with the first culture dominant. Source: Elazar 1972:117
5 A COMPARISON OF DRIFT ANALYSES AND THE EXPANSION METHOD: THE EVALUATION OF FEDERAL POLICIES ON THE SUPPLY OF PHYSICIANS Stuart A.Foster, Wilpen L.Gorr, and Francis C.Wimberly In most cases, the values of social and economic variables vary as a function of time and location. However, it is likely that the relationships among such variables are also context dependent—i.e. vary with time and/or location. For instance, the relationship between the market value of a residential property and the area of the lot is likely to be different for San Francisco than for Oklahoma City and, for a given location, is likely to be different now than it was ten years ago. Contextual variation in functional relationships can be investigated by drift analyses and by the expansion method. The purpose of this paper is to contrast the two approaches and to illustrate the comparative advantages of each in a case study involving the locational behavior of physicians. In the section below we compare the expansion method and drift analyses and discuss their respective merits. Next, we provide background information on the locational behavior of physicians from the 1960s through the 1980s and justify the specification of an initial model which serves as a basis for developing both polynomial and drift analysis models of physician location behavior. We then describe the data to be used in comparing the two approaches, present the results of the comparison, and offer an interpretation of our findings. CONTRASTING METHODOLOGIES The expansion method (Casetti 1972, 1973, 1982a, 1986) is designed to develop models with parameters that vary over contextual domains. To this effect, ‘a “terminal” model is generated from an “initial” one by making some of the parameters of the latter a function of some variables. The expansion method can be used for constructing models meeting requirements that an initial model does not satisfy or for removing inadequacies of an initial model, in such a fashion that whatever validity or usefulness the initial model possesses is not disregarded but rather built upon’ (Casetti 1972:82). For an example, assume the following initial model: (5.1)
72 DRIFT ANALYSIS AND THE EXPANSION METHOD
in which each observation i is also associated with a context vector . This vector may comprise substantive variables, spatial coordinates, time coordinates, or some combination of the three. In the expansion method, parameters are permitted to vary over the contextual domain. For example, expanding β 1 from (5.1) into Z, we obtain (5.2) which, when replaced into initial model (5.1), yields the terminal model (5.3) An example where the function is so expressed is provided by polynomial expansions. For example, the z variables may be powers of time from degree 1 to k. Functional forms other than polynomials have been used; for instance, logistic functions were used in Casetti (1972). Polynomial expansions allow direct testing of whether a parameter is varying over a context. Individual t tests on interaction terms (the approach adopted here) or F tests on the interaction terms of the full polynomial with a causal variable (except for the intercept of the polynomial) determine the significance of the parameter path. Varying parameters can also be investigated by ‘drift analysis’. This approach uses estimators such as DARP (Casetti 1982b) or adaptive filters which estimate the β s directly without the explicit use of pre-specified functional forms. Drift methods for estimation of time-varying parameters include the adaptive estimation procedure (Carbone and Longini 1977) and generalized adaptive filtering (Makridakis and Wheelwright 1977). For spatially varying parameters, two methods have been proposed: DARP (Casetti 1982b) and spatial adaptive filtering (Foster and Gorr 1986). All such methods provide smoothed parameter estimates for each observation point. They are able to accommodate unusual patterns of parameter variation such as step jumps and other discontinuous functions, splines, multi-modalities, etc. Drift analysis methods can be used in an exploratory fashion, such as when plots of estimated parameters are used to suggest functional forms (such plots are called ‘paths’ when the contextual domain is time and ‘maps’ in spatial problems). An example of this is found in Bretschneider and Gorr (1983). In addition, they can be employed as diagnostics for the adequacy of functional expansions through the construction of overlay plots of parameter paths or maps which compare parameter estimates produced by expansion analyses and by drift analyses. We include such a comparison in our empirical analyses. One class of drift analysis is represented by moving window regressions. A limited form of moving windows, moving averages, has long been used in time series analysis and forecasting. For example, classical decomposition and the Census Bureau’s X-ll methods are based on moving averages. For seasonal time series data, a moving average is calculated with the window length equal to the length of a season—for instance, four data points for quarterly data. The first four data points in the time series are averaged with the resulting average
S.A.FOSTER, W.L.GORR, AND F.C.WIMBERLY 73
associated with the center of the window. Then the fifth data point is added while the first data point is dropped as the window is moved forward one step. A new average is then calculated and the process continues in this way until the last four data points have been averaged. The resulting moving averages are capable of smoothing and tracking any pattern in the data. Moving window analyses employ regressions along with the moving average approach of adding and deleting observations to arrive at time paths or maps of parameters. The models evaluated may be multivariate and, unlike polynomial expansions, the regressions provide for significance tests along the path defined by the included observations. In the analyses reported here, each window includes data over a spatial region—the forty-eight contiguous states of the USA. The longer the time window, the smoother the resulting parameter paths. Of course, moving windows collapse to annual regressions (for annual data) as the length of the window is decreased. Tradeoffs related to the choice of the window length are discussed in the context of the results of our case study. SUPPLY OF PHYSICIANS AND FEDERAL POLICIES To explore the comparative advantages of the methods discussed above we have selected an application in which varying parameter models have considerable merit in evaluating policy questions: the supply and geographic distribution of physicians. Physicians are pivotal in the health care industry in the United States since they serve as the entry point and primary providers of health care services. Thus, their supply and geographic distribution are important determinants of the public’s access to health care. Growth in demand for health care services after the Second World War, which resulted largely from rapid population growth and increasing prosperity, led to the widespread perception of a physician shortage. Consequently, the federal government passed several pieces of legislation aimed at increasing physician supply and reducing spatial inequalities in access to physicians—i.e. increasing the relative density of physicians in rural areas. The result of the federal legislation was rapid growth in the number of physicians, commencing in the middle 1960s and continuing throughout the 1970s and early 1980s. By 1981, however, the pendulum had swung from a perception of shortage to one of oversupply of physicians. Subsequently, there has been a greatly reduced federal role in physician manpower planning. Besides the federal initiatives on increasing physician supplies, other important trends and events during this period contributed to major changes in medical practice. Technological innovations within the health care field led to increased specialization and diversity in the availability and provision of health care services. The amendments to the Social Security Act creating the Medicare and Medicaid programs for the elderly and poor restructured the sources and increased the level of demand for health care services. Furthermore, the
74 DRIFT ANALYSIS AND THE EXPANSION METHOD
corporate movement in the health care industry has radically altered the nature and variety of practice alternatives for physicians. In this study we examine the aggregate locational behavior of physicians during the transitional period of the 1960s through the 1980s. Several federal programs were designed to deal directly with the spatial maldistribution of physicians. An example is the National Health Service Corps, which provided incentives for physicians to locate in physician-poor areas. Moreover, federal policy-makers had intended to increase the supply of physicians to the point of saturating physician-rich areas under the assumption that the resulting oversupply would cause a flow of physicians into the poorly supplied areas. Here we compare methods which can be used to evaluate these policies by seeking evidence of the intended spatial diffusion of physicians. A limited number of previous studies have sought evidence of this diffusion process. Schwartz et al. (1980) observed diffusion of board-certified specialist physicians into increasingly smaller communities over the period from 1960 through 1977. Fruen and Cantwell (1982) examined time trends in physician per capita ratios for different sizes of urban and rural areas. They noted evidence of diffusion into smaller communities, with the exception of the most rural areas where comparatively little improvement in the ratio was observed. This study extends this small body of research. Of the previous studies in this area, only Foster (1988) makes use of the expansion method. PHYSICIAN LOCATION FACTORS AND INITIAL MODEL SPECIFICATION The attributes of a given site contribute to three broad factors in physicians’ location decisions: the professional climate, social amenities, and market factors. Professional concerns generally fall into the categories of sufficient access to hospital and other support facilities and interaction and support of colleagues, including, for example, the availability of continuing education programs. Hence, the attractiveness of the professional environment is a positive function of the supply of physicians in an area. Attitudinal surveys of physicians and residents indicate the relative importance attributed to professional concerns in the choice of a practice location (e.g. Parker and Tuxill 1967; Champion and Olson 1971; Cooper et al. 1975; Steinwald and Steinwald 1975; Diseker and Chappell 1976; Parker and Sorensen 1978, 1979). Social amenities include the social and cultural aspects of a community, such as educational facilities, entertainment and recreational opportunities, and shopping facilities— attributes which are a function of community size. Indeed, the primary focus of research into the effects of social amenities on location choice has been on community-size preferences. Numerous studies from a wide variety of geographic contexts have shown that physicians exercise preferences for community size in choosing a practice location (e.g. Parker and Tuxill 1967; Cooper et al. 1975; Steinwald and Steinwald 1975; Diseker and Chappell 1976;
S.A.FOSTER, W.L.GORR, AND F.C.WIMBERLY 75
Parker and Sorensen 1978; Coombs et al. 1985), and these preferences are largely determined by a physician’s previous life experiences. Hence, physicians from large cities tend to prefer practice locations in large cities, and physicians from small communities are more likely to practice in small communities. Nevertheless, population is by far the greatest determinant of physician distribution, and physicians are quite responsive to variations in population growth. Another simple indicator for market characteristics is income potential. The role of income potential in location choice, however, does not appear to be important (e.g. Cooper et al. 1975; Diseker and Chappell 1976; Parker and Sorensen 1978; Foster 1988). Lave et al. (1975) argue that physicians’ incomes are so high that variations in income potential among places are not perceived as important. Given the factors just elucidated, it is evident that our initial model needs to contain measures of population growth and physician density within a geographic area. Furthermore, the federal programs mentioned above, leading from a physician shortage to an oversupply, indicate that physician density should have a varying impact on physicians’ locational decisions. An appropriate initial model is thus: (5.4) where t is time, GPHYSPOP is the percentage growth in the physician population, LDENSITY is the physician-to-population ratio (per 100,000 general population) at the start of the period, GPOP is the percentage growth in general population, and β is a classical disturbance term. The parameter estimate for LDENSITY at any given time is the result of the two opposing locational tendencies. To the extent that LDENSITY reflects the professional climate and social amenities, it should enter into the model with a positive sign, indicating an overall agglomerative trend. In contrast, as far as it is related to the economic competition for practice, its parameter should be negative, consistent with a general deglomerative trend. The estimate of this parameter at any given time thus reflects these opposing tendencies. The parameter estimate associated with GPOP provides an indication of the responsiveness of physician supply to changing market conditions. A parameter estimate of 1.0 indicates that population growth has a proportional effect on physician supply; an estimate exceeding 1.0 indicates a more than proportional response (so that population growth contributes to an increase in physician density); and an estimate of less than 1.0 indicates the opposite. SCOPE AND DATA The scope of our analysis is defined along three dimensions: space, time, and physician specialty. From a spatial perspective the analysis focuses on the distribution of physicians at the national level, with the forty-eight contiguous states
76 DRIFT ANALYSIS AND THE EXPANSION METHOD
Table 5.1 Descriptive statistics: 1963–83 annual data for the contiguous forty-eight states Variable
N
Mean
Standard deviation
Minimum
Maximum
GPRIM GSPEC LPRIMR LSPECR GPOP
959 959 960 960 960
2.87 4.61 56.99 60.69 1.21
3.38 3.96 12.42 20.25 1.25
–7.65 –6.53 37.26 20.65 –3.25
19.26 18.81 108.03 138.35 8.67
comprising the set of spatial observational units. The temporal frame of the analysis involves annual data from 1963 through 1983. Separate analyses are conducted for primary care and specialty care physicians. Primary care physicians include all those whose primary specialization is in general and family practice, internal medicine, obstetrics and gynecology, or pediatrics. All other physicians are classified as specialists. The physician data are from the American Medical Association (AMA) master file, as made available in the AMA’s series of annual publications regarding the geographic and specialty distributions of physicians. Table 5.1 lists the set of variables used in the analyses and includes descriptive statistics. GPRIM and GSPEC represent the annual percentage growth in the supply of primary care and specialty care physicians, respectively, in a given state and time period. LPRIMR and LSPECR identify the number of physicians per 100,000 persons for primary and specialty care, respectively, calculated at the beginning of the year in which growth in physician supply is measured. THE MODELS This section first presents the functional expansions and drift analyses used to evaluate physician manpower policies and then compares them through overlay plots of parameter time paths. Table 5.2 Ordinary least squares estimates of the terminal model: quadratic expansions in time GPRIM (0.735) (0.158) (0.001) (0.511) (0.049) –0.000812t2LPRIMR+0.578GPOP+0. 052tGPOP (0.002) (0.019) (0.333) –0.00236t2GPOP (0.354) adj. R2=0.332 F=59.9, p=0.0001 n=959 GSPEC
=0.588–0.497t+0.0510t2–0.0216LPRIMR +0.0121tLPRIMR
=4.17+0.153t–0.00188t2–0. 00797LSPECR–0.00585tLSPECR
S.A.FOSTER, W.L.GORR, AND F.C.WIMBERLY 77
(0.002) (0.613) (0.896) (0.779) (0.267) +0.000258t2LSPECR+0.507GPOP+0. 109tGPOP (0.248) (0.160) (0.150) –0.00537t2GPOP (0.129) adj. R2=0.097 F=13.7, p=0.0001 n=959 Note: Standard errors are shown in parentheses below the estimates.
Polynomial expansions We investigated functional expansions of initial model (5.4) using polynomials in time interacted with all components of the model. The expansions were computed using GPRIM and GSPEC as the dependent variables for initial model (5.4). Polynomials up through the full fourth order were attempted. Table 5.2, providing ordinary least squares (OLS) estimates of a full quadratic expansion, is included here as an example; standard errors are listed below each co-efficient. The table illustrates a feature of polynomial-based functional expansions which is somewhat problematic. Although both models are highly significant overall (p < 0.0001), many of the individual coefficients are not significant or are marginally so. Indeed, in the specialty physician model none of the coefficients appears to be significant. This results from collinearity among the terms in the polynomial expansions, for instance t, t2, tLPRIMR, t2LPRIMR, etc. The explanatory efficacy is ‘spread’ over these redundant terms with the result that it is lowered for each of them. Table 5.3 Annual regressions for GPRIM, model (5.5): estimated coefficients and p values Year
R2
F
Intercept
LPRIMR
GPOP
1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975
0.45 0.17 0.14 0.07 0.27 0.06 0.32 0.13 0.17 0.32 0.05 –0.02
20.2 4 5.9 2 4.7 1 2.8 0 9.7 3 2.5 0 12.3 4 4.6 1 5.8 2 12.1 4 2.3 0 0.4 0
2.26 0 –0.74 0 –0.59 0 –0.15 0 0.91 0 –3.57 0 0.63 0 –3.34 0 –2.79 0 –0.35 0 2.45 0 4.24 0
–0.0341 0 0.0129 0 –0.0046 0 0.0100 0 –0.0169 0 0.0456 0 0.0127 0 0.0998 1 0.0639 0 –0.0009 0 –0.0050 0 –0.0256 0
0.78 4 0.74 2 0.63 2 0.40 0 1.47 4 0.69 0 1.65 4 0.80 1 0.79 2 1.45 4 0.38 0 0.17 0
78 DRIFT ANALYSIS AND THE EXPANSION METHOD
Year
R2
F
Intercept
LPRIMR
GPOP
1976 –0.03 0.2 0 5.46 0 –0.0102 0 0.24 0 1977 0.32 11.9 4 12.67 4 –0.0126 2 0.96 1 1978 0.07 2.9 0 6.57 2 –0.0685 1 –0.11 0 1979 0.24 8.5 3 6.58 2 –0.0159 0 0.96 3 1980 0.40 16.6 4 7.16 2 –0.0510 0 1.46 4 1981 0.28 10.3 3 5.80 2 –0.0686 1 0.94 2 1982 0.02 1.4 0 4.38 2 0.0182 0 0.33 0 1983 0.32 11.8 4 6.95 4 –0.0640 2 0.74 1 Notes: (1) Shown is adjusted R2. (2) Statistical significance (shown after each coefficient estimate): 0, not significant; 1, 0. 05 level; 2, 0.01 level; 3, 0.001 level; 4, 0.0001 level.
Moving window regressions As a basis of comparison, Tables 5.3 and 5.4 contain annual estimates of the GPRIM and GSPEC models. Note that while they provide time-varying estimates, the results obtained are ‘noisy’, as can be readily discerned in the figures discussed below. They nevertheless serve as a basis for comparing estimates of other, more sophisticated methods, i.e. polynomial expansions and moving window regression parameter paths. Moving window regressions are analogous to the moving averages described above, except that multivariate models are estimated and each window includes a time series of cross-sections for the forty-eight contiguous states. The longer the time window is, the smoother are the resulting Table 5.4 Annual regression estimates for GSPEC, model (5.6): estimated coefficients and p values Year
R2
F
Intercept
LSPECR
GPOP
1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975
–0.01 0.17 0.00 0.13 0.12 0.16 0.22 0.29 0.50 0.04 0.53 0.03
0.7 0 5.9 2 0.9 0 4.5 1 4.1 1 5.6 2 7.5 2 10.7 3 24.8 4 2.1 0 27.3 4 1.8 0
3.49 1 4.59 3 3.56 3 5.28 3 3.86 2 5.87 3 4.17 3 5.57 3 2.84 1 2.02 0 2.60 1 1.46 0
0.0228 0 –0.0401 0 –0.0092 0 –0.0321 0 0.0020 0 0.0005 0 –0.0490 1 –0.0538 1 –0.0305 0 –0.0047 0 –0.0258 0 0.0162 0
0.14 0 1.23 2 0.31 0 0.93 2 0.96 1 1.25 2 1.65 3 1.02 2 1.67 3 0.58 1 1.36 3 1.52 1
S.A.FOSTER, W.L.GORR, AND F.C.WIMBERLY 79
R2
Year
F
Intercept
LSPECR
GPOP
1976 0.13 4.6 1 17.71 2 –0.0223 2 –0.40 0 1977 0.50 24.9 4 14.30 3 –0.0995 3 1.85 3 1978 0.01 1.2 0 2.76 1 0.0226 0 –0.02 0 1979 0.42 17.9 4 3.20 1 –0.0156 0 1.12 3 1980 0.33 12.4 4 9.68 3 –0.0420 1 0.91 3 1981 –0.03 0.4 0 4.44 2 –0.0092 0 0.21 0 1982 0.18 6.0 2 4.05 3 0.0027 0 0.45 2 1983 0.13 4.5 1 2.99 1 0.0038 0 0.90 2 Notes: (1) Shown is adjusted R2. (2) Statistical significance (shown after each coefficient estimate): 0, not significant; 1, 0. 05 level; 2, 0.01 level; 3, 0.001 level; 4, 0.0001 level.
parameter paths. Of course, moving windows collapse to annual regressions (as in Tables 5.3 and 5.4) as the length of the window is decreased. The annual regressions, besides being too noisy, also lack power in individual parameter t tests because of fewer degrees of freedom. While longer windows are therefore desirable to increase smoothing and power, at the other extreme a single regression for all cross-sections pooled together no longer permits any time parameter variation. Here we chose to use three-year moving averages to balance smoothing and responsiveness. These directly estimate the more general timevarying parameter specification of model (5.4), e.g. for primary care and specialty physicians (5.5)
Table 5.5 Three-year moving-average window regressions for GPRIM, model (5.5): estimated coefficients and p values (n=144) Year
R2
F
Intercep Time t
Time2
LPRIM R
GPOP
RHO
1965
0.28
14.8 4
10
00
–0.2 0
0.65 4
0.161 0
1966
0.20
9.9 4
72
–7 3
1.0 3
0.59 4
0.118 0
1967
0.26
13.4 4
–6 0
30
–0.0 0
0.80 4
0.108 0
1968
0.25
13.0 4
–30 3
13 3
–1.4 4
0.79 4
0.116 0
1969
0.47
32.3 4
103 4
–36 4
3.1 4
1.19 4
1970
0.45
29.7 4
–123 4
33 4
–2.3 4
–0. 0047 0 0.0040 0 –0. 0069 0 –0. 0023 0 0.0018 0 0.0597 1
–0.124 0 –0.090 0
0.81 4
80 DRIFT ANALYSIS AND THE EXPANSION METHOD
Year
R2
F
Intercep Time t
Time2
LPRIM R
GPOP
RHO
1971
0.22
10.8 4
–26 0
70
–0.5 0
0.74 4
0.162 0
1972
0.21
10.5 4
61 0
–13 0
0.7 0
0.95 4
1973
0.16
8.0 4
55 0
–12 0
0.6 0
1974
0.15
7.4 4
–55 0
10 0
–0.4 0
1975
0.15
7.5 4
93 0
–16 0
0.7 0
–0.001 0 –0.133 0 –0.007 0 0.021 0
1976
0.27
13.9 4
–14 0
10
0.0 0
1977
0.37
22.1 4
–591 4
87 4
–3.2 4
1978
0.46
31.6 4
–138 4
4.6 4
1979
0.47
39.8 4
1, 038 4 –766 4
95 4
–2.9 4
1980
0.52
30.3 4
–309 1
39 1
–1.2 2
1981
0.45
30.3 4
0.0752 3 0.0477 1 0.0166 0 –0. 0050 0 –0. 0095 0 –0. 0533 1 –0. 0624 2 –0. 0658 2 –0. 0372 1 –0. 0447 1 –0. 0322 1 –0. 0381 2
0.78 4 0.74 4 0.36 1 0.57 1 0.45 1 0.65 3 0.88 4 1.05 4
–0.117 0 –0.124 0 –0.094 0 –0.084 0 –0.069 0 –0.057 0 0.037 0
1, 200 –133 4 3.7 4 0.87 4 4 1982 0.48 39.9 4 –1, 203 127 4 –3.3 4 0.43 2 4 Notes: (1) Shown is adjusted R2. (2) Statistical significance (shown after each coefficient estimate): 0, not significant; 1, 0. 05 level; 2, 0.01 level; 3, 0.001 level; 4, 0.0001 level. (3) RHO is an estimate of serial autocorrelation. Table 5.6 Three-year moving-average window regressions for GSPEC, model (5.6): estimated coefficients and p values (n=144) Year
R2
F
Intercep Time t
Time2
LSPEC R
GPOP
RHO
1965
0.07
3.6 2
52
–1 0
0.08 0
0.44 2
1966
0.15
7.3 4
10 2
–4 0
0.75 1
1967
0.16
7.2 4
–4 0
40
–0.37 0
0.73 4
–0.044 0 –0.053 0 0.146 0
1968
0.31
16.8 4
23 1
–9 1
0.99 1
1.01 4
0.055 0
1969
0.37
22.4 4
–99 4
36 4
–3.07 4
–0. 0080 0 –0. 0291 1 –0. 0130 0 –0. 0092 0 –0. 0170 0
1.24 4
0.053 0
0.87 4
S.A.FOSTER, W.L.GORR, AND F.C.WIMBERLY 81
Year
R2
F
Intercep Time t
Time2
LSPEC R
GPOP
RHO
1970
0.41
25.6 4
116 4
–30 4
2.03 4
1.26 4
0.083 0
1971
0.37
21.7 4
–8 0
30
–0.20 0
1.41 4
0.092 0
1972
0.31
16.9 4
–13 0
40
–0.28 0
1.09 4
1973
0.36
20.8 4
67 0
–13 0
0.62 0
1974
0.14
6.7 4
69 0
–13 0
0.64 0
1975
0.08
4.0 2
–190 0
33 0
–1.35 0
1976
0.26
13.6 4
666 3
–104 3
4.14 3
1977
0.68
75.1 4
206 4
–7.32 4
1978
0.62
55.6 4
–1, 439 4 685 4
–87 4
2.80 4
1979
0.49
35.0 4
595 4
–76 4
2.43 4
1980
0.51
38.2 4
135 4
–3.95 4
1981
0.46
31.7 4
–1, 139 4 765 4
–83 4
2.27 4
1982
0.08
4.0 2
–239 1
26 1
–0.67 1
–0. 0328 2 –0. 0432 3 –0. 0303 1 –0. 0213 0 –0. 0078 0 –0. 0791 1 –0. 1027 2 –0. 0498 3 –0. 0314 1 –0. 0125 0 –0. 0218 1 –0. 0149 0 –0. 0005 0
–0.184 0 –0.246 2 –0.204 0 –1.245 4 –0.418 4 –0.128 0 –0.197 0 –0.299 2 –0.218 1 –0.018 0 –0.072 0
1.21 4 1.11 4 0.94 1 1.08 1 0.79 4 0.95 4 0.71 4 0.84 4 0.59 4 0.46 2
Notes: (1) Shown is adjusted R2. (2) Statistical significance (shown after each coefficient estimate): 0, not significant; 1, 0. 05 level; 2, 0.01 level; 3, 0.001 level; 4, 0.0001 level. (3) RHO is an estimate of serial autocorrelation.
(5.6) Table 5.5 provides time-specific window regression estimates and statistical significance levels for model (5.5). Table 5.6 supplies the same results for model (5.6). Both tables indicate that population growth, GPOP, is significant in nearly every window, while the lagged physician density levels frequently are. Serial correlation in moving window regressions Time series model estimates often have serial correlation in their residuals. To evaluate serial correlation for the pooled cross-sectional time series model of this
82 DRIFT ANALYSIS AND THE EXPANSION METHOD
Figure 5.1 Parameter paths for LPRIMR in the GPRIM model
paper (i.e. time series for each state with resulting data pooled), we estimated the autoregressive AR(1) model (5.7) from the residuals of each window regression. These analyses reveal that thirteen of the thirty-six regressions had significant negative serial correlation (modal p value 0.0001), with seven in the primary care regressions and six in the specialty regressions. While the resulting parameter estimates for models (5.5) and (5.6) are unbiased for windows with serial correlation, they are inefficient. Thus we decided to remove the serial correlation. We applied the Cochrane-Orcutt and Durbin procedures without success; there were practically no changes in the results for models (5.5) and (5.6) after transformation. Closer analysis of the negative correlation revealed it to be an artifact of the three-year window regressions and nationwide deviations in time trends. All cases of negative serial correlation involved windows in which the second year deviated relatively widely from the ‘time trend’ defined by the first and third years, either higher or lower than the trend. Thus constant parameters estimated by regression analysis for such a window generally had alternating signs in residuals over time. To provide a naive accounting for missing variables and model structure, we applied a functional expansion, but In the small’, with polynomials in time for each window. This expansion required either a local maximum or minimum possible at year two of a window, so we employed a second-order polynomial in time to expand the intercept of each model and window. This works well for primary care physicians, as seen in Table 5.5, as all seven cases of serial correlation are eliminated. For specialty physicians, as seen in Table 5.6, one case of serial correlation is eliminated, three are reduced, and two cases stubbornly remain. (Prior to the expansions, four cases of serial correlation
S.A.FOSTER, W.L.GORR, AND F.C.WIMBERLY 83
Figure 5.2 Parameter paths for GPOP in the GPRIM model
Figure 5.3 Parameter paths for LSPECR in the GSPEC model
were significant at the 0.0001 level, one was significant at the 0.0007 level, and one was significant at the 0.007 level for specialty physicians.) Comparison of expansion and drift analysis parameter paths Figures 5.1–5.4 are overlays of parameter paths from the reference annual regressions, a fourth-order polynomial expansion, and the three-year moving window regressions. Some of the polynomial estimates of parameter paths did
84 DRIFT ANALYSIS AND THE EXPANSION METHOD
Figure 5.4 Parameter paths for GPOP in the GSPEC model
not fit the annual regression estimates of parameters very well. For example, Figure 5.1 contains the estimates of LPRIMR’s parameter from model (5.5). The polynomial, while qualitatively correct in shape, is unable to attain the clear maximum in 1971. The window regression estimates track the annual regression estimates more closely than the polynomial expansion while the noise is considerably reduced relative to the annual regressions. This is a good example of the benefits of a drift analysis estimator. Figure 5.3 depicts an interesting result in that the drift and polynomial expansion models both differ markedly from the annual regressions but agree with each other, more or less, in the vicinity of 1975. This is because they both represent smoothed data in which the high value in 1975 is combined with the low value in 1977. INTERPRETATION OF RESULTS Both the drift analyses and the polynomial expansions combine to yield insights into the locational dynamics of physicians. Let us first examine the estimated parameters and parameter time paths in model (5.5) for primary care physicians. Table 5.3 provides time-specific parameter estimates and statistical significance levels for the intercept as well as for the coefficients of LPRIMR and GPOP. Time paths for the latter two parameters are found in Figures 5.1 and 5.2 respectively. Focusing on the path of LPRIMR as determined by the window regressions, we see that the parameter is initially stable around zero. In 1970 it becomes strongly positive with a peak in 1971 and it remains positive through 1973, after which it is negative.
S.A.FOSTER, W.L.GORR, AND F.C.WIMBERLY 85
On the basis of federal health manpower policy, we did not expect the positive maximum in the early 1970s. However, this period does coincide with two potentially important factors. The Medicare and Medicaid programs began in 1966 and federal expenditures for these programs increased rapidly in the years following, resulting in a dramatic increase in demand for physician services. Also, the relative shortage of primary care physicians peaked at about this time. These two factors acted together to increase the practice opportunities for physicians everywhere. One might thus expect physicians to concentrate in those areas attractive to physicians, i.e. those that already had a high density of physicians. After 1971, the physician density parameter estimates follow a decreasing trend, suggesting an increase in the relative importance of deglomerative market forces. This trend persists through 1976, with the sign becoming negative in 1974, which is consistent with expectations associated with an approaching saturation of physicians in physician-rich areas. The negative sign suggests a tendency for the proportional distribution of physicians to become more equitable. During the remainder of the time frame, the estimates are stable at a negative level, indicative of a new equilibrium behavior of primary care physicians in response to LPRIMR. Turning to model (5.6), which describes the behavior of specialty care physicians, and in particular to the time path of the parameter of LSPECR as seen in Figure 5.3, we see that the significant result is the stability of the estimates with respect to time. There is no evidence of a systematic shift in the locational behavior of specialists relative to physician density and, consequently, no evidence that tighter markets for specialists are affecting locational behavior. On the other hand, the systematic variation in the parameter for population growth, GPOP, is quite clear in Figure 5.4 (and Table 5.6). While the parameter is initially quite low, the responsiveness of specialists to GPOP increases dramatically. The GPOP parameter exceeds 1.0 in 1968 and continues to climb through 1971, after which it begins a decreasing trend until it drops below 1.0 in 1977. The GPOP parameter time path in model (5.6) is consistent with a lagged response of specialists to practice opportunities materializing before specialist production permitted them to be filled. Hence, as supply began to grow, specialists responded to current shifts in population as well as to population growth from, say, a decade earlier. As the supply continued to increase the response to population returned toward an equilibrium value. CONCLUSION In this paper we have compared drift and expansion models as policy evaluation tools. We began with an initial model which incorporated factors which, we argued, influence physicians’ choice of practice location by relating percentage growth in primary and specialty care physicians to physician density and growth
86 DRIFT ANALYSIS AND THE EXPANSION METHOD
in the general population. We then explored parameter drift in these models using a variety of approaches. Each approach has comparative advantages which make it useful for evaluating such policies. In particular, the polynomial-based expansion method supports extensive hypothesis testing and is applicable when interpolation or extrapolation is indicated. On the other hand, moving window regressions provide an effective and simple means for detecting parameter drift, especially for pooled cross-sectional time series data such as the state level physician data used in this paper. In terms of substantive results, we found clear impacts of federal policies on the supply and demand relationship for physicians. The results presented for primary care physicians are initially consistent with locational behavior dominated by professional and social amenity factors, but as the supply of primary physicians increases, market forces become dominant. Meanwhile, the results for specialty care physicians are consistent with the domination of economic concerns. ACKNOWLEDGMENTS This research was funded by NSF Grant SES–8700910 and draws on work by Foster (1988). REFERENCES Bretschneider, S.I. and Gorr, W.L. (1983) ‘Ad hoc model building using time-varying parameter models’, Decision Sciences 14:221– 39. Carbone, R. and Longini, R. (1977) ‘A feedback approach for automated real estate assessment’, Management Science 24:241–8. Casetti, E. (1972) ‘Generating models by the expansion method: applications to geographic research’ , Geographic Analysis 4:81– 91. ——(1973) ‘Testing for spatial-temporal trends: an application of urban population density trends using the expansion method’, Canadian Geographer 17:127–36. ——(1982a) ‘Mathematical modeling and the expansion method’, in R.B.Mandal (ed.) Statistics for Geographers and Social Scientists, pp. 81–95, New Delhi: Concept Publishing. ——(1982b) ‘Drift analysis of regression parameters: an application to the investigation of fertility development relations’, Modeling and Simulation 13:961–6. ——(1986) ‘The dual expansion method: an application for evaluating the effects of population growth on development’, IEEE Transactions on Systems, Man, and Cybernetics SMC–16:29– 39. Champion, D.J. and Olson, D.B. (1971) ‘Physician behavior in southern Appalachia: some recruitment factors’, Journal of Health and Social Behavior 12:245–52. Coombs, D.W., Miller, H.L. and Roberts, R.W. (1985) ‘Practice location preferences of Alabama medical students’, Journal of Medical Education 60:697–706.
S.A.FOSTER, W.L.GORR, AND F.C.WIMBERLY 87
Cooper, J.K., Heald, K., Samuels, M. and Coleman, S. (1975) ‘Rural or urban practice: factors influencing the location decision of primary care physicians’, Inquiry 12: 18–25. Diseker, R.A. and Chappell, J.A. (1976) ‘Relative importance of variables in determination of practice location: a pilot study’, Social Science and Medicine 10: 559–63. Foster, S.A. (1988) ‘Analyses of the changing geographic distribution of physicians in the United States from 1950 through 1985’, Ph.D. dissertation, Ohio State University. Foster, S.A. and Gorr, W.L. (1986) ‘An adaptive filter for estimating spatially-varying parameters: application to modeling police hours spent in response to calls for service’, Management Science 32:878–89. Fruen, M.A. and Cantwell, J.R. (1982) ‘Geographic distribution of physicians: past trends and future influences’, Inquiry 19:44–50. Lave, J.R., Lave, L.B. and Leinhardt, S. (1975) ‘Medical manpower models: need, demand, and supply’, Inquiry 12:97–125. Makridakis, S. and Wheelwright, S.C. (1977) ‘Adaptive filtering: an integrated autoregressive/moving average filter for time series forecasting’, Operations Research Quarterly 28:425–37. Parker, R.C. and Sorensen, A.A. (1978) ‘The tide of rural physicians: the ebb and flow, or why physicians move out of and into small communities’, Medical Care 16:152–66. —— and ——(1979) ‘Physician attitudes toward rural practice: answers and questions that were not asked’, Forum on Medicine 2:411–16. Parker, R.C. and Tuxill, T.G. (1967) ‘The attitudes of physicians toward small-community practice’, Journal of Medical Education 42:327–44. Schwartz, W.B., Newhouse, J.P., Bennett, B.W. and Williams, A.P. (1980) ‘The changing geographic distribution of board certified physicians’, New England Journal of Medicine 303:1032–8. Steinwald, B. and Steinwald, C. (1975) ‘The effect of preceptorship and rural training programs on physicians’ practice location decisions’, Medical Care 13:219.
6 PERSONAL CHARACTERISTICS IN MODELS OF MIGRATION DECISIONS: AN ANALYSIS OF DESTINATION CHOICE IN ECUADOR Mark Ellis and John Odland The destination choice component of migration behavior has generally been analyzed by means of aggregate models, most often gravity models in which the volumes of place-to-place migration flows depend on the population sizes of origins and destinations and on the distances separating them. The human capital perspective on migration (Sjaastad 1962; Molho 1986) indicates, however, that heterogeneity in the migrant population may be associated with heterogeneity in destination choices. Human capital models treat migration behavior as a form of investment undertaken by individuals in order to improve their long-term returns to participation in localized labor markets, or their returns from accessibility to other localized opportunities. Conditions for this kind of investment behavior are likely to vary across members of a heterogeneous population, because of differences in liquidity, because place-to-place variations in returns to labor market participation may differ across individuals with different qualifications, and because interregional differences in returns are calculated over different time horizons for persons of different ages. Most analyses within the human capital framework have concentrated on the decision to leave an origin (Nakosteen and Zimmer 1982; Schaeffer 1985) rather than the choice among an array of destinations. Analyses of the effects of individual characteristics on migration decisions in developing country contexts have also emphasized the effects on outmigration decisions rather than the choice of destination (Brown and Goetz 1987). We analyze the effects of some personal characteristics on destination choice in this paper, by fitting a series of origin-specific destination choice models using disaggregate data for migration flows among the cantones of Ecuador during the 1971–4 period. Our model of destination choice is constructed by applying the expansion method (Casetti 1972, 1982) to an initial model of destination choice which has the same functional form as the origin-specific gravity model. This model, however, is derived on the basis of random utility theory and is one component of a general model of migration decisions which includes the decision to move as well as the choice of destination (Moss 1979; Odland and Ellis 1987). The functional equivalence of the destination choice component of the model with the gravity model has been demonstrated by Anas (1983).
ANALYSIS OF DESTINATION CHOICE IN ECUADOR 89
We expand this model in two stages in order to examine individual differentiation in the destination choices of Ecuadorian migrants. A distinction between the attractions of urban and rural destinations has been central in models of migration in developing countries including the Harris-Todaro model (Harris and Todaro 1970; Todaro 1976) and the first stage of the expansion incorporates the distinction between rural and urban destinations. This expansion leads to particular problems in the case of Ecuador because urbanization and population size are strongly collinear over the upper part of the range of regional population sizes, thus making it impossible to estimate separate parameters for the effects of urbanization and population size. Complications of this kind are not infrequent in analyses of data from restricted geographic contexts and we adapt the expansion method in a way which makes it possible to estimate distinct effects for the population sizes of urban and rural destinations for the lower part of the range of population sizes. The upper part of the range, which only contains destinations that are urbanized, is represented by a single categorical variable. Finally, the coefficients of this model are expanded in order to examine the effects of age and gender on destination choice. A DISAGGREGATE MODEL OF MIGRATION BEHAVIOR A general model of disaggregate migration behavior can be established within the random utility formalism by specifying the utilities for alternative destination regions as random variables. Residents of an origin region assign utilities to the members of a choice set Ri, which includes the set of possible destination regions as well as the origin. This assignment of utilities may depend on the objective characteristics of the regions, such as their distance from the origin, or characteristics of the decision-makers, such as their ages. The assignment of utilities also includes stochastic components which leave some uncertainty about the utility assigned to each region. The utility of destination j, for a resident of origin i, may be written as (6.1) where the vectors zj and zij contain observed characteristics of the regions and the decision-makers, and the utility uij of region j depends on functions of those characteristics, v(zj) and v(zij), and also on the corresponding error terms ej and eij. The term v(zj) is a component of the utility of destination j which is independent of the origin of the decision-maker (and ej is a corresponding error term), while v(zij) is an origin-specific component of the utility of the destination. The arguments of this component, zij, are likely to include the distance between the origin and destination j. The utility of the origin, as one of the possible locational choices, reduces to
90 M.ELLIS AND J.ODLAND
Residents of origin i presumably make a locational decision by selecting the region in Ri where utility is maximized, but that choice is uncertain because the error terms make the uij a set of random variables. Consequently the choice of destinations is analyzed in terms of the probability, or the odds, of choosing alternative j from the choice set Ri. This is merely the probability that the value of the random variable uij exceeds the values of all other random variables corresponding to the utilities of regions in Ri. That probability depends on values in zi and zij as well as an associated set of parameters which measures the importance of those variables in the assignment of utilities; but it also depends on the distributions of the error terms ei and eij. Alternative functional forms for models of the probabilities of selecting alternatives can be derived from alternative assumptions about the distributions of these error terms (McFadden 1981). It is useful to notice that the probability of migrating from origin i and selecting destination j can be written as the product of two probabilities, for leaving the origin and for choosing destination j: where the probability p(m|i) of outmigration may depend on the set of utilities for the origin and all destinations, and the probabilities p(j|m, i) for destination choice given outmigration from origin i depend on the utilities of the set of destinations (excluding the origin). A tractable functional form, the nested logit model, can be derived by assuming that ej and eij are independent; that the eij are independently and identically Gumbel-distributed; that the variance of the ej is zero; and that the maximum of ui and uij is also Gumbel-distributed (Ben-Akiva and Lerman 1985:286–7). These assumptions lead to (6.2) for the probability of outmigration, where the summations are over destinations and ø is a parameter whose range of values is 0′ ø<1. The probability of choosing destination j, given outmigration from region i, is (6.3) where the summation is over the set of destinations, excluding the origin. We are primarily interested in destination choice in this analysis and the equation for p(j|m, i) can be written as an initial model which has the same functional form as the familiar gravity model by specifying population size nj as a characteristic of destinations and distance dij as a characteristic of origindestination pairs. That is
where β 1 and β 2 are parameters. The model for destination choice then becomes
ANALYSIS OF DESTINATION CHOICE IN ECUADOR 91
(6.4) which corresponds to an origin-specific gravity model. It is necessary to fit the model to observed odds ratios rather than probabilities and the model can be rearranged to form a model for the odds against choosing destination j: (6.5) be the number of Let Fj be the number of migrants moving from i to j and let migrants moving from i to alternative destinations. Then an observed value of is an estimate of the underlying probability odds . Substituting and taking logarithms yields a linear form where ej is an error which includes the errors associated with using the observed as estimates of the underlying probability odds. The term is a constant for any origin so the model may be rewritten as (6.6) This form of the model can be fitted by generalized least squares (Wrigley 1985). The constant term b0 has a particular interpretation in this model, however, because it also appears in the equation for the decision to move, where it is interpreted as the expected maximum utility of the set of destinations, for a potential migrant from origin i. Hence the expected utility available from the set of destinations may condition the decision to leave the origin, to a degree that is controlled by the parameter ø. The value ø=1 corresponds to the special case of a nonnested logit model (Moss 1979). EXPANDING PARAMETERS WITH A RESTRICTED CHOICE SET The Ecuadorian data report movements among the 113 mainland cantones of Ecuador in the 1971–4 period for a sample of 760, 764 individuals (we exclude movements to and from the Galapagos). Migration is defined as movement from one cantone to another and variables such as population and distance are calculated for the cantones. The set of origin-specific destination choice models is defined by using the nineteen mainland provinces as origins, however. Each province contains several cantones and so distances from an origin province to a particular destination may assume several values, depending on the cantone of origin, and the analysis includes flows within the origin provinces. The expansion of the parameters of the initial model takes place in two stages. The first step takes account of a qualitative distinction between rural and urban destinations, while the second enlarges the parameters to accommodate
92 M.ELLIS AND J.ODLAND
differences in migration behavior associated with the personal characteristics of migrants. The distinction between urban and rural destinations has assumed considerable importance in research on migration in developing countries, where migration is an important component of rapid urbanization. The Harris-Todaro model explains rapid urbanization as a consequence of persistent differentials in wages between urban and rural labor markets. Where these differentials exist they may be more attractive to younger people who expect to benefit from the enhanced urban wages over a longer working life. Research in Latin America has also indicated that migration streams to urban areas contain disproportionate numbers of women, possibly because of the availability of service employment in cities (Fields 1982). The destination choice model could be expanded to accommodate a distinction between rural and urban destinations by classifying destinations as urban or rural and defining each of the parameters as a function of a binary variable which indicates urbanization. In terms of the disaggregate choice framework this amounts to enlarging the choice sets of the decision-makers from a set in which alternatives are defined by their distance and population to one in which alternatives are defined by distance, population, and urbanization. The choice set for migrants within Ecuador is restricted, however, to a subset of the possibilities implied by this expansion. Ecuador includes two large cities of approximately 1 million population, Quito and Quayaquil, and no other destinations of more than 250,000, although the populations of both urban and rural cantones range up to the latter figure. Consequently choices of destinations with more than 250,000 population are restricted to alternatives which are also urban places. This restriction on the choice set also restricts the set of parameters that can be estimated on the basis of the behavior of Ecuadorian migrants. Since the only destinations of more than 250,000 are also urban destinations the two variables are strongly collinear over the full range of population sizes and separate coefficients for population size and urbanization cannot be estimated on the basis of an expanded model where the only distinctions are between rural and urban places. Distinct coefficients can be obtained for population size and urbanization over the range of population sizes up to 250,000, however, by representing urbanization in terms of three categories and expanding the model such that it contains no distinct population effect for the largest cities. Quito and Quayaquil are classified as ‘metropolitan areas’ by means of a variable which is coded Mj =1 for these destinations and Mj=0 for others. The remaining cantones are classified as urban if more than 50 percent of their population resides in urban places and this distinction is represented by a dummy variable coded as Uj=1 for urban cantones and Uj =0 for rural and metropolitan cantones. Similarly, a third dummy variable is coded Rj=1 for rural cantones and Rj=0 otherwise. Three dummy variables are needed to represent the three categories because the model is expanded in terms of the sums of these variables.
ANALYSIS OF DESTINATION CHOICE IN ECUADOR 93
The parameters of the destination choice model are then expanded in a way which excludes the unmeasurable effect of population size for the two large cities:
These coefficients include a coefficient for the attraction o the two metropolitan areas (β 01); a coefficient for population size (β 10) which applies to nonmetropolitan destinations; and a coefficient (β 11) for urbanization which applies to urban places other than the two metropolitan areas. This kind of expansion divides the choice set of destinations into three subsets: a subset of metropolitan areas where no separate parameter for population size can be estimated, and rural and urban subsets where this parameter can be estimated. Since only two metropolitan destinations are defined there is not enough variation in the distances from any origin to estimate an interaction effect between distance and the metropolitan category in origin-specific models. A single parameter for interactions between distance and nonrural destinations is obtained by expanding the distance parameter as Collecting these terms gives a model which is expanded to take account of urbanization in a restricted choice set, but not expanded to account for variation in the individual characteristics of decision-makers:
Each of the model’s variables, distance, population, and urbanization, may have different effects on the utilities of alternative destinations for different subpopulations of decision-makers and the coefficients can be expanded further to identify interactions with individual characteristics. Individuals are crossclassified by gender and by their age in 1971, with four age categories, 0–15 years, 15–20 years, 20– 25 years, and over 25 years. Each of the parameters above then assumes the form Table 6.1 Coefficients of the terminal model for age categories of males Effects in the initial model
Age categories
0–15
15–20
20–25
Over 25
′ 0000
′ 0000+′ 0001
′ 0000+′ 0002
′ 0000+′ 0003
′ 0000+′ 0002 +′ 0100+′ 0102
′ 0000+′ 0003 +′ 0100+′ 0103
Constant Urban and rural destinations Metropolitan destinations Population
′ 0000+′ 0100 ′ 0000+′ 0001 +′ 0100+′ 0101
94 M.ELLIS AND J.ODLAND
Effects in the initial model
Age categories
0–15
15–20
20–25
Over 25
Rural destinations Urban destinations Distance Rural destinations Urban and metropolitan destinations
′ l000
′ l000+′ l00l
′ 1000+′ 1002
′ 1000+′ 1003
′ 1000+′ 1100 ′ l000+′ l001+′ 1100 +′ ll0l
′ l000+′ l002+′ 1100 +′ ll02
′ 1000+′ 1003 +′ 1100+′ 1103
′ 2000
′ 2000+′ 2002
′ 2000+′ 2003
′ 2000+′ 2002 +′ 2100+′ 2102
′ 2000+′ 2003 +′ 2100+′ 2103
′ 2000+′ 2001
′ 2000+′ 2100 ′ 2000+′ 2001 +′ 2100+′ 2101
where F, A1, A2, and A3 are binary variables: F=1 indicates that an individual is female; A1=1 indicates the age category 15–20 years, A2=1 indicates 20–25 years, and A3=1 indicates more than 25 years. The terminal model has forty-eight parameters and one such model is fitted for each of the nineteen origin provinces. Since the initial model, equation (6.6), is expanded solely in terms of categorical variables, however, it can be presented as a set of eight models, one for each category of individual. The model for each category of individual has six parameters, with each parameter taking on one of two distinct values depending on the destination type. A listing of the expanded parameters for males is arranged to correspond with the variables in the initial model in Table 6.1. The expanded parameters for females include additional terms for gender (β ij10), and for interactions between gender and age (β ij11, β ij12, or β ij13). ESTIMATING THE PARAMETERS Parameter estimates are obtained for each origin-specific model on the basis of a multiway table which represents the choice set for each origin. Each entry in the table is a subset of alternatives characterized by similar distances, populations, and degrees of urbanization. Distance is categorized in a way which reflects the varying location of the origins with respect to the set of destinations. Distances from each origin are categorized into ten distance bands, each containing the same number of destinations but with differing radii depending on the location of the origin. Consequently, distance bands are narrower for origins in the central part of the country than for origins near the periphery. Destination populations are divided into four categories containing equal numbers of destinations. Choices
ANALYSIS OF DESTINATION CHOICE IN ECUADOR 95
among pairs of these alternatives are characterized by a binomial sampling error and the parameters are fitted by generalized least squares in order to account for the heteroskedasticity introduced by this sampling situation. Expansion of the table to account for personal variables yields, for each origin, a table with 240 cells (corresponding to three levels of urbanization, four categories of population, ten distance levels to rural destinations, and ten distance levels to urban or metropolitan destinations) which contains forty structural zeros corresponding to intersections of population size and the metropolitan category. The size of the Ecuadorian sample, which contains 78, 641 inter-cantone movers, is sufficient to provide estimates for the frequencies for most categories for most origins, although the tables do contain a small number of sampling zeros. These sampling zeros occur where the sample for a particular origin does not contain any migrants who chose a destination corresponding to a particular distance band, population size, and level of urbanization. These sampling zeros are removed by substituting pseudo-Bayesian estimates for each of the destination choice frequencies (Bishop et al. 1975:401– 33). Each pseudo-Bayesian estimate is
where x is the observed frequency, y is a Bayesian prior probability for the same frequency, N is the total for the entire table, and K is calculated as
with the summation over the entire set of categories. The Bayesian priors, β , are obtained as the national level frequencies for each destination type. This procedure eliminates the sampling zeros and produces very minor changes in the estimates of the other frequencies. DESTINATION CHOICE IN ECUADOR, 1971–4 Parameter estimates for each origin are shown in Tables 6.2 (a), 6.2 (b), and 6.2 (c), where each table contains estimates for provinces within the three major regions of Ecuador, the Costa, Sierra, and Oriente. The origin-specific model includes a total of forty-eight parameters but, for any particular origin, some of these parameters assume values which are not significantly different from zero. The rows of the tables correspond to only those parameters which assume nonzero values for at least one origin within the region. The main effects, which correspond to the initial model, indicate the usual distance decay effect, although there is considerable variation in the parameter across origins. (Positive parameter estimates correspond to a distance decay effect because the model is calculated for the odds against choosing a destination.) The parameter for population size is significant but positive for most origins, indicating that the odds against choosing a destination are greater for more
96 M.ELLIS AND J.ODLAND
populous destinations. For some origins, this effect is reversed for urban destinations, where the parameter has the expected negative sign with respect to population. The unexpected direction of the population effect for some origins may result from the classification of the most populous destinations into a separate category of metropolitan places. The parameter for metropolitan destinations assumes a significant value in most cases and its value indicates that substantial portions of migrants from most origins choose one of the two metropolitan destinations. The large values for this parameter for two of the provinces in the Oriente indicate that migrants from these origins are less likely to choose metropolitan destinations than are migrants from other origins. The effects of distance, population, and urbanization are differentiated over subpopulations of differing ages and genders although this differentiation is not uniform across origins. Some such differentiation of the main effects over different categories of age and gender occurs in fourteen of the nineteen origins and, for a few provinces, age and gender interactions indicate that females and younger migrants are more likely than others to choose metropolitan destinations. Other interactions indicate that distance decay effects for some origins are less pronounced in some age or gender categories, especially with respect to distance to urban or metropolitan destinations; and there is some age-gender differentiation in the responses to population size. Table 6.2(a) Parameter estimates for origins in the Costa region Effect: Parameter
Constant Main: ′ 0000 Age: ′ 0001 Age: ′ 0002 Age and gender: ′ 0011 Metropolitan destinations Main: ′ 0100 Age: ′ 0101 Age: ′ 0102 Population Main: ′ 1000 Age: ′ 1002 Urban and population Main: ′ 1100 Distance Main: ′ 2000
Provinces Esmaraldas El Oro
Guayas
Los Rios Manabi
2.7166
2.6122 0.3840 0.2519
3.8544
3.3168
2.7668 0.2292
–1.2994
–0.8466 –0.5725 –0.7477
–1.3641
–1.8790
–1.9825 –0.6796
0.0125 0.0056
0.0127
–0.0062
0.0060
–0.0035
–0.0235
–0.2025
0.2024
0.2389
0.2284
0.7078
0.5229
0.5004
ANALYSIS OF DESTINATION CHOICE IN ECUADOR 97
Effect: Parameter
Provinces Esmaraldas El Oro
Urban/metropolitan destination and distance Main: ′ 2100 Age: ′ 2101 Age and gender: ′ 2112
–1.2075
Guayas
Los Rios Manabi
0.3850 –0.1717
–0.1880
Table 6.2(b) Parameter estimates for origins in the Sierra region Effect: Parameter Constant Main: ′ 0000 Metropolitan destinations Main: ′ 0100 Population Main: ′ 1000 Gender: ′ 1010 Age: ′ 1001 Age: ′ 1002 Age and gender: ′ 1011 Urban and population Main: ′ 1100 Gender: ′ 1110 Age: ′ 1102 Distance Main: ′ 2000 Age: ′ 2001 Age and gender: ′ 2011 Urban/metropolitan destination and distance Main: ′ 2100 Gender: ′ 2110 Age: ′ 2101
Constant Main: ′ 0000
Provinces Azuay
Bolivar
Canar
Carchi
Chimborazo
1.6295
3.3579
2.9395
2.8837
3.3300
0.3554
–1.9982
–1.9376
–1.5450
–1.7306
0.0205
–0.0055
–0.0159
0 0034
0.0045 0.0130 0.0096 –0.0099 –0.0068
0.5719
–0.0058 0.0167 –0.0125 –0.0087 –0.1490 0.2409
0.1558
1.1176
0.9150
–0.5549
–0.3690
0.5052 –0.1895 0.3275
–0.0068
0.1900
0.4240
–0.1407
Cotopaxi Imbabura
Loja
Pichincha Tungurahua
3.0405
1.4723
3.0825
3.0710
2.3528
98 M.ELLIS AND J.ODLAND
Gender: ′ 0010 Metropolitan destinations Main: ′ 0100 Gender: ′ 0110 Age: ′ 0101 Age and gender: ′ 0103 Population Main: ′ 1000 Gender: ′ 1010 Urban and population Main: ′ 1100 Gender: ′ 1110 Age: ′ 1101 Age: ′ 1102 Age and gender: ′ 1111 Distance Main: ′ 2000 Gender: ′ 2021 Urban/metropolitan destination and distance Main: ′ 2100 Gender: ′ 2110
–0.2642
–2.6921 –0.5579
–1.4362
–1.0450 –0.3853 –0.5371 0.6384
–0.0108
0.0115
0.0098
–0.0080 –0.0166 0.0140
0.0083 0.0188 0.0111 –0.0154 –0.0073 0.0151
0.4189 0.3240
0.6197
1.0778
–0.5431 0.0986
0.4485
0.2615
Table 6.2(c) Parameter estimates for origins in the Oriente region Effect: Parameter
Constant Main: ′ 0000 Metropolitan destinations Main: ′ 0100 Population Main: ′ 1000 Urban and population Main: ′ 1000 Age: ′ 1101 Age and gender: ′ 1111 Distance
Provinces Morona Santiago Napo
Pastaza
Zamora Chinchipe
–0.4194
0.4922
1.1578
1.3971
2.8764
2.8927
1.6071
0.0583
0.0385
0.0183
–0.0199
0.0179
0.0087 –0.0076 –0.0246 0.0348
ANALYSIS OF DESTINATION CHOICE IN ECUADOR 99
Effect: Parameter
Main: ′ 2000 Urban/metropolitan destination and distance Main: ′ 2100
Provinces Morona Santiago Napo
Pastaza
Zamora Chinchipe
1.0804
0.5856
0.1354
0.4922
0.4795
CONCLUSIONS The interactions between personal and place-related variables in migration flows from most of the origins provide some support for the hypothesis that the destination choices of Ecuadorian migrants are conditioned by individual characteristics along the lines suggested by human capital models. The placerelated variables incorporated in the initial model affect destination choices from each of the origins but their effects are modified for most of the origins through interactions with the personal variables, age and gender. These migration decisions take place with respect to a set of alternatives which is restricted because the set of destinations within Ecuador does not provide a full range of the possible combinations of population size, urbanization, and distance; but the expansion method has been adapted to deal with choice behavior within this restricted framework. The patterns of interaction between personal and place characteristics are not uniform across origins, however, but vary substantially from place to place within Ecuador. Research by Brown and others (Brown and Jones 1985; Brown and Goetz 1987) has established that patterns of migration behavior are likely to differ considerably from place to place within a single country, in response to spatial variation in the nature of development processes. These interregional variations in the contexts of migration decisions apparently extend to the associations between personal variables and migration decisionmaking. ACKNOWLEDGMENTS This research was supported by the National Science Foundation, Grant SES 84– 19923. We are grateful to Lawrence A.Brown for providing the data, which he obtained with the support of National Science Foundation Grant SES 80–24565, and to Victoria A.Lawson for assistance with the data. REFERENCES Anas, A. (1983) ‘Discrete choice theory, information theory and multinomial logit and gravity models’, Transportation Research B17:13–23.
100 M.ELLIS AND J.ODLAND
Ben-Akiva, M. and Lerman, S. (1985) Discrete Choice Analysis: Theory and Application to Travel Demand, Cambridge, MA: MIT Press. Bishop, Y.Y., Fienberg, S.E. and Holland, P.W. (1975) Discrete Multivariate Analysis: Theory and Practice, Cambridge, MA: MIT Press. Brown, L.A. and Goetz, A.R. (1987) ‘Development-related contextual effects and individual attributes in Third World migration processes: a Venezuelan example’, Demography 24:497–516. Brown, L.A. and Jones, J.P. (1985) ‘Spatial variation in migration processes and development: a Costa Rican example of conventional modeling augmented by the expansion method’, Demography 22:327–52. Casetti, E. (1972) ‘Generating models by the expansion method: applications to geographic research’, Geographical Analysis 4:81–91. ——(1982) ‘Mathematical modeling and the expansion method’, in R.B.Mandel (ed.) Statistics for Geographers and Social Scientists, pp. 81–95, New Delhi: Concept Publishing. Fields, G.S. (1982) ‘Place-to-place migration in Colombia’, Economic Development and Cultural Change 30:539–58. Harris, J.R. and Todaro, M.P. (1970) ‘Migration, unemployment and development: a twosector analysis’, American Economic Review 60:126–42. McFadden, D. (1981) ‘Econometric models of probabilistic choice’, in C.Manski and D.McFadden (eds) Structural Analysis of Discrete Data, pp. 198–272, Cambridge, MA: MIT Press. Molho, I. (1986) ‘Theories of migration: a review’, Scottish Journal of Political Economy 33:396–419. Moss, W.G. (1979) ‘A note on individual choice models of migration’, Regional Science and Urban Economics 9:333–43. Nakosteen, R.A. and Zimmer, M. (1982) ‘The effects on earnings of interregional and interindustry migration’, Journal of Regional Science 22:325–41. Odland, J. and Ellis, M. (1987) ‘Disaggregate migration behavior and the volume of interregional migration’, Geographical Analysis 19:111–24. Schaeffer, P. (1985) ‘Human capital accumulation and job mobility’, Journal of Regional Science 25:103–14. Sjaastad, L.A. (1962) ‘The costs and returns of human migration’, Journal of Political Economy 70:80–93. Todaro, M.P. (1976) Internal Migration in Developing Countries: A Review of Theory, Evidence, Methodology and Research Priorities, Geneva: International Labor Organization. Wrigley, N. (1985) Categorical Data Analysis for Geographers and Environmental Scientists, New York: Longman.
7 ALTERNATIVE APPROACHES TO THE STUDY OF METROPOLITAN DECENTRALIZATION Shaul Krakover
Urban decentralization is one of the most powerful trends shaping our built environments. Yet traditional methods used for tracing this trend have concentrated on the relative weakening of the central city, instead of the resultant restructuring of the entire urban space. In this paper a methodology designed to encompass the structural changes occurring throughout the urban area, however defined, is developed and demonstrated. The suggested methodology is compared with the performance of previously used methods, utilizing data for the Tel Aviv, Israel, metropolitan region. The definition of the term decentralization is surrounded by a great deal of confusion. Some use this term to denote declines in the density of population, employment, or establishments in central cities (Mills 1970), while others restrict its use to the actual movement of population or establishments away from the central cities to their respective suburbs (Moses and Williamson 1967; Muller 1981). Another confusion involves the differentiation between the terms decentralization and deconcentration. Berry and Kasarda (1977) suggest letting the latter term denote decreases in central city densities while devoting the former to specify faster growth rates occurring in outer urban units relative to the growth that prevails in the central city. Although, etymologically, decentralization means negation of the center or away from the center, geographically it seems inconceivable to consider the changes in the center without examining consequent changes in the rest of the urban area. This study adopts Berry and Kasarda’s (1977) definition for decentralization. According to this definition it is not necessary that each and every individual or establishment actually move from the central urban unit to its outer parts; decentralization is produced even by those who move to the suburbs from other urban areas or from rural ones. It occurs through locational decisions that are directed toward outer units rather than toward the central part of the urban area. Thus, taken in its wider meaning, decentralization reflects locational preferences rather than actual central city-to-suburb migration. This interpretation is not incongruent with the definition suggested by Johnston et al. (1981) in The Dictionary of Human Geography.
102 S.KRAKOVER
Within the context set by the selected definition, four methods of tracing and measuring decentralization are identified. Two of these methods have been in use for a long time: (a) urban-suburban dichotomies, and (b) distance bands. The other two methods are relatively recent; they involve an application of the expansion method (Casetti 1972). The two more recent methods are (c) distance expansions and (d) trend surface expansions (Krakover 1986). The objective of this study is to compare the performances of these four methods. Each method is presented and applied to population growth in the urban region of Tel Aviv, Israel. The performances of the four methods are evaluated against a set of criteria derived from the adopted definition. Major emphasis is given to the applicability of the results obtained by each method to urban planning and population projection. CRITICAL REVIEW OF DECENTRALIZATION LITERATURE The dynamics of urban decentralization have been observed as early as the end of the nineteenth century (Weber 1899). Since then, hundreds of scientific studies have been devoted to the study of the decentralization phenomenon. The great majority of these studies apply the urban-suburban dichotomous method of comparing growth and change in the central city unit to respective measurements applied to the rest of the urban area (e.g. Harris 1943; Kain 1968; Berry and Cohen 1973; Phillips and Vidal 1983). Because of the poor spatial resolution of this approach, most of these studies couched their conclusions in terms of general trends which were revealed by examining large samples of metropolitan areas. Using formal notation, this method assumes that decentralization trends are prevalent if (7.1) where G indicates growth, cc is central city, sub is suburbs, and t is the time span. Such studies of decentralization, even those using sophisticated research methodologies (e.g. Steiness 1982), contribute little to our understanding of the spatial reorganization trends taking place within any specific metropolitan area. They do highlight the pervasiveness of decentralization trends and are an entry into the search for identifying causal relationships between propelling factors and decentralization; herein lies their utility to urban planning. The use of the primitive dichotomous definition could have been justified during the period when suburban sprawl was directed toward homogeneous, mainly residential subdivisions; today, however, the massive geographical expansion and growing spatial and structural heterogeneity of the suburbs (Erickson 1983) have left continued application of this approach inadequate, unless investigation is limited to the fate of central cities only. Studies using the second method, identified as the distance bands approach, do explicitly consider the suburbs’ spatial dimensions, though usually in a simplistic
THE STUDY OF METROPOLITAN DECENTRALIZATION 103
manner. In this method, the whole metropolitan space is subdivided into several rings (rl, r2,…, rn), for which growth levels are calculated and compared. A result like that described by (7.2) will lead to the conclusion that decentralization reached out to a distance as far as the second ring outside the central city. This method was applied by Hawley (1956) wherein he subdivided metropolitan areas into concentric rings of five miles each, to a distance of thirtyfive miles from the city center. Most other studies adopted this approach less meticulously, subdividing the urban space into three or four wide sections: central city, inner ring of counties or settlement units, and one or more outer rings (e.g. Hall and Hay 1980; Morrill 1985). According to this method aggregate values of growth for entire rings are compared in order to determine the geographical extent of decentralization. The distance bands method has several shortcomings: (a) it averages out within-ring variations—azimuthal variations are ignored and the effect of extreme observations is reduced—and (b) because it employs a discrete, stepped-distance function, the results obtained may vary according to the width of the distance bands. Like the previous method, distance bands may have some general planning implications. Metropolitan level planners may take into consideration the overall trends of growth and decentralization, while local planning agencies may find it rewarding to compare their local records with the averages obtained via the distance bands analysis. Some of the shortcomings of the distance bands method are remedied by the third method—the distance expansion model. This kind of model has been suggested conceptually by Blumenfeld (1954) and Boyce (1966), and applied empirically by Newling (1969) and Casetti (1973) in the realm of deconcentration and by Casetti et al. (1971), Lamb (1975), Krakover (1983, 1984, 1985), and Kellerman and Krakover (1986) in the realm of decentralization. This method, unlike the previous one, does not aggregate geographic observations. Instead, it takes into account each subunit and follows it through to the end of the analysis. In this way, each locale that enters the analysis as an input is independently identifiable at the output. The distance expansion model requires that all geographic observations be arranged according to their distances to the central city. Then, a theoretically compatible, distance-related initial model is selected. One possible form of such an initial model examines growth G at any distance d from the central city; a polynomial power series of distance D is raised to a theoretically or empirically justifiable power i as in (7.3) This initial model can be independently fitted to data pertaining to any specific point in time. The mathematical or graphical results for several time periods can be compared and analyzed. When this approach is selected, the temporal trends are
104 S.KRAKOVER
judged exogenous to the model, and cross-model statistical testing is not carried out. Alternatively, using the expansion method to expand the initial model’s parameters bi as functions of time T yields a set of expansion equations for the bi (i=0, 1,…, m) parameters: (7.4) In the expansion equations the temporal trend T is given the flexibility of being raised to any required power j. When substituted into the initial model, equation (7.3), they produce the following: (7.5) When this terminal model is applied to an appropriate data matrix (e.g. population growth in a geographically subdivided metropolitan region), it yields parameters aij which enable the portrayal of a sequence of varying-distance fitted curves, one curve for each time period entered into the analysis. The estimated parameters, along with the graphic presentation of the structure of growth as portrayed by such curves, provide a wealth of information with regard to the organization and reorganization of growth in urban areas (e.g. Kellerman and Krakover 1986). The distance expansion model has several advantages over the previous distance bands method: (a) it provides continuous, rather than stepped, distance functions; thus, for planning purposes, peaks of growth, troughs of growth, or any generalized level of growth may be identified at a point along the crosssection rather than as a band; (b) each observation is entered independently as an input; (c) the estimated level of growth is directly obtainable for each geographical subunit; and (d) the expansion of the distance function parameters by time introduces a dynamic factor of spatial-temporal interconnectedness which allows for endogenous evaluations of changes through time. A major setback which cannot be overcome by the distance expansion model pertains to its inability to capture azimuthal variations. This problem can be solved by the temporal expansion of trend surfaces (TETS). The advantages and weaknesses of this model are discussed in detail in later sections. At this point it is acknowledged that this method is based on an initial trend surface model expanded by the temporal dimension to create a four-variable spatio-temporal terminal model. A model of four-dimensional trend surfaces has been proposed by Harbaugh (1964) in the field of geology. Geographical applications of spatiotemporal trend surface models are rare. Noted among these, though not in a TETS framework, are the works by Tobler (1970) and Bennett (1975). A TEMPORALLY EXPANDED TREND SURFACE MODEL OF DECENTRALIZATION The above discussion has suggested four possible methodologies for the investigation of the decentralization process. Before presenting the specific
THE STUDY OF METROPOLITAN DECENTRALIZATION 105
spatio-temporal model used in this study, it is worthwhile reviewing the definition suggested by Berry and Kasarda (1977). They define decentralization as present when outer units of the metropolis are growing faster than the central city. It was shown that, when this definition is applied via the urban-suburban dichotomy or the distance bands methods, it provided planners and urban analysts with only a minimal amount of information. In order to supply planners with more information in what might be perceived as a wave-like, expanding, multicentric metropolitan system, Berry and Kasarda’s basic definition must be modified explicitly to include several additional components associated with decentralization in such metropolitan spaces (Krakover 1986; Krakover and Kellerman 1990). Metropolitan regions are typically subdivided into smaller geographical units, each with its own centroid location. Since units are located in different azimuths relative to the central city, and since each unit may reach different levels of growth at different times, the modified definition should enable the identification of each unit in its x, y location and the intensity of its growth at any desired time period. If a model is designed in such a manner that these properties are identified throughout the analysis, it will be possible to derive several other useful measurements of decentralization, such as the location of growing subcenters, changes in their location, changes in the distances between points of certain attributes, paces of growth in time at specified critical locations, paces of spatial shifts, and even directional shifts from spread to backwash and vice versa. Hence, the definition should be modified to state that decentralization is identified to one or more directions around a central city if uniquely identified outer units at that direction are growing faster than uniquely identified units located closer to the central city, inclusive. To achieve the goals set by this definition, a temporally expanded trend surface model is proposed. The model is constructed through the following procedure. First, an initial trend surface model of growth G is specified in an x, y planar coordinate system: (7.6) As is the case with trend surfaces, this initial model partitions the actual observations of growth into two components: one represents the general spatial trend of growth across the metropolitan space; the second indicates nonconformance with the general trends. Comparison of individual trend surfaces pertaining to the same space for different periods in time is a tedious task. However, if parameters bik of the initial model (7.6) are expanded in time T, a continuously developing trend surface is obtained and its temporal changes can be derived for any specific location. The expansion equations, specified for i =0, 1,…, m and k=0, 1,…, l, are (7.7)
106 S.KRAKOVER
Insertion of these equations into the initial model (7.6) yields a temporally expanded trend surface (TETS) function as the terminal model: (7.8) This TETS is a spatio-temporal polynomial power series model that can be viewed as a modification of the previous distance expansion model (7.3) in the sense that the single distance vector D is now separated into its two components, X and Y coordinates. This modification utilizes an initial trend surface function, defined by its x, y planar coordinates, expanded in the time dimension to create a terminal model which allows estimating growth in the X, Y, T dimensions. If this model is applied to an appropriate set of data, the structure of the estimated growth levels G can reveal a wealth of information regarding the pattern of decentralization in any selected metropolitan region. An expanded polynomial power series is not, of course, the only model that can be fitted to a set of three-dimensionally arranged data. Furthermore, other models may produce different structural results. A comparative analysis of results obtained via different models or model specifications is clearly a theme for another discussion. However, sufficient guidelines toward the selection of an appropriate model are usually provided by a priori theoretical compatibility and a posteriori success in accounting for a high proportion of the embedded variability. Alternatively, the degree of the polynomials used in the estimation can be determined by tests of statistical hypotheses. In this case a degree is selected when the parameters of higher degree polynomial terms are not significantly different from zero and when the null hypothesis of no spatial autocorrelation cannot be rejected (Norcliffe 1969; Cliff and Ord 1973). In the rest of this paper the four aforementioned methods for tracing decentralization will be applied to a spatio-temporal data set for the metropolitan region of Tel Aviv. The purpose of this application is to examine and compare the effectiveness of these tools in describing decentralization trends for planning purposes. APPLICATION OF DECENTRALIZATION TRACING METHODS The Tel Aviv urban region is the largest population concentration in Israel. The early stages of decentralization in this region can be traced back to the turn of this century when people started to move out of the old port town of Jaffa. This process gathered momentum with the establishment of the Jewish Ahuzat Bayit suburb in 1909. This neighborhood served as a nucleus for the attraction of many more settlers who created the core city of Tel Aviv. During the same period more than twenty other Jewish settlements were established in a radius of about twenty miles around Tel Aviv, setting the stage for the would-be suburban satellites of the metropolis.
THE STUDY OF METROPOLITAN DECENTRALIZATION 107
Following the establishment of the State of Israel in 1948, the town of Jaffa was annexed to Tel Aviv to form the central city of the developing metropolitan area. Since then, the metropolis of Tel Aviv has grown in population, employment, and areal extent. The growth, however, has never been evenly distributed. Studies by Shachar (1975) and Krakover (1985) have found clear evidence for a wave-like expansion of population concentration and population growth, respectively. This expansion is accompanied by a relative strengthening of suburban subcenters as reflected by the suburban concentration of population and economic activities (Dehan 1984). This pattern of development suggests a process equivalent to that found in many Western metropolitan areas (Muller 1981; van den Berg et al. 1982; Erickson 1983). This process is motivated economically by shifts from agglomeration economies to diseconomies and is expressed spatially by transformation from a uni-centric to a multicentric metropolitan structure (Richardson 1978; Krakover 1984). For analytic purposes, such changes make necessary the spatial subdivision of the large metropolitan area into many smaller geographical units. These changes also provide justification for the use of a spatio-temporal data matrix and the application of TETS equations. Such equations, if raised to the appropriate powers, can allow the tracing of wave-like patterns of metropolitan growth expansion. The urban region of Tel Aviv as defined for this study is depicted in Figure 7.1. This urban region has been delineated so as to include a number of additional towns on the outskirts of the statistically defined metropolitan area in order to avoid underestimation of the metropolitan expansion trend. The region includes thirty-one urban settlements ranging in population size from 2,000 to above 300,000 for the central city of Tel Aviv. Total population in the region reached about 1.5 million inhabitants in 1980. Population data for these settlements are available annually beginning with the census of 1961 (Central Bureau of Statistics, Israel, various years). These data are used to compare the effectiveness of the four methods outlined above. Population data are judged suitable for this purpose since they provide the most continuous coverage of space and thus closely resemble the continuity assumption implicit in the use of polynomial functions. Prior to comparing the performance of the four methods, the criteria for their evaluation are put forward. As indicated above, the extended definition for decentralization requires that its measurements will provide the following information for as many geographical points as possible: (a) intensity of growth; (b) timing of growth and decline; (c) direction and directional changes in terms of spread and backwash; (d) distance from the central city wherein different indicators of growth or decline are observed; (e) pace or speed of spatial shifts of the growth indicators in terms of miles per unit of time; and (f) the location at which different indicators are observed specified as accurately as possible (Krakover 1986; Krakover and Kellerman 1990). These requirements will be
108 S.KRAKOVER
Figure 7.1 Urban settlements in the urban region of Tel Aviv
further clarified as we proceed with the comparison of the various methods as applied to the case of the Tel Aviv urban region.
THE STUDY OF METROPOLITAN DECENTRALIZATION 109
Figure 7.2 Population development in the Tel Aviv urban region, 1961–80
Urban-suburban dichotomy Figures 7.2–7.4 present population statistics for the urban region of Tel Aviv along with two traditional ways of measuring decentralization within the framework of the urban-suburban dichotomous method. Figure 7.2 shows that the population in the entire urban region grew from a total of 966, 300 to over 1. 5 million over the period 1961–80. This growth, however, was concentrated in the suburbs, as can be seen in Figure 7.3 which shows the changing urbansuburban shares of population. The share held by the central city of Tel Aviv dropped from 40 percent in 1961 to about 22 percent in 1980. The share statistics, however, are too crude to trace the dramatic efforts made by central city officials to maintain its absolute size. These efforts are reflected by the more sensitive measure of growth indices, presented in Figure 7.4. This graph reveals that the central city reached its highest level of population in 1963 and that there were two short periods of slight growth resurgences in 1970 and 1973. It is only since 1971 that the central city started to decline more rapidly. On the other hand, the rest of the urban area enjoyed constant growth ranging between 4 and 6 index points (with the exception of 1973, when the index jumped by 11 points). The overall results for the twenty-year period show that the central city lost about 15 percent of its population while the suburbs more than doubled. In evaluating the results obtained via this method it is clear that data for the suburbs are highly aggregated and that this vast and heterogeneous area is reduced to a single location-less point, represented by a single average figure for every year. It is thus impossible to uncover the azimuthal variation of the population spread, with regard either to measurements of the pace of population spread or to distances at which new developments are taking place. Nevertheless,
110 S.KRAKOVER
Figure 7.3 Shares of population, central city versus suburbs, Tel Aviv urban region, 1961– 80
Figure 7.4 Index of population growth, central city versus suburbs, Tel Aviv urban region, 1961–80
as summarized in Table 7.1, this simple tool is capable of providing initial knowledge regarding the intensity and timing of differential growth in the central city as
THE STUDY OF METROPOLITAN DECENTRALIZATION 111
Table 7.1 Components of decentralization as treated by the four methods Urbansuburban dichotomy
Distance bands Distance methods expansion model
Trend surface expansion model
1 Intensity (+) (+) (+) (+) 2 Time (+) (+) (+) (+) 3 Directional (*) (+) (+) (+) change 4 Distance (–) (*) (+) (+) 5 Pace (–) (–) (+) (+) 6 Azimuthal (–) (–) (–) (+) variations Note: (+), indication possible; (–), indication impossible; (*), indication possible but unlikely or possibly inaccurate.
opposed to the entire suburbs, as well as detecting directional changes in the rare cases where central cities regain faster growth than their adjacent suburbs. The distance bands method The distance bands method requires that geographical observations be grouped into several distance bands, with the central city constituting a group unto itself. In the Tel Aviv urban region a division into four bands, each 10 km wide, shows that growth levels were the highest for the first ring (Rl) until the year 1975; thereafter the second ring (R2) took the lead (Table 7.2). Also, growth in the third ring (R3) surpassed the growth in the first ring after the year 1980. Historically, the figures for Tel Aviv indicate that a crucial turning point occurred around the years 1971–3.1 Along with the drastic decline of the central city which started in 1972, several other changes are observable in the suburbs. First, growth in Rl rose for the first and only time—with a jump of 10.4 index points—from 1972 to 1973. Before this year the average annual growth (in terms of index points) stood at 6.0 points, while after the big jump of 1973 it dropped to an annual average of 3.2 points only. Second, an opposite trend occurred in the second and third rings. In R2, the Table 7.2 Population and population growth in the Tel Aviv urban region by rings: cc, central city; Rl, 1–10 km; R2, 11–20 km; R3, 21–32 km Total population (1,000)
Growth
Year
cc
R1
R2
R3
cc
R1
R2
R3
1961 1962 1963
390 387 394
282 296 312
209 220 230
86 90 94
100.0 99.3 101.2
100.0 105.0 110.6
100.0 105.5 110.2
100.0 105.2 110.0
112 S.KRAKOVER
Total population (1,000)
Growth
Year
cc
R1
R2
R3
cc
R1
R2
R3
1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980
394 392 390 388 385 383 384 383 364 365 360 354 349 343 340 336 335
330 347 363 374 391 412 430 451 468 497 508 517 526 534 542 553 560
241 252 259 267 276 286 296 311 328 352 367 382 399 417 433 449 466
99 102 104 107 109 112 116 120 122 131 137 143 150 156 163 168 173
101.0 100.6 99.9 99.5 98.7 98.2 98.5 98.3 93.3 93.6 92.4 90.7 89.4 88.0 87.2 86.3 85.9
117.1 123.3 128.8 132.6 138.9 146.4 152.8 159.9 166.0 176.4 180.4 183.4 186.8 189.5 192.3 196.2 198.7
115.5 120.5 124.3 127.8 132.0 136.9 142.0 148.8 157.2 168.7 175.8 183.0 191.2 199.6 207.4 215.2 223.0
114.9 118.5 121.0 124.1 127.2 130.8 134.7 139.4 141.7 153.1 160.1 166.8 175.3 182.4 189.6 196.0 202.2
years 1972 and 1973 were marked by an exceptionally high change, 8.4 and 11.5 points respectively. Again, these years mark a clear turning point; in the period preceding 1972 the average annual growth of index points was 4.9, while thereafter it stabilized at around 7.8 points. The same trend is observed in R3: a high growth level of 11.4 index points was reached in 1973, while prior averages stood at 3.8 points and later indices at 7.0 points. A summary of the information which may be achieved by the distance bands method is provided in Table 7.1. Clearly, the results obtained by this method allow for the analysis of the intensity of decentralization and its timing. In addition, if an internal band that first grew slower than an outward one was subsequently to grow faster, it would be possible to detect a directional change from decentralization to centralization. Furthermore, a crude measure of the distance at which decentralization is most intensive may be obtained. However, measurements of the pace at which decentralization spreads outward would be too crude to rely upon and no conclusion regarding azimuthal variations is possible. The major problem of the distance bands method, however, lies with its sensitivity to changes in the width of the bands. A change which affects the distribution of individual observations among the bands will affect the measurements of intensity, timing, distance, and directional change, if applicable. For instance, using bands 12 km wide instead of 10 km in the case of the Tel
THE STUDY OF METROPOLITAN DECENTRALIZATION 113
Figure 7.5 Growth profiles obtained via two different distance bands delineations, Tel Aviv urban region, 1961–80
Aviv urban region results in much lower growth levels in the third ring. A further subdivision of the urban region to eight distance bands, each 4 km wide, provides a remarkably different picture than that presented in Figure 7.5. Because of the high sensitivity of results to the width of the bands, the use of this method cannot be recom mended unless a nonarbitrary subdivision into bands is available. The distance expansion model The results obtained via the distance expansion model provide a solution to each of the aforementioned problems except azimuthal variability. In order to allow for the possibility of two peaks of growth (two growing subcenters) in the suburbs, terminal model (7.5) was applied using a fourth-degree polynomial in the distance dimension D. The temporal dimension T was assigned a second degree, thus allowing for possible turning points from growth to decline or vice versa. The results of the terminal model are depicted in Figure 7.6 for each geographical observation and for each annual reading. The spatial units are presented as a column of dots located with respect to road distance from the city center. Each dot represents a different year, with time usually progressing from the bottom to the top. (The accompanying numerical output does allow for identification of those cases exhibiting temporal turnarounds from growth to
114 S.KRAKOVER
Figure 7.6 Distance-temporal structure of population growth in the Tel Aviv urban region, 1961–80 (reprinted by permission from Urban Studies, 1985)
decline and vice versa.) The curves drawn in Figure 7.6 depict the estimated terminal model by distance for selected years. An evaluation of this method according to the criteria summarized in Table 7.1 yields the following observations.
THE STUDY OF METROPOLITAN DECENTRALIZATION 115
1 The intensity of growth and its timing are independently identifiable for each geographic unit in the analysis. The growth index, however, is not presented in its raw values as calculated from the population data, but rather in the form of estimated growth indices generalized by an expanded ordinary least squares (OLS) analysis. These values have the advantage of showing each observation’s ‘expected’ level of growth with respect to its neighboring peers and their relative distance from the city center. In taking into account the values of neighboring observations, the distance expansion model resembles the distance bands method. However, this model, unlike the previous method, does not use a stepped distance function relying on averages but rather a continuous distance function relying on individual observations. 2 The distance expansion model allows for the extraction of several distance measurements. For instance, the location of the peak and trough points can be calculated for each year (see Figure 7.6). Changes in the location of these points provide measurements of the distance shifts of peak (DSP) and trough (DST) points through time. Tracing the movement of these points over time may lead to the identification of paths of spread and backwash (Krakover 1983). Applying this model helps immediately to identify settlements located behind or in front of the moving crest of peak growth. Other possible measurements are the range of distance between the peak and trough points and the distance at which the curve is above a certain threshold of growth (national, regional, or suburban averages). 3 The distance expansion model is capable of detecting directional changes as well. For instance, between 1961 and 1965 the peak of growth in the Tel Aviv urban region moved inward, while in 1965 it underwent a directional change and started to move outward (Figure 7.6 and Table 7.3). 4 Since the distance expansion method keeps track of timing and distances, it is possible to derive pace measurements. For example, the peak of growth in the Tel Aviv urban region moved from its position at 7.4 km in 1965 to 12.9 km in 1980, an outward spread of 5.5 km in a period of fifteen years. This amounts to an average pace of about 0.366 km per annum. The figures in Table 7.3 reveal that the pace of Table 7.3 Location and shifts of the peak point of growth: Tel Aviv, 1961–80
Loca tion of peak point of
1961
1965
1970
1975
1980
Total 1965 –80
16.4
7.4
8.6
11.3
12.9
–
116 S.KRAKOVER
grow th (km from Tel Aviv ) Dist ance shift of peak (km) Aver age shift per annu m (km)
1961
1965
1970
1975
1980
Total 1965 –80
–
–9.0
1.2
2.7
1.6
5.5
–
–2. 230
0. 240
0. 540
0. 320
0. 366
spread was not constant. The fastest pace occurred from 1970 to 1975, with an annual average rate of 0.540 km, compared with 0.240 km and 0.320 km for the periods 1965–70 and 1975–80, respectively. It is interesting to note that the rapid pace of spread coincides with the period of highest growth discussed in the previous subsection. The capability of the distance expansion model to supply such information makes it a much more effective tool for urban planning purposes than the two previous methods. In addition to the previously described properties, this method benefits from the accompanying statistical information. For instance, the results obtained for the Tel Aviv urban region have an R2 of 0.41. The analysis of residuals reveals that a single town, Bet Dagan, is responsible for a great deal of the unaccounted variability. This town is located only 10 km away from the metropolitan center, yet its development is hampered because it is situated adjacent to Ben-Gurion International Airport. Another portion of the unaccounted for variation is due to the failure of the distance expansion model to cope with azimuthal variations. Although in practice this method performs much better than the distance bands method, both are underlain by an assumption of concentricity, so that the findings for a representative cross-section are equally applicable in all directions around the central city. This is certainly a strong assumption, and one that is mitigated by
THE STUDY OF METROPOLITAN DECENTRALIZATION 117
the use of the trend surface expansion model (for another option see Krakover and Casetti 1988). Trend surface expansion This model uses the x, y locational coordinates of each geographical observation rather than its adirectional distance. This minor modification in the input produces much more complex results as output. At the input stage, terminal model (7.8) is applied to the same data matrix wherein the index of growth of each settlement in the urban region of Tel Aviv is associated with x, y coordinates in space (Tel Aviv is used as the 0, 0 reference point) and a t coordinate in time is inserted via expansion equation (7.7). Both spatial coordinates were raised to the fourth degree. A function of this degree can generate three turning points, thus allowing for the replication of the theoretically assumed multicentric urban structure. This structure is typified by a central city that represents a trough of growth while points of peak growth are permitted in all directions around the central city. Since a period of twenty years is too short to be generalized by more than one population growth turning point, the time dimension in the expansion equation was represented by a quadratic. When terminal model (7.8) is raised to these powers, it results in a terminal equation containing forty-four parameters representing combinations of X, Y, and T plus a constant. This equation was estimated using an ordinary multiple regression program that produced a thirty-five parameter model accounting for 76.6 percent of the variance (nine of the original forty-four polynomial terms were removed from the equation because of tolerance violations). Graphically, the estimated terminal model is capable of producing a multilayer trend surface structure, one surface for every year included in the analysis. Two such surfaces are presented in Figure 7.7. The bottom surface represents the growth trends as reached in 1970 (base year 1961); the upper one depicts the situation in 1980. The advantages obtained by utilizing this model are immediately apparent. 1 Each geographical observation entering the analysis is identifiable at its x, y, t coordinate location. The estimated values for any other location can also be obtained by means of interpolation. 2 The trend surfaces clearly indicate the presence of wide azimuthal variations; while high growth peaks occurred toward the northwest and the southeast sections, low growth troughs are present in the southern and eastern directions. The close conformity between these results and the original data is validated by a simple contour map (Figure 7.8) performed on the raw data growth levels between 1961 and 1980. Both the raw data and the estimated results obtained using TETS suggest that the concentricity assumption implicit in the distance bands and the distance expansion methods may be unwarranted.
118 S.KRAKOVER
Figure 7.7 Spatio-temporal structure of population growth in the urban region of Tel Aviv, 1970 and 1980
3 In addition to azimuthal variations, this model is capable of identifying all other decentralization components as indicated in Table 7.1. Another unique feature associated with the trend surface expansion model lies in its versatility; for example, it can produce cross-sectional analysis between any two locations. For instance, the cross-section presented
THE STUDY OF METROPOLITAN DECENTRALIZATION 119
Figure 7.8 Distribution of population growth in the urban region of Tel Aviv, raw data, 1961–80 Table 7.4 Cross-section to the south from Tel Aviv (128, 36) to Qiryat Eqron (133, 60) Ye ar
x, y coordinates at peak point
Lo ca tio n of pe ak (k m)
Gr o wt h in de x
x, y coordinates at trough point
Lo ca tio n of tr ou gh (k m)
Gr o wt h in de x
1
1
6.
1
1
2
1
4
5
120 S.KRAKOVER
Figure 7.9 Southward cross-section of the spatio-temporal population growth structure, urban region of Tel Aviv, 1961–80 Ye x, y Lo Gr x, y Lo ar coordinates ca o coordinates ca at peak tio wt at trough tio point n h point n of in of pe de tr ak x ou (k gh m) (k m) 9 2 1. 0 3 3 6. 1. 6 9. 9 5 2. 2. 8 2 5 2 2 0 4 3 0 5 3 4 3 9 0 4, 5, 1 1 4 6. 1 1 5 2 9 2 2. 1 7 3 7. 1. 7 9. 0 5 1. 2. 1 6 0 2 2 0 7 4 5 0 5 2 0 2 0 5, 7, 1 1 4 7. 2 1 5 2 9 2 2. 0 1 3 7. 1. 7 9. 8 0 5. 2. 4 9 5 4 5 0 6 4 9 5 2 5 7 5 0 8, 8,
Gr o wt h in de x
0 1. 7
1 0 6. 5
1 1 2. 0
THE STUDY OF METROPOLITAN DECENTRALIZATION 121
Figure 7.10 Analysis of southward cross-section from Tel Aviv to Qiryat Eqron, 1965–80 Ye ar
1 9 8 0
x, y coordinates at peak point
1 2 9. 5 6 1, Difference s:
4 3. 5 8 9
Lo ca tio n of pe ak (k m)
Gr o wt h in de x
x, y coordinates at trough point
Lo ca tio n of tr ou gh (k m)
Gr o wt h in de x
7. 5 5 0
2 6 6. 1
1 3 2. 5 4 9,
2 2. 3 0 0
1 1 7. 6
1. 5 0 0
1 3 3. 7
1. 0 5 0
1 5. 9
5 7. 8 3 7
in Figure 7.9 represents a trajectory surface connecting the central city of Tel Aviv with the southern town of Qiryat Eqron (an aerial distance of about 24 km). This trajectory is associated with twenty curves representing the annually changing distance function between the two selected localities. A distance-temporal analysis of this trajectory was employed to produce the information presented in Table 7.4 and depicted in Figure 7.10 for four selected years. This analysis reproduces all observations obtained via the previous distance expansion model except that this time it is locationally specific. Instead of assuming concentricity and utilizing one, supposedly
122 S.KRAKOVER
representative, cross-section for the entire urban region, the trend surface expansion model enables us to identify specific trajectories in the analysis of spatio-temporal trends between virtually any specified pair of x, y coordinates across the urban region. SUMMARY AND CONCLUSIONS The use of four different methods available for the analysis of decentralization was discussed and demonstrated for the urban region of Tel Aviv. It was shown that progressively more information can be obtained for planning purposes as we proceed from the urban-suburban dichotomous method, to the distance bands method, to the distance expansion model, to the trend surface expansion model. The use of the latter is judged superior for urban planning purposes since, along with a broad overview of the prevailing decentralization trends, it is also capable of producing a wealth of information unmatched by its predecessors and the results of the analysis can be more accurately located in the urban space. The weakness of the trend surface expansion model lies in its complexity, on the one hand, and its reliance on a specific mathematical formulation, on the other. The distance expansion model, however, suffers not only from its specific mathematical formulation but also from its heavy reliance on the assumption of concentricity and ignorance of possible azimuthal variations. This drawback also plagues the distance bands method, an approach that in addition suffers from sensitivity to the selected width of the distance bands. Finally, the least cumbersome method, that of the urban-suburban dichotomy, though not plagued by any of the above shortcomings, supplies very little information regarding the spread of decentralization in the suburban section of the urban region. In a cost-benefit type of framework it seems reasonable to conclude that the application of the most simple urban-suburban dichotomous method and the most complex spatio-temporal models are preferable, the former in view of its low cost and the latter on the merits of its ample benefits. NOTE 1 A part of this change may be attributed to the availability of census data for the year 1972.
REFERENCES Bennett, R.J. (1975) ‘The representation and identification of spatio-temporal systems: an example of population diffusion in north-west England’, Transactions, IBG 66: 73–94. van den Berg, L., Drewett, R., Klaassen, L.H., Rosi, A. and Vijverberg, C.H.T. (1982) A Study of Growth and Decline: Urban Europe, Oxford: Pergamon.
THE STUDY OF METROPOLITAN DECENTRALIZATION 123
Berry, B.J.L. and Cohen Y.S. (1973) ‘Decentralization of commerce and industry: the restructuring of metropolitan America’, in L.H. Massotti and J.K.Haden (eds) The Suburbanization of the Suburbs, pp. 431–56, Urban Affairs and Annual Review 7. Berry, B.J.L. and Kasarda, J.D. (1977) Contemporary Urban Ecology, New York: Macmillan. Blumenfeld, H. (1954) ‘The tidal wave of metropolitan expansion’, Journal of the Association of American Planners 20:3–14. Boyce, R.R. (1966) ‘The edge of the metropolis: the wave theory analog approach’, British Columbia Geographical Series 7:30–40. Casetti, E. (1972) ‘Generating models by the expansion method: application to geographical research’, Geographical Analysis 4: 81–91. ——(1973) ‘Testing for spatial-temporal trends: an application to urban population density trends using the expansion method’, Canadian Geographer 17:127–37. Casetti, E., King, L.J. and Odland, J. (1971) ‘The formalization and testing of concepts of growth poles in a spatial context’, Environment and Planning A 3:377–82. Central Bureau of Statistics, Israel (various years) Local Municipalities in Israel, Jerusalem. Cliff, A.D. and Ord, J.K. (1973) Spatial Autocorrelation, London: Pion. Dehan, S. (1984) ‘Growing economic independence of southern suburbs of Tel Aviv’, Paper presented at a seminar at Ben-Gurion University of the Negev, Beer Sheva, Department of Geography (in Hebrew). Erickson, R.A. (1983) ‘The evolution of the suburban space economy’, Urban Geography 4:95–121. Hall, P. and Hay, D. (1980) Growth Centers in the European System, Berkeley, CA: University of California Press. Harbaugh, J.W. (1964) ‘A computer method for four-variable trend analysis—illustrated by a study of oil gravity variations in southern Kansas’, Bulletin 171, Lawrence, KS: State Geological Survey, University of Kansas. Harris, C.D. (1943) ‘Suburbs’, American Journal of Sociology 49:1– 13. Hawley, A.H. (1956) The Changing Shape of Metropolitan America: Deconcentration Since 1920, Glencoe, IL: The Free Press. Johnston, R.J., Gregory, D., Haggett, P., Smith, D. and Stoddart, D.R. (eds) (1981) The Dictionary of Human Geography, Oxford: Basil Blackwell. Kain, J.F. (1968) ‘The distribution and movement of jobs and industry’, in J.Q.Wilson (ed.) The Metropolitan Enigma: Inquiries into the Nature and Dimensions of America’s Urban Crisis, pp. 1– 43, Cambridge, MA: Harvard University Press. Kellerman, A. and Krakover, S. (1986) ‘Multi-sectoral urban growth in space and time: an empirical approach’, Regional Studies 20: 117–29. Krakover, S. (1983) ‘Identification of spatio-temporal paths of spread and backwash’, Geographical Analysis 15:318–29. ——(1984) ‘Spread of growth in urban fields: eastern United States, 1962–1978’, Environment and Planning A 16:1361–73. ——(1985) ‘Spatio-temporal structure of population growth in urban regions: the cases of Tel Aviv and Haifa, Israel’, Urban Studies 22:317–28. ——(1986) ‘Progress in the study of decentralization’, Geographical Analysis 18:260–3. Krakover, S. and Casetti, E. (1988) ‘Directionally biased metropolitan growth: a model and a case study’, Economic Geography 64:17–28.
124 S.KRAKOVER
Krakover, S. and Kellerman, A. (1990) ‘Urban decentralization: a redefinition applied to the urban field of Chicago’, Geography Research Forum 10:51–67. Lamb, R. (1975) Metropolitan Impacts on Rural America, Research Paper 162, Department of Geography, Chicago, IL: University of Chicago Press. Mills, E.S. (1970) ‘Urban density functions’, Urban Studies 7:5–20. Morrill, R. (1985) ‘Identification of spatio-temporal paths of spread and backwash: a comment’, Geographical Analysis 17:247–50. Moses, L. and Williamson, H.F., Jr (1967) ‘The location of economic activities in cities’, American Economic Review 57:211–22. Muller, P.O. (1981) Contemporary Suburban America, Englewood Cliffs, NJ: PrenticeHall. Newling, B.E. (1969) ‘The spatial variation of urban population densities’, Geographical Review 59:242–52. Norcliffe, G.B. (1969) ‘On the use and limitations of trend surface models’, Canadian Geographer 13:338–48. Phillips, R.S. and Vidal, A.C. (1983) ‘The growth and restructuring of metropolitan economies’, Journal of American Planners Association 49:291–306. Richardson, H.W. (1978) Urban Economics, Hillsdale, IL: Dryden Press. Shachar, A. (1975) ‘Patterns of population densities in the Tel Aviv metropolitan area’, Environment and Planning A 7:279–91. Steiness, D.N. (1982) ‘Suburbanization and the “mailing” of America: a time series approach’, Urban Affairs Quarterly 17: 401–18. Tobler, W.R. (1970) ‘A computer movie simulating urban growth in the Detroit region’, Economic Geography 46 (Supplement): 234– 40. Weber, A.F. (1899) The Growth of Cities in the Nineteenth Century, 2nd printing, 1965, Ithaca, NY: Cornell University Press.
8 LONG-WAVE SPATIAL AND ECONOMIC RELATIONSHIPS IN URBAN DEVELOPMENT Shaul Krakover and Richard L.Morrill
The study of business cycles is a well-established research area in economics. The awareness of cyclical rhythms in urban development, however, is more recent. This awareness has risen in two ways. Research during the 1960s and 1970s focused upon the sensitivity of urban economic structure to economic fluctuations (e.g. Cutler and Hansz 1971). More recently, authors have investigated spatial cyclical trends (e.g. van den Berg et al. 1986; Berry 1988; Mera 1988). Some studies in the latter group have analyzed the concurrent appearance of spatial and economic cycles. This study is concerned with the interrelationships between economic and spatial trends for selected regional centers over the last hundred years. This period is approximately coincident with the last two Kondratieff (1935) long-wave economic cycles. Kondratieff’s long-wave cycle is one of four types of economic cyclical rhythms that have been identified at the national level (van Duijn 1983). The shortest is known as the Kitchin (1923) cycle of three to five years’ duration; it is assumed to be initiated by trends in inventory management. An intermediate cycle of seven to eleven years’ duration is named after Juglar (1862), and is said to be driven by investment decisions. A third and longer cycle is the Kuznets (1930) building cycle, fifteen to twenty-five years long. The fourth and longest is the Kondratieff forty-five-to sixty-year cycle; this is the cycle of concern in this research. Its dynamics are determined, according to Rostow (1980), by changing ratios of relative prices of raw materials to processed goods, and, according to van Duijn (1983), by innovation life cycles and the amortization of major capital goods and infrastructural investments. The well-documented long-term trends of population centralization and decentralization into and from central cities seem to coincide, respectively, with the third and fourth Kondratieff long-wave cycles. The third Kondratieff cycle is said to have occurred between 1896 and 1933, while the fourth cycle lasted from 1933 to 1972 (Rostow 1980). Van Duijn (1983) modifies Rostow’s chronology by dating the third Kondratieff between 1892 and 1948 and the fourth between 1948 and 1980. In either case, it seems that the third Kondratieff coincided approximately with the period of metropolitan centralization and intensification while the fourth was coincident with an era of metropolitan decentralization. It
126 S.KRAKOVER AND R.L.MORR1LL
has been suggested that the last one hundred years of urban centralization and decentralization trends constitute one long cycle (van den Berg et al. 1982). However, this should not necessarily be inferred from our association of urban and economic trends. While economic fluctuations tend to be cyclical around a usually growing secular trend, the relationships between cyclical and secular trends in urban development still await theoretical elaboration and empirical substantiation. This study focuses upon aggregate population responses to long-wave cyclical change. The following hypotheses are examined: (a) that relative metropolitan population centralization coincided with the third Kondratieff while relative metropolitan decentralization coincided with the fourth; and (b) that, during both Kondratieff cycles, periods of greater prosperity were characterized by inner county growth and outer county decline, while recessionary years tended to exhibit inner county decline and outer county growth. The rationale behind these hypotheses is that the spatial pattern of population distribution in a metropolitan region is determined primarily by structural attributes and by shifting economic conditions. The structural attributes are, inter alia, technological level, size and density of the central city, demographic characteristics, wealth accumulation, planning policy, and societal aspirations. The economic trends are determined by changing economic conditions, the major dimensions of which are employment and income, though consumption opportunities should not be overlooked. The structural factors tend to pull population in the same direction regardless of changes in the short-term economic situation, while the economically motivated spatial trends are highly affected by periodic economic conditions. Thus, periods of prosperity tend to reinforce the structural trends, resulting in centralization during the third Kondratieff and decentralization during the fourth Kondratieff. On the other hand, in recessionary times many people are released from their jobs in cities (Bernstein 1970; Geruson and McGrath 1977:82, 116), and a migration stream toward rural areas may result. Thus, during the slowdown of both the third and the fourth Kondratieff cycles, outer counties are expected to show higher growth rates than in the preceding boom period. These hypotheses have a solid grounding in the literature. The review which follows is developed in three sections: (a) decentralization and suburbanization; (b) sensitivity of cities to economic fluctuations; and (c) relationships between economic fluctuations and urban spatial development. DECENTRALIZATION AND SUBURBANIZATION Although decentralization and suburbanization may suggest cause and effect relationships, they have rarely been treated together. Decentralization usually refers to slower growth or a decreasing population share for the central city relative to the rest of the metropolitan area (i.e. Schnore 1957; Berry and Kasarda 1977; Steiness 1982). Suburbanization, on the other hand, concentrates
LONG-WAVE RELATIONSHIPS IN URBAN DEVELOPMENT 127
on forms and patterns of physical features and life styles characteristic of the suburbs (i.e. Masotti and Hadden 1973; Muller 1981). This separate treatment may possibly be responsible for the confusion surrounding the historical timing of the two processes. As regards suburbanization, its starting point is debatable (Singleton 1973) and its stages are difficult to define. Many researchers have adopted the wellestablished historical division suggested by Adams (1970), which relates the evolving metropolitan form to changing transportation technology. He identified four eras: (a) walking-horsecar era (pre–1850 to late 1880s); (b) electric street car era (late 1880s to 1920); (c) recreational automobile era (1920–45); and (d) freeway era (1945 to present) (Yeates and Garner 1980; Muller 1981). This division splits the third Kondratieff into two eras distinguished by the appearance of the freemoving automobile. Although Adams’s subdivision represents a genuine attempt to divide the continuous sequence of suburban development, it does not pinpoint the timing of any transformation from centralization to decentralization. Did this transformation occur at the end of the second era (1920)? At the end of the third era (1945)? Or somewhere in between? These questions have been debated in the past (Schnore 1957), but they deserve a renewed treatment in the context of longwave cycles, with a longer historical perspective and more powerful analytic techniques. Undoubtedly, centralization in the USA continued long after suburbanization had been in effect, largely because of the influx of migrants from rural areas and abroad. The central city started to decline first in relative and then in absolute terms only at a later time. In this paper we propose to associate the turning point from centralization to decentralization approximately with the turn from the third to the fourth Kondratieff, or around the year 1945. The significance of this date is that it separates the waning of the pre-war infrastructure and innovations (wartime industries, roads) and the onset of a different infrastructural and technological era (van Duijn 1983; Barras 1987). The post-Second World War era has been recognized by many as a turning point from slow to rapid suburbanization. For instance, Yeates and Garner (1980) indicate that, although suburbanization is not a new phenomenon, it ‘has been most apparent in the last 30 years…. The greatest volume of suburbanization occurred following the Second World War, when the supply of new housing was low… and the demand for housing was high…. [The baby boom] added an unusual urgency to the need for housing’ (pp. 61– 2). In addition, the postwar period coincided with increased dissatisfaction with the political and social conditions of the central city, a marked resistance to further annexations, and a tendency toward racial and class polarization, with the less affluent and minorities increasingly confined to the central cities. The suburban surge was further fueled by federal mortgage policies that favored newer suburban housing.
128 S.KRAKOVER AND R.L.MORR1LL
The impact of suburbanization may be expected to be reflected in consistent and high rates of growth in the inner, but not central, counties throughout our entire study period, but especially in the 1940–80 phase. The effect of decentralization, or the active preference of people for zones beyond the central city, should be reflected in peak growth for inner, but not central, county zones, except perhaps in the 1970s when the growth surge moved even further from the central city. THE SENSITIVITY OF CITIES TO ECONOMIC FLUCTUATIONS The early discussion of business cycles in cities was summarized by Thompson (1965), who resolved the debate over the preeminence of local versus national cycles in favor of the latter. The timing, duration, and severity of national or regional business cycles upon the local community depends very much on the city’s particular industry mix. Thompson hypothesized that larger urban centers will more closely replicate national trends because their economies resemble the national economic structure. The effect of industrial mix has been validated by Casetti et al. (1971), Cutler and Hansz (1971), and Cho and McDougall (1978). Geographers dealing with urban growth fluctuations have tried to extract some wider regional generalizations concerning similarities among cities located in the same region (Jeffrey et al. 1969; Casetti et al. 1971; Borchert 1983). A step further was taken by Jones (1983), who proposed several theoretically based models to examine mechanisms for the transmission of interregional economic fluctuations. Despite the consideration given to identifying concurrent fluctuations in the wider regional geographical context, this and the related economic literature (e.g. Gottlieb 1976) do not elaborate on the spatial effects that business cycles may have on urban spatial structure. ECONOMIC FLUCTUATIONS AND LOCAL URBAN SPATIAL DEVELOPMENT Few studies have dealt with the interrelationships between economic business cycles and urban spatial development. An early example is a study by Whitehand (1972), who investigated the conversion of rural to urban land during economic boom and slump periods. In a similar vein, Manson et al. (1984) examined the effect of business cycles on metropolitan suburbanization from 1969 to 1980. They explored the relationship between changes in net nonresidential fixed investment and central county shares of Standard Metropolitan Statistical Area (SMSA) income and population. They found that the rate of suburbanization accelerated during expansions in the national economy and declined during economic downturns. Although Whitehand’s (1972) study refers to twenty-year cycles and the study of Manson et al. (1984) refers to four-to seven-year cycles,
LONG-WAVE RELATIONSHIPS IN URBAN DEVELOPMENT 129
the reasoning of both is embedded in the reluctance of investors to take the risk of investing on the outskirts of the built-up area during recessionary periods. A comprehensive study on this theme is Barras’s (1987) examination of building cycles in the UK. He relates urban development history to Kondratieff’s long-wave cycles. He associates the early phase of industrialization and urbanization between 1780 and 1845 with the first Kondratieff. The second Kondratieff corresponds to the second phase of industrial revolution from 1845 to 1895, wherein the physical fabric of what constitutes today’s inner city areas was created. Barras associates the third Kondratieff of 1895 to 1945, which is a part of our study, to suburbanization, noting that during this era there was a shift of emphasis within established towns and cities from urbanization to suburbanization. Finally, he notes that during the postwar period, which corresponds to the fourth Kondratieff long wave, urban areas witnessed a change from suburbanization to decentralization—or rather deurbanization. Following van Duijn (1983), Barras recognizes that ‘each long wave is assumed to start with the emergence of a related cluster of new technologies which acts as the driving force for widespread innovations in new products and the establishment of wholly new branches of industry…. Each long wave in turn generates one or more long swings of building activity, combining to create a new wave of urban development’ (Barras 1987:7). Of importance to this paper are Barras’s observations regarding the technological and societal advances which led to the switch from suburbanization during the third Kondratieff to decentralization during the fourth. In his view ‘the most important factors underlying this deurbanization trend have been the further increase in household mobility due to mass ownership of motorcars and the construction of the motorway network, and the locational mobility of the new consumer goods industries of the post-war economic boom, which were based on technologies such as electronics, synthetic materials, and pharmaceuticals’ (p. 10). He notes that the locational mobility of new economic activity has been reinforced with the shift towards service industries. In addition, ‘the search for good housing and social facilities such as schools, plus a pleasant living and working environment, is now pulling urban development away from the industrial cities’ (p. 10). It seems clear that these observations apply equally well to the post-Second World War circumstances in the USA. There, as in the UK, the evolution of mass car ownership, the construction of a modern highway network, the rise of footloose industries, and the growth of the service sector, all helped to facilitate the movement of households and business to suburban, and even exurban, locations (Morrill 1979). In this way the gradual pre-Second World War suburbanization accelerated into a massive decentralization or deurbanization. This acceleration is demonstrated by statistics compiled by Muller (1981). His data show that, while in 1940 there were only two SMSAs out of the fifteen largest urban centers in which the suburban percentage of population exceeded
130 S.KRAKOVER AND R.L.MORR1LL
50 percent, in 1960 there were nine such cases and three others had passed the 45 percent mark. This literature and evidence lead us to hypothesize that despite the long lasting sequence of suburbanization, the historical switch from trends of centralization to trends of decentralization occurred approximately at the turn from the third to the fourth Kondratieff long-wave cycles, or around the year 1945. This hypothesis, along with the others relating spatial trends to prosperity and recession, are examined in the following sections for three large urban regions in the USA. CASE STUDIES, DATA, AND METHODOLOGY Hypotheses concerning the interrelationship between economic and spatial trends are tested in this paper for the urban regions of Philadelphia, Chicago, and Atlanta. Although these cities do not constitute a sample, they do represent three different types of early urban development experience: one from the Atlantic seaboard, a second from the intermediate period in the Midwest, and a third from the relatively new urban growth in the south. A fourth case of an urban center on the west coast could have provided for a better geographic coverage; however, it had to be dropped because of incompatibility in the size of the geographic units— counties—which in the west are quite large. The analysis is performed at the large urban-regional scale (Figure 8.1). This scale aims to capture the geographic extent of the historical rural-to-urban migration flows during the era of urban centralization, and the opposite flow during the era of decentralization. While the hypotheses relate to central cities versus suburbs, our data are for counties. Counties are especially useful for the purpose of analyzing change throughout the period from 1890 to 1980, since their boundaries are relatively stable. In accordance with the discussion in the previous section, this century covers the last two Kondratieff long waves. Total population growth is adopted as the dependent variable. This variable was selected not only because it is available by decade from the US Censuses, but also because it reflects the migratory responses of households to changing structural and economic conditions. For each decade, annual average population growth rates were calculated for each county in the study areas. If spatiotemporal patterns of growth and decline emerge consistent with the hypotheses, they will provide some evidence of the interrelationships we seek between spatial and economic trends. The model selected for hypothesis testing is a polynomial power series which has proven effective in studies of de centralization for shorter periods (Krakover 1983). It was adapted to this longer historical research following suggestions made by Morrill (1985). The model relies on the expansion method (Casetti 1972), whereby the parameters of an initial model are expanded as functions of time to create a spatio-temporal model.
LONG-WAVE RELATIONSHIPS IN URBAN DEVELOPMENT 131
Figure 8.1 Selected study areas
The initial spatial function is designed to capture complex patterns of the distribution of population growth as a function of distance from central cities. The
132 S.KRAKOVER AND R.L.MORR1LL
following function was selected: (8.1) where PGd is population growth at any distance D and bi are parameters. This fourth-degree polynomial can resolve complex patterns of growth such as those characterized by two peak or two trough points. The temporal component is introduced by defining the bi parameters as functions of time T, as shown in the following expansion equations: (8.2) Second-degree polynomial expansions allow the detection of temporal growth or decline. To detect such trends at any distance from the central city, the right-hand sides of the expansion equations are inserted into the initial model (8.1) to form the following terminal spatio-temporal model:
(8.3)
This terminal model is estimated from annual average population growth data for each urban region and for two time horizons: one to test the overall spatiotemporal trends during the third Kondratieff from 1890 to 1940; and another to test the trends during the fourth Kondratieff from 1940 to 1980. The model was estimated in two steps. The first step was designed to determine the appropriate degree of D, and Table 8.1 Summary of regression results Philadelohia 1890–1940 b00 0.0089 b01 0.0011 b02 –2. 3256×10–5 b10 –
Chicago 1940–80
1890–1940
1940–80
1890–1940
1940–80
0.0124 0.0017 –6. 9490×10–5 –
0.0136 0.0031 –1. 0573×10–4 –
0.0200 _ –
–
0.0288 0.0025 –7. 4125×10–5 –9. 8071×10–4 –
1. 0334X10–6 – –4. 6163×10–7
b11
–
–
0.0490 _ –1. 9218×10–5 –5. 7621×10–4 –
b12
–
7. 1447×10–7 – –1. 6904×10–7
5. 9168×10–7 – –2. 5632×10–7
b20 – b21 –1. 0236×10–7
Atlanta
– –2. 5311×10–5 – – –
2. 0873X10–6 – –1. 0534×10–6
LONG-WAVE RELATIONSHIPS IN URBAN DEVELOPMENT 133
Philadelohia
b22
Chicago
Atlanta
1890–1940
1940–80
1890–1940
1940–80
1890–1940
1940–80
–
–
–
–
–
– – – –
– – – 4. 2205×10–
– – – –
7. 0589×10–9 – – – –
b30 – b31 – b32 1.93×10–11 b40 –
– – – 2.4470×10– 9
10
b41 – – – – b42 – – – 4.14×10–13 N 125 100 105 84 2 R 0.19 0.17 0.40 0.45 Note: All estimates significant at the 0.05 level.
– – 185 0.23
– – 148 0.60
involved the sequential addition and significance testing of blocks of polynomial terms of increasingly higher degrees of D. In each analysis, the degree of D immediately lower to that of the first nonsignificant block was selected. In the second step, a backward selection procedure was applied on the polynomial chosen so as to obtain an estimated equation with all the coefficients significant at the 5 percent level or better. According to our first hypothesis, trends of urban centralization are expected during the early period while trends of decentralization will show up during the latter period. Possibly the process will begin earliest in the oldest metropolitan region, Philadelphia, occur next in Chicago, and later in the newest metropolitan region, Atlanta. This would indicate that regional structural forces are able to modify the stronger national forces. The second hypothesis concerning spatial trends during upswings and downswings of the economic cycle will be investigated by examining the estimated growth-distance trends by region and by decade. Specifically, we hypothesize that upswings are associated with growth in central counties and decline in peripheral counties, while downswings portray opposite trends. RESULTS The regression analyses obtained for the three urban regions are reported in Table 8.1 and their results are presented graphically in Figures 8.2–8.4. In these figures the horizontal axes measure distance in miles from the city center and the vertical axes are the associated annual population growth rates. The graphs are portraits of the estimated models for the time intervals in the analyses. Thus, each curve in the figures indicates the growth-distance relation for a particular
134 S.KRAKOVER AND R.L.MORR1LL
decade in the urban region. Comparison of the curves for different time intervals reveals the changing structure of growth throughout the entire period. This structure, for each urban region, is analyzed in two graphs: one for the third Kondratieff between 1890 and 1940 and the other for the fourth from 1940 to 1980. In the analysis of the graphically portrayed results, atten tion will be paid to four groups of counties: central county observations, suburban counties, intermediate zone counties, and remote counties. Alternatively, counties in the first two groups will be referred to as central counties while the other two are referred to as peripheral, rural, or outer counties. Philadelphia, 1890–1940 The last decade of the nineteenth century shows a relatively low spatial variation of population growth between central and peripheral counties, though the slightly higher growth rates in the center are indicative of centralization. This trend accelerated during the upswing of the Kondratieff economic cycle that took place until the 1920s. The first two decades of the twentieth century are marked by high population growth in central counties and slower growth in the peripheral areas. Between 1910 and 1920, rates of growth started to decline in most of the outer counties while continuing to grow in the center and its vicinity. The situation changed during the downturn of the economy in the 1920s. Rates of population growth declined everywhere throughout the Philadelphia urban region except for the most remote county, one that at this time was little affected by Philadelphia. This situation was exacerbated during the depression of the 1930s, except that the decline in growth rates was less in the rural counties and, again, the most remote county was the only place where growth rates increased. The overall result supports the hypothesis that the third Kondratieff, during both upswing and downswing, is characterized by centralization. The early economic boom period was associated with a rise in the magnitude of total concentration, and the consequent downswing period was marked by relative concentration which continued throughout the depression. The second hypothesis, regarding falling growth rates during prosperity and increasing growth rates during recession in outer counties, gains but little support in the early case of Philadelphia. It is only the most remote county that exhibited this pattern. The rest of the rural area behaved very much like the central counties, though the decline started a decade earlier, in the 1910s, a time of peak economic prosperity in the city. Philadelphia, 1940–1980 The fourth Kondratieff starts in the urban region of Philadelphia with a continued centralization. This trend has transformed since the 1950s to a trend of rapid decentralization. Although growth rates in the central county continued to be positive from the 1940s to the 1970s, growth in the adjacent suburban counties was
LONG-WAVE RELATIONSHIPS IN URBAN DEVELOPMENT 135
Figure 8.2 Estimated spatio-temporal growth structure for the urban region of Philadelphia
even higher. The downturn of the economy during the 1960s and 1970s witnessed falling growth rates which turned from relative loss in the early decade
136 S.KRAKOVER AND R.L.MORR1LL
to an absolute decline in the latter decade (compare with van den Berg et al. 1982). The temporal variation of growth rates in the mid-section counties is less distinct, though it follows about the same pattern with a slightly different chronology. The pattern is reversed, however, in the most remote section (from a distance of about eighty-five miles and beyond). Here the upswing of the 1950s witnessed falling growth rates while the downturn from the 1960s onwards is characterized by an increasingly growing trend. The overall result seems to fit the hypotheses: decentralization in central counties is observed since the second decade of the fourth Kondratieff; growth and decline of population in the central city are in congruence with the cyclical economic trend; suburban counties follow about the same pattern; and remote counties portray just the opposite—they decline during the economic upswing between 1940 and 1960, and grow during the downswing to 1980. Chicago, 1890–1940 Figure 8.3 portrays the generalized growth pattern for the urban region of Chicago. Unlike the case of Philadelphia, the peak surge of centralization was achieved as early as 1890. This surge gradually declined and gave way to decentralization during the last decade of the third Kondratieff—the depression era. Despite the declining growth rates, the gap between the growth of central and outer counties tended to widen. During the decades from 1910 to 1930, the more remote counties reached a low trough in their population growth. The slowdown and the depression decades of 1920–40 show decreasing growth rates in the central counties and increasing growth rates in the rural areas from a distance of about forty miles outwards. The widespread revival of growth in the rural section is especially impressive during the years of the depression between 1930 and 1940. In summary, the third Kondratieff in Chicago displays some discrepancy with our hypotheses. First, centralization is highest at the beginning and gradually decreases rather than increasing and decreasing with the turn of economic conditions. Second, decentralization starts a decade earlier than expected. Third, while rural counties behave as expected— growth rates declining during upswing and increasing during the downturn—the timing is off and the turnaround is delayed by one decade. The Chicago third Kondratieff thus seems to exhibit spatial trends at variance with our expectations based on the national economic cycle. This may reflect the greater sensitivity of the Chicago economy to the farm economy recession, which preceded the general economic downturn. Chicago, 1940–1980 The fourth Kondratieff opens in Chicago with a short revival of centralization during the 1940s. Thereafter, trends of decentralization increasingly dominate
LONG-WAVE RELATIONSHIPS IN URBAN DEVELOPMENT 137
Figure 8.3 Estimated spatio-temporal growth structure for the urban region of Chicago
the urban scene. Nevertheless the upswing of the 1950s is characterized by increased growth rates in central counties along with decreasing rates in the outer areas. The post–1960 downturn was met with sharply decreasing growth rates in central counties but a lesser decline in the intermediate zone. Consistent with our hypotheses, growth rates of the most remote counties have recovered during the recessionary years of the 1970s.
138 S.KRAKOVER AND R.L.MORR1LL
The regression results for Chicago in this period partially support our hypotheses. Except for the revived centralization in the 1940s and 1950s, the general trend in the central county is one of decentralization, and the timing of the turning point between growth and decline matches the economic cycle. (Note that if we had used the City of Chicago instead of Cook County, the relative decentralization would have occurred earlier.) Counties in the interim zone seem to behave similarly. Counties in the remote edge show declining growth rates during prosperity and growing rates during the recession. Atlanta, 1890–1940 The third Kondratieff opens in the Atlanta urban region with relatively equal growth between central and peripheral counties, although slightly higher growth, indicative of centralization, prevails in the central counties. During the following decades differences in growth rates sharpen. While rates in the central county remained about the same throughout the entire period, they fell sharply in the rural section, especially so during the prosperity of the early twentieth century. In the last two decades of recession and depression, the decline in growth in the rural sector not only slowed down but even reversed in the more remote section. The pattern supports our hypotheses to a large extent. First, the central city is the point of high, if not highest, growth, which reflects the trend toward centralization; second, rural county growth rates declined sharply during the period of economic prosperity; and, third, growth resumed in the most remote rural section during the economic slump of 1920–40. Atlanta, 1940–1980 The spatial distribution of growth in the first decade of the fourth Kondratieff is a clear continuation of the pattern arrived at in the last decade of the previous cycle, though with a much wider variation between peak and trough points. Also, centralization continued during the 1940s and the 1950s. The latter decade, which is a part of the economic upswing, is characterized by continued vigorous centralization on the one hand, together with lower growth rates in the remote part of the rural section, on the other. Decentralization in this southern case study appeared in the 1960s and intensified during the recession of the 1970s. Nevertheless, rates continued to grow throughout most of the rural section. Furthermore, a reversal from falling growth rates to increasing growth rates is observed in the remote counties during the decade of the 1960s. This renewed growth intensified during the recessionary period of the 1970s. The fourth Kondratieff in Atlanta provides partial support for our hypotheses. First, decentralization is a latecomer to this region. Second, most of the rural areas demonstrated constantly increasing population growth rates. On the other hand, the central county’s growth rates rose and declined in accord with the
LONG-WAVE RELATIONSHIPS IN URBAN DEVELOPMENT 139
Figure 8.4 Estimated spatio-temporal growth structure for the urban region of Atlanta
economic cycle, while the most remote areas portrayed just the opposite: declining growth rates during prosperity and growing rates during economic downturn. The result supports our sub-hypothesis that Atlanta, as a younger and newer metropolis, might experience the long-wave trends at some time lag.
140 S.KRAKOVER AND R.L.MORR1LL
SUMMARY OF RESULTS AND DISCUSSION Table 8.2 provides a summary evaluation of the results in light of the hypotheses. The urban region of Philadelphia seems to best fit the expected patterns of spatio-temporal growth. The urban region of Chicago fits partially, while the case of Atlanta seems to exhibit the most unconformity. The differential quality of fit relative to expectations may perhaps be attributed to interregional variations in economic cyclical patterns. Contrary to Thompson’s (1965) assumption, it seems that even large cities do not necessarily reflect national economic trends. On the contrary, leads and lags between national and regional, or local, economic cycles may be the rule rather than the exception. Among the four geographic segments of the urban regions, the most remote section seems to conform best to our hypothesis. Here, changes in population growth rates run contrary to economic trends, with declining population growth rates during prosperity and increasing growth rates during recession. The generality of this result seems to indicate an interesting migratory behavior in the remote rural zone—a massive outmigration during prosperity and a revival of growth via in-migration during difficult economic times. This finding may partially account for the migration turnaround trend of the 1970s (see Champion 1988; Frey 1988). Concerning the central cities, centralization is the rule during the third Kondratieff and decentralization during the fourth. Nevertheless, generalizing about the timing of the turnaround from centralization to decentralization is questionable. These results seem to reflect the possibility that cities in different regions or of different sizes have their Table 8.2 Conformity of results with hypotheses
Central county centralizati on during third Kondratief f Central county decentraliz ation during fourth Kondratief f
Philadelphia
Chicago
Third Fourth
Third
Yes
Yes, except last decade
Yes, except first decade
Atlanta Fourth
Third
Fourth
Yes
Yes, except first decade
No, started in late 1960s
LONG-WAVE RELATIONSHIPS IN URBAN DEVELOPMENT 141
Growth and decline of central county accord with economic trends Growth and decline of intermediat e zone counties accord with economic trends Growth and decline of remote counties contrary to economic trends
Philadelphia
Chicago
Atlanta
Third Fourth
Third
Fourth
Third
Fourth
Yes
Yes
No, only decline
Yes
No, only decline
Yes
Yes
Yes
No, counter trend
Yes
No, only decline
No, only growth
Yes
Yes
Yes
Yes
Yes
Yes
own turning points, probably depending upon local historical, physical, and economic conditions. The compatibility between the growth and decline of population and upswings and downswings of the economic cycle was confirmed in four out of the six cases. The results seem to contradict the observation by Manson et al. (1984) that suburbanization accelerates during economic expansion. It should be recalled, however, that their findings refer to short-term economic fluctuations while this study examined long-wave cycles. The most inconsistent results in this study relate to the intermediate zone of nonmetropolitan counties. Without indulging in a detailed analysis, it seems that the large variability of the counties belonging to this zone prohibits reaching a more solid generalization. In order to strengthen the conclusions reached in this study one may proceed either by drawing a much larger sample of urban regions or by conducting a detailed analysis on a county-by-county level. This study adopted a middle course and showed that, despite regional and local variations, some
142 S.KRAKOVER AND R.L.MORR1LL
generalizations can be made regarding the relationships between economic cycles and urban spatial development trends. REFERENCES Adams, J.S. (1970) ‘Residential structure of midwestern cities’, Annals, Association of American Geographers 60:37–60. Barras, R. (1987) ‘Technical changes and the urban development cycle’, Urban Studies 24:5–30. van den Berg, L, Drewett, R., Klaassen, L.H., Rossi A. and Vijverberg, C.H.T. (1982) Urban Europe: A Study of Growth and Decline, Oxford: Pergamon. van den Berg, L., Burns, L.S. and Klaassen, L.H. (1986) Spatial Cycles, Aldershot: Gower. Bernstein, I. (1970) ‘The city in the great depression’, in R.A.Mobl and N.Betten (eds) Urban America in Historical Perspective, pp. 303–14, New York: Weybright and Talley. Berry, B.J.L. (1988) ‘Migration reversal in perspective: the long-wave evidence’, International Regional Science Review 11:245– 52. Berry, B.J.L. and Kasarda, J.D. (1977) Contemporary Urban Ecology, New York: Macmillan. Borchert, J.R. (1983) ‘Instability in American metropolitan growth’, Geographical Review 73:127–49. Casetti, E. (1972) ‘Generating models by the expansion method: application to geographical research’, Geographical Analysis 4: 81–91. Casetti, E., King, L. and Jeffrey, D. (1971) ‘Structural imbalance in the U.S. urbaneconomic system, 1960–1965’, Geographical Analysis 3:239–55. Champion, A.G. (1988) ‘The reversal of the migration turnaround: resumption of traditional trends’, International Regional Science Review 11:253–60. Cho, D.W. and McDougall, G.S. (1978) ‘Regional cyclical patterns and structure, 1954– 1975’, Economic Geography 54:66–74. Cutler, A.T. and Hansz, J.E. (1971) ‘Sensitivity of cities to economic fluctuations’, Growth and Change 2:23–8. van Duijn, J.J. (1983) The Long Wave in Economic Life, London: George Allen & Unwin. Frey, W.H. (1988) ‘The re-emergence of core region growth: a return to the metropolis?’, International Regional Science Review 11:261–8. Geruson, R.T. and McGrath, D. (1977) Cities and Urbanization, New York: Praeger. Gottlieb, M. (1976) Long Swings in Urban Development, New York: National Bureau of Economic Research. Jeffrey, D., Casetti, E. and King, L. (1969) ‘Economic fluctuations in a multiregional setting: a bi-factor analytic approach’, Journal of Regional Science 9:397–404. Jones, D.W. (1983) ‘Mechanisms for geographical transmission of economic fluctuations’ , Annals, Association of American Geographers 73:35–50. Juglar, C. (1862) Des crises commercials et leur retour périodique en France, en Angleterre et aux Etats Unis, Paris: Librairie Guillaumin. Kitchin, J. (1923) ‘Cycles and trends in economic factors’, Review of Economic Statistics 5:10–16.
LONG-WAVE RELATIONSHIPS IN URBAN DEVELOPMENT 143
Kondratieff, N. (1925) ‘Long economic cycles’, Voprosy konyunktury, vol. 1, translated by G.Daniels, 1984, as The Long Wave Cycle, Richardson and Snyder. Krakover, S. (1983) ‘Identification of spatio-temporal paths of spread and backwash’, Geographical Analysis 15:318–29. Kuznets, S. (1930) Secular Movements in Production and Prices, Boston, MA: Houghton Mifflin. Manson, D.M., Howland, M. and Peterson, G.E. (1984) ‘The effect of business cycles on metropolitan suburbanization’, Economic Geography 60:71–80. Masotti, L.H. and Madden, J.K. (eds) (1973) Urbanization of the Suburbs. Urban Affairs Annual Review , vol. 7, Beverly Hills, CA: Sage. Mera, K. (1988) ‘The emergence of migration cycles?’, International Regional Science Review 11:269–76. Morrill, R.L. (1979) ‘Stages in patterns of population concentration and dispersion’, Professional Geographer 31:55–65. ——(1985) ‘Identification of spatio-temporal paths of spread and backwash: a comment’, Geographical Analysis 17:247–50. Muller, P.O. (1981) Contemporary Suburban America, Englewood Cliffs, NJ: PrenticeHall. Rostow, W.W. (1980) Why the Poor Get Richer and the Rich Slow Down, New York: Macmillan. Schnore, L.F. (1957) ‘Metropolitan growth and decentralization’, American Journal of Sociology 63:171–80. Singleton, S.H. (1973) ‘The genesis of suburbia: a complex of historical trends’, in L.H.Masotti and J.K.Madden (eds) Urbanization of the Suburbs. Urban Affairs Annual Review, vol. 7, pp. 29– 50, Beverly Hills, CA: Sage. Steiness, D.N. (1982) ‘Suburbanization and the “mailing” of America: a time series approach’, Urban Affairs Quarterly 17: 401–18. Thompson, W.R. (1965) A Preface to Urban Economics, Baltimore, MD: Johns Hopkins University Press. Whitehand, J.W.R. (1972) ‘Building cycles and the spatial pattern of urban growth’, Transactions, Institute of British Geographers 56: 39–55. Yeates, M. and Garner, B. (1980) The North American City, 3rd edn, San Francisco, CA: Harper & Row.
9 AN INVESTIGATION INTO THE DYNAMICS OF DEVELOPMENT INEQUALITIES VIA EXPANDED RANKSIZE FUNCTIONS C.Cindy Fan Most inequality measures are designed to evaluate the inequality within a system, or at most within well-defined partitions of a system. While the continuous variation of inequality appears to be of great relevance in the social sciences and elsewhere, it has not yet constituted a significant object of inquiry. This paper presents an approach to the measurement of systemic inequality that is conducive to investigating the continuous variation of inequality across contexts. This approach makes it possible to investigate how systemic inequality varies over time, across space, within the system itself, and in relation to whatever contextual dimensions appear relevant. The approach suggested is based on the expansion methodology. It involves the following. A parameter of a mathematical relationship between size and rank that is capable of estimation by regression is identified and interpreted as a measure of inequality. The parameters of the relationship, including the one that is the measure of systemic inequality, are redefined into functions of contextual variables by expansion equations. These expansion equations allow the modeling of contextual variation of systemic inequality in response to any systemic variation that might be of interest by replacing the expansion equations into the initial relationship. The terminal model thus derived may be used to test for and to portray a broad range of contextual variation of systemic inequality measures. The paper is structured as follows. First, the major issues associated with the use of conventional inequality measures are discussed. Then the inequality measures arising from rank-size formulations are placed in focus, and the application of the expansion method to these rank-size functions is presented. An application of expanded rank-size functions to the testing and measurement of temporal and systemic variation in development inequalities is then described. A section of conclusions completes the paper. CONVENTIONAL MEASURES OF INEQUALITY The early and simplest measures of inequality originate from the investigation of the scatter of data about measures of central tendency. Prominent among them are the variance, standard deviation, and coefficient of variation. However, good
THE DYNAMICS OF DEVELOPMENT INEQUALITIES 145
measures of inequality should be invariant to a change in the scale of measurement, a property not held by the variance and standard deviation. The coefficient of variation, while not scale dependent, may be unduly influenced by the presence of outliers. Another popular measure, the Gini coefficient, is cumbersome to implement (Gaile 1984) and can only be applied to a system as a whole. The coefficient is inevitably influenced by values at the upper end of the Lorenz curve, and major difficulties arise in comparative studies if Lorenz curves intersect. Further, there is no clear relation between the Gini coefficient calculated from a system and those calculated from partitions of a system. The Gini coefficient does not allow for decomposition or disaggregation of the spatial or social entities involved, and consequently cannot show how inequality varies within a system. In response to these limitations, the use of information theory as the basis for generating inequality measures has increasingly gained favor in the literature. Information statistics alleviate many problems confronting the approaches discussed above. They are not scale dependent and are not unduly affected by extreme values. They can be decomposed into additive terms corresponding to different levels of classification, for example different levels of spatial aggregation/disaggregation. Information statistics used to measure inequality are primarily based on the Shannon entropy approach (Theil 1967; Semple and Gauthier 1972; Perin 1975; Perin and Semple 1976; Semple 1977; Walsh and O’Kelly 1979). In spite of their ability to calculate within-group and betweengroup inequalities, the utility of information theoretic measures is limited by the need for a priori grouping and classification of spatial or social entities. Unless exhaustive combinations and disaggregation of data units has been explored, the identification of continuous variation in inequality within a system remains problematic. The inequality measures referred to above were primarily intended as devices to determine overall systemic inequality. At most they could cope with the relation between overall systemic inequality and the systemic inequality of welldefined and pre-specified partitions of a system. The issue of continuous variation within the system is not and cannot be addressed by the use of these measures. It can, instead, be addressed by the use of the expanded rank-size functions discussed in the next sections of this paper. Specifically, the use of rank-size functions to measure inequality will be discussed and an application using the expansion method will be presented. THE RANK-SIZE APPROACH TO THE MEASUREMENT OF INEQUALITY The rank-size function has been widely employed to investigate the relationship between city size and rank. Rank-size models were developed to formalize the observation that, when all cities of a region are ranked in decreasing order of population size, the size of a city of a given rank is related to the size of the largest
146 C.C.FAN
Figure 9.1 Zipf’s ideal rank-size distribution
city in that region (Garner 1967). Zipf (1949) used rank-size relationships to examine a variety of issues, and concluded that the distribution of human activities, ranging from language, art, economic power, and social status, was subject to natural laws which he then adopted as a standard for defining an ideal distribution. In his work both frequency and size were used to represent the magnitude of a measurable quantity.1 In this paper ‘size’ is used as a general term for magnitude. The simplest representation of the rank-size relationship is: (9.1) where y is size, r is rank (largest size: rank 1), and a and b are parameters. In logarithmic form, equation (9.1) becomes (9.2) where a is the intercept and b is the slope of the rank-size curve. In the case of city size distribution, y is the population of a particular city and r is its rank according to population size. When the coefficient b equals –1, it is referred to by Zipf as an ideal distribution which reflects the orderliness of human activities, namely the rank-size rule. The coefficient b is the derivative of the logarithmic function (9.2). It evaluates the percentage rate of change in size associated with the percentage rate of change in rank. The rank-size rule suggested by Zipf prescribes that, if the rank of a unit increases (or decreases) by a certain percentage, its size also increases (or decreases) by the same percentage. On doubly logarithmic paper, the rank-size rule will appear as a straight line
THE DYNAMICS OF DEVELOPMENT INEQUALITIES 147
descending from left to right at an angle of 45°, indicating a slope of –1 (Figure 9.1). Zipf examined briefly cases where the slope deviates from –1. This issue has been further investigated by geographers, especially in studies of city size distribution. Malecki (1975, 1980) explored the change in the slope over time as an indication of the relative rates of growth of higher ranking and lower ranking cities. For instance, if the coefficient b in equation (9.2) becomes more negative over time, larger cities have increased in size more rapidly than smaller cities, resulting in a steeper rank-size curve, which in turn indicates a trend toward population concentration. Conversely, if b becomes less negative in time, the slope decreases in steepness, which then implies a trend toward population deconcentration. Danta (1985, 1987) further suggested that the value of –1 provides a benchmark for estimating the population distribution within an urban system, since values of b more negative than –1 indicate a greater percentage of population living in the higher ranking cities than predicted by the rank-size rule. The coefficient b can therefore be interpreted as a measure of population concentration. More specifically, Danta clarified the meaning of ‘concentration’ as ‘relative proportion’, which evaluates the relative distribution of population contained within an urban system. The relation between inequality and rank-size functions can be more generally described by recognizing that, when certain quantities of a particular attribute are shared among social units, the level of concentration also reflects the degree of inequality (and equality). For instance, if a cake is evenly divided so that each person gets an equal share, this is a situation of perfect equality, where the level of concentration is at a minimum. Using the rank-size approach the coefficient b will approach zero as the cake’s distribution approaches perfect equality, and the rank-size curve will look very flat, as shown graphically in Figure 9.2(a). If one person gets close to getting the whole cake, then the maximum level of concentration is approached and b will approach negative infinity, which corresponds to perfect inequality. The rank-size curve will also become very steep (Figure 9.2(b)). Hence, the theoretical range of the coefficient b is between zero and negative infinity. In comparative terms, a more negative value implies a greater level of inequality, and a less negative value a lower level of inequality. The coefficient b in the above framework can be interpreted as a systemic measure of inequality. It is capable of estimation by regression, which facilitates the generation of models that can portray the continuous variation of inequalities within a system and with respect to relevant contexts using the expansion method. This approach will be presented by applying it to the investigation of the variation of development inequalities over time and by development levels in the sections below. An empirical analysis based on the approach suggested will follow.
148 C.C.FAN
Figure 9.2 (a) Perfect equality rank-size distribution; (b) perfect inequality rank-size distribution
A RANK-SIZE APPROACH TO DEVELOPMENT INEQUALITY The study of development inequalities among countries has attracted
THE DYNAMICS OF DEVELOPMENT INEQUALITIES 149
considerable attention in the social sciences since the 1950s. The emphasis of these studies has often been upon whether the so-called ‘development gap’ is getting bigger, i.e. whether the present level of development inequality is greater than in the past (Benoit 1972). Much research has provided evidence of a trend toward increasing levels of inequality between countries. Using a traditional approach, Zimmerman (1962) measured inequality by the Gini coefficient and found an increase in intercountry inequality from 1860 to 1959. The study by Summers et al. (1984), with the aid of a variety of indices, showed that inequality has gone up between 1960 and 1980. Along this same line, Lamers (1967) argued that a considerable acceleration of growth rates on the part of the developing countries was required for the development gap to show signs of closing. On the other hand, Russett’s (1965) study, which covered the time span from 1950 to 1975, found a slight decrease in overall inequality. However, his results should be interpreted with care. First, portions of his data came from projections instead of collections, and second, the improvement in development was found to be concentrated in middle-income countries, while the gap between poor countries and the rest was found to be widening. It is on this basis that Russett predicted a persistence of serious inequalities. Most of these studies rely on systemic measures which obscure the variation of development inequalities between countries. Russett’s study, for example, illustrates how a systemic measure may be inadequate for capturing the undercurrents within a system. These undercurrents are only visible when the continuous variation of development inequalities between countries is placed in focus. The temporal dynamics of these variations is another important theme that needs to be addressed. As Sicherl (1973) has argued, social and economic development is by nature a long-run phenomenon and needs to be analyzed in a temporal dimension. When the focus of the research is on the dynamic nature of inequality, the rank-size function proves to be a useful and flexible measure that can be redefined into functions capable of portraying the dynamics of inequality in a wide variety of contexts, a feature not shared by the conventional measures outlined previously. For example, inequality between countries changes over time and this temporal dimension can be easily incorporated into the rank-size function via routines specified by the expansion method (Casetti 1972, 1986). In the Initial model’, equation (9.2), y becomes the indicator of development and r stands for the rank of a country according to y. Both the parameters a and b in this initial model may change with time and therefore can be defined as functions of time (t): (9.3) (9.4) Substituting these ‘expansion equations’ into the initial model (9.2) yields the following ‘terminal model’: (9.5)
150 C.C.FAN
which is capable of capturing the temporal shift of the inequality measure b. Namely, if the coefficient b1 is positive and significant, inequality has decreased over time, and conversely, if it is negative and significant, inequality has increased over time (Malecki 1975, 1980; Danta 1985, 1987). A similarity between the coefficient b and the conventional inequality measures discussed earlier is that they are single measures of systemic inequality. They give a very rough picture of the magnitude of overall inequality, but are not capable of revealing detailed relationships between entities in different portions of the system. Both mask a situation in which the inequality between some countries is greater than that between others. Moreover, if the continuous variation within the system changes over time, these changes will not be reflected by observing the temporal shift of single systemic measures. Yet the advantage of using the rank-size approach is that the function can be easily expanded to incorporate variation in inequality within the system as well as its dynamics when the system evolves. The continuous variation of inequality is most obvious when the rank-size function exhibits a nonlinear relationship. Figure 9.3 illustrates four possible cases. In the first case, there is a greater inequality between high ranking countries than between low ranking countries so that the slope of the rank-size curve becomes less steep with increasing rank. The second curve (Figure 9.3(b)) is concave to the origin, implying that inequality is greater between low ranking countries than between high ranking countries, and so the slope becomes steeper with increasing rank. The step-like curve in the third case (Figure 9.3(c)) indicates a low level of inequality between high ranking countries and also between low ranking countries, but there is a large gap between these two groups of countries. This step-like feature may repeat itself, as illustrated in Figure 9.3(d), where natural groupings are generated due to the obvious gaps in the levels of development between them. In the above four cases, a linear rank-size function portrayed by equation (9.2) will yield a poor fit, rendering the interpretation of the inequality measure b ambiguous. Instead, the continuous variation of inequality can be captured by examining the change of b with rank, which represents the variation of inequality with levels of development. Again, this can be done within the framework of the expansion method, which greatly enhances the interpretability of the coefficient. The coefficient b in the initial model (9.2) should then be defined as a function of rank. Various specifications of the function are possible. Specifically, b may be defined as a linear function of rank (r), of the square of rank (r2), of the logarithm of rank (In r), or of the inverse of rank (r−1). These would yield the following expansion equations respectively: (9.6) (9.7) (9.8) (9.9)
THE DYNAMICS OF DEVELOPMENT INEQUALITIES 151
Figure 9.3 Nonlinear rank-size curves
Expansion equation (9.6) specifies that inequality is a linear function of rank. Expansion equation (9.7) indicates that inequality changes with rank exponentially, which describes cases where inequality increases or decreases
152 C.C.FAN
more rapidly at lower ranks. In contrast, if inequality changes more rapidly at higher ranks but less rapidly at lower ranks, it can be expressed as a function of In r (equation (9.8)). Similarly, expansion equation (9.9) can be used when inequality increases or decreases at a decreasing rate with rank. Substituting each
THE DYNAMICS OF DEVELOPMENT INEQUALITIES 153
of these expansion equations into the initial model (9.2) generates the following terminal models respectively: (9.10) (9.11) (9.12) (9.13) The choice of an appropriate specification should depend, firstly, on a priori theoretical explanations of the relationship between inequality and development levels, and secondly, on the goodness of fit of the model. The parameters in these terminal models can be estimated via regression. The coefficients of determination (R2) from the regression analyses may be used to indicate which of these specifications gives the best fit with respect to the relationship between inequality and rank. Statistical tests on the coefficient b1 will indicate whether the hypothesized relationship is significant. The continuous variation of development inequalities between countries is also likely to change as the system evolves in time. Namely, the relationship between inequality and development levels may not be temporally stable. This can again be easily portrayed and interpreted using the expansion method, with the parameters shown in equations (9.10)– (9.13) further expanded as functions of time. To minimize complexities in the final terminal model, the indicator of development can be converted into an index: (9.14) where yi is the indicator of development of the ith country, ya is the indicator of development of the highest ranking country, and k is a constant (100). Since each of the ys is expressed as a fraction of the y of the highest ranking country, the expansion of the parameter a in equations (9.10)–(9.13) (which is the estimate of the indicator of development of the highest ranking country) becomes unwarranted. Instead, only the parameters b0 and b1 in (9.10)–(9.13) are redefined as functions of time. If a linear function of time t is deemed appropriate, the following expansion equations are selected: (9.15) (9.16) Substituting equations (9.15) and (9.16) into, for example, equation (9.10) generates the following terminal model, which captures the dynamics of inequality with rank and in time: (9.17) Upon estimation, the interpretation of parameters in (9.17) should be focused on the original parameter b, which is the basic measure of inequality: (9.18)
154 C.C.FAN
The derivative of b with respect to t evaluates the change of inequality over time. Namely, if db/dt<0, inequality increases over time, and if db/dt>0, inequality decreases over time. With respect to equation (9.18), (9.19) which can be easily interpreted to indicate that inequality increases more, or less, over time at particular ranks by substituting selected values of r. The above formulation illustrates that by using rank-size functions the inequality measure becomes a regression coefficient b. By expanding b, this approach makes it possible to investigate how inequality changes within a system as well as how it behaves over relevant contexts such as time. The empirical analysis in the next section will demonstrate this approach. EMPIRICAL ANALYSIS In this empirical analysis per capita gross domestic product (GDP) is chosen as an indicator and summary measure of the level of development of a country. This is consistent with most of the literature on development inequalities and therefore facilitates comparison. The dynamics of development inequalities is most visible in a long time span and the constraint of data availability renders per capita GDP the only acceptable indicator of development. This is not to suggest that it is superior to other available development indicators. In fact, development inequalities can also be studied from a multivariate approach. Cole (1981), for instance, examined different aspects of production and consumption to show changes in development inequalities. When particular aspects of development are involved, a multivariate approach may be more appropriate. In this paper the focus is placed upon the inequality of the overall level of development between countries, for which purpose per capita GDP is considered a convenient and adequate basic country-level indicator of development. The data set consists of the per capita GDPs of thirty-eight countries2 for selected years covering the time span from 1913 to 1980. The number of observations reflects a compromise between representativeness and time span. As one goes further back in history, the list of countries with reliable data becomes shorter. Although these thirty-eight countries are only a subset of all the countries in the world, they do represent a good cross-section. Both developed and developing countries as well as communist and nonTable 9.1(a) Estimates for linear rank-size functions Year
a
b
R2
1913 1929 1950
8.7282 9.0016 9.5514
–0.8288 –0.8622 –0.9240
0.8522 0.8440 0.7478
THE DYNAMICS OF DEVELOPMENT INEQUALITIES 155
Year
a
b
R2
1960 1970 1980
9.8690 10.4165 11.2070
–0.9515 –1.0219 –1.0833
0.7706 0.6699 0.6390
communist countries are represented in the data set.3 The total population of the countries included amounts to more than 65 percent of the world’s 1980 population. Data for the years 1950, 1960, 1970, and 1980 come from the World Bank (1981, 1983). The data for 1913 and 1929 are based on Zimmerman’s (1962) study. For each of these six points in time the countries are ranked in decreasing order according to their per capita GDPs (in 1980 US dollars), which generates the variables r and y respectively for the various rank-size functions used in the following analyses. Figures 9.4(a)–9.4(f) show six scatter diagrams, each for one point in time, relating the logarithm of per capita GDP to the logarithm of rank. An obvious feature is that, if rank-size curves were drawn to link up the data points, they would not tend to be linear. To illustrate this observation more clearly, a linear rank-size function (equation (9.2)) was separately estimated for each year. The results, shown in Table 9.1(a), indicate only moderate R2 values. Table 9.1(b) Estimates for linear rank-size function expanded in time Parameter
Estimate
a0 a1 b0 b1 Note: R2 = 0.7819.
8.4965 0.0348 –0.8081 –0.0037
To assess temporal relationships existing in the data, I pooled the six years of cross-sections and expanded parameters a and b as suggested in equations (9.3) and (9.4). The value of t equals zero for the initial year 1913 and 67 for the final year 1980. The estimates for terminal model (9.5) are reported in Table 9.1(b). The negative sign of b1 implies that inequality has been increasing everywhere in the array of countries over time. However, the again moderate R2 value suggests that this model may yet be improved. In addition, the scatter diagrams (Figures 9.4(a)– 9.4(f)) clearly show that increasing inequality is apparent only at the lower end of the array, i.e. between low ranking countries, while b1 addresses the entire array. This exercise shows that erroneous conclusions can be drawn if a single measure of systemic inequality is used instead of an investigation of the continuous variation of inequality within a system. Like other systemic measures,
156 C.C.FAN
Figure 9.4 (a) Scatter diagram of ln y and ln r, 1913; (b) scatter diagram of ln y and ln r, 1929; (c) scatter diagram of ln y and ln r, 1950; (d) scatter diagram of ln y and ln r, 1960; (e) scatter diagram of ln y and ln r, 1970; (f) scatter diagram of ln y and ln r, 1980
only the overall trend is measured, albeit inaccurately, whereas differentiations and important dynamics within the system have been overlooked. It is clear from the scatter diagrams shown in Figures 9.4(a)–9.4(f) that development inequality changes with rank. Consequently, the inequality measure b was redefined into alternative functions of rank, following equations (9.6)–(9.9). Estimates of the respective terminal models for 1913, 1950, and 1980 are reported in Table 9.2. It should be noted that per capita GDP has been converted into an index, as demonstrated in equation (9.14), so that it would not be necessary to account for the temporal shift in the parameter a in subsequent analyses. The models shown as (a) and (b), where b is expressed as a linear
THE DYNAMICS OF DEVELOPMENT INEQUALITIES 157
function of r and as a linear function of r2 respectively, are superior to the other two, as demonstrated by the higher R2 values. Hence inequality appears to change linearly or at an increasing rate with rank. Substituting the respective estimates for b0 and b1, into equations (9.6) and (9.7) provides us with an interpretation of the relationship between inequality and rank. The estimates for b1 are negative and significantly different from zero in both models, suggesting that development inequality becomes greater among countries at the lower end of the array. Thus, development inequality is smaller among developed countries
158 C.C.FAN
Table 9.2 Estimates for terminal model: in y' =a+ [b=f(r)]ln r Year 1913 1950 1980
1913
a
b0
(a) b=f(r)=b0+b1r 4.4712 –0.2171 (67.61) (–5.79) 4.3030 –0.0733 (24.38) (–0.73) 4.2446 0.3765 (31.23) (4.89) (b) b=f(r)=b0+b1r2 4.6462 –0.4340
b1
R2
–0.0141 (–18.51) –0.0196 (–9.65) –0.0336 (–21.51)
0.9863
–2.7×10−4
0.9311 0.9746
0.9803
THE DYNAMICS OF DEVELOPMENT INEQUALITIES 159
Year 1950 1980
1913 1950
a
b0
(63.68) (–12.87) 4.4879 –0.3375 (31.67) (–5.16) 4.5926 –0.0966 (60.35) (–2.75) (c) b=f(r)=b0+b1 In r 4.2545 0.3847 (34.88) (3.34) 4.0988 0.6465
b1 (–15.10) –4.1×10−4 (–11.56) –6.8×10−4 (–36.20) –0.2822 (–10.87) –0.3653
R2 0.9476 0.9906
0.9662 0.8826
160 C.C.FAN
Year 1980
1913 1950 1980
a
b0
(15.14) (2.53) 3.8499 1.6649 (11.89) (5.45) (d) b=f(r)=b0+b1r−1 4.6185 –0.7148 (16.72) (–10.69) 4.6566 –0.7917 (10.34) (–7.26) 4.7126 –0.8318
b1
R2
(–6.34) –0.6392 (–9.27)
0.8955
1.9327 (2.78) 2.2405 (1.98) 4.2606
0.8789 0.7731 0.6959
THE DYNAMICS OF DEVELOPMENT INEQUALITIES 161
Year
a
b0
(7.13) (–5.20) Note: t values are in parentheses.
R2
b1 (2.56)
Table 9.3 Estimates for terminal models (9.20) and (9.21) Parameter
Estimate
t value
Model (9.20) a b00 b01
4.3673 –0.2333 0.0065
75.51 –5.95 11.36
162 C.C.FAN
Parameter b10 b11 Model (9.21) a b00 b01 b10 b11
Estimate
t value
–0.0126 –2.5×10–4 F=1, 234.2611
–11.90 –11.54 R2=0.9568
4.6025 –0.4644 0.0043 –2.3×10–4 –5.8×10–6 F = 1, 639.9185
99.10 –18.70 12.89 –11.82 –14.04 R2=0.9671
Note: All p ′ 0.0001.
and greater among developing countries. This type of continuous variation should become an important consideration when theoretical partitions of the system need to be generated. Finally, the temporal dynamics of these relationships can be investigated by first pooling the data and then expanding b0 and b1 as linear functions of time t, as illustrated in equations (9.15) and (9.16). A linear function of t is preferred because the derivative of b with respect to t can then be easily interpreted. In this instance we retain only models employing the expansion in terms of r and r2. Thus, the two terminal models capturing the drift of inequality with rank and with time are and (9.20) and (9.21) respectively. The resultant estimates of these models, obtained by stepwise regression, are reported in Table 9.3. Let us restrict the interpretation to the b coefficient, the original inequality measure. By substituting the significant estimates of Table 9.4 Values of db/dt for selected ranks Rank
db/dt=b01+b11r
db/dt=b01+b11r2
5 15 25 35
0.005280 0.002745 0.000210 –0.002326
0.004169 0.003009 0.000690 –0.002789
equations (9.20) and (9.21) into (9.22)
THE DYNAMICS OF DEVELOPMENT INEQUALITIES 163
(9.23) respectively, and then taking the derivative of b with respect to t, we obtain the following: (9.24) (9.25) Since the derivatives evaluate the change of inequality over time, substituting selected values of r into equations (9.24) and (9.25) will illustrate whether inequality increases or decreases over time at certain ranks. For example, Table 9.4 lists the values of the derivatives for ranks 5, 15, 25, and 35. The results suggest that inequality tends to decrease more over time at higher ranks, less at middle ranks, and actually increases at lower ranks. In other words, inequality between the developed countries at the upper end of the array has decreased, while that between the developing countries at the lower end of the array has increased over the period 1913–80. The net effect of this is a widening development gap between developed and developing countries. CONCLUSIONS This paper articulates an approach to the measurement of inequality that is superior to conventional systemic measures. The rank-size function provides a convenient measure of systemic inequality which, through the application of the expansion method, can lead to further understanding of inequality with respect to whatever contextual dimensions appear relevant. The measurement of continuous inequality within a system and its shift in the context of time has been demonstrated through an empirical analysis of development inequalities between countries. The empirical analysis reveals the essential realities of development inequalities that would have been obscured by using conventional methodologies for the investigation of inequality. By allowing the parameters in the rank-size function to drift over relevant contexts—level of development (rank) and time— the expansion methodology makes it possible to investigate the continuous variation of development inequality within a system and allows an exploration of its temporal dynamics. The findings confirm that development inequality is not stable across development levels; instead, inequality is smaller among developed countries and is greater among developing countries. It was also found that these tendencies have intensified over time. The investigation of inequality within the above frame of reference can be used as a building block for generating natural partitions among the social units involved, as opposed to relying on a priori categories as is required in conventional approaches. The approach outlined here is also very flexible, as it is capable of addressing any aspect of systemic inequality that might be of interest
164 C.C.FAN
in a wide variety of situations. While the empirical example in this paper has demonstrated variations in the context of development level and time, the methodology is ultimately capable of investigating how inequality varies across an open-ended variety of variables. NOTES 1 Zipf examined the relationship between frequency and rank in issues such as the economy of words in James Joyce’s Ulysses and the number of intervals in Mozart’s Bassoon Concerto in Bb Major. When investigating metropolitan districts in the USA, he substituted population size for frequency in the relationship. 2 The high-income oil-exporting countries are not included since the tremendous increase in their per capita GDPs in the 1970s was very much a result of the extraction and export of petroleum that may not throw light on the improvement in their levels of ‘development’, which should instead involve certain components in the economies that could enhance sustaining economic growth in the future. Countries which have gained their independence after the Second World War, many of them African countries, are also not included because the time duration appears to be too short to be capable of revealing major relationships associated with development inequalities. 3 See World Bank (1983) for the schemes of classification of countries.
REFERENCES Benoit, E. (1972) ‘Closing the international income gap: is it urgent?’, Columbia Journal of World Business 7:27–33. Casetti, E. (1972) ‘Generating models by the expansion method: application to geographical research’, Geographical Analysis 4: 81–91. ——(1986) ‘The dual expansion method: an application for evaluating the effects of population growth on development’, IEEE Transactions on Systems, Man, and Cybernetics SMC–16:29– 39. Cole, J.P. (1981) The Development Gap. A Spatial Analysis of World Poverty and Inequality, New York: Wiley. Danta, D.R. (1985) ‘Identifying agglomerative-deglomerative trends in the Hungarian urban system, 1870–1980’, Ph.D. dissertation, Ohio State University. ——(1987) ‘Identifying urban turnaround in Hungary’, Urban Geography 8:1–13. Gaile, G.L. (1984) ‘Measures of spatial equality’, in G.L.Gaile and C.J.Willmott (eds) Spatial Statistics and Models, pp. 223–33, Boston, MA: Reidel. Garner, B.J. (1967) ‘Models of urban geography and settlement location’, in R.J.Chorley and P.Haggett (eds) Socio-Economic Models of Geography, pp. 303–60, London: Methuen. Lamers, E. (1967) ‘How fast will the gap close?’, International Development Review 9: 30–3. Malecki, E.J. (1975) ‘Examining change in rank-size systems of cities’, Professional Geographer 27:43–7.
THE DYNAMICS OF DEVELOPMENT INEQUALITIES 165
——(1980) ‘Growth and change in the analysis of rank-size distributions: empirical findings’, Environment and Planning A 12: 41–52. Perin, D.E. (1975) ‘Spatial income inequalities in the United States, 1953–1972’, Ph.D. dissertation, Ohio State University. Perin, D.E. and Semple, R.K. (1976) ‘Regional trends in regional income inequalities in the U.S.’, Regional Science Perspectives 6: 65–85. Russett, B.M. (1965) Trends in World Politics, New York: Macmillan. Semple, R.K. (1977) ‘Regional development theory and sectoral income inequalities’, in J.Odland and R.N.Taaffe (eds) Geographical Horizons, pp. 45–67, Dubuque, IA: Kendall/Hunt. Semple, R.K. and Gauthier, H.L. (1972) ‘Spatial-temporal trends in income inequalities in Brazil’, Geographical Analysis 4:169–80. Sicherl, P. (1973) ‘Time-distance as a dynamic measure of disparities in social and economic development’, Kyklos 26:559–75. Summers, R., Kravis, I.B. and Heston, A. (1984) ‘Changes in the world income distribution’, Journal of Policy Modeling 6:237–69. Theil, H. (1967) Economics and Information Theory, Chicago, IL: Rand McNally. Walsh, J.A. and O’Kelly, M.E. (1979) ‘An information theoretic approach to measurement of spatial inequality’, Economic and Social Review 10:267–86. World Bank (1981) World Development Report, New York: Oxford University Press. ——(1983) World Tables, vol. I, Economic Data, Baltimore, MD: Johns Hopkins University Press. Zimmerman, L.J. (1962) ‘The distribution of world income 1860– 1960’, in E.de Vries (ed.) Essays on Unbalanced Growth, pp. 28– 55, S-Gravenhage, The Netherlands: Mouton. Zipf, G.K. (1949) Human Behavior and the Principle of Least Effort, Cambridge, MA: Addison-Wesley.
10 IDENTIFYING HIERARCHICAL DEVELOPMENT TRENDS IN THE HUNGARIAN URBAN SYSTEM USING THE EXPANSION METHOD Darrick R.Danta The expansion method is a procedure that involves the construction of complex terminal models from simple initial models. It is a valuable research tool from at least three perspectives. First, it forces the researcher to question the usually implicit assumption of parameter stability inherent in most models and empirical analyses. Second, the expansion method formalizes the exploration of various types of parameter change, or drift, along such relevant dimensions as time or distance. Discovery of drift in these cases may lead to refinement of existing theory or to the formation of new constructs to account for the identified patterns. Third, the method allows for the derivation of specific models that can be used in rigorous hypothesis-testing frameworks aimed at determining the validity of competing theoretical propositions. In this paper, the expansion method is used in an exploratory fashion to identify the development dynamics of the Hungarian urban system over the period 1870–1986. The accent is on application: on showing how the expansion method can be used as a base for asking relevant questions concerning the development tendencies of an urban system; on formulating an appropriate model to identify the presence and particulars of parameter drift; and finally on demonstrating how the results of the analysis can be used to derive new theoretical constructs and research avenues. The paper is organized as follows. In the next section, some aspects of the theory of urban systems development are reviewed and extended. Next, the data for the analysis and background material for Hungary are presented, followed by the construction of the expansion model. The results of the regression analysis and their interpretation are then given. The paper ends with a summary and concluding remarks. THEORY One goal of urban geographers has been to identify, explain, and predict patterns associated with urban systems development: the progressive restructuring of population within settlement networks through differential city growth rates brought on by, and themselves influencing, various geographic, historic, economic, demographic, and political forces. The current state of understanding
DEVELOPMENT TRENDS IN THE HUNGARIAN URBAN SYSTEM 167
Figure 10.1 The urban turnaround model
can be summarized by what is referred to here as the urban turnaround model. In this framework, urban systems are seen to progress through an essentially twostep pattern involving a phase of increasing levels of spatial-hierarchical concentration, or agglomeration, followed by a phase of falling levels, or deglomeration (Alonso 1968; Casetti 1968, 1984; El-Shakhs 1972; Bourne 1980; Gaile 1980). In terms of quantitative relationships, if a rank-size distribution is taken as the norm, then agglomeration is registered as an increase in overall slope; deglomeration as a decrease in slope; and urban turnaround as the transition from one growth dynamic to the other (Figure 10.1). This framework in general, and various second-phase growth patterns in particular, have received considerable attention recently (Richardson 1980; Smith et al. 1983; Townroe and Keen 1984; Ogden 1985; Vining 1986; Kontuly and Vogelsang 1988) and has been shown to apply to the developing Hungarian system (Danta 1987b). But although the urban turnaround model rests on sound theoretical underpinnings and empirical verification, it is simple and may mask more complicated, interesting, and significant patterns. Specifically, the model rests on the assumption that agglomerativedeglomerative trends occur in unison across the entire urban hierarchy; in other words, that cities at each rank position simultaneously experience the dominant prevailing growth dynamic so that the rank-size curve remains a straight line throughout the course of development. Placing this within an expansion method context, the rank-size slope parameter is assumed to be stable with respect to rank. The assumption of parameter stability inherent in the urban turnaround model, however, may not be warranted. For example, development usually does not occur evenly across the urban hierarchy; rather, one or at most a few cities are often propelled into a state of primacy, while at the same time the smallest places may be drained of their populations. This type of development pattern—termed polarization here—results in departures from a linear rank-size distribution in the form of concavity in the upper reaches of the curve and convexity in the lower portions. At some point in time, though, this trend may reverse and thus usher in trickle-down effects which diffuse the momentum of growth to the rest of the system, thereby balancing the curve (Figure 10.2). But how exactly do polarization and trickle-down effects manifest themselves in a developing urban
168 D.R.DANTA
Figure 10.2 Polarized growth
system? Does the system experience simple increasing and then decreasing concavity/convexity? Or is the pattern more complex, involving different phases at each rank position? How do the patterns change through time? Do trends at the low end of the distribution mirror those at the top, or are they independent? Or is the urban turnaround model sufficient to account for development trends? Trying to answer these questions using conventional analyti cal techniques would be difficult; however, the task can be greatly facilitated by placing the problem within an expansion method context, as will be demonstrated in a later section. First, though, a discussion of the data set and background to Hungary is in order. DATA Data for the analysis consist of decennial census population totals for 96 Hungarian cities spanning the interval 1870– 1986 (1986 data are derived from registry). The cities are located within the present territory of Hungary and the data are enumerated to constant city boundaries. However, only cities with population greater than 10,000 were used in the analysis; consequently, the number of cities in any given time period ranged from 45 in 1870 to 93 in 1986 for a total of 908 observations. The remarkable feature of the Hungarian urban system is its degree of primacy. Budapest, currently at a little over 2 million population, has long dominated the cultural, economic, and political life of the country. The next tier of the urban hierarchy consists of the five regional centers, which ring the capital in classic central place fashion and contain around 200,000 population. Below this level are centers with generally less than 100,000 population that perform secondary administrative, manufacturing, or service functions, while smaller cities mainly perform local service functions. Places less than 10,000 generally do not exhibit demographic dynamism (Enyedi 1976:235) and consequently were excluded from the data set. The development of the Hungarian urban system has been affected by both external influences and internal policies. Early development was checked by first Turkish and then Austrian domination, thereby postponing largescale economic growth until about 1870. After this date, industrialization occurred rapidly in and around Budapest, soon launching that city into greater primacy. During the latter part of the nineteenth and early twentieth centuries, growth remained centered in
DEVELOPMENT TRENDS IN THE HUNGARIAN URBAN SYSTEM 169
Figure 10.3 Hungarian rank-size distributions
the capital, although some mining, processing, and market towns flourished. Damage suffered during the First World War, especially through the dismemberment of the Austro-Hungarian Empire, followed by further
170 D.R.DANTA
devastation in the Second World War shattered the economy and wrought considerable physical damage. After the war, the new government quickly initiated Soviet-style heavy industrialization based mainly in Budapest. However, the identified goals of early and subsequent regional development policies have been to reduce the ‘Budapest problem’ and to balance the urban hierarchy and regional economic structure of the country through what amounts to a growth pole strategy focused on the five regional centers (see Danta 1987a). A crude indication of the development tendency of the Hungarian urban system is gained through examination of the rank-size distributions for the ninety-six cities for selected time periods. As seen in Figure 10.3, Budapest in 1870 had already eclipsed the next largest city’s growth to attain a pronounced level of primacy. The cities in ranks 2 to about 50, however, show a more or less rank-size distribution, while the curve for cities less than 10,000 falls off rapidly. Growth over the next two periods (to 1930) is confined largely to Budapest and the highest ranking centers, with little increase being registered by the lowest ranking places. The Hungarian urban system then experienced allometric growth over the period 1930–60, as indicated by parallel shifts of the rank-size curve at most positions. Afterwards (to 1986) the middle and lower portions of the curve show the greatest increase. In summary, the Hungarian urban system appears to have experienced a period of polarization to about the Second World War followed by trickle-down and a balancing of the system achieved through the more rapid growth of the intermediate-and small-sized cities. This pattern corresponds quite well to the ideas outlined above, although the dynamics of the urban system need to be more rigorously evaluated before precise conclusions can be reached. THE MODEL The analysis begins by adopting as the initial model the rank-size formula, expressed as: (10.1) where In p and In r are the natural logarithms of city population and rank; c is the estimate of the logarithm of population for the rank 1 center; and q is the rank-size slope coefficient. The parameter q is particularly useful in studies of urban systems development since it assesses the elasticity of population and as such is a measure of hierarchical concentration within an urban system. Furthermore, an increase in the magnitude of q denotes agglomeration; while a decrease in q portends deglomeration. Such growth tendencies can be determined by solving (10.1) for time series population and rank data for each available period, using the expansion method to redefine q as a linear function of time to identify simple trends (Malecki 1975) or, alternatively, as a quadratic of time to capture a switch in growth indicative of urban turnaround (Danta 1987b). These
DEVELOPMENT TRENDS IN THE HUNGARIAN URBAN SYSTEM 171
types of analyses, however, treat the system as a unit; no insight can be gained into tendencies at specific rank positions. The suggestion was made earlier that participation in the development process may occur unevenly with respect to the various levels of the urban system; i.e. all rank positions may not necessarily possess the same degree of hierarchical concentration at a particular point in time. To capture this type of nonlinear drift, q may be expanded as a quadratic function of rank, which is sufficient to identify patterns of polarization and trickle-down occurring at both the high and low ends of the urban system since the terminal model would be a cubic with potentially two inflection points. The expansion of q is thus: (10.2) 2 where In r is the logarithm of rank and (In r) is the square of the logarithm of rank. The parameters of (10.2) represent the drift of q within an urban system. If both q1 and q2 are zero, then the urban system is best characterized as a straight line and hence levels of concentration are equal at each rank position. For q1 and/ or q2 positive/negative, levels of relative concentration decrease/increase down the urban hierarchy. Expansion equation (10.2) could be replaced into (10.1) to obtain a terminal model capable of assessing the drift of q across the urban hierarchy for a particular time period, or to determine the average condition from a time series of data. However, the expansion method can be applied again to construct an equation that simultaneously captures the drift of q across rank and through time. This model is specified by expanding the parameters q0, q1 and q2 of equation (10.2) as quadratics of time, reflecting the desire to capture the changing temporal behavior of the drift of q with respect to rank: (10.3) (10.4) (10.5) where t equals time in years from the initial period (1870=0; 1880=10, etc.). In addition to the expansion of q, the parameter c in equation (10.1) is also expanded as a quadratic function of time to allow for nonlinear growth of the rank 1 city: (10.6) Substituting the right-hand sides of expansion equations (10.3)–(10.6) into (10.1) yields the terminal model: (10.7)
172 D.R.DANTA
Equation (10.7) is a cubic and thus can accommodate nonlinear departures from a straight line rank-size curve identified with polarized growth. In this instance, the terminal model is applied to the data set to see which, if any, of the expansion parameters are significant. Because the range of potential outcomes is wide, no specific a priori outcomes are anticipated and so interpretations must await the results. The investigation is thus couched within an exploratory mode of inquiry rather than a strict hypothesis-testing context since precisely defined competing frameworks do not exist to guide such an analysis. RESULTS Equation (10.7) was applied to the Hungarian data for cities greater than 10,000 population using stepwise multiple regression with entry and exit criteria set at 0. 05 and 0.1. The results are presented in Table 10.1. They indicate that all the parameters in equation (10.7) are significant at very high significance levels; furthermore, the signs are consistent and interpretable. The actual patterns of population agglomeration and deglomeration occurring within the Hungarian urban system over the study period are indicated by the Table 10.1 Results of rank and time expansion analysis: Hungary’s urban system Term
Estimate
c0 12.51431958 c1 0.03370827 c2 –0.00016598 q00 –2.44678404 q01 –0.02727577 q02 0.00025043 q10 0.92235195 q11 0.00924623 q12 –0.00010402 q20 –0.13149126 q21 –0.00102780 q22 0.00001316 Note: DF=11, 897; R2=0.9823.
Significance 0.0000 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0004 0.0001
estimated expansion equation for q, obtained by replacing the letter parameters of equations (10.3)–(10.5) by their numerical estimates:
DEVELOPMENT TRENDS IN THE HUNGARIAN URBAN SYSTEM 173
(10.8)
The behavior of q, and hence the characteristics and growth dynamics of the Hungarian urban system, can be determined by solving equation (10.8) for values of time and rank. For example, Figure 10.4 plots estimates of the values of q for cities greater than 10,000 by rank for 1870, 1900, 1930, 1960, and 1986. In this graph, the higher a point is on the ordinate, the greater is the magnitude of q and hence the greater is the level of hierarchical concentration existing at the particular rank position indicated on the abscissa. Likewise, the steeper the slope of the curve, the greater is the rate of change of q with respect to rank and hence the greater is the degree of nonlinearity present in the original rank-size distribution. Since a straight, horizontal line on the graph would indicate equal values of q at each rank position, a tendency for the curves in Figure 10.4 to steepen or flatten through time indicates increasing or decreasing polarization. The curves in Figure 10.4 can be interpreted as follows. The steepness of the curve for 1870 indicates that the Hungarian urban system was unbalanced even at the onset of the study period, prior to the first wave of industrialization. Between 1870 and 1900, the intercept, which is the estimate of q for Budapest, rises dramatically, while only modest gains are registered by the small centers. This pattern is indicative of rapid polarization within the Hungarian urban system during the last thirty years of the nineteenth century. From 1900 to 1930, however, the curve shifts outward almost equally at each rank position, denoting proportional growth. The minimum point, though, continues to be displaced to the right, indicating further polarization effects in the small cities. After 1930, the inter cept value drops sharply, denoting the beginning of trickle-down effects in the larger centers, while the values of q continue to increase for the segment of the curve corresponding to the lower ranking centers. The intercept value continues its decline from 1960 to 1986 to below 1870 levels, while the smaller places remain near their 1960 levels of concentration. These results clearly document the phase of polarization up to around 1930, followed by trickle-down effects and a general leveling of relative population concentration within the Hungarian urban system, especially during the latter periods. These findings are commensurate with those for the simple rank-size curves discussed earlier, though these more precise specifications allow greater insight into the development dynamics operating in Hungary over the time horizon under investigation. Specifically, the results demonstrate conclusively that the system is more balanced now than at the beginning of the study period and that deglomerative tendencies set in some thirty years earlier than identified before. As for theory, these findings lend support to the notion of nonlinear drift
174 D.R.DANTA
Figure 10.4 Estimates of q by rank
in rank-size distributions occurring during polarized development; namely, the increasing and then decreasing concavity/convexity in the rank-size curves. Finally, trends in the largest and smallest cities appear to parallel one another. A further test, however, is needed to specify the growth dynamics more precisely, in particular the timing of the switch from agglomeration to deglomeration for the various sized places. These dates can be determined from equation (10.8) and are shown in Figure 10.5. This graph indicates the estimated time (year) when a particular rank level (as shown on the ordinate) experienced the switch from increasing to decreasing magnitudes of q; in other words, from agglomeration to deglomeration. The graph shows clearly that the timing of the switch was not uniform across the urban hierarchy: the switching phenomena began in Budapest in 1924 and then proceeded to successively smaller centers through time, ending in 1949 at rank position In 3.4 (30). This finding documents the manner in which
DEVELOPMENT TRENDS IN THE HUNGARIAN URBAN SYSTEM 175
Figure 10.5 Timing of switch of q from increasing to decreasing
growth was diffused to successively smaller centers during economic maturation and diversification. Although not as pronounced as for the large centers, the graph also indicates that the switch in growth dynamics occurred earliest in the smallest cities and then proceeded up the hierarchy through time to larger ones. This behavior can be explained as follows. During the early phases of economic development, the very small cities, which possessed limited economic base to begin with, became even less viable due to the pull of the larger centers. The result was rapid outmigration that reduced the population of the small cities, thereby skewing the rank-size curve downward. This decline, however, still resulted in an increase in the local slope of the rank-size curve and hence in the magnitude of q. Likewise,
176 D.R.DANTA
trickle-down implies a reversal of polarization, which occurs after the magnet effect of the largest places diminishes and the cities engage in sustained growth, even though such growth results in decreasing values of q. For small cities, then, polarization and trickle-down effects are registered by increasing and decreasing magnitudes of q just as for the large ones; however, these trends are produced through opposite processes. In the case of Hungary, the graph indicates that deglomerative trends commenced in the smallest city in the data set (rank position In 4.33 (76), which had a 1941 population of 10,008) in 1940 and progressed up the hierarchy to meet the cycle coming down from above. These results are quite interesting, indicating as they do the tendency for the urban system to develop through a series of agglomerative and then deglomerative phases that begin in the largest/smallest sized cities and then progress through time down/up the urban hierarchy. This pattern, which heretofore has not been described in the literature, is introduced here as the cascading cycles model of urban systems development. This framework still needs to be placed on firmer theoretical ground and verified empirically for more cases; however, it appears promising and deserves further scrutiny. Another interesting aspect of Figure 10.5 is the identification of the reflection or ‘hinge’ rank level. The theoretical significance of the hinge level is not completely known at this time; however, it appears to signify the lower limit of the ‘active’ cities in an urban system and thus holds meaning for policy formation and implementation, though more in-depth analyses must be undertaken to reach firm conclusions. In the case of Hungary, the rank of In 3.4 (30) is identified as the hinge level, which had a 1949 population of 25,100. Further investigation of this city size, particularly with an eye to policy formation, is warranted. CONCLUSIONS This study has used the expansion method to study the growth dynamics exhibited by the developing Hungarian urban system over the period 1870–1986. In particular, an initial rank-size model was expanded first as a quadratic function of rank and then as quadratic functions of time to assess changing levels of hierarchical concentration and hence agglomerative and deglomerative trends operating at specific rank levels. The results indicate that the Hungarian urban system experienced a general phase of agglomeration to about the 1930s, followed by deglomeration and a balancing of the system. Furthermore, the agglomerative/deglomerative phases proceeded down the urban hierarchy from the top and up from the bottom in cascading cycles fashion. Last, a city population of 25,000 was identified as an important size in Hungary and 1949 was identified as a watershed year in the development of the system. These findings are significant, since they show that deglomerative trends commenced immediately after the initial phase of rapid industrialization and that all agglomerative trends in Hungary ceased prior to the socialist period; gains made
DEVELOPMENT TRENDS IN THE HUNGARIAN URBAN SYSTEM 177
in balancing the system after 1949 thus appear to be ‘spontaneous’ rather than planned. The results also point to the possibility of a new framework of urban systems development: the cascading cycles model. Further investigation of this framework, in particular its theoretical underpinnings and empirical verification using other case studies, is both warranted and welcome. Finally, this paper has demonstrated the usefulness of the expansion method in social science research. The expansion method paradigm helped to focus attention on and call into question the assumption of parameter stability inherent in the standard model of urban systems development and to identify potential classes of parameter instability, or drift. The expansion method was then used to redefine an initial rank-size model to produce a terminal model capable of establishing the specific mode of drift present in the data set. Once this was accomplished, the results were analyzed to provide (a) much deeper insight into the operation of the Hungarian urban system than could have been achieved through more conventional forms of analyses, and (b) the basis for developing a new framework of urban systems development. REFERENCES Alonso, W. (1968) ‘Urban and regional imbalances in economic development’, Economic Development and Cultural Change 17:1– 14. Bourne, L.S. (1980) ‘Alternative perspectives on urban decline and population deconcentration’, Urban Geography 1:39–52. Casetti, E. (1968) ‘Optimal interregional investment transfers’, Journal of Regional Science 8:101–7. ——(1984) ‘Peripheral growth in mature economies’, Economic Geography 60:122–31. Danta, D.R. (1987a) ‘Hungarian urbanization and socialist ideology’, Urban Geography 8: 391–404. ——(1987b) ‘Identifying urban turnaround in Hungary’, Urban Geography 8:1–13. El-Shakhs, S. (1972) ‘Development, primacy, and systems of cities’, Journal of Developing Areas 7:11–36. Enyedi, G. (1976) Hungary: An Economic Geography, Boulder, CO: Westview. Gaile, G.L. (1980) ‘The spread-backwash concept’, Regional Studies 14:15–25. Kontuly, T. and Vogelsang, R. (1988) ‘Explanations for the intensification of counterurbanization in the Federal Republic of Germany’, Professional Geographer 40:42–54. Malecki, E.J. (1975) ‘Examining change in rank-size systems of cities’, Professional Geographer 24:43–7. Ogden, P.E. (1985) ‘Counterurbanization in France: the results of the 1982 population census’, Geography 70:24–35. Richardson, H. (1980) ‘Polarization reversal in developing countries’, Papers, Regional Science Association 45:67–85. Smith, W.R., Huh, W. and Demko, G.J. (1983) ‘Population concentration in an urban system: Korea, 1949–1980’, Urban Geography 4:63–79.
178 D.R.DANTA
Townroe, P.M. and Keen, D. (1984) ‘Polarization reversal in the state of Sao Paulo, Brazil’, Regional Studies 18:45–54. Vining, D.R., Jr (1986) ‘Population redistribution towards core areas of developed countries, 1950–1980’, International Regional Science Review 10:1–45.
11 AN EXPLORATION OF THE RELATIONSHIP BETWEEN SECTORAL LABOR SHARES AND ECONOMIC DEVELOPMENT Kavita Pandit One of the strengths of the expansion method lies in its ability to incorporate real-world complexities into simple models with strong theoretical foundations. The systematic procedure provided by the method enables these complexities to be introduced while maintaining the theoretical integrity of the relationship being investigated. The methodology has special significance in the study of relationships in the social sciences that have attained a quasi-law status due to their compelling theoretical rationale, but that are not necessarily invariant over time, space, or context. One set of such social science ‘laws’ is found in the field of development economics. Numerous authors (Clark 1957; Chenery 1960; Kuznets 1966a) have highlighted the systematic variation in a country’s economic and social structure that is associated with rising income. These structural changes or ‘patterns of development’ encompass accumulation processes, resource allocation processes, and demographic and distributional processes (Table 11.1). The theoretical justification for these extensively documented trends is based upon theories such as, among others, the dual economy theory of Lewis (1954), Engel’s law of household consumption, and the theories of balanced and unbalanced growth of Rosenstein-Rodan (1943) and Nurkse (1959). This paper demonstrates the usefulness of the expansion method in examining one particular structural change—the change in the sectoral allocation of the labor force during Table 11.1 Structural changes during economic development Accumulation ′ Resource allocation ′ Demographic/distributional ′
Investment Government revenue Education Domestic demand Production Trade Labor allocation Urbanization Demographic transition Income distribution Source: Chenery and Syrquin 1975
economic development. Theoretical arguments advanced by Fisher (1939) and Clark (1940, 1957) posit that, as a country’s per capita gross national product
180 K.PANDIT
(GNP) grows, the labor force allocation in agriculture declines, the allocation in manufacturing first increases and then declines, and that in services increases steadily. While this hypothesis has been empirically validated by numerous studies, evidence has also suggested that the sectoral shifts relation may not be constant over time, space, or certain contexts. The expansion method provides a systematic procedure to investigate the manner in which the relationship holds differently in various domains. The analysis presented here is structured around a critical review of the traditional approach to the examination of development patterns, as exemplified by Chenery and Syrquin’s study, Patterns of Development 1950–1970 (1975). This approach, with its emphasis on the selection of the right independent variables for inclusion in the model, and its implicit assumption of contextual stability in the model, provides a sharp contrast to the investigative approach represented by the expansion method. Although the comparison is restricted to only one structural change, i.e. labor allocation patterns, it has implications for the manner in which development patterns in general should be researched. The rest of the paper is organized as follows. The first section presents a theoretical background related to the sectoral shifts of labor during development. Next is a discussion of the approach adopted by Chenery and Syrquin. The third section presents an alternative model based upon the expansion method. This is followed by a discussion of the results and conclusions. DETERMINANTS OF SECTORAL SHIFTS OF LABOR The theoretical basis of the relationship between the sectoral allocation of labor and economic development was first established by Fisher and Clark, who attributed it to two factors: the changing structure of consumer demand and the differential growth of sectoral labor productivities. In the first case, they argued that sectoral employment is determined by sectoral output, which, in turn, is dependent upon the demand for sectoral goods. This demand is influenced by household consumption patterns, which according to Engel’s law, are determined by income. Thus, it is maintained that the observed shifts from the primary to the secondary and the tertiary sector with rising income parallel shifts in domestic demand: while the relative demand for agricultural products falls with time, that for manufactured goods first rises and then falls in favor of services. The link between changes in household demand and the sectoral allocation of the labor force has been investigated by numerous authors (Maizels 1963; Fuchs 1968; Ramos 1970; Thirwall 1978). Characteristics of production also influence labor allocation patterns through the critical linkage between sectoral output and sectoral employment. Sectors which have high labor productivities will employ fewer additional laborers per unit growth in output than sectors with lower labor productivities. Further, the sector with the fastest growth rate of labor productivity will, all else being equal, account for a declining relative share of the labor force. Clark observed that as
RELATIONSHIP BETWEEN LABOR SHARES AND DEVELOPMENT 181
per capita national income rises, ‘real product per man-hour in manufacture… nearly always advances at a greater rate than the real product per man-hour in other sectors of the economy’ (Clark 1957:493). Kuznets (1966b), in an analysis of the industrial structure of thirteen countries between 1801 and 1958, found systematic differences in sectoral productivity growth. He noted that while labor productivities in agriculture and manufacturing grew at rates equal to and greater than the national rate, respectively, service sector productivity grew at a slower rate than that of the economy as a whole. Although the theoretical determinants of sectoral shifts are based upon relatively uniform features of development, i.e. household demand structure and labor productivity growth, actual studies of the changes in sectoral employment with development have indicated that these patterns have been far from uniform from context to context. Evidence from the presently developing countries, for example, indicates that the patterns emerging there differ from those observed historically in the now industrialized countries. Studies by Bairoch and Limbor (1968) and Squire (1979), among others, show that the agricultural sector of the less developed countries (LDCs) accounts for a larger percentage of the labor force than that observed at similar levels of development in the past for the developed countries. Within the nonagricultural sectors of the LDCs, employment in industry has consistently lagged that in services. The sluggish rate of labor absorption in manufacturing has been documented in the contexts of Latin America (Jaffee 1959; Jones 1968; Ramos 1970; Kirsch 1973) and Asia (Mehta 1961; McGee 1967; Oshima 1971). Several factors cause a variation in the pattern of sectoral shifts with per capita GNP. International trade, for example, can distort the hypothesized relationship between domestic demand for sectoral products and production (and, therefore, employment). Imports satisfy domestic demand without stimulating production, while exports can generate employment in response to demands that are not local. Population size can also affect a number of development processes both directly and indirectly through its effect on economies of scale and transport costs. Trade, for example, with its related effects on sectoral employment, tends to be of greater importance in the economies of small countries, which are often geared towards primary specialization or export processing, than in the economies of large countries that have a substantial domestic demand. THE CHENERY-SYRQUIN APPROACH In their book, Chenery and Syrquin (hereafter referred to as CS) examine ten basic structural changes, encompassing accumulation processes, resource allocation processes, and demographic and distributional processes (see Table 11.1). The major goal of their study, which is an extension of earlier works by Chenery (1960) and Chenery and Taylor (1968), is to separate the effects of factors affecting all countries from particular characteristics of individual
182 K.PANDIT
countries. The analytical tool is regression analysis, with income level, population size, and trade pattern as the independent variables. These variables were used in the examination of all structural changes in order to facilitate a uniform analysis. The methodology used to examine the sectoral shifts of labor, therefore, parallels that used for the investigation of other development patterns. The shift of employment from agriculture to manufacturing and services in a particular country is divided into three components: 1 the effect of universal factors systematically related to income level (e.g. household demand); 2 the effect of other general factors that governments have little or no control over (e.g. market size, natural resource endowment); and 3 the effect of a country’s individual history, political and social objectives, and government policy. CS seek to identify the effects of factors 1 and 2 since they affect all countries and account for the bulk of the observed variation. They therefore use the following three explanatory variables in their analysis: 1 GNP per capita (Y)—used as an overall index of development, as well as a measure of output/demand. 2 population (N)—introduced as a proxy for market size and scale economies. 3 net resource flow (F)—defined as imports minus exports as a percentage of GNP. The authors note that the treatment of F as an independent variable poses difficulties because, as a country develops, the economic processes tend to equilibrate, resulting in an adjustment of capital inflow. For this reason, they carry out regressions with and without the F variable. The following model equations are used in their analysis: (11.1) (11.2) where X is the percentage of labor in agriculture (A), manufacturing (M), or services (S), and ln Y and ln N denote the natural logarithms of Y and N. The data used by CS were compiled by the World Bank’s Social and Economic Data Division in June 1972. Although the basic statistics cover 101 countries over a maximum time span ranging from 1950 to 1970, the authors omit time trend analyses in their investigation of sectoral shifts. Instead, they pool together the available data for a total of 165 observations. The sample pertains mainly to the 1960s. CS’s approach, whereby a uniform method of analysis is applied to the entire range of structural changes, has considerable merit. Most importantly, by adopting a generalized formulation, it facilitates the comparison of various development patterns, and enables meaningful conclusions to be drawn. For
RELATIONSHIP BETWEEN LABOR SHARES AND DEVELOPMENT 183
example, the comparison of labor force shifts with production shifts allows inferences to be drawn about sectoral labor productivities. At the same time, the methodology adopted by CS in studying the sectoral shifts patterns has some underlying shortcomings. Their approach first establishes a relationship between sectoral labor shares and per capita GNP and then attempts to ‘improve’ its explanation by adding on other exogenous variables. The basic assumption is that better explanations of a given relationship are achieved by including additional variables specific to it. The problem with this procedure is twofold. First, with each addition in independent variables, the theoretical integrity of the basic relation is reduced and the focus moves away from the relationship of interest—that between sectoral labor shares and per capita GNP. The second weakness of the CS approach stems from its implicit assumption of model stability, since the end product of the strategy is a single regression model,1 with a string of independent variables, that is believed to best describe the pattern of structural changes in all countries, time periods, and contexts. Thus, by estimating the models represented by equations (11.1) and (11.2) with their invariant parameters a through f, there is no explicit recognition that these parameters may drift. The evidence, however, suggests that there is significant contextual drift in sectoral shifts patterns: i.e. in the relations between sectoral sizes and levels of development, and between rates of change of sectoral labor and rates of change of per capita GNP. In light of these criticisms, it is suggested here that the focus should be not on the estimation of a model of sectoral shifts, but upon the manner in which this model varies over time, space, and other domains. An alternative approach to the examination of the sectoral shifts relation and its contextual drift is suggested by the expansion method (Casetti 1972). The technique and its application to the study of the structural changes in labor is described in the following section. THE EXPANSION METHOD APPROACH The expansion method approach entails the definition of an ‘initial model’, based upon strong theoretical underpinnings and ‘expansion equations’ which detail the manner in which the parameters of the initial model vary over different domains. A ‘terminal model’ is arrived at through the substitution of the expansion equations into the initial model. The initial model In the study of sectoral shifts, the initial model consists of the relationship between the share of labor in agriculture, manufacturing, and services (A, M, S, respectively) and GNP per capita (Y) as the general index of economic development. The initial model implicit in the CS model described by equations (11.1) and (11.2) is
184 K.PANDIT
(11.3) The natural logarithm of Y is used to reduce the range of its values. A quadratic term (ln Y)2 is introduced as an independent variable to account for the possibility of non-linearities in the development patterns. By expressing the dependent variables as shares, and maintaining a common formulation for all three sectors, CS are able to achieve a desirable adding-up property of the aggregate. This, however, is true only as long as all estimates refer to exactly the same sample. Despite its merits, the formulation raises two issues. First, by definition, the variable X should range between 0 and 100 percent, and cannot take on negative values or values greater than 100. Yet, the formulations used by CS do not eliminate this possibility, i.e. it is possible that certain values of Y could cause the predicted values of the dependent variables to be outside the 0–100 percent range. The second problem with the formulation is the theoretical inconsistency of the quadratic term in the case of agriculture and services. In the first case, the inclusion of the term implies that the decline in the agricultural labor allocation is going to ‘bottom out’, an assumption that does not have theoretical foundations. It could be argued that the ‘bottoming-out’ effect is observed at sufficiently high (or even unattainable) levels of GNP per capita that the model is relevant within a practical range of income values. However, the present evidence indicates that countries in advanced stages of development have already attained values of A as low as 0.02 (2 percent). Given this reality, a model such as that proposed by CS has a real danger of overestimating the agricultural labor allocations at high levels of economic development. Similarly, in the case of services, the inclusion of the (ln Y)2 term suggests a pattern such as that observed in manufacturing, i.e. a ‘peaking’ of the sectoral labor. This assumption, like the earlier one, is not theoretically or empirically supported. It is suggested here that the initial model be defined by the following formulations: (11.4)
Table 11.2 Regression results for the Chenery and Syrquin model Independent variables
Regression coefficients
Agriculture
Manufacturing
Services
Intercept 217.790** –35.408 ln Y –32.575** 8.797 (ln Y)2 1.000 –0.039 R2 0.825 0.743 F 144.065 88.027 Note: *, significant at the 0.05 level; **, significant at the 0.01 level.
–82.382* 23.778* –0.962 0.705 72.936
RELATIONSHIP BETWEEN LABOR SHARES AND DEVELOPMENT 185
(11.5) Since S=(100–A–M), a separate regression for S is not necessary. The use of an exponential formulation eliminates the occurrence of nonpositive values. Further, the quadratic term is dropped in equation (11.4), implying that at high levels of GNP per capita A will asymptotically approach the X-axis rather than crossing it. The equations are easily linearized as follows: (11.6) (11.7) Equations (11.6) and (11.7) constitute the initial model. The value of Y at which the labor share in the manufacturing sector is at its peak (Yβ), can easily be calculated by computing the derivative dM/dY and setting it equal to zero. We obtain (11.8) The peak value M′ of M is given by (11.9) In order to compare the results obtained by the model given by equations (11.6) and (11.7) with the model used by CS as defined by equation (11.3), regressions were carried out using a data set similar to that utilized by CS. Since the original data were not given in their entirety in their publication, in this study I used cross-sectional Table 11.3 Regression results for the exponential model Independent variables
Regression coefficients
Agriculture
Manufacturing
Intercept 3.985** 2.410** Y –0.000276** 0.000473** 2 Y n.a. –3.73×10–8** 2 R 0.721 0.566 F 160.016 39.736 Note: *, significant at the 0.05 level; **, significant at the 0.01 level.
figures for 1977 from sixty-four countries published by the World Bank (1980). Table 11.2 presents the regression results for the CS model (equation (11.3)), while Table 11.3 presents the results for the alternative exponential model defined by equations (11.6) and (11.7). As expected, the coefficients for In Y and Y are negative for the agricultural sector and positive for the manufacturing and, in the case of the CS model, the services sector. In Table 11.2, the parameters for (In Y)2 were positive for the agricultural sector but negative in the case of the nonagricultural sectors, indicating a ‘leveling off of the decline/increase in sectoral labor shares. Similarly, the coefficient for Y2 in Table 11.3 is negative, indicating the eventual decline in the manufacturing labor share with
186 K.PANDIT
Figure 11.1 Agricultural and manufacturing labor allocation during development
development. Figure 11.1 presents a graphical representation of the estimated initial model given by equations (11.6) and (11.7). The improvement in results obtained by using the exponential formulation rather than the CS model is evident through examination of the significance levels of parameters. Table 11.2 indicates that none of the (ln Y)2 terms was significant at the 0.05 level of significance, while the ln Y coefficients were significant in only two of the three sectors. The coefficients of the exponential model, in contrast, are all significant at the 0.01 level. The reduction in the R2 values in Table 11.3 possibly reflects the reduced variance in the dependent variables when the natural logarithms of sectoral shares are used. The expansion equations The parameters of the initial model are then expanded as linear functions of selected variables. The variables selected for the expansions were (a) population size, (b) an index of resource flow, (c) time, and (d) regional dummy variables. The population size and resource flow variables, used by CS in their analyses, have been considered influential in determining the sectoral labor allocation patterns. Consequently, the first two expansions are given by, respectively, (11.10) (11.11)
RELATIONSHIP BETWEEN LABOR SHARES AND DEVELOPMENT 187
where here and hereafter x denotes the parameters a, b, p, q, and r in equations (11.6) and (11.7), N indicates population size, and F is resource flow, defined as imports minus exports as a percentage of GNP. The third and fourth expansions explore the occurrence of temporal and spatial drift in the relation. CS make an unwarranted assumption of temporal stability by pooling together time series and cross-sectional data in their analysis. To investigate the drift of the relation with time, the following expansion is undertaken: (11.12) To examine spatial drift in the sectoral shifts relation, three dummy variables D1, D2, and D3 were created. Each variable takes on a value of either 0 or 1 depending upon the region in which a country is located. The values are defined as follows: Region
D1
D2
D3
Latin America Africa Asia (excluding Japan) Others (more developed countries (MDCs))
1 0 0 0
0 1 0 0
0 0 1 0
The expansion equations in this case are represented as follows: (11.13) It is interesting to note that, in the light of the expansion method, the CS model given by equations (11.1) and (11.2) can be interpreted as an intercept expansion of the initial model given by equation (11.3). However, because they involve only an intercept expansion, they can only determine whether there is an ‘upward’ or a ‘downward’ shift in the sectoral shift curve (11.3). They cannot show if the curve actually ‘changes shape’ across differing contexts. Thus, the model is unable to determine whether contextual variables F and N affect the timing of the ‘peak’ of manufacturing labor. The terminal models By substituting the right-hand sides of the expansion equations into the initial model, four terminal models were obtained. They are specified as follows. 1 Population size expansion—substitution of expansion equation (11.10) into the initial model: (11.14) (11.15)
188 K.PANDIT
2 Resource flow expansion—substitution of expansion equation (11.11) into the initial model: (11.16) (11.17) 3 Time expansion—substitution of expansion equation (11.12) into the initial model: (11.18) (11.19) 4 Space expansion—substitution of expansion equation (11.13) into the initial model: (11.20)
(11.21)
To estimate equations (11.14) – (11.21), I compiled first a sixty-four-country cross-sectional data set for 1977, which included figures on imports, exports, and population in addition to the basic statistics on sectoral labor shares and GNP per capita, and, second, a 105–country time series data set, with statistics for sectoral labor shares and GNP per capita for the years 1960, 1965, 1970, 1977, and 1981. The first data set was used to estimate equations (11.14) – (11.17), i.e. drift with respect to population and resource flow. The second data set was employed for the temporal expansions represented by equations (11.18) – (11.19). The 1981 observations of the second data set were used to estimate equations (11.20) and (11.21). Table 11.4 Regression results—expansion by population Regression coefficients Independent variables
Agriculture
Manufacturing
Full
Stepwise
Full
Stepwise
Intercept ln N Y (ln N)Y Y2 (ln N)Y2 R2
1.643 0.143* 0.000209 –0.000029 n.a. n.a. 0.748
2.116** 0.115** _ –0.000017** n.a. n.a. 0.745
1.412 0.060 0.00101 –0.000032 –6.567X10–8 1.661×10–9 0.577
2.409** _ 0.000474** _ – –2.257×10–9 0.567
RELATIONSHIP BETWEEN LABOR SHARES AND DEVELOPMENT 189
Regression coefficients Independent variables
Agriculture
Full
Stepwise
Manufacturing Full
Stepwise
F 59.333 89.306 15.828 Note: *, significant at the 0.05 level; **, significant at the 0.01 level.
39.949
Table 11.5 Regression results—expansion by resource flow Regression coefficients Independent variables Agriculture Full
Stepwise
Manufacturing Full
Stepwise
Intercept 4.079** 4.070** 2.345** F –1.971* –1.575* 2.066 Y –0.000285** –0.000280** 0.000483** FY 0.000240 – –0.000816 Y2 n.a. n.a. –3.713×10–8 FY2 n.a. n.a. 5.629×10–9 2 R 0.748 0.747 0.599 F 59.454 90.034 17.335 Note: *, significant at the 0.05 level; **, significant at the 0.01 level.
2.410** – 0.000473** – –3.713×10–8 ** – 0.566 39.736
The estimation employed ordinary least squares regression. Two types of regressions were carried out for each equation: (a) a full regression with all independent variables, and (b) a stepwise regression entering only those variables significant at the 0.05 level or better. The results of the analyses are discussed in the following section.
RESULTS Tables 11.4–11.7 present the regression statistics for the expanded equations (11. 14)–(11.21). A cursory examination of the tables reveals that the expansion variables were significant, by themselves or as cross-products, in nearly all the regressions. There is therefore evidence of drift in the relationship between labor force allocation by sector and economic development over the four contexts investigated. A series of graphs in Figures 11.2–11.5 portray the results of the stepwise analyses. Figure 11.2 illustrates the manner in which the relationship between sectoral labor allocation and development varies with population size, by contrasting the sectoral shifts curves for a country with 200 million population with those for a country with 1 million population. The figure shows that size has
190 K.PANDIT
an important effect on the pace of labor force shifts out of agriculture and into manufacturing. It also affects the size and timing of the peak labor force absorption into the manufacturing sector. This finding challenges the implicit assumption of parameter stability underlying the CS models. The size and direction of a country’s resource flow caused a drift in the agricultural labor allocation patterns as indicated in Figure 11.3. Countries with a net resource outflow showed a higher share of agricultural labor at a given income level than Table 11.6 Regression results—temporal expansion Regression coefficients Independent variables
Agriculture
Full
Stepwise
Manufacturing Full
Stepwise
Intercept 4.273** 4.242** 1.834** t –0.000830* –0.00567* 0.0229** Y –0.000234** –0.000218** 0.000590** tY 0.000001 _ –0.000013** Y2 n.a. n.a. –3.886×10–8** tY2 n.a. n.a. 1.080×10–9** 2 R 0.698 0.697 0.519 F 408.180 90.034 114.149 Note: *, significant at the 0.05 level; **, significant at the 0.01 level.
1.834** 0.0229** 0.000590** –0.000013** –3.886×10–8** 1.080×10–9** 0.519 114.149
Table 11.7 Regression results—spatial expansion Regression coefficients Independent variables
Agriculture
Manufacturing
Full
Stepwise
Full
Stepwise
Intercept D1 D2 D3 Y D1Y D2Y D3Y Y2 D1Y2 D2Y2 D3Y2
3.473** 0.612** 0.852** 1.083** –0.0001** 0.0002** –0.00007 –0.0005** n.a. n.a. n.a. n.a.
3.561** 0.524** 0.716** 0.995** –0.0002** –0.0002** – –0.0005** n.a. n.a. n.a. n.a.
3.582** 0.0229 –0.819** –1.693** –0.000007 0.0002 0.0008** 0.0007* 8.85×10–10 –1.450×10–8 –7.806 ×10–8** –6.522×10–8
2.760** 0.229* – –0.539* 0.0002** – – 0.0002* –1.086×10–8** – – –
RELATIONSHIP BETWEEN LABOR SHARES AND DEVELOPMENT 191
Regression coefficients Independent variables
Agriculture
Full
Stepwise
R2
Manufacturing Full
Stepwise
0.877 0.874 0.641 F 101.510 116.800 15.546 Note: *, significant at the 0.05 level; **, significant at the 0.01 level.
0.576 27.670
countries with net resource inflow. This finding suggests that countries specializing in exports (i.e. those with net resource outflows) may be promoting more capital-intensive production strategies than countries catering to the domestic market. Consequently, in these nations, the share of agriculture may be fairly high even at relatively advanced income levels. The effect of the resource flow variable on the manufacturing relation was not statistically significant, which suggests parameter stability in this context. The implications of this for service sector labor shares during development thus run exactly opposite to the pattern observed for agriculture. Countries with net resource outflows will have lower service labor shares at all levels of per capita GNP than countries with net resource inflows. This may be due to the lower level of ‘forward linkages’ available to export-promoting countries compared with import-substituting ones, causing a lower impact on the nonagricultural employment. The temporal drift in labor force allocation patterns is presented in Figure 11.4. The figure indicates that the decline in the agricultural labor share for a unit increase in per capita GNP has been increasing with time, i.e. in more recent years there has been a more rapid shift from agricultural to nonagricultural sectors. This finding may reflect the adoption of capital-intensive agricultural techniques in developing countries. This influence is also observed in the manufacturing sector, in that the labor absorption in the industrial sector has been declining over time. Thus, countries beginning their industrialization programs now will have an unlikely chance of achieving the high levels of blue collar workers in the labor force observed in the past development of the now industrialized countries. Once again, this reflects the ‘latecomer’ status of the developing countries, whereby they are able immediately to adopt the capitalintensive and labor-saving manufacturing techniques developed in the West over a longer period of time. The temporally shifting patterns in the agricultural and manufacturing sectors have definite implications for the service sector. If the recent pattern is that of more rapid declines in the agricultural sector and slower absorption in the manufacturing sector, the tertiary labor share must have a proportionately higher rate of growth than that observed in the past. This finding supports the extensive literature on the ‘hypertrophy in services’ in the Third World. The regional drift in the sectoral shifts patterns is illustrated in Figures 11.5(a)–11.5(d), which present the labor share-GNP per capita relation for Latin
192 K.PANDIT
Figure 11.2 Effect of population size on sectoral labor allocation relations
America, Africa, Asia, and the MDCs respectively. Note that the range of income per capita values for each graph corresponds, generally, to the income range of the countries in that region. Of the developing regions, Latin America had the lowest overall shares of labor in agriculture, and Africa the highest. The most rapid decline in the primary sector labor share, however, was observed in Asia. The MDCs had the lowest labor shares in the agricultural sector, as well as the slowest decline in these shares. In the case of the manufacturing sector, only the MDCs have achieved incomes high enough to observe the peaking of the manufacturing labor share. The lowest levels of industrial labor are found in Africa, a region that is heavily oriented towards primary exports. Latin America showed the highest manufacturing shares of the LDCs. However, the most rapid (or least sluggish) growth of manufacturing labor shares is found in Asia, indicating the rapid transfer of labor from agricultural to manufacturing sectors. The results show that assumption of spatial stability in the sectoral shifts relation cannot be assumed. CONCLUSIONS The changes in a country’s economic and social structure that accompany economic development constitute a set of simple, theoretically justified ‘laws’ that often are not exactly replicated over time, region, or context. The expansion method provides an effective way to test for the temporal, spatial, and contextual
RELATIONSHIP BETWEEN LABOR SHARES AND DEVELOPMENT 193
Figure 11.3 Effect of resource flows on labor allocation in agriculture relations
drift in these structural relationships while maintaining the theoretical rationale of the underlying phenomena. This paper examines one particular structural change— the relationship between the share of labor in agriculture, manufacturing, and services and GNP per capita. It contends that the conventional approach employed by Chenery and Syrquin, which focuses upon theoretical relationships rather than upon their contextual drift, is suboptimal. The expansion method analyses demonstrate that the sectoral shifts relation varies over the contexts investigated, and consequently that Chenery and Syrquin’s implicit stability assumptions are unwarranted. The environment in which today’s developing countries are embarking upon their industrialization differs significantly from the conditions that existed at the time of the industrial revolution. Furthermore, there are sharp regional contrasts based upon historical and cultural conditions. Consequently, it is inadvisable to build development models that are expected to be universally applicable, no matter how persuasive their theoretical rationale may be. The expansion method provides a compelling alternative whereby the contextual stability of theoretically grounded models rather than the formulation of universal models becomes the prime focus of research. NOTES 1 In the study of some structural changes other than labor force shifts, CS have attempted to estimate more than one model of the structural change by subdividing
194 K.PANDIT
Figure 11.4 Temporal variation in sectoral labor allocation relations the sampled countries into categories based upon population size and trade orientation. However, such stratification reduces the degrees of freedom of the analysis considerably, and is discontinuous. In addition, this methodology is not conducive to the systematic evaluation of model stability in varying contexts.
REFERENCES Bairoch, P. and Limbor, J.M. (1968) ‘Changes in the industrial distribution of the world labor force, by region, 1880–1960’, International Labour Review 98:311–36. Casetti, E. (1972) ‘Generating models by the expansion method: applications to geographical research’, Geographical Analysis 4: 81–91. Chenery, H.B. (1960) ‘Patterns of industrial growth’, American Economic Review 50: 624–54. Chenery, H.B. and Syrquin, M. (1975) Patterns of Development 1950–1970, London: Oxford University Press. Chenery, H.B. and Taylor, L. (1968) ‘Development patterns: among countries and over time’, Review of Economics and Statistics 50: 391–416.
RELATIONSHIP BETWEEN LABOR SHARES AND DEVELOPMENT 195
Figure 11.5 Sectoral labor allocation relations for (a) Latin America, (b) Africa, (c) Asia, and (d) more developed countries Clark, C. (1940) The Conditions of Economic Progress, London: Macmillan. ——(1957) The Conditions of Economic Progress, 3rd edn, London: Macmillan. Fisher, A.G.B. (1939) ‘Production: primary, secondary and tertiary’, Economic Record 15: 24–38. Fuchs, V. (1968) The Service Economy, New York: Columbia University Press. Jaffee, A. (1959) People, Jobs, and Economic Development, Glencoe, IL: Free Press. Jones, G.W. (1968) ‘Underutilization of manpower and demographic trends in Latin America’, International Labour Review 98:451–69. Kirsch, H. (1973) ‘Employment and the utilization of human resources in Latin America’, Economic Bulletin for Latin America 18:46–94. Kuznets, S. (1966a) Economic Growth and Structure, London: Heinemann. ——(1966b) Modern Economic Growth: Rate, Structure and Spread, New Haven, CN: Yale University Press. Lewis, W.A. (1954) ‘Economic development with unlimited supplies of labor’, The Manchester School 22:139–91. McGee, T.G. (1967) The Southeast Asian City, London: Bill.
196 K.PANDIT
Maizels, A. (1963) Industrial Growth and World Trade, Cambridge: Cambridge University Press. Mehta, S.K. (1961) ‘A comparative analysis of the industrial structure of the urban labor force of Burma and the United States’, Economic Development and Cultural Change 9:164–79. Nurkse, R. (1959) Patterns of Trade and Development, The Wiksell Lectures, Stockholm: Almquist & Wiksell. Oshima, H.T. (1971) ‘Labor-force “explosion” and the labor-intensive sector in Asian growth’, Economic Development and Cultural Change 19:161–83. Ramos, J.R. (1970) Labor and Development in Latin America, New York: Columbia University Press. Rosenstein-Rodan, P.N. (1943) ‘Problems of industrialization of eastern and southeastern Europe’, Economic Journal 53:202–11. Squire, L. (1979) ‘Labor force, employment and labor markets in the course of economic development’, World Bank Staff Working Paper 336, Washington, DC: World Bank. Thirlwall, A.P. (1978) Growth and Development with Special Reference to Developing Economies, 2nd edn, London: Macmillan. World Bank (1980) World Tables, Washington, DC: World Bank.
12 PRODUCTION FUNCTION ESTIMATION AND THE SPATIAL STRUCTURE OF AGRICULTURE Sent Visser
A production function is a graphical or mathematical statement of the relationship between the amount of output and the quantity of inputs. These functions may describe the output response to variations in input for industries, firms, or areas. Inputs may be measured in terms of quantity of labor, area of land, value of capital equipment, or the quantity or value of any other input for which the response to output is needed. Output may be measured as the physical quantity of a single output, or as the total value of a combination of outputs. When inputs such as labor and capital equipment are combined, they require a common unit of measurement such as cost. Theoretically, the most desirable production function describes the physical output response to variations in the physical quantities of inputs under a given technology. However, since the parameters of such functions are very difficult to estimate, production functions are generally estimated in terms of the value of the output response to changes in the value of the inputs. The parameter estimates are therefore influenced by the differential pricing of the outputs and inputs. In this paper aggregate production functions are estimated for agricultural areas, and the parameters of the functions are expanded with respect to the degree of agricultural type occurring in an area. That is, counties include a variety of agricultural types, and so the production function of an agricultural area constitutes an aggregate of several production functions, one for each type of agriculture. To specify the parameters of the function for a particular type of agricultural area, the parameters of the aggregate production function can be expanded with respect to the percentages of the county area devoted to each type of agriculture. The validity of this approach is provided by agricultural location theory, and the methodology overcomes many of the problems confronting the estimation of other kinds of production functions. Economists generally have estimated the parameters of production functions describing output responses to input variations for industries or firms. The estimations reported here are for areas; they describe yield (per unit area) responses to variations in intensity (inputs per unit area) as opposed to output per farm. Such functions are relevant to applying agricultural location theory
198 THE SPATIAL STRUCTURE OF AGRICULTURE
Figure 12.1 Observations of the relationship between inputs and outputs relative to the actual production function defined for optimal combinations of inputs
empirically, while agricultural location theory in turn provides the proof that parameter estimation of such functions is possible. PROBLEMS IN THE ESTIMATION OF PRODUCTION FUNCTIONS Agricultural economists generally estimate farm level production functions. Their concern is often with the elasticity of substitution between capital and labor. These functions are normally estimated using time series data. The response of yield to variations in individual inputs has been addressed experimentally as well, but the estimation of yield in response to changes in input combinations is difficult in experimental situations because of the variety of input mixes possible. I have previously tried estimating yield functions and have reviewed some of the major problems in the estimation of production functions of all types (Visser 1981; see Brown 1957; Griliches 1962, 1963, 1964; Mundlak and Hoch 1965; Zellner et al. 1966; Brown and Beattie 1975). The major difficulty is an identification problem similar to that in the estimation of market supply and demand schedules. In equilibrium, if all farmers are faced with the same factor prices and output prices, and if they all maximize profits, they should all be located at the same point on the production function. If faced with uncertainty about prices, farmers may use varying levels of intensity according to their individual expectations, but this will only trace out the production function if farmers use the optimal combination of resources to achieve their output. Since this is improbable, a purely random variation of yield responses around or below the optimal combination of resources should result, as
S.VISSER 199
Figure 12.2 Observations of the relationship between inputs and outputs for farmers responding to variation in real factor and output prices
illustrated in Figure 12.1. Curve A is the production function when inputs are employed under optimal conditions. Cluster B in the figure represents farm observations where, on average, farmers are not using optimal combinations of inputs to maximize income. Even if part of the production function were traced out, therefore, the estimate of its parameters would be biased. Cluster C represents possible observations when adverse weather conditions occur. Yields are a lagged response to input levels, and farmers cannot adjust input levels to unanticipated climatic events. Using any single year to estimate the production functions may therefore result in incorrect parameter estimates. If output and input prices vary in some way, then the identification problem is resolved, and the farmers’ optimal responses to these price variations should trace out the function shown in Figure 12.2. Most estimates of the func tions have therefore relied on time series data because the temporal variation in prices should enable the delineation of the production function. This approach is conceptually very weak. A production function describes the output response to a change in the quantity of the inputs (and in the mix, but that is subsumed) under a fixed technology and a homogeneous environment. The usefulness of such a function lies in its prediction of agricultural responses to changes in output prices and factor prices. Time series data, however, incorporate technological change; thus the data may describe yield responses which are the result of a switch to new production functions1 (Brown 1957). Calculating aggregate production functions from large-scale cross-sectional data produces a similar problem (Griliches 1963). These regressions take advantage of regional differences in factor and output prices, but they represent a composite of many production functions for different agricultural types under
200 THE SPATIAL STRUCTURE OF AGRICULTURE
extremely variable environmental conditions. For example, low intensity agriculture under semi-arid conditions has exceptionally low yields, while high intensity agriculture under excellent environmental conditions has very high yields. When both cases are included in the data, the response of output to variations in inputs is not really being assessed, neither for agriculture in the aggregate nor for agricultural types under particular environmental circumstances. The estimate of the parameters describing the relationship between intensity and yield is instead being biased by the spurious correlation with environment. Thus a lower yield in a semi-arid environment is only partly the result of lower intensity. The lower yield is also independently caused by the environment. The solution to this problem is to estimate the relationship between intensity and yield while controlling for environmental variation. Real agricultural output prices should vary spatially (as should factor prices) because of transport costs, and so optimal intensity and yields should vary spatially. Therefore, the individual production functions for particular agricultural types under homogeneous environments should be traced out. The aggregate agricultural production function need merely be specified fully with respect to agricultural type and environmental conditions. There are other provisos in estimating these functions. Mundlak and Hoch (1965) pointed out that, if stochastic terms trace out a production function, then the errors are being transmitted to the inputs, thus making least squares estimation inappropriate. The variation in input quantities associated with distance to market, however, is optimal behavior in response to price variations, which alleviates the problem. Multicollinearity of inputs causes difficulties in estimating the output elasticities of labor and capital. This is less of a problem in the estimation of aggregate production functions because capital-to-labor ratios can be expected to vary substantially between different agricultural types. In the case of a single agricultural type, capital-to-labor ratios may vary with distance to market if their factor prices vary differently with distance to market. There are plausible arguments for why this should be so, but it appears unlikely to cause significant variations in the capital-to-labor mix. As a result it is advisable to combine capital and labor into a single composite input. There is one final problem associated with estimation of production functions using Census of Agriculture data as is done in this paper. Census data on inputs and output is collected for the calendar year. In many instances, therefore, inputs recorded for a year are not generating the output recorded for that year. This may be especially significant in winter wheat production regions. It must be assumed that input volumes do not vary significantly from year to year. Adverse weather in some regions during the data collection year may also bias the parameter estimates relative to the long-term norm.
S.VISSER 201
THE THEORY Agricultural location theory demonstrates that the intensity of a single agricultural type will vary spatially (Garrison and Marble 1957; Beckmann 1972; Webber 1973; Katzman 1974; Bannister 1977; Visser 1980a, 1982; Randall and Castle 1985). Under conditions of spatial equilibrium in a perfectly competitive market, the intensity of input per unit area should decline as distance to market increases, in response to lower output prices received by farmers. The theory also indicates that different agricultural types prevail, at equilibrium, at different distances from the market, and so different production functions will be found over space since different agricultural types may have different yieldintensity relationships. The classical formulation of agricultural location theory by Dunn (1954) and von Thunen (Hall 1966) allocates given land-use types to locations relative to a central market. The allocation is determined by the ability of land uses to bid rent for the land at a given location. Although von Thunen also showed that with lower intensity methods a single output type can locate farther from the market, neither von Thunen nor Dunn addressed the question of how intensity affects the ability of a land use to bid rent at a given location. For any economic activity the location question is: how much is to be produced and where? The scale of production at any location must be determined before the most profitable location can be determined. Thus for agriculture the variation of intensity over space must first be determined before land-use types can be allocated to locations. To develop the rent model, first assume an isotropic plain with a single central market. Assume also that prices of agricultural outputs are given, and that a spatial equilibrium of production and consumption exists. Casetti (1972a) showed that under such conditions maximization of profit by farmers will result in a maximization of land rent which is the residual flow of earnings accruing to land after all costs other than land have been deducted from revenues. Given an isotropic plain, land rent will vary with distance from the market solely because of transport costs. If transport costs are included as part of the production function for agriculture (Isard 1956), then a unique production function is defined for each location. This would severely inhibit the application and potential extensions of the agricultural location model. Furthermore, inclusion of transport costs in the production function as a separate input makes the function much more mathematically intractable. Common production functions also generally assume constant elasticity of substitution between inputs. There is no reason to presume that this would be the case for transport vis-à-vis other production factors. Transport costs are therefore deducted from the market price received for agricultural produce by farmers and are not included in the production function. The rent equation is therefore (12.1)
202 THE SPATIAL STRUCTURE OF AGRICULTURE
where c is cost per unit input, d is distance to market, p is market price per unit output, R is rent per unit of land, t(d) are transport costs per unit output, x is units of input per unit of land,2 and y(x) are output units per unit land (yield), which is described by a production function3 which applies to all agriculture in general or to one particular agricultural type. Under perfect competition all farmers will earn normal profits (i.e. returns on nonland investment are included with cost per unit input). To maximize rent with respect to intensity we obtain the partial derivative of R with respect to X and set it equal to zero: (12.2) That is, the marginal value of output equals the marginal cost of inputs. For a maximum the second derivative must be negative: (12.3) This means that there must be diminishing marginal returns to intensity. From equation (12.2), it is clear that, as transport costs t(d) increase, p—t(d) must decrease, and thus ′ y/′ x must increase. That is, the marginal product of nonland inputs intensity must increase. Because of diminishing marginal returns to intensity, this can only occur if intensity is decreased as distance increases. If intensity decreases, yield must also.4 To allocate agricultural types to their respective optimal locations, we must determine the optimal intensities and yields for each agricultural type at all distances. An agricultural type is defined as a type according to whether or not it has a unique production function. These will not only differ by output type, but also with variations in the physical environment and technology. Thus the raising of crops in a humid environment will have a different production function from that for raising crops in an arid environment under irrigation. Another example is that of raising beef. Farmers who graze livestock on natural pasture should have different production functions from those using feedlots. Given the optimal intensities and yields as a function of distance for each agricultural type, these can then be inserted into the rent equation (12.1) and the optimal rent for each agricultural type can be defined as a function of distance. At any location, the type that earns the highest rent (has the highest bid rent curve) is allocated to that location in the manner formulated by Dunn (1954). To do this, however, the form and parameters of the production function of each agricultural type must be known. Otherwise the optimal intensities and yields cannot be defined and the bid rent equations cannot be specified. Given the assumption that the production functions are continuous, the composite rent function (of distance) derived from the inclusion of several agricultural types should also be continuous. However, the intensity and yield distance decay functions need not be continuous. They may be step functions.
S.VISSER 203
Where agricultural types change, there may be sudden changes in intensity and yield. As distance to market increases it is possible for a more intensive type of agriculture to replace a less intensive one, although on average intensity should decrease with distance to market. The fact that the composite intensity and yield functions may be step functions will make empirical identification of unique production functions for each agricultural type much easier. This can be summarized as follows. 1 For a given agricultural type, intensity decreases as distance to market increases. The spatial variation of optimal intensity thus enables identification of the production function for that type of agriculture. 2 Different agricultural types occur at different distances from the market, and thus the intensity of agriculture need not be a continuous function of distance. This enables the empirical identification of unique production functions for each agricultural type in a study area. 3 Because of environmental variations, and the fact that land-use allocation is a stochastic process, abrupt changes in land use cannot be expected. Within each area a mixture of agricultural types can be expected. That is, there is a drift in agricultural types or technologies. Thus to estimate the individual production functions we cannot choose areas with a single agricultural type. Instead, over a large study region, we must expand the parameters of a production function for agriculture (y(x)) into variables which indicate the extent to which a given agriculture and technology exists within a certain statistical area (i.e. county). 4 To apply empirically agricultural location theory the form and parameter values of the individual production functions must be known. THE FORM OF THE PRODUCTION FUNCTION Two production functions are estimated in this paper. The first is the CobbDouglas function which is the one most commonly estimated because it is readily made linear and has been found to describe output responses to variations in inputs very accurately. This production function has diminishing marginal returns to intensity throughout its range, as shown in Figure 12.2. However, it appears probable that agricultural production functions have increasing marginal returns to intensity at low levels of intensity when production is viewed in the long term (Figure 12.3). That is, there are high fixed costs in farming which must be incurred before any production can occur. The presence of increasing marginal returns to intensity has major economic implications. For instance, if there are decreasing marginal returns to intensity throughout the production function, the intensity of agriculture can approach zero at the margins of cultivation (at either the locational or the environmental margin). In fact, this is rarely the case. Even in the case of extensive pastoralism under semi-arid conditions, a substantial initial investment is required for livestock
204 THE SPATIAL STRUCTURE OF AGRICULTURE
Figure 12.3 Hypothesized shape of agricultural production functions with increasing marginal returns to intensity at low levels of intensity
inventory and fences. Second, if prices for output fall, farmers can reduce intensity from whatever operating level, provided there are decreasing marginal returns to intensity, in order to adjust to the lower prices. If there are increasing marginal returns, however, they cannot lower their intensity below a certain level because of fixed costs. In this instance, farmers must sell their investment at a loss (i.e. go bankrupt), and new farmers can purchase the capital plant at a lower cost which enables them to be profitable at the new output prices. The present rash of US farmers in economic difficulty is a reflection of the fact that lowering intensity is not a viable response to lower real output prices and higher real input prices because of the increasing marginal returns of the production functions they are on. The production function form is also critical to the crop allocation model. For example, assume corn and wheat have Cobb-Douglas production function forms. As transport costs increase with distance from the market, both output types must lower their intensity and therefore their yields. Unless the corn yields drop much more rapidly than wheat yields as intensity is reduced, wheat cannot earn a higher rent than corn anywhere. In fact, the production functions, when yield is measured in dollar units, must cross. This concept is illustrated in Figure 12.4. Although the production functions can cross without either having increasing marginal returns to intensity (panel A), it is more likely if one or both do. Thus, in panel B, if transport costs dictate that corn must be grown at intensity levels less than level X, corn becomes uncompetitive with wheat production. ln turn, wheat cannot be profitable at intensity levels below Z and so would be displaced by another output type. The two types of production functions (i.e. the Cobb-Douglas and the step function) will be hard to distinguish from one another empirically because, first,
S.VISSER 205
Figure 12.4 Production function shapes and the location allocation of output types. See text for explanation
most of the costs reported in the Census are variable costs. The only fixed capital costs reported are interest expenses which cannot be utilized because farmers do not have equal degrees of indebtedness. That is, for some the capital investment is equity capital. Others have borrowed it. So interest expenses give little clue to fixed (or variable) capital costs. Second, farmers should all be operating in that part of the production function in which there are diminishing returns to intensity. A Cobb-Douglas function may be traced out by farmers operating at all levels of
206 THE SPATIAL STRUCTURE OF AGRICULTURE
Figure 12.5 Empirical observations of yield and intensity and the underlying production functions. Curve A is the Cobb-Douglas function. Curve B is the production function with increasing marginal returns to intensity at low levels of intensity. Which one is applicable to farm production cannot be identified given the above scatter of farms
intensity, but in the case of increasing marginal returns to intensity there are no farmers operating at low intensity levels. The part of the function with increasing marginal returns to intensity cannot be traced out. However, where farmers are all operating in the high intensity portion of the production function, there is still no guarantee that the portion of the function traced out is not part of a CobbDouglas function (Figure 12.5). The Cobb-Douglas function to be estimated is (12.4) and the function which indicates increasing marginal returns to intensity is (12.5) where a, b, c, and h are parameters to be estimated for each output type, and y and x have been previously defined. The linear transformation of equation (12.4) for regression estimation is (12.6) Equation (12.5) is already linear and will be referred to as the linear-log model henceforth. Although this function does not exactly describe the illustrated function with increasing marginal returns, in that there is no vertical section, it does approximate it. THE DATA Because the production function is believed to be different under varying environmental conditions, an area with a relatively homogeneous physical
S.VISSER 207
environment was needed to estimate the production functions. This area also needed to be large to permit significant spatial variation in farm output prices and factor prices, and thus in intensity. The northern Great Plains was therefore chosen, including the states of North and South Dakota, Nebraska, Kansas, and Iowa, the western portion of Minnesota, and the eastern portions of Montana, Wyoming, and Colorado. Portions of three major agricultural zones occur within this area. They are the corn belt, the wheat zone, and the livestock ranching region. Two hundred counties were randomly selected from this area. Standard Metropolitan Statistical Areas (SMSAs) were excluded because of the urban impact on local agriculture. Data were collected only for what the 1982 Census of Agriculture defines as commercial farms (sales of more than $10,000 per year) because it is probable that small, parttime farms are not on the same production function as commercial farms. Commercial farms account for more than 90 percent of the agricultural production of the area. County aggregate data for these farms were then collected. The variables included the area in farms, the area in each type of crop, and the area irrigated; the value of sales for each type of commodity; the number of operators, the number of operators over 65 years of age, and days per year worked off-farm; and farm expenditures on each type of expense. Given the nature of these cross-sectional data, the production function of each agricultural type can be estimated by expanding (Casetti 1972b) the parameters of equations (12.5) and (12.6). Rewriting the linear form of the Cobb-Douglas function (equation (12.6)), we have (12.7) where The parameters are then expanded with respect to crop type and type of technology by the following equations: (12.8) (12.9) where Ci are variables measuring output type and Tj are variables measuring type of technology. For instance, C1 may be the percentage of the county’s farmland area in wheat or the percentage of the county’s farm sales earned by wheat. T1 may be the percentage of the county’s farm area that is irrigated, because irrigated farming should have a different production function from that of dry farming. The parameters could also be expanded with respect to environmental variables. Substituting the expansion equations back into equation (12.7) generates the terminal model (12.10) The parameters h and a of the Cobb-Douglas function therefore depend on the type of production and the type of technology employed in farming. This
208 THE SPATIAL STRUCTURE OF AGRICULTURE
formulation does not, however, permit the decomposition of output or yield into physical quantities of each commodity type as formulated in location theory. The expanded regression formulation is dictated by the nature of the data, and the single dependent variable of yield must have a common measure for all output types. Therefore it is measured in terms of average value of yield per acre in each county. It therefore follows that the independent variables measuring output proportions in each county should be measured in terms of the percentage of total farm sales contributed by each commodity type instead of the percentage area devoted to each type. In addition, the former variable has fewer missing observations. The parameters of the linear-log model production function (equation (12.5)) are expanded in exactly the same way, generating a regression model identical to equation (12.10) except that the dependent variable is y instead of In y. In both models, if the production function for wheat (as an example) is different from that of other agricultural types, then the hi and/or ai parameters for the wheat variable will be significantly different from zero. Intensity was also decomposed into capital and labor intensity, and the production function parameters of each type of input were expanded with respect to output type and type of technology in order to see whether there were significant differences in the capital-to-labor ratios for the different kinds of farming. Multicollinearity of the two inputs did not seem to be a major problem. The simple linear regression of labor intensity against capital intensity had an R2 of 0.46 (n=160). MEASUREMENT OF VARIABLES The dependent variable in all instances is the county’s average dollar value of agricultural sales per acre in 1982. The natural logarithm of this variable was used to estimate the Cobb-Douglas function. To measure nonland capital inputs per acre, only variable expenditures were available. Although the Census provides interest expenditures, these would include interest on farm mortgages (i.e. in many cases they are measuring land rental and not the expense of nonland capital), and if the degree of indebtedness varies systematically over space, then the use of interest payments in the capital measure would bias the estimation of the production functions. Capital expenditures, then, were the sum of expenditures on livestock, feed, seeds, fertilizer, energy, and machine rentals. The data had a large number of missing values because of disclosure problems, and so expenditures on agricultural chemicals other than fertilizer were omitted from the summation. This does not appear to have biased the capital variable. The linear regression of capital intensity with expenditures on other agricultural chemicals against capital intensity without other agricultural chemicals had an R2 of 0.99 (n=181). The measurement of labor inputs to agriculture using Census data poses two problems. First, there is doubt about the actual quantity and value of labor
S.VISSER 209
employed (Hottel and Gardner 1983). To calculate the quantity of operator labor a variant of a formula proposed by Griliches (1964) was used: where D is days of operator labor, assumed equal to 250, minus the calculated number of days worked off-farm, N is the number of operators, and A is the number of operators aged 65 and over. No data are available on the quantity of family labor. To impute a value for this operator labor ‘the most common procedure is to use the hired farm wage rate as the shadow price of the labor component of operator and family services and to add a fixed percentage (often 5 percent) of gross receipts as a management charge’ (Hottel and Gardner 1983). This method appears to overstate the operator’s remuneration in certain kinds of agriculture, such as feedlot operations where there is a very high sales volume. In this research, I chose simply to assess an operator’s labor and management as equivalent to middle management in an urban occupation, and so assigned a full-time operator’s return on labor and management as being $30,000 per year. This expense was then added to expenditures on hired labor to generate an average labor cost per acre for the counties. If the value of the operator’s labor is grossly overstated or understated, this variable may be biased in that the labor input in regions where little hired labor is used may be overstated or understated. However, it is the spatial variation in labor costs that matters in the estimation of the production functions, and the same value for labor has been imputed in all locations. Expenditures on contract labor, though available, were not included in the labor measure, because of a large number of missing values. The relative quantities spent on contract labor did not appear to vary spatially. The simple linear regression of labor intensity including contract labor with labor intensity excluding contract labor yielded an R2 of 0.99 (n=174). Labor intensity was added to capital intensity to derive a composite measure of intensity. Six variables were used to measure output type. They were the percentage of total farm sales earned from wheat, corn for grain, soybeans, sorghum, cattle and calves, and crops (all agricultural products less livestock products; when crops are considered in the models I exclude the other output types). Output types such as sheep, poultry, dairying, and oats were not considered because they were not significant sources of revenue in most counties, and there were problems of missing data as a result of disclosure rules. Technology variables include the percentage of farm land area irrigated and feed expenditures as a percentage of total capital expenditures. The latter measure was intended to identify feedlot agriculture. Thus eight agricultural variables are being considered: irrigated farming in general, feedlot farming, wheat farming, corn farming (for cash), soybeans, sorghum, cattle ranching, and crops in general. It is recognized that these are not necessarily independent farming types, but if, for instance, soybeans are generally grown in association with corn, multicollinearity of the variables will cause one to be excluded from the multiple regression. Natural logarithms of those output and technology type variables were tried in the regressions, but the linear measures satisfied the assumptions of regression
210 THE SPATIAL STRUCTURE OF AGRICULTURE
analysis better, and resulted in higher levels of explanation in the production function estimates. The variables are summarized in Table 12.1. Table 12.1 Key to regression variables Intensity LNINTENS Natural logarithm of intensity per acre ($) LNCAP Natural logarithm of capital intensity per acre ($) LNLAB Natural logarithm of labor intensity per acre ($) Technology FEED Feed expenditures as a percentage of capital expenditures IRRIG Percentage of farm land area in irrigation Agricultural types CATTLE Percentage of agricultural sales earned by cattle CORN Percentage of argicultural sales earned by corn CROPS Percentage of agricultural sales earned by crops SORGH Percentage of agricultural sales earned by sorghum SOYB Percentage of agricultural sales earned by soybeans WHEAT Percentage of agricultural sales earned by wheat
PRODUCTION FUNCTION ESTIMATION RESULTS All combinations of the seven output and technology type variables were used in the estimation of the two production functions. Stepwise regression was employed. The results shown in Tables 12.2 and 12.3 only provide coefficients for those variables significant at p<0.05, unless an independent variable, while not having a significant independent effect, increased the adjusted R2 of the multiple regression by at least 0.01. At first glance the production function estimates appear most satisfactory. The Cobb-Douglas functional form is clearly superior to the linear-log model, as is evident from the R2 values. Furthermore, in addition to the extraordinarily high explanation provided by the Cobb-Douglas function, the elasticity of output of intensity is always less than unity, and the elasticities of labor intensity and capital intensity also sum to less than unity.5 Thus the functions exhibit diminishing marginal returns to intensity as necessitated by economic theory. The estimations also suggest a significant difference in the production functions for different agricultural and technology types. For instance, FEED appears in many of the regressions, suggesting that in the case of feedlot operations the response of yield to intensity is different from that for other agricultural types. However, the coefficient for FEED in all regressions is negative. The simple correlation coefficient of FEED with yield is positive, and FEED is positively related to intensity. Thus higher FEED values are associated
S.VISSER 211
with higher intensity which causes higher yields, but the marginal productivity of intensity at higher levels is lower, causing the negative coefficient for FEED. The influence of irrigation on yield, as measured by IRRIG, is more difficult to unravel. IRRIG has a weak, positive, simple correlation with yield and intensity, but in most of the regressions with yield as the dependent variable, IRRIG did not enter after controlling for intensity and other variables. When it did enter, its coefficient was negative, and the significance was weak. Counties with a high percentage of irrigation tend to have a higher intensity which explains the higher yield. When intensity is controlled Table 12.2 Cobb-Douglas production function estimates Independent variable Regression coefficient Adj. R2 N Model 1 Intensity only Constant Model 2 Capital and labor intensity only LNLAB Constant Model 3 Intensity; expansions of crop sales, feed, and irrigation CROPS * LNINTENS Constant Model 4 Intensity; expansions of output types and feed FEED CATTLE CORN * LNINTENS Constant Model 5 Capital and labor intensity; expansions of crop sales, feed, and irrigation LNLAB CROPS * LNLAB Constant Model 6
LNINTENS 0.44
0.98
0.981
176
LNCAP
0.66
0.981
176
0.81
0.988
137
0.96
0.990
104
0.73
0.988
137
0.32 1.09 LNINTENS
0.00096 0.44 LNINTENS –0.0065 –0.0024 0.0009 0.71 LNCAP
0.18 0.0015 1.10
212 THE SPATIAL STRUCTURE OF AGRICULTURE
Independent variable Regression coefficient Adj. R2 N Capital and labor LNCAP 0.83 0.992 104 intensity; expansions of output types and feed LNLAB 0.11 FEED –0.0077 CATTLE –0.0061 WHEAT 0.0060 WHEAT * LNCAP –0.0025 Constant 1.56 Note: Dependent variable, natural logarithm of yield ($ per acre), 1982; p<0.01 for all estimates. Table 12.3 Linear-log model production function estimates Independent variable Regression coefficient Adj. R2 N Model 1 Intensity only Constant Model 2 Capital and labor intensity only LNLAB Constant Model 3 Intensity; expansions of crop sales, feed, and irrigation Constant Model 4 Intensity; expansions of output types and feed WHEAT SORGH CATTLE * LNINTENS CORN * LNINTENS SOYB * LNINTENS FEED * LNINTENS Constant Model 5 Capital and labor intensity; expansions of
LNINTENS –330.9
112.38
0.763
176
LNCAP
102.55
0.763
176
118.73
0.763
137
78.18
0.797
104
55.79
0.773
137
– –242.53 LNINTENS
–356.2 LNINTENS 1.81 –4.23 0.37 0.47 0.45* 0.57 –358.30 LNCAP
S.VISSER 213
Independent variable Regression coefficient Adj. R2 N crop sales, feed, and irrigation FEED –6.79* FEED * LNCAP 2.10 Constant –94.60 Model 6 Capital and labor LNCAP 148.70 0.852 104 intensity; expansions of output types and feed LNLAB –37.08 WHEAT 6.17 SORGH –2.68 WHEAT * LNCAP –2.05 SOYB * LNCAP 0.32** Constant –298.80 Note: Dependent variable, dollar yield per acre, 1982; *, p<0.05; **, p<0.10; otherwise, p<0.01.
in the multiple regression, however, the independent and negative impact of irrigation probably reflects a lower marginal productivity of intensity at higher intensities, and also reflects the fact that irrigation occurs in drier counties such that the yield on nonirrigated land within those counties is below average for the study area. Because of this problem in the measurement of irrigation, and the fact that it did not contribute significantly to explanation of variation in yield, IRRIG was excluded from the regressions. The exclusion of IRRIG also increased the number of observations because the irrigated acreage was not reported for many of the counties. The significance of the output type variables in the regressions suggests that the production functions for some of these output types can be described. For instance, in model 4 of Table 12.2 (the Cobb-Douglas function estimates), if it is assumed that corn provides 100 percent of agricultural sales, then the variables CATTLE and FEED must both have a value of zero, and the production function for corn is therefore (12.11) where X is intensity (0.71=ln 2.03). This production function describes increasing marginal returns to intensity at all levels of intensity. It is possible that farmers are operating on that portion of a production function in which there are increasing marginal returns to intensity, but it is not probable. Diminishing marginal returns must occur eventually, and it is that portion of the production function in which they occur which is relevant to economic theory. The function shown in equation (12.11) is partly the result of extrapolating beyond the range of
214 THE SPATIAL STRUCTURE OF AGRICULTURE
Figure 12.6 Unique linear-log production functions for individual agricultural types generate a production function envelope that is estimated as a Cobb-Douglas function
the data. That is, CORN had an average value of 14 percent and never exceeded 50 percent. The inescapable conclusion, therefore, is that equation (12.11) does not describe a production function for corn. It may, however, provide a satisfactory estimation of the production function for areas. That is, when the independent variables describe a mix of land uses characteristic of the region, the models provide a very accurate estimate of the average value of yield in a county. It is doubtful, then, that the estimation procedure followed in this paper has actually estimated the production function for a particular agricultural type. It also follows that, despite the R2 values, the Cobb-Douglas functional form is not necessarily the best to describe an agricultural production function. The rationale is illustrated in Figure 12.6. In this figure, the solid lines A, B, and C represent the true production functions for three types of agriculture. The scatter of points represent the county observations. The broken line W is the linear-log production function that can be estimated as the aggregate production function for all agriculture in the region. The broken line YZ is the Cobb-Douglas function that can be estimated from the data. It is clear that the Cobb-Douglas function describes the data better than the single linear-log model, and that the individual production functions would probably not be identified by the regression because farmers are not operating on those parts of the individual production functions which are needed to identify them. Figure 12.6 also illustrates how the production function estimates result in a Cobb-Douglas function with an output elasticity that is very close to unity. The scatter of counties traces out a relationship between intensity and yield that is very close to linear. The regression results in Tables 12.2 and 12.3 show elasticities of output that are very close to unity when the average values of output type for the counties are
S.VISSER 215
Figure 12.7 The effect of varying annual prices of output on estimation of production functions measured in terms of value of yield. The solid lines represent the actual production functions in terms of long-term equilibrium prices. The broken lines represent the production functions estimated given the prices for produce in any single year. In (b) different production functions are estimated when in fact both types of production are on the same production function in the long term
substituted into the expanded equations. This therefore lends support to the argument that the equations are describing the phenomenon shown in Figure 12.6.
216 THE SPATIAL STRUCTURE OF AGRICULTURE
Using values of yield instead of physical quantities to estimate production functions also impedes identification of the individual production functions. Figure 12.7(a) illustrates the production functions for three agricultural types based on long-term equilibrium prices for output. If observation values traced out these production functions, the regression would not be able separately to identify the three functions because of the high degree of multicollinearity between intensity and agricultural type (measured as a percentage of sales earned by a commodity). However, in any one year prices for output may vary considerably from the long-term average such that the relationship between value of yield and intensity for each agricultural type is as shown by the broken lines in Figure 12.7(b). In that case, the regression will identify separate production functions for the different output types, although actually it is merely identifying a price effect. As formulated in the theory portion of this paper, production functions should describe the physical output effect of a change in the quantities of the inputs. Ideally, labor inputs should be measured in terms of physical quantities (e.g. days), but the mix of capital inputs would still have to be measured in terms of value. Factor prices should not vary differentially over time as much as output prices do. The dependent variable should be physical quantities of output of a particular commodity. Thus there should be separate regressions for each output type, and the independent variables must measure inputs specific to growing each commodity. The data necessary for such estimations would have to be gathered from individual farm surveys or via experimentation, and the expansion method, as used here to identify commodity components of an aggregate production function, would not be applicable. The expansion method has provided, however, an excellent estimate of the form and parameters of areal production functions given a certain mix of land-use types. ACKNOWLEDGMENT The author is grateful to Charles Aston and Deborah Stewart for the collection of the data. NOTES 1 Note that, with increasing capital productivity as a result of technological change, time series data would tend to mask decreasing marginal returns to scale or intensity, and tend to indicate proportional or even increasing marginal returns. Given long time periods, the R2 of these regressions is also very high, with large increases in inputs over time producing large increases in output. 2 Because of the long-term equilibrium assumption, there is no need to distinguish between variable and fixed input costs.
S.VISSER 217
3 In previous work I have expanded this production function in terms of capital and labor (Visser 1980a, b, 1982). This model also approximates that of Katzman (1974). 4 The spatial behavior of rent can be determined by obtaining the first and second partial derivative of rent (12.1) with respect to distance. This shows that rent decreases at a decreasing rate as distance to market increases, regardless of whether transport rates (per mile costs) are a proportional or decreasing function of distance. 5 The coefficients of the inputs in the Cobb-Douglas model are the output elasticities of those inputs.
REFERENCES Bannister, G. (1977) ‘Land use theory and factor intensities’, Geographical Analysis 9: 319–31. Beckmann, M.J. (1972) ‘Von Thünen’s model revisited: a neoclassical land use model’, Swedish Journal of Economics 74:1–7. Brown, E.H.P. (1957) ‘The meaning of the fitted Cobb-Douglas function’, Quarterly Journal of Economics 71:546–60. Brown, W.G. and Beattie, B.K. (1975) ‘Improving estimates of economic parameters by use of ridge regression with production function applications’, American Journal of Agricultural Economics 57:21–32. Casetti, E. (1972a) ‘Spatial equilibrium distribution of agricultural production and land values’, Economic Geography 48:193–8. ——(1972b) ‘Generating models by the expansion method: applications to geographical research’, Geographical Analysis 4:81– 91. Dunn, E.S. (1954) The Location of Agricultural Production, Gainesville, FL: University of Florida Press. Garrison, W.L. and Marble, D.F. (1957) ‘The spatial structure of agricultural activities’, Annals of the Association of American Geographers 47:137–44. Griliches, Z. (1962) ‘Review of Agricultural Production Functions, by E.O.Heady and J.L.Dillon, 1961’, American Economic Review 52: 282–5. ——(1963) ‘Estimates of the aggregate agricultural production function from crosssectional data’, Journal of Farm Economics 45:419–28. ——(1964) ‘Research expenditures, education and the aggregate production function’, American Economic Review 54:961–74. Hall, P. (ed.) (1966) Von Thünen’s Isolated State, trans. by C.M. Wartenburg, Oxford: Pergamon. Hottel, J.B. and Gardner, B.L. (1983) ‘The rate of return to investment in agriculture and measuring net farm income’, American Journal of Agricultural Economics 65: 553–7. Isard, W. (1956) Location and Space-Economy, Cambridge, MA: MIT Press. Katzman, M.T. (1974) ‘The Von Thünen paradigm, the industrial-urban hypothesis, and the spatial structure of agriculture’, American Journal of Agricultural Economics 56: 683–96. Mundlak, Y. and Hoch, I. (1965) ‘Consequences of alternative specifications in estimation of Cobb-Douglas production functions’, Econometrica 33:814–28.
218 THE SPATIAL STRUCTURE OF AGRICULTURE
Randall, A. and Castle, E.N. (1985) ‘Land resources and land markets’, in A.V.Kneese and J.L.Sweeney (eds) Handbook of Natural Resource and Energy Economics, vol. II, pp. 571–620, Amsterdam: Elsevier. Visser, S. (1980a) ‘Technological change and the spatial structure of agriculture’, Economic Geography 56:311–19. ——(1980b) ‘Modeling the spatial structure of agricultural intensity: illustration of the Casetti expansion method’, Modeling and Simulation 11:1393–8. ——(1981) ‘Estimation of agricultural production functions’, Modeling and Simulation 12:833–9. ——(1982) ‘On agricultural location theory’, Geographical Analysis 14:167–76. Webber, M.J. (1973) ‘Equilibrium of location in an isolated state’, Environment and Planning 5:751–9. Zellner, A., Kmenta, J. and Dreze, J. (1966) ‘Specification and estimation of CobbDouglas production function models’ , Econometrica 34:784–95.
13 INCORPORATING THE EXPANSION METHOD INTO REMOTE SENSINGBASED WATER QUALITY ANALYSES Martin Miles, Douglas A.Stow, and John Paul Jones, III
Remote sensing involves collecting and analyzing information about the characteristics or properties of objects or phenomena without being in physical contact with them. One aim of remote sensing of the environment is to understand the spatial order of the Earth’s surface properties from measurements of electromagnetic energy obtained from a distance above the surface. Much research in remote sensing is oriented toward understanding the Earth’s biophysical environment, with particular emphasis on its resources. Imaging remote sensors, such as the Landsat multispectral scanner (MSS), are devices which measure electromagnetic energy with varying wavelength sensitivity. They synoptically record a two-dimensional picture of the electromagnetic energy. The recorded energy is that which is reflected or emitted from the surface as a function of the magnitude of surface properties or type of materials. Remotely sensed data from aircraft and satellites have been shown to be useful for the monitoring and mapping of many environmental properties, including water quality in estuaries, bays, and coastal oceans (Klemas et al. 1973; Kritikos et al. 1974; Khorram 1979, 1981; Khorram and Cheshire 1985). Spatial variation of near-surface water properties such as suspended sediment, chlorophyll, turbidity, temperature, and salinity are observable and usually quantifiable from remotely sensed data. Whether explicit or not, inference in remote sensing implies the development of a model. The transformation of remotely sensed spectral radiances into biophysical quantities such as water properties may be achieved by one of two modeling approaches, deterministic or statistical/empirical. In practice, these terms describe end points of a continuum of methodology; i.e. a method can be fundamentally deterministic even if formulated using some empirical components, and vice versa (Strahler et al. 1986). Deterministic methods involve a functional transformation based on physical principles of electromagnetic energy and matter interactions, and on auxiliary data to account for factors that determine the radiation that reaches the sensor. The amount of radiation reaching the sensor is a function of the characteristics of the ground scene (the spatial distribution of surface characteristics), the intervening atmosphere (the spatial distribution of gases and particulates which
220 M.M.MILES, D.A.STOW, AND J.P.JONES, III
exist between the ground and the sensor), and the sensor itself. Typically, deterministic remote sensing models account for the emissivity, reflectivity, scattering, and/or absorption of the scene, atmosphere, and sensor (Strahler et al. 1986). While deterministic methods are advantageous in that they can lead to a more direct understanding of the scene characteristics, such methods are often limited by a lack of data concerning radiation properties and/or an incomplete knowledge of radiation transfer physics. Statistical or empirical remote sensing methods rely on multiple regression relationships between a set of surface measurements and the spectral radiances observed remotely from the surface sample locations. One limitation of empirical modeling is the ‘black box’ nature of the approach, since the statistical relationships identified may be poorly understood in light of the physical bases of radiation transfer physics. In spite of not being physically deterministic, empirical methods can produce scene-specific models with quite accurate scene inference when properly developed for a certain region and conditions (Strahler et al. 1986). Such approaches may lead to the further development of deterministic models, while at the same time providing cost-effective means through which environmental monitoring and assessment can take place. A heretofore overlooked limitation of statistical models in remote sensing is the assumption of stability in regression model parameters. Such models do not allow for spatial variation in the functional relationships under examination (Jensen et al. 1980). The objective of this paper is to report on the development and analysis of spatially expanded regression models for the extraction of biophysical information from remotely sensed data. The results of specific applications to analyzing surface water quality properties will be presented. In this research we address (a) whether the parameters of variables describing water quality are spatially unstable, (b) whether there is an increase in the accuracy of predicting water quality through spatially expanded regression models, and (c) whether the spatial variation in radiative transfer due to atmospheric and oceanic factors can be more thoroughly accounted for and possibly understood through such a procedure. The research represents an example of the use of the expansion method for improving model performance within the domain of physical geography. THE EXPANSION METHOD AND TREND SURFACE EXPANSIONS Casetti (1972) suggested redefining an initial model’s parameters into functions of contextual variables. When replaced into the initial model, the expanded parameters yield a terminal model encompassing the initial model and the contextual variation of its parameters. Through the latter model’s estimation one can derive a mathematical portrait of parameter variation. For example, the parameters a and b in
REMOTE-SENSING-BASED WATER QUALITY ANALYSES 221
(13.1) may be redefined as functions of temporal, spatial, or substantive variables. By expanding b as a linear function of a variable d (e.g. distance to some fixed reference point), we obtain the expansion equation (13.2) which, when replaced into (13.1), yields the following: (13.3) Least squares estimates of b0 and b1 can be substituted for the letter parameters in (13.2) to derive an estimated expansion equation which specifies the drift of the parameter with respect to the variable d. The expansion method represents a second level of inquiry in regression modeling. By investigating how a parameter varies over a context, insight may be gained into relationships and processes. In addition, typically accompanying the methodology is an increased coefficient of determination (R2); i.e. a greater amount of the variation in Y may be accounted for with the varying parameter model (Casetti 1982). For remote sensing applications of the type described in the previous section, this approach seems quite appropriate, since the expansion method may ultimately lead to refinement of statistical/empirical models. However, in remote sensing, terminal models such as that shown in equation (13.3) may not be readily applicable as a two-dimensional mapping function for transforming image radiances into biophysical distributions. In order to use a varying parameter model as a mapping function with remote sensing data, each pixel in the scene must be assigned a value for each expansion variable. It can be rather cumbersome to assign each pixel a value on the distance to arbitrary geographic features (e.g. distance to shore). A far preferable approach would be to address parameter variation in two spatial dimensions so that the results can be readily associated with pixels in the digital array. A solution to this problem can be found in the use of trend surface expansions (TSEs), an approach illustrated by Jones (1983, 1984). This method involves redefining the initial model’s parameters as two-dimensional polynomials of spatial coordinates. These polynomials are the same as those used in standard trend surface analysis. TSEs combine the trend surface approach with the expansion method strategy. To illustrate, assume the same initial model as (13.1) (13.4) with Y and X measured in Cartesian units. The two-dimensional spatial stability of b can be examined by redefining the parameter in terms of a trend surface in the areal coordinates of the observations, as in, for example, the following second-order expansion equation: (13.5)
222 M.M.MILES, D.A.STOW, AND J.P.JONES, III
where u and v are the areal coordinates of the observations. The terminal model is obtained by replacing the expansion equation (13.5) into the initial model (13. 4), yielding (13.6) Least squares estimates of b0 through b5 can be substituted into (13.5) to identify the spatial form of the effect of X upon Y. This estimated expansion equation can be used to generate a trend surface map showing the spatial variation of the expanded parameter b. Alternatively, maps of the dependent variable can be produced by evaluating (13.6) for pixels with reference data for X, u, and v. Thus, by referencing the u, v coordinates of the TSE to the rows and columns of the digital image, both the spatially varying parameter and the predicted values from the model can be readily mapped using a digital image data array. This procedure may provide clues to the use of new independent variables in remote sensing models, and may ultimately increase our understanding of resource-energy transfer relationships. In the sections that follow we describe an application of TSEs for the analysis of four water quality measures. DATA AND MODELING PROCEDURES In this research, TSEs are applied to in situ and remotely sensed data to analyze water quality distributions for the Neuse River estuary in North Carolina. These data include Landsat-3 MSS digital data and near-simultaneous surface sample measurements from September 24, 1982 (see Khorram and Cheshire 1985). Water quality samples were collected at seventy-five surface sample sites; their
REMOTE-SENSING-BASED WATER QUALITY ANALYSES 223
Figure 13.1 Spatial distribution of sampling sites (marked by dots) in Neuse Estuary
distribution is shown in Figure 13.1. The seven westernmost sites were not covered by the satellite overpass, and were not used in any of the modeling procedures; this yields a total of sixty-eight observations for each of the models estimated. The precise location of the sample sites was indicated on nautical charts used in coordinating the sampling procedure (Khorram and Cheshire 1985). Four water quality properties were measured: (a) turbidity (an indicator of water clarity, governed by the amount of inorganic and organic matter); (b) total suspended solids (a measure of the concentration of suspended particulates in the water); (c) chlorophyll-a (an indicator of phytoplankton and biological productivity); and (d) salinity (the concentration of dissolved salts in the water). Table 13.1 reports descriptive measures of these dependent variables for the estuary. To ensure that remotely sensed radiance values obtained during the satellite overpass were properly referenced to the corresponding surface samples of the water surface, we employed the mean digital numbers derived from a nine-pixel (ground resolution cell) block surrounding each sample site. The nine-pixel averages were computed so as to compensate for imprecise collocation of sample sites with Table 13.1 Range of observed water quality values
Minimum Maximum Average
TURB
SAL
TSS
CHL-A
1 5 3.3
0 18 9.4
1 24 7.5
6 69 24.4
224 M.M.MILES, D.A.STOW, AND J.P.JONES, III
TURB
SAL
TSS
CHL-A
Note: Turbidity (TURB) is in nephelometric turbidity units; total suspended solids (TSS) is in milligrams per liter, chlorophyll (CHL-A) is in micrograms per liter, and salinity (SAL) is in parts per thousand.
the corresponding satellite data and to allow for temporal irregularities between remote and surface sampling. MSS wavebands 4, 5, 6, and 7 (representing 0.5–0. 6 μ m, 0.6– 0.7 μ m, 0.7–0.8 μ m, and 0.8–1.1 μ m, respectively) were used to develop the statistical models. The maximum, minimum, and mean digital number values in the four MSS bands are shown in Table 13.2. Various mathematical functions and combinations of mean digital numbers for each waveband were employed to assess variation in the water quality measures. The development of these new variables followed functional transformations which have been shown to be effective in previously published research in remote sensing of estuarine waters and in published results of in situ spectroscopic analysis of ocean color and chlorophyll content. In addition, a number of experimental transformations were derived. These functions included logarithmic and exponential transformations of the bands, ratio and differences, and complex combi Table 13.2 Range of mean Landsat multispectral scanner digital numbers BAND4
BAND5
BAND6
BAND7
Minimum 12.4 8.1 4.0 1.0 Maximum 16.7 11.3 7.1 4.8 Average 13.6 9.1 5.4 2.9 Note: Obtained from a nine-pixel block surrounding each of the sixty-eight sample sites. Original data had integer precision.
nations of such transformations. In this manner, the four MSS bands were transformed into over fifty new independent variables for input into the regression analysis. Because of the large number of transformed independent variables, the four initial models were estimated using a stepwise selection procedure, which was terminated when variables not in the model failed to achieve the 95 percent confidence level. The significant variables retained from these models were then tested for parameter stability using TSEs. The TSE variables were produced by establishing a u, v coordinate grid centered over the study area. In this manner, each water sampling site was assigned a u and v value for use in evaluating the stability of the independent variables’ parameters. First-, second-, and third-order trend surface variables were employed for each of the water quality models. Presented here are the results of the third-order TSEs, which were also derived through a stepwise selection procedure employing the same 95 percent
REMOTE-SENSING-BASED WATER QUALITY ANALYSES 225
confidence level criteria. The estimated TSE equations were then used to produce maps of the spatially unstable parameters for the estuary. RESULTS A summary of the results of the initial and terminal models is presented in Table 13.3. The table indicates that there is a high degree of variability in the success of the initial modeling procedures, as well as in the improvements brought about by the TSEs. The improvement in R2 ranges Table 13.3 Summary of results Model
R2, initial
R2, TSE
No. X, initial
No. X, TSE
TURB 0.42 0.51 2 1 SAL 0.83 0.99 2 2 TSS 0.34 0.39 2 2 CHL-A 0.13 0.45 1 1 2 Note: Shown for each model are the R values of the initial and TSE models, the number of predictor variables in the initial models (No. X, initial), and number of those variables found to be spatially unstable (No. X, TSE).
from a low of 5 percent (for the total suspended solids (TSS) model) to a high of 32 percent (for the chlorophyll-a (CHL-A) model). Furthermore, all the initial models evaluated exhibited some form of parameter instability. In one case, that of turbidity; one of the independent variables was found to be spatially stable with respect to the third-order TSE. The results of each of the four modeling procedures are discussed separately below. Turbidity models The initial model for turbidity resulted in the incorporation of two predictor variables: where TURB is turbidity, in nephelometric turbidity units,
The corresponding TSE model, obtained by expanding both the X1 parameter (b) and the X2 parameter (c) as a third-order trend surface in u and v, resulted in the incorporation of two expansion terms (u and v3) for the X1 parameter, while X2 was found to be spatially stable. The estimated terminal model is
226 M.M.MILES, D.A.STOW, AND J.P.JONES, III
Figure 13.2 Distribution of b parameter from turbidity model
As shown in Table 13.3, this model improved the R2 from 0.42 to 0.51. The estimated expansion equation for the X1 variable is defined as Figure 13.2 shows the two-dimensional spatial variability of b as specified by the estimated TSE equation. The figure reveals a change in the sensitivity of the parameter around the west-central portion of the estuary. This region is associated with particularly high chlorophyll-a levels (as evidenced by the results of the surface sampling). Varying concentrations of chlorophyll affect the absorption and reflectance of electromagnetic energy, particularly in the wavelengths that the multispectral scanner measures. The physical reason for improvement of the model appears to be the ability of the TSE technique to take into account the spatial heterogeneity of chlorophyll in the estuary, which varies as a function of the distance from river inflow. Salinity models The initial salinity model resulted in the incorporation of two predictor variables. The initial model is where SAL is the salinity, in parts per thousand,
REMOTE-SENSING-BASED WATER QUALITY ANALYSES 227
Figure 13.3 Distribution of b parameter from salinity model
Figure 13.4 Distribution of c parameter from salinity model
The TSE model resulted in the incorporation of five expansion variables (u, v, uv, v2, and v2u) for the X1 parameter (b) and one expansion variable (v2) for the X2 parameter (c). The estimated terminal model is
228 M.M.MILES, D.A.STOW, AND J.P.JONES, III
This model shows a substantially improved R2 (Table 13.3), with an unusually high value of 0.99 as a result of the incorporation of the expanded parameters. The corresponding TSE equations for X1 and X2 are
Figures 13.3 and 13.4 show the two-dimensional spatial variation of b and c as specified by these equations. The map of the b parameter indicates a gradual, unidirectional change in sensitivity along the length of the estuary, which is associated with a gradual increase in salinity and a decrease in turbidity. Interpretation of Figure 13.4 is more problematic, as the parameter does not exhibit a clear pattern with respect to the physical properties of the estuary. The primary reason for the observed model improvement appears to be the ability of the techniques to take into account the spatial distribution of a surface surrogate (perhaps turbidity), associated with salinity in the estuary (Khorram and Cheshire 1985). Another possibility is the effect of spatial variation in the concentration of airborne salt particles, which are generally higher near the ocean. However, since the physical basis for optical remote sensing of salinity has yet to be established, these conjectures remain necessarily speculative. Total suspended solids models The estimated initial model for total suspended solids is where TSS is total suspended solids, in milligrams per liter,
The TSE modeling resulted in the incorporation of one expansion variable (u) for the X1 parameter and one expansion variable (v2) for the X2 parameter. The estimated terminal model obtained is The estimated terminal model shows only a slightly increased R2, from 0.34 to 0. 39, as the result of the incorporation of the two expanded variables. The estimated expansion equations are
Figures 13.5 and 13.6 show the two-dimensional spatial variation of b and c. The spatial patterns of the parameters for the TSS model are similar to those of the salinity model (compare Figure 13.3 with Figure 13.5 and Figure 13.4 with Figure 13.6). Figure 13.5 suggests spatial effects in model estimation with respect to distance from the estuary’s mouth, while in Figure 13.6 we have again
REMOTE-SENSING-BASED WATER QUALITY ANALYSES 229
Figure 13.5 Distribution of b parameter from total suspended solids model
Figure 13.6 Distribution of c parameter from total suspended solids model
identified a wave-like pattern that operates throughout the estuary. The generalized along-estuary variation of parameter b (Figure 13.5) indicates that regression modeling of TSS may be influenced by variables which vary linearly with respect to distance from the estuary’s mouth.
230 M.M.MILES, D.A.STOW, AND J.P.JONES, III
Figure 13.7 Distribution of b parameter from chlorophyll-a model
Chlorophyll-a models The initial model for chlorophyll-a resulted in the incorporation of one predictor variable; the estimated model is where CHL-A is the chlorophyll-a concentration, in milligrams per liter, The TSE modeling resulted in the incorporation of three expansion variables (u2, v2, and v2u) for the X1 parameter. The estimated terminal model is
The R2 of this terminal model substantially improved, from 0.13 to 0.45. While the final value is relatively low, the improvement is greater than that seen in the other modeling results. The estimated TSE equation for b is This parameter is mapped in Figure 13.7. The map of parameter b indicates greatest sensitivity in the westernmost part of the estuary, with more gradual changes elsewhere. The western part of the estuary is characterized by high turbidity, low salinity, and high (though variable) biological productivity. This combination of factors is probably responsible for the high degree of parameter instability in this area. Also evident is a sensitivity to south shore versus north shore location throughout most of the estuary. The results of the surface sampling show higher salinity values near the north shore, compared with the south, where there is greater freshwater influx from rivers. This factor appears to
REMOTE-SENSING-BASED WATER QUALITY ANALYSES 231
affect the chlorophyll-a distribution and, consequently, the spectral radiance measured by the Landsat MSS.
CONCLUSION The analyses reported here indicate that the parameters of statistical models relating water quality properties to remotely sensed data are often spatially unstable, and that TSEs are capable of improving the reliability of such models. The TSE procedures have provided some insight into the physical reasons for the model improvement. For turbidity, the physical reasons for the observed spatial parameter variation are clearly evident. However, for other modeling results (e.g. salinity and chlorophyll-a), an assessment of the physical basis for the parameter variation remains rather speculative. The modeling of total suspended solids revealed parameter instability but little insight into the factors responsible. At this stage it is difficult to differentiate which physical factors (atmospheric or oceanic) are responsible for the parameter variation. Nevertheless, the analyses do point toward further refinement of statistical modeling strategies, which preferably would be improved by an interaction between deterministic and statistical methods for the modeling of estuarine characteristics using remote sensing data. Where statistical methods are used, optimization may be achieved in the manner illustrated in this paper. While the results summarized are from one case study, these methods should also produce positive results for remote sensing studies of estuarine water properties in other geographic regions. That is, there is no reason to believe that the existence of spatial parameter instability is region specific. Similar analysis should be performed using data from other estuaries to confirm that the conclusions derived here are in fact general. The application of the expansion method to the development of spatially varying parameter models as illustrated in this research suggests that the method would have much utility in other remote sensing applications. In particular, the method appears to be most useful in fine tuning statistically based calibrations of remotely sensed image data by spatially distributed point-sampled data at the surface or in the atmosphere. Such calibrations are often required in correcting radiometric influences on the image data, such as from atmospheric interactions, or for image-based mapping of biophysical surface features. The goal in radiometric processing of multispectral image data is to normalize the remotely sensed radiance values so that their variation is strictly due to the surface property of interest. Other factors can also affect the spatial variation in imaged radiance values, for instance those that influence the illumination of the surface (e.g. atmospheric particulates), the directional aspects of reflected radiation (e.g. topography), the variability in the surface absorption-reflection relationship (e.g. surface moisture), or the transmission characteristics of the atmosphere (e.g. atmospheric water vapor). Corrections for these radiometric
232 M.M.MILES, D.A.STOW, AND J.P.JONES, III
effects can be based on point sample measurements of the property to be corrected for at the time of a remote sensing overpass. The expansion method should prove useful for performing the corrections when a statistical approach is chosen and a number of spatially distributed samples have been taken. In summary, the inference of surface characteristics from remotely sensed data involves the estimation of functional relationships between the energy received at the sensor and the properties being sensed. In spite of the fact that physical properties are modeled, there is no a priori justification for assuming that the relationships identified are stable across the set of observations for which estimates have been made. The results presented here suggest that remote sensing-based water quality analyses can be improved through the expansion method. The same might be true whenever remotely sensed data are used for the quantification of the Earth’s surface characteristics. ACKNOWLEDGMENT We are grateful to Dr Siamak Khorram and Ms Heather Cheshire of the Computer Graphics Laboratory at North Carolina State University for supplying the data employed in this research. REFERENCES Casetti, E. (1972) ‘Generating models by the expansion method’, Geographical Analysis 4:81–91. ——(1982) ‘Mathematical modeling and the expansion method’, in R.B.Mandal (ed.) Statistics for Geographers and Social Scientists, pp. 81–95, New Delhi: Concept Publishing. Jensen, J.R., Estes J.E. and Simonett, D.S. (1980) ‘The impact of remote sensing in U.S. geography’, Remote Sensing of Environment 10:43–50. Jones, J.P. (1983) ‘Parameter variation via the expansion method with tests for autocorrelation’, Modeling and Simulation 14:853– 7. ——(1984) ‘A spatially-varying parameter model of AFDC participation: empirical analysis using the expansion method’, Professional Geographer 36:455–61. Khorram, S. (1979) ‘Remote sensing analysis of water quality in the San Francisco baydelta’, Proceedings of the 13th International Symposium on Remote Sensing of Environment, pp. 1591–601, Ann Arbor, MI: Environmental Research Institute. ——(1981) ‘Coastal water quality mapping from Landsat digital data’, International Journal of Remote Sensing 2:145–53. Khorram, S. and Cheshire, H.M. (1985) ‘Remote sensing of water quality in the Neuse River estuary’, Photogrammetric Engineering and Remote Sensing 51:53–61. Klemas, V., Bourchardt, J.F. and Treasure, W.M. (1973) ‘Suspended sediments observations from ERTS–1’, Remote Sensing of Environment 2:205. Kritikos, H., Yorinks, L. and Smith, H. (1974) ‘Suspended solids analysis using ERTS-A data’, Remote Sensing of Environment 3: 69.
REMOTE-SENSING-BASED WATER QUALITY ANALYSES 233
Strahler, A.N., Woodcock, C.E. and Smith, J.A. (1986) ‘On the nature of models in remote sensing’, Remote Sensing of Environment 20:121–39.
14 INNOVATION DIFFUSION THEORY AND THE EXPANSION METHOD Michael Sonis
Casetti defined the expansion method as follows: The expansion method is a procedure whereby a terminal model is generated from an initial one by making the parameters of the latter functions of some variables’ (1972:82). This definition includes not only the expansion of the parameters of an initial model by expansion equations, but potentially also the extension, modification, and metamorphoses of the conceptual framework of the initial model. The expansion method can be enriched by the application of two simple procedures for transforming an initial model into a terminal one. The first of these is the direct generalization of the initial model, which consists of the introduction of new variables and parameters analytically identical with those of the initial model. Usually direct generalization involves an increase in dimensionality. In the case of innovation diffusion, direct generalization is exemplified by the switch from one innovation to a set of competitive innovations. The conceptual rationale of this switch is the consideration of the innovation as an alternative choice, and the consideration of innovation spread as a result of competition between socio-economic systems supporting different innovation alternatives. Second, the expansion method can accommodate analytical metamorphoses of the initial and/or terminal model. These consist of changes in analytical form, analytical structure, or notational schemes corresponding to extensions of the conceptual framework. In this paper the analytical metamorphoses of models are, for example, the switch from the logistic S-shape growth curve to the logistic differential equation, and, consequently, to the matrix formulation of competition between adoption and nonadoption of an innovation. Other examples discussed are the transition from the temporal derivative of the rate of adoption of an innovation to directional derivatives in space-time, and the transition from the compressed logistic curve to the matrix formulation of the effect of an active environment through ‘redistribution’ processes. In this paper we extend the scope of the expansion method by including in it the procedures of direct generalization and analytical metamorphoses of models as well as the concomitant extensions of the conceptual frameworks of models. The paper presents the application of these procedures to innovation diffusion
INNOVATION DIFFUSION THEORY 235
theory. This enables the unification of geographical innovation diffusion theory, economic utility choice theory, and ecological competition theory (Sonis 1984, 1986a). THE EXPANSION METHOD AS A GLOBAL METATHEORETICAL APPROACH: AN APPLICATION TO THE INNOVATION DIFFUSION PROCESS A meta-theory provides a universal arena upon which the rise, elaboration, and establishment of new disciplinary theories takes place. A meta-theory is the coherent totality of general propositions and principles which can be used for the elaboration of some part of a specific discipline irrespective of substantive content. A global meta-theory is applicable to any specific disciplinary theory; examples are mathematics, logic, statistics, etc. A partial meta-theory concentrates on the specific properties and features common to a set of substantively different phenomena; examples include structural Q-analysis, graph-theoretical analysis, and factor analysis. Meta-theories are operational instruments of meta-theoretical approaches. Meta-theoretical approaches encompass the totality of heuristic, operational, or philosophical viewpoints, methods, or principles for the investigation of specific phenomena, irrespective of content. Analogous to the definition of global and partial meta-theories, one can define the notions of global and partial metatheoretical approaches. Sonis (1985, 1988) identified three global meta-theoretical approaches: interdisciplinary, integrative, and unifying. Only the first two of these will be considered here. They can be described as follows. The interdisciplinary approach is the mechanistic transfer of principles, methods, and models from well-developed disciplines to developing ones. The interdisciplinary approach moves ‘from model to reality’. This means that only the end point of a modeling process is interpreted, rather than its intermediate stages or the conceptual framework of the model being transferred. The integrative approach moves ‘from reality to model’ and involves also the interpretation of the conceptual framework of all intermediate stages of the modeling process. This requires the conversion of a specific principle into a meta-theoretical principle acting within at least two different conceptual frameworks. Thus, the positive impact of the integrative approach is connected to the modification and extension of the content of specific principles. From this point of view we can now evaluate the methodological and conceptual rationale of Casetti’s expansion method. The expansion method is a global meta-theoretical approach based on the stage-by-stage transformation of the interdisciplinary approach into the integrative approach. This transformation includes two operational phases within an iterative loop: (a) the expansion of an initial analytical model into a terminal model, and (b) the extension of the conceptual rationale of the initial model to the conceptual framework of the
236 M.SONIS
terminal model. Since the terminal model in a loop can become the initial model in a subsequent loop, the two operational phases can be iterated, if necessary (see Appendices). The formal scheme of the application of the expansion method is given in Figure 14.1. This scheme reflects the dialectic unity of the two operational phases that can be implemented only in close interaction with one another (Casetti 1986). Moreover, one can argue that the conceptual meta-theoretical rationale of the expansion method rests on a duality principle. A broad formal description of duality can be articulated as follows. Consider two systems, each including objects and their properties. One system is called the dual of the other if there exists a correspondence between the objects of both systems which defines (implies or generates) a correspondence between properties of objects. Apparently, the first clear expression of the duality principle appeared in the seventeenth century in projective geometry in the form of the Desargues duality principle: each true statement about points, straight lines, planes, and their incidence properties generates a true dual statement by exchanging the notion ‘a point’ by the notion ‘a plane’, and vice versa. The duality principle took the form of duality theories in almost all branches of mathematics, and spread from mathematics to other sciences. For example, in quantum mechanics the most important duality is ‘a particle’-‘a wave’. In economic linear programming models the duality interfaces two sides of the production process: physical production itself and the monetary evaluation of the production process by ‘shadow prices’, which in location/allocation optimization models play the role of differential rents. In the theory of antagonistic zero-sum games the dual structures represent the behavior of antagonistic players. In the spatial analysis applications of graph theory the duality between ‘links’ and ‘vertices’ translates into the duality between differentiation of regions and transportation networks; thus, the hexagonal covering of a plane in central place theory is dual with respect to a triangular transportation network. Analogously, the push-pull principle in migration theory describes the duality between origins and destinations (Sonis 1980). The introduction of new variables into the initial model by expansions implies the introduction of a complementary dual structure. Casetti’s (1986) dual expansions are formally reminiscent of the primal and dual formulations in linear programming. Indeed, we should not forget that early applications of the expansion method (e.g. Casetti and Semple 1969; Casetti et al. 1972; Hanham and Brown 1976) include also a latent duality between innovation spread characteristics and the features of adopters of innovations. In this paper the latter duality is presented explicitly in the form of a threefold correspondence between the objects of innovation (competitive innovations, the adopters choosing the innovations as choice alternatives, and the systems generating and supporting the spread of innovations) and the properties of such objects (the tempos of spread of different innovations, the adopters’ individual expectations for a gain in the future, and the marginal temporal utilities of supporting systems).
INNOVATION DIFFUSION THEORY 237
Figure 14.1 The operational stages of the expansion method
The recognition of this threefold duality extends the conceptual framework of innovation diffusion theory. The primary changes in this conceptual framework are connected with the consideration of the innovation as a choice alternative. Such a view converts innovation diffusion theory into a theory of dynamic choice processes (Sonis 1986a). The sociological theory of innovation spread emphasizes imitation and learning, and implies a rejection of the concept of homo economicus and a consideration of the innovation adopter as homo socialis. Homo economicus is a rational and omniscient creature exercising free choice between alternative innovations on the basis of utility maximization. Homo
238 M.SONIS
socialis’s behavior is collective and is based on imitation, learning, and the interactions among adopters of different innovations within an active and uncertain environment. The choice behavior of homo social is based on a subjective evaluation of expected future gains. This evaluation is heavily influenced by information flows through mass media which present ‘ready’ opinions and evaluations, thus making difficult the rational evaluation of innovations and their utilities for a potential adopter. Therefore, for homo socialis micro level utility maximization is replaced by a macro level balance between cumulative socio-spatio-temporal interaction among adopters of different innovations and a cumulative equalization of choice alternatives. This balance condition governs innovation spread and constitutes the dynamic macro level counterpart of the individual utility maximization principle (Sonis 1986a). Sociological innovation diffusion theory suggests that the choice behavior of homo socialis, which is the source and engine of innovation spread, is a collective macro level choice behavior such that the relative changes in choice frequencies depend upon the distribution of innovation alternatives between adopters (Fischer et al. 1990). This translates into Volterra-Lotka type partial differential equations that constitute the dynamic counterpart of the well-known static multinomial logit and dogit choice models. This paper is concerned with continuous-time innovation diffusion models and not with their discrete-time equivalents (see Sonis 1986b). The models of discrete-time innovation diffusion/choice processes can be extended into universal discrete-time multi-stock/multi-location relative socio-spatial dynamic models (see Dendrinos and Sonis 1990). TEMPORAL INNOVATION DIFFUSION WITHIN AN INDIFFERENT ENVIRONMENT AS COMPETITION BETWEEN ADOPTION AND NONADOPTION: EXPANSION TO THE CASE OF AN ACTIVE ENVIRONMENT The fundamental empirical regularity of innovation diffusion processes is the Sshaped curve relating the cumulative proportion of adopters of an innovation to time (Figures 14.2(a) and 14.2(b)). This S-shaped curve includes, in latent form, the essential features of innovation spread: the competition between adoption and nonadoption, the intervention of an active environment into the innovation diffusion process, and the process of the individual’s choice of an innovation within an indifferent or active environment. We will use two iterated expansions/ extensions, starting from the initial model which presents the cumulative Sshaped growth of innovation diffusion. The formal application of the expansion/ extension approach is articulated in Appendix 14.1. In the case of an indifferent environment, the S-shaped growth of the relative portion of adopters of an innovation can be presented by the following initial model:
INNOVATION DIFFUSION THEORY 239
Figure 14.2 Cumulative temporal S-shaped growth of the relative portion of adopters of an innovation: (a) innovation diffusion within an indifferent environment; (b) innovation diffusion within an active environment
(14.1) where is the ratio of adopters to non-adopters at time zero and a represents the tempo of innovation spread. Equation (14.1) is a solution for the Verhulst differential equation (14.2)
240 M.SONIS
which represents a contagious-type diffusion process: potential users of an innovation become adopters as a result of direct personal contacts with adopters. Here y(1−y) is proportional to the maximal amount of such contacts, and the parameter a is included to measure contacts leading to actual innovation adoption. Therefore, it is possible to interpret the parameter a as a measure of the effectiveness of the transfer of information about the innovation between the population of adopters and nonadopters. To represent innovation spread as competition between adoption and nonadoption, let us consider the relative portions of adopters and nonadopters Then the Verhulst equation (14.2) will have the following coordinate form: (14.3)
or in vectorial form (14.4) where The vector
describes the distribution of an innovation between adopters and nonadopters, and the matrix
represents the influence of adopters on nonadopters and vice versa. Equation (14. 4) can be written in the following symbolic form: (14.5) Equation (14.5) represents an analytical metamorphosis of model (14.1). Further, write then (14.6) One can interpret the co-influence parameter a as a value of an antagonistic zerosum game between adoption and non-adoption (Sonis 1983a, b). This interpretation implies the competitive exclusion principle: if the adoption is a ‘winner’ in the antagonistic zero-sum game against nonadoption (i.e. if a > 0), then in the long run all adopters will obtain the innovation. This principle is one of the most important concepts of the behavior of conservative multispecies
INNOVATION DIFFUSION THEORY 241
Figure 14.3 Scheme of the redistribution of an innovation between adopters and nonadopters caused by the intervention of an active environment
ecological systems, which forbids the stable coexistence of two (or more) species with identical habits within an ecological niche characterized by a limited food supply (Hardin 1960). In an active environment this principle must be modified. Under an active environment the portion y1(t) of adopters within an indifferent environment will be divided into two parts (Figure 14.3): the portion s+y1 of adopters that will remain adopters within an active environment, and the portion which will become nonadopters. Analogously, the portion y2(t) of nonadopters will be divided within an active environment into adopters s_y2 and nonadopters . Finally, the portion of adopters in an active environment will be (14.7a) and the portion of nonadopters in an active environment will be (14.7b) Thus, the distribution of adoption and nonadoption of an innovation in an active environment
is connected to the distribution
in an indifferent environment by the equation
242 M.SONIS
(14.8) where the matrix M is a Markovian matrix with the transposed stochastic matrix (14.9) such that ST = M. Moreover, equations (14.7a) and (14.1) give (14.10) which represents an S-shaped curve with a lower asymptote (an initial level) s_ and an upper asymptote (a saturation level) s+ (Figure 14.2 (b)). This figure shows that the population can be divided into three portions: the first with a share s_ represents the pool of adopters who have always been and will remain adopters; the second portion with a share of 1–s+ represents the environmental niche of nonadopters who persist in refusing to obtain the innovation; and the third portion with the share s+–s− represents the active population of potential and actual users who obtain or will obtain the innovation through innovation spread. Thus, the initial level s− and the saturation level s+ are defined by the introduction of an active environment. The competitive exclusion principle in the active environment influences only the active population in such a way that in the long run the entire active population will become adopters of the innovation. The compressed logistic curve (14.10) is the solution for the Pearl-Reed differential equation (14.11) Here w–s− and s+–w are the portions of adopters and potential users of the innovation from the active population; the product is proportional to the maximal number of contacts within the active population, and a/ is the relative rate of effectiveness of these contacts. Thus, the compressed logistic curve (14.10) and the Pearl-Reed differential equation (14.11) can be considered expansions of the logistic curve (14.1) and of the Verhulst logistic differential equation (14.2). Furthermore, the transformation (14.8) can be presented in the form (14.12) where
INNOVATION DIFFUSION THEORY 243
Therefore, the vectorial form of the Verhulst equation gives the following vectorial form of the Pearl-Reed equation: (14.13) presenting the innovation diffusion process within an active environment with the help of an antisymmetric competition matrix A and the Markovian matrix M. It is important to mention that the vector columns
of the Markovian matrix M represent the initial and final distributions of innovation within an active environment. SPATIO-TEMPORAL INNOVATION DIFFUSION: CASETTI’S EXPANSION AND DIRECTIONAL DERIVATIVES Casetti and Semple (1969:225) introduced the expanded model (14.14) starting from the initial model (14.1). Here y(t, s) is a portion of the adopters of an innovation at time t and at distance s from the supposed center of diffusion. This model is a particular case of a general terminal model (14.15) where the function u(t, s) (u(0, 0)=0) is an arbitrary space-time-dependent function. Models (14.1), (14.14), and (14.15) present three iterative expansions/ extensions leading to the introduction of a spatial dimension into the innovation diffusion process. The formal description of these stages is provided in Appendix 14.2. Function (14.15) satisfies the following system of partial differential equations, which constitute an analytical metamorphosis of model (14.15): (14.16)
The spatial distortions of innovation spread in each direction r can be represented by the partial differential equation equivalent to system (14.16):
244 M.SONIS
(14.17) where ′ y/′ r is a directional derivative in the arbitrary direction r in the domain of parameters t, s. Conceptually, this means that the Casetti-type expansion (14. 15) enables us, with the help of the directional derivative, to capture the spatial features of innovation spread without including special spatial diffusion terms into the model of innovation spread, as had been the usual case for ecological diffusion models since Hotelling’s (1929) seminal model. The vectorial form of equation (14.17) is (14.18) where
is the distribution of adopters and nonadopters at time t and at distance s from the diffusion center, and the matrix
is an antisymmetric competition matrix. This means that the function u(t, s) can be interpreted as a cumulative gain of adoption in the antagonistic zero-sum game against non-adoption. It is important to emphasize that there is no need to assume the existence of only one diffusion center: the distance parameter s can be exchanged by the geographical point N with coordinates x, y. Therefore, equation (14.18) can be rewritten as (14.19) which is a direct generalization of model (14.18). The choice of the analytical form of the co-interaction function u(t, x, y) renders useful cartographic representations of the innovation spread possible. The estimation of u(t, x, y) can be carried out using the following transformation of equation (14.15): Figure 14.4 shows the simplest way of deriving maps of innovation spread, which can be more readily obtained by computer. The influence of an active environment can be introduced into the differential equation of innovation spread (14.19) by a space-time-dependent Markovian matrix M(t, x, y) representing the spatio-temporal variations of the intervention
INNOVATION DIFFUSION THEORY 245
Figure 14.4 Construction of the level curves for general spatio-temporal innovation spread
of an active environment at each space-time point. As a result we have the equation (14.20) with solution
which represents the expanded form of equation (14.10). The vectorial partial differential equation (14.20) can be further expanded to the case of competitive innovations.
246 M.SONIS
MULTINOMIAL EXPANSION OF THE VERHULST LOGISTIC EQUATION: SPREAD OF COMPETITIVE INNOVATIONS WITHIN AN INDIFFERENT ENVIRONMENT The multinomial expansion of the innovation diffusion model is based on a conceptual framework which views innovation diffusion in terms of choices among alternatives (see Appendix 14.3). In the case of one innovation we must consider two exchangeable and exclusive alternatives: adoption and nonadoption. The extension of this frame of reference to the case of n innovations leads to the concept of competitive innovations. A set of n innovations is called a set of competitive innovations if these innovations are (a) mutually exchangeable, (b) mutually exclusive, and (c) exhaustive. Such a definition allows for the quantitative measurement of the relative portions of adopters of each innovation. The expansion of the Verhulst differential equation in the multinomial case produces the following system of log-linear differential equations: (14.21) The rationale here is that the products yiyj are proportional to the maximal number of direct and indirect contacts between adopters of innovation i and j, and the aij are measures of the effectiveness of these contacts. The vectorial form of (14.21) is (14.22) where
(14.23)
The condition (14.24) implies (Sonis 1983a) that (14.25) (14.25) has a game-theoretical interpretation: each i, j pair of innovations participates in an antagonistic zero-sum game with the interaction coefficient aij being the expected gain for innovation i. The distribution vectors
INNOVATION DIFFUSION THEORY 247
(14.26)
are equilibrium states of the log-linear dynamics (14.21). Moreover, the gametheoretical interpretation of the coefficients aij as pay-offs in antagonistic games among innovations leads to the following expansion of the competitive exclusion principle: (14.27) i.e. if the innovation i is a winner in all games against other innovations, then the state eβ i is a stable attractor, and state eβ i is a final distribution of adopters. Namely (14.28) so that in the long run all adopters accept only invitation i If and (14.29) then the state eβ
k
will be the initial distribution of adopters: (14.30)
The above conditions allow the qualitative analysis of the log-linear dynamics (Sonis 1986b). Let (14.31) and let
Then the qualitative conditions of asymptotic stability for of competitive exclusion states provide the following qualitative picture:
248 M.SONIS
(14.32)
This expansion of the competitive exclusion principle brings about new qualitative phenomena even in the case of three competitive innovations. For three innovations there are eight different sign combinations for sgn A. Six of these are characterized by asymptotically stable (for t ′ ± β ) initial eβ − and final eβ + states: (14.33) In each of the six cases (14.33) the portion of one of the innovations increases in an S-shaped fashion from 0 to 1; the other innovation decreases in an inverse Sshaped form from 1 to 0; the remaining innovation undergoes an increase from 0 to some maximal value, and after that decreases to 0 (Figure 14.5). The remaining sign combinations (14.34) identify a new qualitative phenomenon (Sonis 1986b) characterized by the absence of asymptotically stable
competitive exclusion equilibria and by the existence of a nesting set of periodic cycles surrounding an unstable equilibrium (14.35) where
INNOVATION DIFFUSION THEORY 249
Figure 14.5 Qualitative description of innovation diffusion dynamics with asymptotically stable initial and final equilibria
For n>3, the qualitative picture of the log-linear innovation diffusion dynamics is more complicated (see Sonis 19865). Thus, as we move from one innovation to n competitive innovations the topologically distinct types of innovation diffusion increase, despite the stability of the signs of the interaction coefficients aij. MULTINOMIAL EXPANSION OF S-SHAPED GROWTH: SPREAD OF TOTALLY ANTAGONISTIC INNOVATIONS WITHIN AN INDIFFERENT ENVIRONMENT The expansion of the equation to the multinomial case with n competitive innovations leads to the generalized logistic (14.36) with interaction potentials aj, j=1,…, n.
250 M.SONIS
The generalized logistics are the solution of the log-linear differential equations (14.21) in the special case when (14.37) The existence of interaction potentials aj obeying (14.37) implies that, for each closed chain of innovations i, j, k,…, m, i, the total interaction equals zero: (14.38) This means that each subset of m innovations participates in an uncooperative antagonistic zero-sum game, i.e. competition among innovations is totally antagonistic. The case of totally antagonistic competition is one in which the explicit solution of the system of log-linear equations (14.21) exists; in general the explicit solution of system (14.21) does not exist. Moreover, totally antagonistic competitions among subsets of innovations provide the ‘building blocks’ for the description of competition in general: the complete analysis of general competition, corresponding to the different cooperative zero-sum games, is based on a breakdown of competition into totally antagonistic competitions among smaller subsets of innovations (Sonis 1986b). Further, in the vicinity of each competitive exclusion equilibrium the general innovation diffusion process is approximated by totally antagonistic diffusion, which, in turn, is approximated linearly by the continuous Markov chain process (Sonis 1983a, b). THE ACTION OF AN ACTIVE ENVIRONMENT The active environment changes the accessibility of an innovation. The external intervention of an active environment brings about a redistribution of adopters among innovations. These interventions can be presented with the help of the stochastic matrix and its transposed Markovian matrix The coefficients sij are the frequencies in which the adopters of innovation i reject the innovation and instead adopt innovation j under pressure from the active environment. Therefore, innovation diffusion dynamics within an indifferent environment is converted to the dynamics within an active environment by the action of the transformation (14.39) In the case of totally antagonistic innovation diffusion, i.e. in the case of the existence of the interaction potentials aj the redistribution process transforms the generalized logistic growth (14.36) into a balanced dynamics of the type (14.40) In the case of the general innovation diffusion dynamics (14.22), transformation (14.39) provides the general vectorial equation of the multinomial innovation diffusion process within an active environment:
INNOVATION DIFFUSION THEORY 251
(14.41) The transformation (14.39) converts the competitive exclusion equilibria eβ the dynamics into the equilibrium states
i
of
(14.42)
which are the vector columns of the redistributional Markov matrix M. Moreover, transformation (14.39) preserves the stability of the corresponding equilibria, i.e. (14.43) Therefore, the qualitative picture of asymptotic stability of the equilibria (14.43) is given by the following scheme:
(14.44)
An important example of the multinomial innovation diffusion process is that of innovation diffusion dynamics with the following external redistribution (Sonis 1986b):
(14.45)
where The innovation diffusion dynamics associated with it is given by the system of differential equations (14.46) which is a direct expansion of the Pearl-Reed logistic equation (14.11). The external intervention (14.45) presents the ‘trapping’ properties of the environmental niche associated with each innovation i which is measured by the portion si of adopters belonging to this niche; the difference wi–si is the additive
252 M.SONIS
portion of adopters selecting a new innovation due to the interaction with adopters from the ‘active’ population outside the environmental niche (Sonis 1986b). GENERALIZATION OF THE MULTINOMIAL INNOVATION DIFFUSION PROCESS AND THE EXPANSION TO DYNAMIC INDIVIDUAL CHOICE THEORY To add socio-economic and cultural explanatory variables to innovation diffusion models we need a multidimensional space P spanned by space-time dimensions and by dimensions corresponding to the socio-economic characteristics of adopters and their active environment. Eventually, this leads to the following expansion of the log-linear diffusion dynamics: (14.47) The relative distribution of n competitive innovations at each point p of the parameter space P requires the distribution vector
(14.48)
and the functions which are nonlinear functions depending on the innovation i, the direction r in space P, and the distribution It can be shown (Sonis 1984, 1986b) that for each i there is a scalar potential Vi (p) such that (14.49) and that the system (14.47) is equivalent to the system (14.50) which is the direct expansion of the system of log-linear equations (14.21). The expansion (14.50) is also characterized by an antisymmetric cumulative interaction matrix (14.51) and by the existence of competitive exclusion equilibria eβ k. The various types of competition between innovations it involves can also be interpreted in terms of antagonistic games. Moreover, the following fundamental formulas are equivalent to the system of differential equations (14.47):
INNOVATION DIFFUSION THEORY 253
(14.52) These formulas are the expansion of multinomial generalized logistic growth (14. 36). They are identical with logistic growth if the scalar potentials Vi are independent of the distributions of adopters between innovations. Now we will present the expansion of the conceptual framework of the multinomial innovation diffusion process based on systems (14.50) and (14.52). The most important property of the fundamental formulas (14.52) is the fact that the portion yi(p) of the adopters of innovation i can be considered as the frequency of an individual’s selection of innovation i as an alternative. From this viewpoint the interaction potentials Vi(p) can be interpreted as the individual’s choice utilities for the alternative innovation i. Moreover, the interpersonal interactions are the utilities of switching from innovation i to innovation j, and ′ Vi/′ r are the dynamic marginal utilities representing the expectation of future gain from a change in location or in socio-economic environment. Equation (14.52) analytically resembles the static multinomial random utility choice logit model (Domencich and McFadden 1975). Therefore the dynamics (14.50) and (14.52) can be considered as deterministic dynamic expansions of the logit model. The conceptual basis for the derivation of the static logit model is the principle of an individual’s utility maximization. Our deterministic dynamic counterpart of the logit model necessitates exchanging the principle of an individual’s utility maximization for a more appropriate choice principle, because an individual compares alternative i with all other alternatives j, not only by comparing the cumulative utilities Vi and Vj, but also by comparing the dynamic marginal utilities ′ Vi/′ r and ′ Vj/′ r; moreover, the utilities Vi are dependent on the distribution of adopters of different innovations, and the expression
gives the transitional expected growth in utility and the degree of influence of adopters of alternative innovations j upon the decision to change from alternative i to j. In the new frame of reference an individual is not a homo economicus choosing among alternative innovations on the basis of utility maximization, but rather a social creature, whose imitative learning behavior reflects the influence of, and interaction with, other individuals. EXPANSION OF THE PRINCIPLE OF UTILITY MAXIMIZATION TO THE HAMILTONIAN VARIATIONAL PRINCIPLE OF STATIONARITY FOR DYNAMIC CHOICE PROCESSES The conceptual framework for the expanded form of the principle of an individual’s utility choice connects the dynamics of innovation spread with the
254 M.SONIS
ecological competitive dynamics of multispecies conservative ecological associations. The simplest case of the log-linear dynamics (14.21)
is similar to the Volterra conservative ecological dynamics with zero self-growth (Scudo and Ziegler 1978). It describes the redistribution of relative populations yi of different species caused by multispecies interaction with a zero total interaction. Volterra derived the conservative ecological dynamics from the Hamiltonian-type variational principle, which means that the ecological system evolves preserving a constant value of a quantity Volterra called ‘Vital action’. Let us expand Volterra’s ideas to the case of log-linear dynamics (14.21) (see Dendrinos and Sonis 1986). The analogues of Volterra’s ‘quantities of life’ are cumulative portions of adopters of the innovation i: (14.53) For such cumulative portions of adopters, the variational integral is constructed thus: (14.54) The integral (14.54) plays the role of a welfare function arising from the interaction between adopters of different innovations, and governs the dynamics of an individual’s choice of alternative innovations (14.21). If such a welfare function is optimized its first variation vanishes, giving rise to the system of Euler differential equations, which in our case represent log-linear relative dynamics (14.21) (see Dendrinos and Sonis 1986). The rationale of this variational principle lies in the fact that two processes are giving rise to the individual’s choice dynamics: (a) the process of interaction between adopters of different innovations, which is presented by the cumulative interaction
and (b) the process of the accumulation of disparities in the choice of different innovations, which is presented by the cumulative dynamic entropy
It is possible to show (see Dendrinos and Sonis 1986) that for actual individual choice dynamics (14.21), the cumulative interaction and the cumulative dynamic entropy balance each other:
INNOVATION DIFFUSION THEORY 255
Figure 14.6 Interconnections between diffusion of competitive innovations and individual utility choice within an active environment
This balance principle represents the dynamic deterministic expansion of an individual’s utility maximization principle. It is important to note that the balance principle can also be extended to the case of temporarily stable external interventions of an active environment. CONCLUSION The purpose of this paper is twofold. First, it is argued that Casetti’s expansion method is a global meta-theoretical approach based on the stage-by-stage transfer from the interdisciplinary approach to the integrative one. The expansion method consists of recursive loops, each of which includes two dual operational phases: the expansion of an initial analytical model into a terminal model; and the extension of the conceptual rationale of the initial model to the conceptual framework of the terminal model. Second, the expansion method is demonstrated by an application to innovation diffusion theory. Successive extensions of innovation diffusion theory by analytical expansions of models of innovation spread and by the concurrent expansion of their conceptual frameworks are portrayed in Figure 14.6. These result in dual interconnections between the diffusion of competitive innovations and an individual’s utility choice of alternative innovations within an active environment (Sonis 1984, 1986a). This scheme thus represents the unification of geographical theory of innovation spread, economic utility choice theory, and
256 M.SONIS
ecological competition theory—each of which had previously developed in isolation from the others.
APPENDIX 14.1
This appendix shows the formal scheme of the dual expansion of the S-shaped innovation growth model to the model of competition between adoption and nonadoption within an active environment. Iteration 1 Initial model
or
Interpretation and conceptual framework y(t) is a portion of adopters of the innovation at time t. Potential users of an innovation become adopters as a result of direct personal contacts with adopters. The co-influence parameter a is the relative proportion of contacts generating an actual innovation adoption and, simultaneously, is a measure of effectiveness of the transfer of information between adopters and nonadopters. Direct generalization and analytical metamorphosis of initial model or
INNOVATION DIFFUSION THEORY 257
Terminal model
or
Reinterpretation and extension of the conceptual framework y1 and y2 are the portions of adopters and nonadopters of the innovation at time t. Matrix A represents the influence of adopters on nonadopters and vice versa. The co-influence parameter a is the value of an antagonistic zero-sum game describing the competition between adoption and non-adoption. The competitive exclusion principle holds: if the adoption is a winner in the zero-sum game against non-adoption, then in the long run all adopters will obtain the innovation. Iteration 2 Initial model
or
Expansion and analytical metamorphosis of initial model
or
or
258 M.SONIS
Terminal model
where
Reinterpretation and extension of conceptual framework The additional redistribution of an innovation between adopters and nonadopters takes place under the intervention of an active environment. The portion y1 is divided into two parts: s+ y1 which will continue to adopt, and (1–s+)y1 which will reject the adoption. Analogously, the portion y2 of nonadopters is divided into adopters s−y2 and nonadopters. As a result, the adoption niche is revealed and expanded from the initial level s− to the saturation level s+, and the niche of nonadopters is established with a size 1–s+.
APPENDIX 14.2
This appendix shows the formal scheme of the expansion from the timedependent change to the space-time directional derivative. Iteration 1 Initial model
INNOVATION DIFFUSION THEORY 259
Terminal model
Reinterpretation y(t, s) is a portion of adopters of an innovation at time t and at distance s from a center of innovation.
Iteration 2 Initial model
Expansion
Terminal model
Iteration 3 Initial model
260 M.SONIS
Reinterpretation u(t, s) is a cumulative gain of adoption in the antagonistic zero-sum game against nonadoption. Analytical metamorphosis Introduction of space-time directional derivatives. Terminal model
or
where
Iteration 4 Initial model
or
Expansion Transfer from one diffusion center to the points in geographical space. Terminal model
INNOVATION DIFFUSION THEORY 261
Iteration 5 Initial model
Expansion Introduction of the action of an active environment with the help of the spacetime-dependent Markovian matrix M(t, x, y). Terminal model
APPENDIX 14.3
This appendix presents the formal scheme of expansion of one innovation model to the model of diffusion of a set of competitive innovations. Initial model
262 M.SONIS
Direct generalization
Terminal model
Reinterpretation Innovations are considered as choice alternatives and the innovation’s interactions aij as expectations of the gain in the antagonistic zero-sum games between the alternatives. REFERENCES Casetti, E. (1972) ‘Generating models by the expansion method: applications to geographical research’, Geographical Analysis 4: 81–91. ——(1986) ‘The dual expansion method: an application to evaluating the effects of population growth on development’, IEEE Transactions on Systems, Man, and Cybernetics SMC–16:29–39. Casetti, E. and Semple, R.K. (1969) ‘Concerning the testing of spatial diffusion hypotheses’, Geographical Analysis 1:254–9. Casetti, E., King, L. and Williams, F. (1972) ‘Concerning the spatial spread of economic development’, in W.P.Adams and F.M. Helleiner (eds) International Geography, pp. 897–9, Toronto: University of Toronto Press. Dendrinos, D.S. and Sonis, M. (1986) ‘Variational principles and conservation conditions in Volterra’s ecology and in urban relative dynamics’, Journal of Regional Science 26:359–77. —— and ——(1990) Chaos and Socio-Spatial Dynamics, New York: Springer Verlag. Domencich, T. and McFadden, D. (1975) Urban Travel Demand: A Behavioral Analysis, Amsterdam: North-Holland. Fischer, M.M., Haag, G., Sonis, M. and Weidlich, W. (1990) ‘Account of different views in dynamic choice processes’, in M.M.Fischer, P.Nijkamp and Y.Y.Papageorgiou (eds) Spatial Choices and Processes, North Holland, 17–47. Hanham, R.Q. and Brown, L.A. (1976) ‘Diffusion waves within the context of regional economic development’, Journal of Regional Science 16:65–71. Hardin, G. (1960) ‘Competitive exclusion principle’, Science 131: 1292–8. Hotelling, H. (1929) ‘Stability in competition’, Economic Journal 39: 41–57.
INNOVATION DIFFUSION THEORY 263
Scudo, F. and Ziegler, J. (eds) (1978) The Golden Age of Theoretical Ecology 1923–1940, New York: Springer Verlag. Sonis, M. (1980) ‘Locational push-pull analysis of migration streams’, Geographical Analysis 12:80–97. ——(1983a) ‘Competition and environment—a theory of temporal innovation diffusion’, in G.A.Griffith and A.Lea (eds) Evolving Geographical Structures, pp. 99–129, The Hague: Martinus Nijhoff. ——(1983b) ‘Spatio-temporal spread of competitive innovations— an ecological approach’, Papers of the Regional Science Association 52:159–74. ——(1984) ‘Dynamic choice of alternatives, innovation diffusion and ecological dynamics of the Volterra-Lotka model’, London Papers In Regional Science 14: 29–43. ——(1985) ‘Unifying principles in spatial analysis’, Paper presented at the Conference on Scientific Geography, Athens, Georgia. ——(1986a) ‘A unified theory of innovation diffusion, dynamic choice of alternatives, ecological dynamics and urban/regional growth and decline’, Ricerche Economiche 15:696–723. ——(1986b) ‘Qualitative asymptotic stability of equilibria for relative spatial dynamics’, Modeling and Simulation 17:209–13. ——(1988) ‘Relationships between spatial and economic analysis: a methodological discussion’, Horizons, Studies in Geography, University of Haifa 23–4:115–22.
15 SPATIAL DEPENDENCE AND SPATIAL HETEROGENEITY: MODEL SPECIFICATION ISSUES IN THE SPATIAL EXPANSION PARADIGM Luc Anselin The spatial expansion paradigm developed by Casetti (1972, 1986) consists of a flexible modeling strategy that takes into account contextual variation of parameters and functional forms. In its operational implementation, such a strategy results in a specification search which directs the evolution from the initial model to a terminal model. To the extent that this search is guided by statistical and econometric techniques, its validity is conditional upon a proper application of this methodological framework. In empirical regional science and geography, the application of standard tests and estimation methods is often hampered by the spatial characteristics of the data. This has led to a specialized methodology of spatial statistics and spatial econometrics. Two particular aspects of the data that merit special consideration are spatial dependence and spatial heterogeneity. The former is well known and has resulted in a number of tests for spatial autocorrelation, and led to various estimation procedures for spatial process models (see, for example, Cliff and Ord 1973, 1981; Upton and Fingleton 1985). Spatial heterogeneity is less familiar, and can be considered as a special case of heteroskedasticity or can be incorporated into estimation methods that include random and varying coefficients, switching regressions, and other forms of structural instability (e.g. Anselin 1988a, b). In this paper, I focus on the importance of spatial dependence and spatial heterogeneity in the specification search that is associated with the move from the initial model to the terminal model in the spatial expansion paradigm. I will pay particular attention to some methodological issues that should be taken into account when carrying out significance tests for the coefficients in expanded models. In addition, I consider issues associated with checking for various forms of model misspecification, such as heteroskedasticity and spatial error autocorrelation. The paper is organized around a simple taxonomy of spatial expansions that distinguishes between different degrees of knowledge about the precise expression for the expansion. This leads to various forms of heteroskedasticity and has implications for the validity of tests for spatial error autocorrelation. The
L.ANSELIN 265
main objective of the paper is to illustrate the problems associated with standard methods of statistical inference and to outline appropriate alternative approaches. The remainder of the paper consists of six sections. First, I briefly present the main concepts and introduce the distinction between spatial dependence, spatial heterogeneity, and spatial expansion. I then outline a simple taxonomy for spatial expansion specifications, which is followed by a discussion of some general model specification issues. The subsequent section focuses on the role of heteroskedasticity. Alternative ways in which significance tests can be carried out are presented, and some tests for the presence of heteroskedasticity in the expansion model are outlined. In the following section, tests for spatial error autocorrelation are considered more closely. A distinction is made between different forms of the expansion model, and a new test is introduced, based on the Lagrange multiplier principle. The paper closes with practical recommendations for model specification in the spatial expansion paradigm. SPATIAL DEPENDENCE, SPATIAL HETEROGENEITY, AND THE SPATIAL EXPANSION APPROACH In order to facilitate the discussion in the remainder of the paper, I will first briefly introduce the concepts of spatial dependence, spatial heterogeneity, and spatial expansion in formal terms. For simplicity of exposition, I will limit the scope to a linear regression context. Spatial dependence is the situation where a phenomenon observed at one point in space is co-determined by its realization in other locations. As in Tobler’s first law of geography, everything is related to everything else, but closer things are more so (Tobler 1979). Formally, this dependence can be expressed in a variety of spatial processes, e.g. as where y is a dependent variable of interest, observed at location i, yJ is a vector of realizations of y at other locations J in the system, x is a vector of explanatory variables, β is a stochastic error term, and f is a functional relation, often taken to be in a linear form (e.g. a spatial autoregression or a spatial moving average). In a narrower view of spatial dependence, it is limited to the stochastic error term in a regression model. In that case it leads to model misspecification, often called spatial error autocorrelation. In formal terms, the lack of independence between error terms at locations i and j implies that and thus results in a special case of a non-spherical disturbance term. As is well known, this may lead to misleading inference about the significance of the regression coefficients. In addition, spatial error autocorrelation also negatively affects the validity of a wide range of standard model diagnostics in applied regression analysis, as illustrated by Anselin and Griffith (1988).
266 SPECIFICATION ISSUES IN THE EXPANSION PARADIGM
Spatial heterogeneity pertains to the situation where a phenomenon is not homogeneous over space, owing to the particular characteristics of each location or as a result of other forms of contextual variation. Formally, it is expressed as structural instability in functional forms, in model parameters, and in the stochastic error term. If heterogeneity in the error term implies a nonconstant variance, it is called heteroskedasticity, which is a common reason for the existence of a nonspherical error term. This is the main form of spatial heterogeneity considered in this paper. A complicating factor for model specification in empirical regional science and geography is the joint occurrence of spatial error autocorrelation and heteroskedasticity (e.g. Anselin 1988a, c; Anselin and Griffith 1988). A spatial expansion produces a varying coefficient specification that can be formally expressed without loss of generality in the context of a linear regression equation with one explanatory variable and a constant term. The initial model (15.1), expansion equation (15.2), and terminal model (15.3) are then, for each observation i, (15.1) (15.2) (15.3) where zi is a vector of variables that determine the expansion, β is a vector of corresponding parameters, and f is a functional relationship that expresses the form of the contextual variation in the β 1 coefficient. In a typical application, the f is taken to be a linear form, so that equations (15.2) and (15.3) simplify to (15.4) (15.5) with, in this example, z1 and z2 as expansion variables. In some applications of the expansion method, the z variables typically consisted of trend surface terms in the coordinates of the locations for the observations (Jones 1983). Since this can easily lead to problems with multicollinearity, recent approaches have included more complex forms, such as principal components in the orthogonal expansion method of Casetti and Jones (1987a, b). A TAXONOMY OF SPATIAL EXPANSION SPECIFICATIONS From a model specification standpoint, three different situations can be distinguished in a spatially expanded regression model, depending on the extent to which the correct expansion variables have been included. In the first case, which I will call the standard expansion (SE), the exact form of the expansion (15.4) is known a priori. In other words, spatial theory provides
L.ANSELIN 267
a strong basis for the choice of the variables z1 and z2, and dictates the exact form of the expansion equation (in vector notation): (15.6) In the second case, this assumption is no longer tenable, and an error term is introduced. I will call this a random expansion (RE), in view of the presence of a stochastic error term in the expansion equation. One situation where this may occur on theoretical grounds is in the application of orthogonal expansions, where the original expansion variables are replaced by a smaller number of principal components. Even if the originally included expansion variables conformed to the SE case, the use of principal components introduces a random error. Formally, the expansion equation becomes (15.7) where μ is a stochastic error term, with zero mean and fixed variance. The third case is a generalization of the random expansion in that the stochastic error term in the expansion no longer needs to have zero mean. This would happen in a situation where the expansion equation is misspecified, e.g. when relevant expansion variables have been ignored. I will therefore call this case a misspecified expansion (ME). Formally, it can be expressed as: (15.8) (15.9) where β is a stochastic error term, with E[β ]=0, and v is the overall specification error, which does not have zero mean unless E[z2]=0. Moreover, z2 or any other ignored expansion variables are no longer necessarily known. GENERAL SPECIFICATION ISSUES IN THE SPATIAL EXPANSION METHOD Spatial heterogeneity in the form of a spatial expansion of the regression coefficients has a number of consequences for estimation, significance testing, and model specification. Most importantly, if the terminal model is indeed the correct specification, the estimates for the coefficients in the initial and any intermediate model will be biased. As a consequence, inference based on these estimates will be suspect. This is a special case of the familiar omitted variable problem in regression analysis. Indeed, as illustrated by Anselin (1988b), if the terminal model is correct, the expected value of the ordinary least squares (OLS) estimates in the misspecified initial model becomes (15.10) where β is a vector of the β k coefficients in (15.6). The elements of the matrix Z are products of the elements of X with the expansion variables (the z1 and z2 in
268 SPECIFICATION ISSUES IN THE EXPANSION PARADIGM
(15.6)). Therefore, X and Z will not be orthogonal, and the second term in (15.10) is the extent of the bias. The omitted variables also affect the indications given by standard misspecification tests. As is well known in regression analysis, a significant value for a test against serial correlation or against heteroskedasticity is not necessarily solely an indication of a nonspherical error term. As pointed out by, for example, Thursby (1981, 1982), Knottnerus (1985), and Godfrey (1987), these tests also have power against functional misspecification, such as omitted variables. In Anselin (1988b), this is shown to hold also in a spatial context. For example, an indication of spatial error autocorrelation may be misleading, in that not spatial dependence but spatial heterogeneity (in the form of a spatial expansion) is the source of the misspecification. This issue is further considered in a subsequent section. An additional complication associated with the specification search in empirical applications of the expansion method is the potential for data mining when no strong a priori theoretical model exists (e.g. Learner 1978; Lovell 1983; Anselin 1988c). In particular, when the choice of the terminal model is based on multiple comparisons of alternative specifications, the formal probabilistic framework for inference may no longer be valid. In applied work, this should be avoided as much as possible by implementing rigorous model validation procedures and by adjusting significance levels for multiple comparisons. The latter can be achieved by using Bonferroni bounds or other approximations (e.g. Savin 1980). SPATIAL EXPANSION AND HETEROSKEDASTICITY In an applied context, it is unlikely that the rigorous requirement of perfect knowledge needed in the standard expansion model would be satisfied. Therefore, the random expansion and misspecified expansion may be more realistic perspectives. In both these cases, the presence of an error term in the expansion equation will lead to a heteroskedastic error in the terminal model. However, the form of the resulting heteroskedasticity is different in each case. This has implications for the way in which significance tests for the model parameters can be properly carried out. It also determines the type of test for the presence of heteroskedasticity that is appropriate. Heteroskedasticity in the random expansion model In the random expansion model, a terminal model results that is similar to the Hildreth-Houck form of a random coefficient specification (Hildreth and Houck 1968; Amemiya 1985). Formally, substitution of (15.7) in the initial model yields
L.ANSELIN 269
(15.11) or (15.12) (15.13) where β is a heteroskedastic error term, with E[β ]=0 and, for (15.14) The error variance (15.14) can easily be extended to the case where many explanatory variables are included in the regression. In general, it consists of a constant variance and a sum of squares of the expanded explanatory variables, weighted by the error variance in the associated expansion equation. It should be noted that an expansion of the constant term would result in a variance component that cannot be identified separately from the error variance similar to the result in a random coefficient model. Although OLS will still yield unbiased estimates for the parameters in the terminal model (15.12), significance tests based on the standard OLS variance estimate will be misleading. A familiar alternative approach is to implement an estimated generalized least squares (EGLS) procedure and to base the indications of parameter significance on the variance matrix (15.15) where Ω –1 is replaced by a consistent estimate derived from the error variance components. The latter can be estimated by means of a number of different iterative approaches, most of which also yield maximum likelihood results (for overviews see, for example, Raj and Ullah 1981; Amemiya 1985). All these approaches assume perfect knowledge of the heteroskedastic error components, which is satisfied in the RE model. It is important to note that the standard t and F tests for the significance of the coefficients are no longer valid. In the context of a spatial expansion specification search this will affect the indication of contextual variation reflected by the estimates of the expanded variables. Correct inference should be based on the EGLS results and is asymptotic in nature, in contrast with the exact F test. As is well known and discussed in more detail by Anselin (1988c), this asymptotic approach may be misleading when only a small number of observations are available, and finite sample corrections may be useful (as suggested by Rothenberg 1984). A simple asymptotic test for the extent of heteroskedasticity that may be present in an RE model can be based on the Breusch-Pagan approach (Breusch and Pagan 1979; Anselin 1988a). This test is based on OLS estimation results and is of the form (15.16) where f is a vector with elements ei is the OLS residual associated with observation i, β 2 is the error variance based on OLS residuals (= e'e/N), and Z is
270 SPECIFICATION ISSUES IN THE EXPANSION PARADIGM
an N by p+1 matrix which consists of a constant term and the squared explanatory variables that have expanded coefficients. In the RE model, this asymptotically distributed statistic will have degrees of freedom equal to one less than the number of explanatory variables in the initial model. Heteroskedasticity in the misspecified expansion model In the misspecified expansion model the heteroskedasticity is of a more complex form. Indeed, the error term v in (15.8) also contains the ignored expansion variables and no longer necessarily leads to a standard random coefficient form. Given the ignorance about the correct specification, a more appropriate estimation and testing strategy may be based on the robust procedures outlined in Anselin (1990). These approaches are based on OLS estimation, but use a heteroskedasticity-robust form of the coefficient variance matrix to derive significance tests. Three procedures seem particularly suitable in the context of the ME model. A first is based on the results of White (1980, 1984), which provide a means to obtain a consistent estimate for the OLS variance that is robust to heteroskedasticity of unknown form, as well as to other sources of misspecification. Such an estimate that can be used in asymptotic tests of significance (Wald and Lagrange multiplier tests) is presented in MacKinnon and White (1985) as (15.17) where S is a matrix of squared OLS residuals, adjusted by a term with kii as the ith diagonal element in the idempotent matrix A second approach consists of the application of the heteroskedasticity-robust tests in regression directions, developed by Davidson and MacKinnon (1985). These procedures consist of testing a null hypothesis β =0 in (15.18) where Z is an N by R matrix, β is an R by 1 column of parameters, and β is an independent but heteroskedastic error term with bounded for all i. In the context of the ME model, the Z would contain all included expanded variables. The test statistic suggested by Davidson and MacKinnon is of the form (15.19) where M is the familiar projection matrix and Ω (e) is a diagonal matrix with the squared OLS residuals (under the restricted model, i.e. under H0). The test statistic can also be obtained from a simple auxiliary regression, as discussed in more detail by Anselin (1990). This approach can be applied to a wide range of situations and can easily be constructed for many alternative expansions, based only on the results of an OLS estimation of the initial model. Of course, the critical significance levels in such
L.ANSELIN 271
a specification search would need to be corrected for the multiple comparisons, e.g. by using the appropriate Bonferroni bounds. Also, it should be kept in mind that the properties of the test are asymptotic. A third strategy is to take a nonparameteric approach. Pseudo-significance tests for the coefficients in the expanded variables can be based on the variance estimates obtained from a jack-knife procedure. This resampling technique consists of repeated estimation (for a total of N times) on a data set from which observations are dropped one at a time. As shown in Efron (1982), an estimate for the cc variance of the OLS estimate can be obtained from (15.20) where b(i) is the estimate on the data set without observation i. Alternatively, as pointed out in MacKinnon and White (1985), this can be expressed as (15.21) where Ω * is a diagonal matrix with as elements the adjusted squared residuals is the same as for the White approach above, and e* is a vector with the square root of the diagonal elements of Ω *. In order to assess the extent of the problem before embarking on these more complex approaches, a test for the presence of heteroskedasticity of unspecified form could be carried out. The well-known White (1980) test consists of computing N times the R2 measure of fit (not corrected for the mean) in an auxiliary regression of the squared OLS residuals on all explanatory variables, their squares, and cross-products. The statistic is asymptotically distributed as with degrees of freedom equal to the number of nonredundant explanatory variables in the auxiliary regression (not counting the constant). Although this test is designed primarily to detect heteroskedasticity, it also has power against other forms of misspecification, such as omitted variables, which are present in the ME model. SPATIAL EXPANSION AND SPATIAL AUTOCORRELATION Although the spatial expansion method deals with spatial heterogeneity, it is often implemented in contexts where spatial dependence may be present as well. The most common situation is that where a spatial autoregressive or other spatial process underlies the regression disturbance. Two issues need to be distinguished in this context. The first relates to the extent to which standard tests for autocorrelation also have power against the misspecification present in the initial model in the expansion paradigm. In other words, it is possible that a commonly carried out procedure, such as a Moran test on the regression residuals, may point to spatial dependence, when in fact a
272 SPECIFICATION ISSUES IN THE EXPANSION PARADIGM
spatial expansion of the model is the main culprit of the misspecification. This addresses an issue raised by Jones (1983) and Casetti and Jones (1987b), who pointed out in an empirical example how a spatial expansion of regression coefficients resulted in a Moran test that changed from significant to insignificant. Although in their case spatial expansion eliminated spatial autocorrelation, this is not always necessarily the case. In Anselin (1988b) and Anselin and Griffith (1988), it is shown in detail how the expected value of the Moran coefficient in the initial model, when in fact the terminal model is the correct specification, is biased towards rejection of the null hypothesis. Formally, it can be shown that the expected value of Moran’s I for the regression residuals e, i.e. e' We/e'e, becomes (15.22) where is the number of explanatory variables included in X, tr is the trace operator, and the other variables are as in (15.10). Since the first term in this formulation is positive, E[I] will exceed the result for a properly specified model, which is the second term in the expression (e.g. Cliff and Ord 1981; Upton and Fingleton 1985). Intuitively, since too small a value is subtracted from the I measure to obtain the standardized z coefficient, the Moran test in the misspecified model may be more likely to reject the null hypothesis of no spatial autocorrelation. The extent to which this is the case depends directly on the degree of coefficient instability (the Zβ ). In other words, the relation between spatial expansion and spatial autocorrelation has to be stated carefully. The issue is not whether spatial expansion in the parameters eliminates spatial autocorrelation in the error terms. Rather, it turns out that the Moran test may have power against misspecifications of the form implied by spatial coefficient instability. This is similar to results obtained in a time series context for the Durbin Watson test (e.g. Thursby 1981). A second important issue pertains to the extent to which tests for the presence of spatial error autocorrelation are affected by the heteroskedasticity in the spatial expansion model. As demonstrated in Anselin (1988a, b), special test statistics need to be developed for situations where multiple sources of misspecification are present in the regression model. This issue affects the random and misspecified expansion models differently, and therefore needs to be discussed separately for each. Testing for spatial error autocorrelation in the random expansion model In the random expansion model, the form of the heteroskedasticity is known, as expressed by the x2 variables in (15.14). At first sight, it would seem that the Lagrange multiplier test for spatial error autocorrelation in the presence of heteroskedasticity, given in Anselin (1988a), would apply to this situation.
L.ANSELIN 273
However, the particular form for the nonspherical error in the random expansion model with spatial error autocorrelation does not correspond to the general specification used in the previously presented Lagrange multiplier test. Therefore a separate derivation is needed. With the same notation as before, a spatial autoregressive error term can be introduced in (15.11) as or (15.23) where W is the usual spatial weight matrix and β is an independent and homoskedastic random error. Substitution of (15.23) into the error β in (15.13) yields (15.24) i.e. a disturbance term that incorporates both heteroskedasticity (μ x) and spatial dependence. If the errors β and μ are assumed to be independent, the overall disturbance variance matrix becomes (15.25) where for notational simplicity, ′ 2 is the variance of β , and H is a diagonal matrix with as elements expressions such as (15.14). A more detailed derivation of the test is presented in the Appendix. The resulting statistic is (15.26) where V is the estimated error variance matrix from an EGLS procedure, and e is the associated residual The expression in (15.26) shows a definite similarity to a traditional Moran test, since V−1/2e would be the ‘corrected’ residual that is typically associated with iterative EGLS estimators for heteroskedastic models. Therefore, the numerator in (15.26) is similar to a cross-product of an (adjusted) residual vector with its spatial lag, inversely weighted by the variance. The denominator is an expression in matrix traces needed to scale this pseudo-Moran coefficient in order to achieve the proper asymptotic distribution. It is similar to trace factors found in other Lagrange multiplier tests for spatial autocorrelation (Burridge 1980; Anselin 1988a, b). This statistic can be computed from the output of a standard regression package, and does not necessitate a designated nonlinear estimation strategy. It is particularly appropriate to test for spatial autocorrelation in the orthogonal expansion model, where it provides a rigorous alternative to the ad hoc Moran statistic.
274 SPECIFICATION ISSUES IN THE EXPANSION PARADIGM
Testing for spatial autocorrelation in the misspecified expansion model As pointed out in the previous section, the heteroskedasticity likely to be present in the misspecified expansion model is of unknown form. Therefore, the Lagrange multiplier test introduced above cannot be applied. An alternative consists of a heteroskedasticity-robust test for spatial error autocorrelation, presented in Anselin (1990). This approach follows from an application of the Davidson-MacKinnon tests in regression directions. It exploits the equivalence of spatial error autocorrelation with a nonlinear spatial autoregressive model, the so-called common factor approach or spatial Durbin model (e.g. Burridge 1981; Blommestein 1983; Bivand 1984; Anselin 1988b, c): (15.27) The omitted variables in the terminology of (15.18) are Wy and WX, i.e. an N by K+1 matrix of spatially lagged variables. As shown in more detail in Anselin (1988d), a straightforward extension of the Davidson-MacKinnon results to this case yields the test statistic as
where y–f are the OLS residuals, Ω (e) is a diagonal matrix of squared OLS residuals, a projection matrix, with Q as a matrix of instruments, and F(β ) and F (β ) are partial derivatives, evaluated under the null hypothesis of β =0. For the spatial model, these partial derivatives simplify to
with b as the OLS estimate. The expression is none other than a vector of spatially lagged OLS residuals, or We. Even though instrumental variables are needed to construct the test, no actual instrumental variables estimation is carried out. The test is asymptotic and its finite sample properties have not been fully explored. CONCLUSION The spatial expansion method is a flexible procedure to account for spatial heterogeneity in regression analysis. However, a careful specification search that proceeds from the initial model to the terminal model needs to take into account a variety of complicating factors. Foremost among these are the effects that are caused by heteroskedasticity and spatial autocorrelation. It should be clear from the foregoing discussion that ignoring these effects may result in misleading inference and erroneous conclusions. The various tests and inference strategies outlined in this paper provide a means to deal with these specification problems in realistic contexts. Given the
L.ANSELIN 275
increasing popularity of the spatial expansion method for empirical research in geography and regional science, it is important that this realism be taken into account. To avoid some common pitfalls in applied work, the following general guidelines should be kept in mind. 1 A specification search is more reliable if it proceeds from an overspecified model to a more parsimonious model. Going from a large to a small model may result in inefficient estimates, but avoids the bias inherent in the estimation of an underspecified model. The search should be based on rigorous model specification tests and model validation techniques. The proper adjustment of significance levels should be implemented whenever multiple comparisons are carried out. 2 The standard expansion specification is likely to be overly optimistic about the prior knowledge of the analyst. The two alternative forms introduced in this paper—the random expansion and the misspecified expansion— reflect a more realistic view of specification problems. The random expansion model is particularly appropriate for orthogonal expansions. In all other circumstances, the misspecified expansion model should be taken as a safer point of departure, and robust inference will tend to be more reliable. 3 The interpretation of the significance of the model coefficients should be carried out with care. Since heteroskedasticity and/or spatial autocorrelation are likely, tests for the presence of these forms of misspecification should be carried out. The Lagrange multiplier and robust statistics outlined in this paper provide a practical means to implement this. In this respect, it is important to keep in mind that spatial expansion of regression coefficients does not eliminate spatial error autocorrelation. 4 Inference after estimation by EGLS (e.g. to take into account known heteroskedasticity or spatial autocorrelation) is asymptotic in nature, and may not be reliable for a small number of observations. Whenever possible, finite sample corrections should be considered, or the robust jack-knife approach implemented as an alternative. As a general rule, robust inference will be more reliable whenever there exists substantial doubt about the correct specification. On the other hand, when there are strong theoretical grounds to motivate the choice of a particular specification, the robust approach will result in a loss of efficiency and power.
276 SPECIFICATION ISSUES IN THE EXPANSION PARADIGM
APPENDIX: A LAGRANGE MULTIPLIER TEST FOR SPATIAL ERROR AUTOCORRELATION IN THE RANDOM SPATIAL EXPANSION MODEL
As outlined in more detail in Anselin (1988a, b), a Lagrange multiplier test is based on maximum likelihood estimation of a model under the null hypothesis. A particular test can be derived by partitioning the parameter vector β into a group of parameters that are involved in the null hypothesis and a group with all other parameters: β =[β 1|β 2]. The relevant expressions for the test are computed from the score and a partitioned information matrix, based on the first and second partial derivatives of the likelihood function, and evaluated under the null hypothesis. The statistic itself is where d is the score vector and I11 is the partitioned inverse of the information matrix that corresponds to the coefficients involved in the null hypothesis, both evaluated under the null. The likelihood for the random spatial expansion model with spatial error autocorrelation is a special case of a regression model with a nonspherical error variance. The corresponding log likelihood is (ignoring constants) with β as the error and Ω as its variance. As shown in (15.25)
The Lagrange multiplier statistic is derived by obtaining the score for β , by finding the relevant partitioning of the information matrix for β , and by evaluating these expressions for β =0 (i.e. the null hypothesis). From a tedious but straightforward application of matrix calculus the score can be found as
Under the null hypothesis, the matrix Ω is a diagonal matrix with the standard heteroskedastic variance components as elements, say V. Also, for β =0, it follows that B=I, and thus the score simplifies to
L.ANSELIN 277
where e is the corresponding residual y–Xb. Moreover, since V−1 is diagonal and W has zero diagonal elements by convention, tr V−1W=0, and thus In models with spatial error dependence, the information matrix is typically not block diagonal between the parameter β and the other coefficients of the error variance (i.e. the variances of the heteroskedastic model). This would considerably complicate the derivation, were it not that under the null hypothesis of β =0, block diagonality results. Therefore, the relevant partitioned inverse reduces to a scalar, and is the inverse of the element of the information matrix that corresponds to β . This can be found as Under the null hypothesis, with B=I, and Ω =V this becomes Combining the results for the score and the information matrix yields the Lagrange multiplier test as
or
ACKNOWLEDGMENT The research behind this paper was supported in part by Grants SES–8600465 and SES–8721875 from the National Science Foundation. REFERENCES Amemiya, T. (1985) Advanced Econometrics, Cambridge, MA: Harvard University Press. Anselin, L. (1988a) ‘Lagrange multiplier test diagnostics for spatial dependence and spatial heterogeneity’, Geographical Analysis 20: 1–17. ——(1988b) Spatial Econometrics: Methods and Models, Dordrecht: Kluwer Academic. ——(1988c) ‘Model validation in spatial econometrics: a review and evaluation of alternative approaches’, International Regional Science Review 12:279–316. ——(1990) ‘Some robust approaches to testing and estimation in spatial econometrics’, Regional Science and Urban Economics 20: 141–63. Anselin, L. and Griffith, D. (1988) ‘Do spatial effects really matter in regression analysis?’, Papers, Regional Science Association 65: 11–34. Bivand, R. (1984) ‘Regression modeling with spatial dependence: an application of some class selection and estimation methods’, Geographical Analysis 16:25–37.
278 SPECIFICATION ISSUES IN THE EXPANSION PARADIGM
Blommestein, H. (1983) ‘Specification and estimation of spatial econometric models: a discussion of alternative strategies for spatial economic modeling’, Regional Science and Urban Economics 13:251–70. Breusch, T. and Pagan, A. (1979) ‘A simple test for heteroskedasticity and random coefficient variation’, Econometrica 47: 1287–94. Burridge, P. (1980) ‘On the Cliff-Ord test for spatial correlation’, Journal of the Royal Statistical Society B 42:107–8. ——(1981) ‘Testing for a common factor in a spatial autoregressive model’, Environment and Planning A 13:795–800. Casetti, E. (1972) ‘Generating models by the expansion method: applications to geographical research’, Geographical Analysis 4: 81–91. ——(1986) ‘The dual expansion method: an application for evaluating the effects of population growth on development’, IEEE Transactions on Systems, Man, and Cybernetics SMC–16:29– 39. Casetti, E. and Jones, J.P. (1987a) ‘Spatial aspects of the productivity slowdown: an analysis of U.S. manufacturing data’, Annals, Association of American Geographers 77: 76–88. —— and ——(1987b) ‘Spatial parameter variation by orthogonal trend surface expansions: an application to the analysis of welfare program participation rates’, Social Science Research 16: 285–300. Cliff, A. and Ord, J. (1973) Spatial Autocorrelation, London: Pion. —— and ——(1981) Spatial Processes: Models and Applications, London: Pion. Davidson, R. and MacKinnon, J. (1985) ‘Heteroskedasticity-robust tests in regression directions’, Annales De L’INSEE 59/60:183– 217. Efron, B. (1982) The Jackknife, the Bootstrap and Other Resampling Plans, Philadelphia, PA: Society for Industrial and Applied Mathematics. Godfrey, L. (1987) ‘Discriminating between autocorrelation and misspecification in regression analysis: an alternative test strategy’, Review of Economics and Statistics 69:128–34. Hildreth, C. and Houck, J. (1968) ‘Some estimators for a linear model with random coefficients’, Journal of the American Statistical Association 63:584–95. Jones, J.P. (1983) ‘Parameter variation via the expansion method with tests for autocorrelation’, Modeling and Simulation 14:853– 7. Knottnerus, P. (1985) ‘A test strategy for discriminating between autocorrelation and misspecification in regression analysis: a critical note’, Review of Economics and Statistics 67:175–8. Learner, E. (1978) Specification Searches: Ad Hoc Inference with Nonexperimental Data, New York: Wiley. Lovell, M. (1983) ‘Data mining’, Review of Economics and Statistics 65:1–12. MacKinnon, J. and White, H. (1985) ‘Some heteroskedasticityconsistent covariance matrix estimators with improved finite sample properties’, Journal of Econometrics 29:305–25. Raj, B. and Ullah, A. (1981) Econometrics: A Varying Coefficients Approach, New York: St Martin’s Press. Rothenberg, T. (1984) ‘Hypothesis testing in linear models when the error covariance matrix is nonscalar’, Econometrica 52:827– 42. Savin, N. (1980) ‘The Bonferroni and Scheffe multiple comparison procedures’, Review of Economic Studies 47:255–74.
L.ANSELIN 279
Thursby, J. (1981) ‘A test strategy for discriminating between autocorrelation and misspecification in regression analysis’, Review of Economics and Statistics 63: 117–23. ——(1982) ‘Misspecification, heteroskedasticity, and the Chow and Goldfeld-Quandt tests’, Review of Economics and Statistics 64:314–21. Tobler, W. (1979) ‘Cellular geography’, in S.Gale and G.Olsson (eds) Philosophy in Geography, pp. 379–86, Dordrecht: Reidel. Upton, G. and Fingleton, B. (1985) Spatial Data Analysis by Example, New York: Wiley. White, H. (1980) ‘A heteroskedastic-consistent covariance matrix estimator and a direct test for heteroskedasticity’, Econometrica 48:817–38. ——(1984) Asymptotic Theory for Econometricians, New York: Academic Press.
16 GENERATING VARYING PARAMETER MODELS USING CUBIC SPLINE FUNCTIONS Robert Q.Hanham
A standard assumption in spatial models such as (16.1) and in time series models such as (16.2) is that β and β are invariant over space and time, i.e. over i and t. This is a very restrictive assumption which may be quite unwarranted. There is now a great deal of evidence to show that the process (es) embodied in equations (16.1) and (16. 2) do indeed vary over space and time. A number of methods exist for discerning the presence of variable parameters and for building this fact into models such as those above. In this paper the method of cubic splines is proposed to generate models in which the parameters follow complex spatial or temporal paths. Details of the method are given in a later section. An illustration of the approach is provided using a time series model of regional unemployment. REGIONAL UNEMPLOYMENT RESPONSE MODEL One of the most common models of regional unemployment stems from the work of Brechling (1967). A typical formulation of the model is as follows: (16.3) where Ut is the unemployment rate in a given region at time t, Nt is the unemployment rate in the national economy at time t, and β and β are parameters which reflect the level of structural unemployment (value of Ut when Nt equals zero) and the sensitivity of regional unemployment to changes in national unemployment, respectively. This model is demand driven and the focus of attention is on the response of a regional economy to changes in aggregate demand. The parameters β and β are assumed to be constant through time, but are expected to, and evidence shows that they do, vary spatially. Equation (16.3) has been estimated in a variety of countries and at a variety of scales. A review of this research may be found in Clark (1980). Research has also indicated that the parameter β may vary with time. Campbell (1975) and Owen and Gillespie (1982) found this to be the case in
GENERATING VARYING PARAMETER MODELS 281
northern England by estimating an equation similar to (16.3) for various subperiods. Using a variety of tests, Dunn (1982) also found instability in the β estimate for each of a number of regions in southwest England and South Wales. Dunn was also able to track the time path of the β estimates for these regions by means of a moving window regression, essentially akin to a moving average procedure but applied in a regression context in which ordinary least squares estimates are successively obtained by adding and discarding time observations at the beginning and end of a fixed length of time m. The set of T-m estimates was plotted and indicated a fairly complex time path for β , suggesting that regional economies are not consistently responsive to changes in the behavior of the national economy through time. Hanham (1982), using a similar method, found a great deal of variation in β time paths for a set of metropolitan economies in the Sunbelt from 1965 to 1981. Not only were the time paths complex functions of time, they differed substantially between regions. VARYING PARAMETER MODELS AND THE EXPANSION METHOD Casetti’s (1972) expansion method provides a mechanism for constructing varying parameter models. It has been applied in both spatial and temporal contexts. With respect to a time series model such as equation (16.3), we begin by assuming that β and β are time dependent; hence (16.4) Changes in a and β may be discrete or continuous. If discrete, then β and β can be expressed as a function of one or more dummy variables, representing different time periods. Continuous change in β and β can also be built into equation (16.4) by expressing them as continuous functions of time. One of the simplest would be a quadratic function. This has in fact been used to model changes in a and β in several studies (e.g. Casetti et al. 1971). In that case it was assumed that (16.5) It would also be possible, of course, to assume that β too is a quadratic function of time, which would allow for the possibility that a region’s sensitivity to changing national unemployment may increase or decrease at an increasing or decreasing rate. Hence (16.6) which, in combination with equation (16.5), gives the following time-varying parameter model: (16.7)
282 R.Q.HANHAM
CUBIC SPLINE FUNCTIONS Although equation (16.7) will capture the time dependence of a and β , it is still a crude model given our knowledge of the complex time paths which have been found for both these parameters in previous research. As one means of building this complexity into the original regional unemployment response model, I wish to propose the method of cubic spline functions. Such functions may be used to approximate the shape of curvilinear functions without the necessity of prespecifying the mathematical form of the function (Suits et al. 1978). As such, the method is clearly very flexible and, furthermore, is capable of representing quite complex forms. Originally developed in engineering, the method has been used sparingly in economics and in the social sciences (e.g. Suits et al. 1978; Anderson 1982). Moreover, it has only been used to estimate functions of the form Y=Y(X) and not, to the best of my knowledge, functions such as β =β (T). Spline functions are related to piecewise linear regression in the sense that a series of regressions is fitted to each of a number of segments marked off on the axis of the independent variable. Any number of segments can be used, although three appear to work well in practice. Spline functions differ from piecewise linear regression in certain important respects. First, the linear approximation for each segment is replaced by a cubic polynomial approximation (other degrees may be used, but the cubic has proven to be adequate). In essence, then, the method of cubic splines involves a system of piecewise cubic polynomials joined together at a number of points. This enables one to approximate fairly complex curvilinear trends. Second, the method ensures that not only is the overall function continuous, but its derivatives are continuous at the junction between the segments. Such junctions are termed knots. In the context of equation (16.4), β =β (T) is approximated by a cubic spline function. The variate T is divided into three (equal) segments by the points T0, T1, T2, and T3. The cubic spline function is as follows: (16.8) where Dj is a dummy variable defined by the jth segment. Equation (16.8) is discontinuous at the knots, but application of the following constraints makes the function and its first and second derivatives continuous (Suits et al. 1978):
GENERATING VARYING PARAMETER MODELS 283
Figure 16.1 Moving window regression time plot of β for Pittsburgh
(16.9)
The constraints on the β j0 equate values of the function to the left and right of the knots, those on the β j1 equate the first derivatives at the knots, and those on the β j2 equate the second derivatives at the knots (Suits et al. 1978). Assuming that the segments of T are equal in length and substituting equations (16.9) into (16.8) we obtain (16.10) where D'1=1 if and only if T′ T1, and otherwise D'1=0; and D'2=1 if and only if T′ T2, and otherwise D'2=0. Equation (16.10) has five composite variables on the righthand side and parameters β 23 and β 33 can be obtained from those parameters which are directly estimated. If we wish to build a time-varying parameter model along the lines of equation (16.4), in which the parameters are approximated by a cubic spline function, then equation (16.10) is simply substituted for β in equation (16.4). If only β is assumed to be time dependent, this results in the following model: (16.1 1) If β is also assumed to be approximated by a similar function, then it is expanded in the same way.
284 R.Q.HANHAM
EMPIRICAL TEST The models outlined in this paper were estimated using quarterly data for the Pittsburgh metropolitan area for a twenty-year period, 1964 (3)–1984 (3). As one might expect from an economy which is heavily reliant on durable goods manufacturing, Pittsburgh has followed the path of the national business cycle quite consistently during this period. During periods of expansion the region’s unemployment rate has been somewhat lower than the nation’s, and during recession it has been somewhat higher. However, the relation between the two does appear to have changed. During the expansion of the late 1960s, Pittsburgh’s unemployment rate was about 75 percent of the nation’s. During the post–1971 recovery it was about 90 percent, during the post–1975 recovery it was about 96 percent, and during the most recent recovery it has been about 150 percent. During the recessions of 1971, 1975, and 1982, Pittsburgh’s unemployment rate was 113, 100, and 165 percent of the nation’s, respectively. The first model to be estimated was the simple time-invariant version outlined in equation (16.3). The parameters were estimated by generalized least squares, assuming that the errors followed a first-order autoregressive scheme. This gave (16.12) The β parameter was significant at the 0.01 level. Over the entire time period, therefore, changes in Pittsburgh’s unemployment rate are shown to have matched changes in the nation’s from quarter to quarter. If, however, we estimate equation (16.3) for a succession of overlapping periods, each five years in length, the plot of β estimates over time, as shown in Figure 16.1, indicates quite clearly that the path of β is variable. From the mid–1960s to the mid–1970s, using the beginning of each five-year period as a reference point, β fluctuates gently around a slight downward trend. From 1976 onward, incorporating the 1980–2 recession and subsequent recovery, the value of β rises dramatically. To capture this trend a quadratic time-varying parameter model, given in equation (16.7), was estimated using data for the entire period. The results are as follows: (16.13)
All parameters with the exception of a0 are significant at the 0.05 level. The plot of β over time is shown in Figure 16.2. Although it captures the initial decline and subsequent rise in β , the quadratic function is clearly only a crude representation of the time path of β . Finally, a full version of the cubic spline varying parameter model was estimated, involving both a=a(T) and β =β(T). The Durbin-Watson statistic for this model was 1.95 and the adjusted R2 was 0.97. The estimated parameters, of
GENERATING VARYING PARAMETER MODELS 285
Figure 16.2 Quadratic and cubic spline time plot of β for Pittsburgh
which there are twelve, are not shown here. Their values are intuitively reasonable and when combined give the β plot shown in Figure 16.2. CONCLUSION This paper has outlined a procedure for generating models whose parameters are not only variable but also follow a complex path. The method of cubic spline functions was illustrated using a time-varying time series model of regional unemployment, essentially a model along the lines of equation (16.2). The procedure should be equally applicable to a spatial model such as the one shown in equation (16.1). In this case, the parameters a and β are assumed to follow some complex spatial path or surface. The method bears some relation to Jones’s (1984) procedure for generating spatially varying parameter models through the use of trend surface expansions of the parameters. Cubic spline functions, however, allow for more complex variability in the parameters to be built into the original model. NOTE This paper has been reprinted with permission from Modeling and Simulation 16 (1), pp. 75–9. REFERENCES Anderson, J.E. (1982) ‘Cubic spline urban density functions’, Journal of Urban Economics 12:155–67.
286 R.Q.HANHAM
Brechling, F.B. (1967) ‘Trends and cycles in British regional unemployment’, Oxford Economic Papers 19:1–21. Campbell, M. (1975) ‘A spatial and typological disaggregation of unemployment as a guide to regional policy—a case study of north-west England 1959–72’, Regional Studies 9:157–68. Casetti, E. (1972) ‘Generating models by the expansion method: applications to geographic research’, Geographical Analysis 4: 81–91. Casetti, E., King, L.J. and Jeffrey, D. (1971) ‘Structural imbalance in the U.S. urbaneconomic system’, Geographical Analysis 3:239– 55. Clark, G. (1980) ‘Critical problems of geographical unemployment models’, Progress in Human Geography 4:157–80. Dunn, R. (1982) ‘Parameter instability in models of local unemployment responses’, Environment and Planning A 14:75–94. Hanham, R.Q. (1982) ‘The pattern of change in the metropolitan labor market system of the Sun Belt, 1965–81’, Modeling and Simulation 13:1033–7. Jones, J.P. (1984) ‘A spatially varying parameter model of AFDC participation: empirical analysis using the expansion method’, Professional Geographer 36:455–61. Owen, D.W. and Gillespie, A.E. (1982) ‘The changing relationship between local and national unemployment rates in northern England, 1971–80’, Environment and Planning A 14:183–94. Suits, D., Mason, A. and Chan, L. (1978) ‘Spline functions fitted by standard regression methods’, Review of Economics and Statistics 60:132–9.
INDEX
active environment, innovative diffusion in 325, 328; spatio-temporal diffusion 312; temporal diffusion 306–9; totally antagonistic competition 318–20 Adams, J.S. 163–4 Africa 246, 248 age, migration and 122–4, 126–30 agglomeration see concentration agricultural location theory 48, 253, 257– 60; see also agricultural production functions agricultural labor force 230, 232, 236, 239, 243–9 agricultural production functions 8, 252– 77; forms of 260–5; Great Plains 265–9, estimation results 270–7; problems in estimation 253–7 Ahuzat Bayit suburb 140–1 Aid to Families with Dependent Children (AFDC) 65, 68–9; as work disincentive 70–80, 85, spatial variation 80–3 Albin, P. 69–70 Alexander, J. 50 Alonso, W. 214 Amedeo, D. 51 Amemiya, T. 340, 341 analytical methodologies, life span of 42 Anas, A. 116 Anderson, J.E. 358 Anderson, M. 81 Anselin, L. 337;
specification issues 339, 340, autocorrelation 336, 345, 346, 347, 348, 350, heterogeneity 334, heteroskedasticity 341, 342, 343 Asia 246, 248 Atlanta 168, 169, 171, 180; 1890–1940 178, 179; 1940–1980 178–80; autocorrelation, spatial error 335, 336, 337, 339, 344–8, 349; misspecified expansion model 347–8; random expansion model 346–7, 350–2 autoregressive models 13 azimuthal variations 137–8, 145, 148, 152, 155, 158 Bairoch, P. 232 Bannister, G. 257 Barras, R. 164, 166–7 Bayesian priors 125 Beattie, B.K. 253 Beckmann, M.J. 257 Ben-Akiva, M, 118 benefits, welfare see welfare model Bennett, R.J. 34, 138 Benoit, E. 191 Berg, L. van den 141, 161, 162, 175 Bernstein, I. 163 Berry, B.J.L. 133–4, 134, 138, 161, 163 Bet Dagan 152 Bhaskar, R. 57 Bieker, R. 66, 72 Bilsborrow, R.E. 22 Binswanger, H.P. 22
287
288 INDEX
Bishop, Y.Y. 125 Bivand, R. 348 Blommestein, H. 348 Blumenfeld, H. 136 Bonferroni bounds 340 Borchert, J.R. 165 Boserup, E. 22 Bourne, L.S. 214 Boyce, R.R. 136 Brechling, F.B. 355 Brehm, C. 64, 66 Bretschneider, S.I. 14, 15, 96 Breusch, T. 342 Briggs, R. 16 Brown, E.H.P. 253, 255 Brown, L.A. 15, 116, 131, 301 Brown, W.G. 253 Brunn, S. 51 Budapest 216, 218, 222, 225 Bunting, T.E. 50 Burridge, P. 347, 348 Burton, I. 51 business cycles 161, 165–8; see also cyclical urban development Cain, G. 66 Campbell, M. 356 Cantwell, J.R. 98 capital: expenditure and agricultural production functions 256, 267, 271, 272; formation: economic growth 27, population growth 23 Carbone, R. 96 cascading cycles urban systems development model 226, 227 Casetti, E. 152, 337; applications of expansion method 15– 16, development rates 25, fertility and mortality decline 14, tractor diffusion 14–15, trend surface expansions 15, Verdoorn’s law 48; distance expansion model 136; drift analysis 96;
expansion method 11, 43, 95, 116, 134, 170, 192, 265, 281–2, 297, 299, 301, 334, 356–7; industrial mix 165; innovation diffusion 309; land rent 258; nomothetic-idiographic middle ground 63; spatial expansion 345; urban turnaround model 214 Cassen, R.H. 22 Castle, E.N. 257 centralization, population 162–3, 164–7, 172, 174–82 passim; see also decentralization; urbanization Champion, A.G. 180 Champion, D.J. 99 Chappell, J.A. 99 Chenery, H.B. 22, 229 Chenery-Syrquin study of development patterns 230, 240, 249–50; approach 233–5; compared with expansion method 236– 8 Cheshire, H.M. 279, 283, 284, 290 Chicago: development trends 168, 169, 171, 180, 1890–1940 175–7, 1940–1980 176, 177; Welfare Queen 84 Chief, E. 70 chlorophyll-a 284, 292–3 Cho, D.W. 165 choice: destination see destination choice; innovation diffusion and 302, 321–5 city size, rank-size functions and 187–91; see also rank-size models Clark, C. 22, 229, 230, 231 Clark, G. 49, 356 Cliff, A. 140, 334, 345 Cloward, R. 69, 84 Coale, A.J. 22 Cobb-Douglas production functions 260–4, 266, 270–6 coefficient of variation 186 Cohen, Y.S. 134
INDEX 289
Cole, J.P. 198 common factor approach 348 comparative regional geography 54–5 competition 9, 325; innovation diffusion and 303–9; totally antagonistic 317–20 competitive exclusion principle 306, 308, 314–15, 327 competitive innovations 312–17, 325, 331– 2 concentration, population: Hungarian urban system 221–2, 223–4, 227, switch to deglomeration 224, 225, 227; rank-size approach 189–91, 219, 220; urban turnaround model 214–15 contextual variation 3, 44–7, 55, 94 contingent relations 57–9 Cooper, J.K. 99 corn belt 265 crop allocation 263 cross-sectional analysis 156–7 cubic splines 355, 357–62 culture, political 82–3, 92–3 Cutler, A.T. 161, 165 cyclical trends in urban development 7, 161–82; cities’ sensitivity to 165; decentralization and suburbanization 163–5; local urban spatial development 166– 82, Atlanta 178–80, Chicago 175–7, Philadelphia 173, 174–5 Danta, D.R. 189, 214, 218, 219 Danziger, S. 65, 66 Davidson-MacKinnon tests 343, 347–8 Dear, M. 49 decentralization 6, 7, 133–58; cyclical urban development 162–3, 166–8, 172, 174–82, suburbanization and 163–5; definition 133–4, 138; review of literature 134–8; Tel Aviv 140–57,
distance bands 146–50, distance expansion model 150–2, trend surface expansion 152–7, urban-suburban dichotomy 143–6; temporally expanded trend surface model 138–40 decision making models 49 deconcentration 133, 136; see also decentralization; deglomeration deglomeration 214–15, 219; Hungarian urban system 221–2, 226, 227, switch to 224, 225, 227 Dehan, S. 141 demand 231 Demko, G. 14 Dendrinos, D.S. 323, 324 dependence, spatial 334–5, 336, 344; see also specification issues destination choice 6, 115, 116, 118–20; Ecuador 125–30, 131 deterministic remote sensing 280 deurbanization see decentralization development, economic 48; inequalities 7, 191–211, temporal dynamics 197–8, 206–9, 210; level of 22–3, and growth 25–8, 31–3; population growth and 20–8, empirical test 29–33; sectoral labor shares and 229–49; structural changes 229–30 development, urban see cyclical trends in urban development; decentralization; Hungarian urban system diffusion 9; innovation see innovation diffusion; physicians 98; tractors in USA 14–15 diminishing marginal returns 259, 261, 263, 270 discrete-time innovation models 303 Diseker, R.A. 99 distance, migration and 122, 124, 126–30 distance bands 134, 135–6, 158; Tel Aviv 146, 146–50
290 INDEX
distance decay 46–7, 126 distance expansion model 134, 136–7, 158; Tel Aviv 146, 149, 150–2 Domencich, T. 322 drift analysis 5–6, 96–7, 112; physician location 103–10; see also parameter drift dual expansion method 4–5, 17–20, 35–6; population growth and development 20–8, empirical analysis 29–33 duality principle 299–302 Duijn, J.J. van 161, 162, 164 DuMouchel, W.H. 13 Duncan, S. 55–6, 56 Dunn, E.S. 257, 259 Dunn, R. 356 Durbin spatial model 348 Durbin Watson test 345 Easterlin, R.A. 22 ecological diffusion models 310; see also diffusion; innovation diffusion ecological systems 306 economic cycles see cyclical urban development economic development see development, economic Ecuador 116, 120–1; migration in 125–30, 131 Efron, B. 343 El-Shakhs, S. 214 Elazar, D.J. 82 Eldridge, J.D. 47 Ellis, M. 116 Elman, R. 69 empirical Bayes inference 13 empirical remote sensing 280–3, 294; see also water quality empirical research 49–50 Entrikin, J.N. 51 environment, active see active environment Enyedi, G. 216 Erickson, R.A. 135, 141 estimation, modeling and 4 expansion equations 2, 11
expansion method 1–3, 11–17; applicability 4, 13–16; compared with drift analysis 95, 102– 10, 112; global meta-theoretical approach 299– 303; paradigmatic aspects see paradigmatic aspects; parameter stability 16–17; parameter variation 356–7, see also parameter drift; popularity 43; special purpose expansions 34; specification issues see specification issues; usefulness 33–6 external effects 23, 27 factor analysis 42, 73–5 federal policy, physician location and 97– 8, 99–100, 111–12 federalism, public assistance and 69, 78– 83, 84 feed, expenditure on 270 Fields, G.S. 120 fertility decline 14 Fingleton, B. 334, 345 Fisher, A.G.B. 230, 231 Fisher, J.L. 22 Fisher, M.M. 302 Food Stamp Program 68 Foster, S.A. 96, 98, 100 Fotheringham, S. 46 Freeman, R.A. 81 frequency, rank-size model and 187, 211 Frey, W.H. 180 Fruen, M.A. 98 Fuchs, V. 231 functional relations 43–5 Gaile, G.L. 15, 214 Gardner, B.L. 268 Garfinkel, I. 65, 68, 69 Garner, B. 164, 187 Garrison, W.L. 257 Gauthier, H.L. 15, 187 gender, migration and 122–4, 126–30
INDEX 291
General Assistance 68 geography 45, 47–8; regional 51–7 Gershenkron, A. 26 Geruson, R.T. 163 Giddens, A. 50 Gillespie, A.E. 356 Gini coefficient 186 global meta-theoretical approaches 299 Glover, D. 22 Godfrey, L. 339 Goetz, A.R. 116, 131 Golledge, R. 50, 51 Good, I.J. 13 Gorr, W.L. 96 Gottlieb, M. 165 Gould, P. 51, 59 graph theory 301 gravity models 115 Great Plains 265–9; production functions 270–7 Greenstein, R. 84 Gregory, D. 56 Griffith, D. 336, 337, 345 Griliches, Z. 253, 255, 268 gross domestic product (GDP), per capita 198–206 gross national product (GNP), per capita: level of development and 25; sectoral labor shares and 8, 230, 233– 49 guarantee, welfare program 65 Guelke, E. 50 Gujarati, D. 34 Hadden, J.K. 163 Hagen, E.E. 25 Hall, P. 136, 257 Hamermesh, D. 66 Hamiltonian variational principle 323–4 Hanham, R.Q. 14, 15, 301, 356 Hannaford, P. 84 Hansz, J.E. 161, 165 Harbaugh, J.W. 138 Hardin, G. 306 Harris, C.D. 134 Harris, J.E. 13
Harris-Todaro model 116, 120 Hart, J.F. 51, 52 Hartshorne, R. 51, 53–5 Harvey, D.W. 56 Havrylyshyn, O. 25 Hawley, A.H. 135 Hay, D. 136 Hayes, S. 76 health care, access to 97–8; see also physicians heterogeneity, spatial 334–5, 336, 339; see also specification issues heteroskedasticity 335, 336–7, 340–4, 348, 349; misspecified expansion model 342–4; random expansion model 340–2 hierarchical urban development see Hungarian urban system hierarchical statistical models 13 Hildreth, C. 340 Hirschman, A.O. 22 Hoch, I. 253, 256 homo economicus 302 homo socialis 302, 322 Hoover, E.M. 22 Horvat, B. 25 Hosek, J. 70 Hotelling, H. 310 Hottel, J.B. 268 Houck, J. 340 housing: assistance 68; contingent and necessary relations 57–8 Howes, C. 67 human capital theory 48, 50; migration 115–16, 131; see also destination choice ; migration Hungarian urban system 7, 213–27; expansion method 219–26; growth dynamics 222–6, switch from concentration to deglomeration 224, 225, 226; rank-size distribution 217, 218–19; urban turnaround model 214–16
292 INDEX
ideology, political 82 idiographic approach 52–6, 63, 84 imaging remote sensors 279 income maintenance programs 67–9; see also welfare model income potential, physicians and 99–100 increasing marginal returns, production functions and 261–5 individualist culture 82, 92–3 industrial mix 165 industrialization: Kondratieff cycles and 166; sectoral labor shares 246, 249 inequalities 7, 185–211; conventional measures 186–7; rank-size approach 187–91; development inequalities 191–211 information theory 186–7 initial model 2, 11 innovation diffusion theory 297–332; choice theory 320–4; competitive innovations 312–17, 331– 2; expansion method 298–303; spatio-temporal 309–12, 328–31; temporal: active environment 306–9, indifferent environment 303–6, 325– 8; totally antagonistic innovations: active environment 318–20, indifferent environment 317–18 integrative approaches 299, 324 intensity of agriculture 255–6; rent and 258–9; spatial variation 257, 259–60; yield and 255–6, production functions 260–5, 267–76 interdisciplinary approaches 299, 324 invariance 3 irrigation, yield and 270–3 Isard, W. 258 iteration 2 Jaffa 141 Jaffee, A. 232 James, P.E. 51, 53, 55
Jeffrey, D. 165 Jensen, J.R. 281 Johnson, C. 66 Johnston, J. 13 Johnston, R.J. 51, 70, 134 Jones, D.W. 165 Jones, G.W. 232 Jones, J.P. 48, 66, 70, 131, 337; parameter stability 47; spatial expansion and autocorrelation 345; theoretical frameworks 48–9; trend surface expansions 15, 282, 337, 362; Judge, G.G. 13, 16 Juglar cycle 161 Kahn, H. 25, 26 Kain, J.F. 134 Kasarda, J.D. 133–4, 138, 163 Kasper, H. 70 Katzman, M.T. 257 Keeley, M. 66 Keen, D. 214 Kellerman, A. 136, 137, 143 Kelly, A.C. 22 Khorram, S. 279, 283, 284, 290 Kirsch, H. 232 Kitchin cycle 161 Klemas, V. 279 Knottnerus, P. 339 Kodras, J.E. 16, 48–9, 70 Kondratieff cycles 7, 161–2; in urban development 163, 172–80 Kontuly, T. 214 Krakover, S. 15, 134, 152; decentralization 138, 170, Tel Aviv 141, 143, 151; distance expansion model 136, 137, 151 Kristensen, T. 25, 26, 27 Kritikos, H. 279 Kuttner, B. 67 Kuznets, S. 22, 229; building cycle 161; sectoral productivity growth 231–2
INDEX 293
labor: agricultural production functions 256, 267, 267–8, 271, 272; sectoral shares see sectoral labor shares labor-leisure theory 64, 64–6; see also welfare model labor market barriers to employment 66–7, 69–70; AFDC 72–80; federalism and 80–4 Labour Party, UK 58–9 lagged dependent variables models 13 Lagrange multiplier test 346–7, 349, 350–2 Lamb, R. 136 Lamers, E. 191 Lampman, R. 66 land rent 257–8 landlord-tenant relations 57–8 Landsat multispectral scanner 279 Latin America 232, 246, 248 Lave, J.R. 100 laws: geography and 52, 54; social science 7–8, 35, 46, 229 Learner, E. 339 Lerman, S. 118 Levitan, S. 66 Lewis, W.A. 229 Limbor, J.M. 232 Lindley, D.W. 13 linear-log model production function 264– 5, 266–7; Great Plains data 270–6 linear regression, piecewise 358 livestock ranching 265 localities research 55–6 location choice, physicians and 98, 99–112 log-linear innovation diffusion dynamics model 313–17; choice theory and 320–1, 323 Longini, R. 96 Lousiana 80 Lovell, M. 339 MacKinnon, J. 342; Davidson-MacKinnon tests 343, 347–8 Maizels, A. 22, 231
Makridakis, S. 96 Malecki, E.J. 189, 219 Manson, D.M. 166, 182 manufacturing sector labor force 230, 232; development and 237, 239, 243–9 Marble, D.F. 257 March, J.G. 49 marginal tax rate 65–6, 85 Mark-Lawson, J. 58–9 Markusen A. 67 Martin, G.J. 51 Masotti, L.H. 163 Massachusetts 80 Massey, D. 56 Masters, S. 65, 68, 69 Matthews, R.C.O. 26 McDougall, G.S. 165 McFadden, D. 118, 322 McGee, T.G. 232 McGrath, D. 163 McGroughran, E. 66 McNicoll, G. 22 Medicaid 68, 98, 111 Medicare 98, 111 Mehta, S.K. 232 Menefee, J. 66 Mera, K. 161 meta-theories 298–9 ‘metropolitan areas’, migration to 121, 122, 126–30 metropolitan decentralization see decentralization migration 6, 48, 115–31, 180; destination choice in Ecuador 124–30; disaggregate model of behavior 117–20; duality and 301; expansion with restricted choices 120– 4 Mills, E.S. 133 Minnesota 79 Mississippi 79, 80–1 misspecified expansion model 338, 349; error autocorrelation 347–8; heteroskedasticity 342–4 modeling, schools of 3–4 Molho, I. 115 moralist culture 82, 83, 92–3 Moran test 344–5
294 INDEX
Morrill, R. 66, 136, 167, 170 Morris, C.N. 13 mortality decline 14 Moses, L. 133 Moss, W.G. 116, 120 moving averages 96–7 moving window regressions 96–7, 112; physician location 103–7, serial correlation 107–10 Muller, P.O. 133, 141, 163, 164, 167 Mundlak, Y. 253, 256 Murgatroyd, L. 56 Murray, C. 84 Nakosteen, R.A. 115 National Health Service Corps 98 necessary relations 57 nested logit models 118 Neuse River, water quality 283–94 ‘new regional geography’ 55–6 Newling, B.E. 136 nomothetic approach 52–6, 63, 83–4 Norcliffe, G.B. 140 Nurkse, R. 229 Odland, J. 15, 116 Ogden, P.E. 214 O’Kelly, M.E. 187 Olson, D.B. 99 Olson, M. 26 open research 45 Ord, J.K. 140, 334, 345 Oshima, H.T. 232 output prices 275, 276 Owen, D.W. 356 Pagan, A. 342 paradigmatic aspects of expansion method 2–3, 5, 43–50, 59; realist critique of science 57–9; regional geography 51–7 parameter drift: expansion method and investigation of 14, 15, 16–17, 34–6, 213, 281–2; Hungarian urban hierarchy 220, 225, 227; need to prove stability 47;
population growth and economic development 22–8, 31–3; see also drift analysis Parker, R.C. 99 Pearl-Reed equation 308–9, 320 Pennsylvania 84 Perin, D.E. 187 Philadelphia 168, 169, 171, 180; 1890–1940 173, 174; 1940–1980 173, 175 Phillips, R.S. 134 physicians 97–112; drift analysis 103–10; expansion model 102, 110; location factors 99–100; supply and federal policies 97–8, 110– 12 piecewise linear regression 358 Pigou, A. 65 Pindyck, R.S. 13 Pittsburgh 360–2 Piven, F.F. 69, 84 Platky, L. 70 polarization 215–16; Hungarian urban system 218–19, 222, 223, 225–6 political cultures 82–3, 92–3 political ideology 82 population growth: cyclical urban development 168–82; development and 20–8, empirical test 29–33; physician location and 99, 100, 107, 111, 111–12; Tel Aviv 143–57 population size: migration and destination choice 116, 121, 122, Ecuador 126–30; rank-size models 187–91; sectoral labor shares and development 233, 241, 242, 243, 245 Potter, N. 22 poverty 74–5, 81–2, 85 Pred, A.R. 50 prices, agricultural output 275, 276 primary care physicians 101–7, 108, 110– 11, 112
INDEX 295
processes, social 56, 57, 59 product per capita (PRC) 23–4; see also gross domestic product; gross national product production functions 48, 252; forms of 260–5; Great Plains data 265–9, estimation results 270–7; problems in estimation of 253–7 productivity, sectoral growth in 231–2 professional environment 99 Pudup, M.B. 55 qualitative models 49 quantum mechanics 301 Quayaquil 121 Quito 121 Raj, B. 341 Ramos, J.R. 231, 232 Randall, A. 257 random coefficients models 13 random expansion model 338, 349; error autocorrelation 346–7, 350–2; heteroskedasticity 340–2 random utility theory model of migration 116, 117–20; see also migration rank-size models 7; inequality measurement 187–91, development 191–211; urban turnaround model 214–16, Hungarian urban system 217, 218, 219– 26 Reagan, R. 66, 84 ‘real world’ 46, 47 realism 57–9 regional geography 51–7 regional unemployment 355–62; cubic spline functions 357–9; expansion method 356–7; Pittsburgh 360–2; response model 355–6 remote-sensing 8–9, 279–81, 282, 294–5; see also water quality rent 257–8, 277 research, expansion paradigm and 45–50
resource flow 233–4, 241, 242, 243–4 245 Richardson, H. 141, 214 Robbins, H. 13 Robinson, E.A.G. 22 robust statistics 343, 347–8, 349, 350 Rosenstein-Rodan, P.N. 229 Rostow, W.W. 25, 161, 162 Rothenberg, T. 341 Rubinfeld, D.L. 13 Rukeyser, L. 65–6 Rumberger, R. 67 rural destinations, migration and 116, 120– 2, 126–30 Russett, B.M. 25, 191–2 salinity 284, 288–90 Samuelson, P.A. 16 satellites see remote-sensing Savin, N. 340 Saving, T. 64, 66 savings 23, 27 Sayer, A. 57–8 scale diseconomies 23, 27 Schaeffer, P. 115 Schnore, 163, 164 Schwartz, W.B. 98 science 59; realist critique of 57–9 Scudo, F. 323 sectoral labor shares 8, 48, 229–49; Chenery-Syrquin approach 233–5; determinants of shifts 231–2; expansion method 235–49, expansion equations 239–40, initial model 235–9, results 243–9, terminal models 240–2; population size 243, 245; regional variation 246–9; resource flows 243–4, 245; temporal drift 244–6, 247 Selwood, D. 15 Semple, R.K. 14–15, 187, 301, 309 Seninger, S. 67 sensitivity analysis 17 serial correlation 107–10
296 INDEX
service sector labor force 230, 236, 244, 246 Shachar, A. 141 shadow prices 301 Shaeffer, F.K. 51, 52 Shannon entropy approach 187 Sheppard, E. 46 Sicherl, P. 192 Simon, J.L. 22 Singleton, S.H. 163 Sjaastad, L.A. 115 Smeeding, T. 67 Smith, A.F.M. 13 Smith, N. 56 Smith, W.R. 214 social amenities 99 social policy 48; welfare model and 80–5 social processes 56, 57, 59 social science 59; expansion method and 47–50; ‘laws’ 7–8, 35, 46, 229 Soja, E. 56, 63 Sonis, M. 303; diffusion 16, active environment 319, 320, choice 302, 321, 323, 324, 325, competition 306, 313, 314, 315, totally antagonistic competition 318; duality principle 301; global meta-theoretical approaches 299 Sorensen, A.A. 99, 100 space 44–5 Spall, H. 66 spatial cycles see cyclical trends spatial econometrics see specification issues ‘spatial laws’ 52 specialist physicians 101–7, 109, 110, 111– 12, 112 specification issues 9, 334–52; error autocorrelation 344–8, 350–2; general issues 339–40; heteroskedasticity 340–4; taxonomy of spatial expansion 337–8 spline functions 9, 355, 357–62 Squire, L. 232 standard deviation 186
state, theories of 49 static multinomial random utility choice logit model 322 statistical remote sensing 280–3, 294; see also water quality Stein, B. 69–70 Steiness, D.N. 135, 163 Steinwald, B. 99 Steinwald, C. 99 stepwise procedure 76 Stonecash, J. 76 Strahler, A.N. 280 structuration theory 50 suburban-urban dichotomy see urbansuburban dichotomy suburbanization: decentralization and 163–5; urban cyclical development 166–7, 182 Suits, D. 357, 358 Summers, R. 191 Swamy, P.A.V.B. 13 Syrquin, M. see Chenery-Syrquin study systematic geography 51–7 systemic inequality 191, 192, 193, 206, 210 Taaffe, E.J. 51, 55, 63 Takayama, T. 16 tax rate, marginal 65–6, 85 Taylor, L. 233 technology: economic development and 26–7; urban development and 166–7 Tel Aviv 140–57; distance bands 146, 146–50; distance expansion 146, 149, 150–2; trend surface expansion 146, 152–7; urban-suburban dichotomy 143–6 temporal expansion of trend surfaces (TETS) 137–40, 152–7; see also trend surface expansions tenant-landlord relations 57–8 terminal model 2, 11, 11–12 Theil, H. 187 theoretical frameworks 48–9 Thirlwell, A.P. 231 Thompson, W.R. 165, 180 Thrall, G.I. 16
INDEX 297
Thrall, I. 70 Thnen, J. von 257 Thursby, J. 339, 345 time 44–5 time series data 254–5, 277 Tobler, W. 138, 336 total suspended solids 284, 287, 290–2, 294 totally antagonistic competition 317–18; active environment 318–20 Townroe, P.M. 214 tractors, diffusion of 14–15 traditionalist culture 82–3, 92–3 transport costs 258 transportation technology 163–4 trend surface expansions (TSEs) 15, 337; decentralization 134, 137–40, 158, Tel Aviv 146, 152–7; remote sensing 282–3, water quality 283–94 trickle-down effects 215–16, 218–19, 223, 225–6 truth value 45 Tsitanidis, J.G. 16 turbidity 284, 287, 287–8, 294 Tuxill, T.G. 99 Ullah, A. 341 unemployment see regional unemployment Unemployment Insurance (UI) 67–8 Union of Soviet Socialist Republics (USSR) 14 United Kingdom (UK) 58–9, 166–7 United Nations (UN) 22 United States of America (USA): cyclical urban development 167, 168– 82; diffusion of tractors 14–15; federalism and public assistance 69, 78– 83, 84; Great Plains production functions 265– 77; physician location and federal policies 97–8, 99–100, 111–12; unemployment in Pittsburgh 360–2; water quality in North Carolina 283–93 universal validity 3 Upton, G. 334, 345
urban development see cyclical trends in urban development; decentralization; Hungarian urban system urban-suburban dichotomies 134, 134–5, 157–8; Tel Aviv 143–6 urban turnaround model 214–16 urbanization 166; migration and 116, 120–2, 126–30; see also centralization; concentration Urry, J. 56 utility choice theory 9, 321–5; see also random utility theory of migration Uyanga, J. 15 variance 186 variation: coefficient of 186; contextual 3, 44–7, 55, 94; expansion method 52–3; social sciences and 44; see also parameter drift variational principle, Hamiltonian 323–4 Verdoorn’s law 48 Verhulst differential equation 303–5, 313– 17 Vidal, A.C. 134 Vining, D.R. 214 Visser, S. 16, 253, 257 Vogelsang, R. 214 Volterra conservative ecological dynamics 323 Wald, A. 13 Walsh, J.A. 187 water quality 279, 281–94; chlorophyll-a 292–3; salinity 288–90; total suspended solids 290–2; turbidity 287–8 Watts, H. 66 Webber, M.J. 257 Weber, A.F. 134 welfare, social and Labour Party 58–9
298 INDEX
welfare model 5, 48–9, 63–86; empirical test 70–80, constraints 72–5, definitions of variables 89–91, initial model 71–2, terminal model 75–80; labor-leisure tradeoff 64–6; spatial variations in social policy 83–5; state variations in work-disincentive effect 77–80, 92–3, reasons for 80–3; welfare as work disincentive 66–70 wheat zone 265 Wheelwright, S.C. 96 White, H. 342, 344 Whitehand, J.W.R. 166 Williamson, H.F. 133 Williamson, J.G. 22 Winegarden, C. 70 Wohlenberg, E. 66, 70 work-disincentive effect of welfare 71–2, 85, 86, 89; interstate variations 78–80, 92–3, reasons for 80–3 work-welfare choice see welfare model world, concepts of 46, 47 Yeates, M. 164 yield, agricultural 255–6, 257; production functions 270–6 Ying, K. 14 Zdorkowski, R.T. 14, 15 Zellner, A. 253 Ziegler, J. 323 Zimmer, M. 115 Zimmerman, L.J. 191, 199 Zipf, G.K. 187–9, 211