Time and Space in Economics

T. Asada, T. Ishikawa (Eds.) Time and Space in Economics T. Asada, T. Ishikawa (Eds.) Time and Space in Economics Wi...

Author: T. Asada | T. Ishikawa

93 downloads 2541 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

T. Asada, T. Ishikawa (Eds.) Time and Space in Economics

T. Asada, T. Ishikawa (Eds.)

Time and Space in Economics With 48 Figures

Toichiro Asada, Ph.D. Professor, Faculty of Economics Chuo University 742-1 Higashinakano, Hachioji, Tokyo 192-0393, Japan Toshiharu Ishikawa, Ph.D. Professor, Faculty of Economics Chuo University 742-1 Higashinakano, Hachioji, Tokyo 192-0393, Japan

ISBN-10 4-431-45977-4 Springer Tokyo Berlin Heidelberg New York ISBN-13 978-4-431-45977-4 Springer Tokyo Berlin Heidelberg New York Library of Congress Control Number: 2006935369 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microﬁlms or in other ways, and storage in data banks. The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Springer is a part of Springer Science+Business Media springer.com © Springer 2007 Printed in Japan Typesetting: SNP Best-set Typesetter Ltd., Hong Kong Printing and binding: Shinano Inc., Japan Printed on acid-free paper

Preface

This book is published in celebration of the 100th anniversary of the Faculty of Economics of Chuo University. The Faculty of Economics of Chuo University in Tokyo, Japan, was founded in 1905. Since the beginning of the twentieth century, it has helped educate many brilliant students who have had an inﬂuence throughout the world. This faculty has worked not only as a pathﬁnder in the ﬁeld of economics but also as a compass that indicates the directions of various movements in our society. To mark the auspicious occasion of the 100th anniversary, the academic committee, organized by the faculty and assisted by overseas academic partners as well as staff of Chuo University, launched a small but important international conference to contribute to the development of economics. The conference aimed to enrich the respective disciplines of the economics of time (dynamic economics) and the economics of space (spatial economics) and to expand their applicability in the real world. The conference, the Chuo Meeting on Economics of Time and Space 2005 (Chuo METS 05), was held at Chuo University August 29–30, 2005. Most of the chapters in this book, all of which were refereed by the editors, are based on papers presented at that conference. It is the hope of the committee that this book will play a role in providing scholars and experts with new ideas with regard to time and space in economics, and that it will help build a foundation for developing a frontier of economics as we stand at the beginning of the twenty-ﬁrst century. Part I, “Economics of Time: Keynesian Macrodynamics,” is a collection of ﬁve chapters on macroeconomic dynamics in the Keynesian tradition, which allows for the existence of involuntary unemployment. Chapter 1, “A Sophisticatedly Simple Alternative to the New-Keynesian Phillips Curve” (R. Franke), presents an alternative microeconomic foundation to the newKeynesian Phillips curve by using a model with heterogeneous and bounded rational ﬁrms, and attempts to compromise the rational expectation

V

VI

Preface

hypothesis theoretically with the adaptive expectation hypothesis. Chapter 2, “Harrodian Dynamics Under Imperfect Competition: A Growth-Cycle Model” (H. Yoshida), reconsiders the Harrodian instability problem by using a model of imperfect competition with putty-clay technology, and shows the existence of endogenous ﬂuctuations by using the three-dimensional Hopf bifurcation theorem. Chapter 3, “Endogenous Technical Change: The Evolution from Process Innovation to Product Innovation” (G. Gong), introduces product innovation and process innovation to the nonlinear discrete time version of the Harrodian dynamic model of economic growth, and investigates the complex dynamic behavior and occurrence of bifurcation in such a system. Chapter 4, “Tobin’s q and Investment in a Model with Multiple Steady States” (M. Kato, W. Semmler, and M. Ofori), considers the investment theory based on Tobin’s q under adjustment costs with size effect, and investigates the implication of the existence of discontinuous policy function with multiple steady states by means of the Hamilton-Jacobi-Bellman (HJB) method. Chapter 5, “Signiﬁcance of the Keynesian Legacy from a Theoretical Viewpoint: A High-Dimensional Macrodynamic Approach” (T. Asada), presents a high-dimensional Keynesian macrodynamic model with debt accumulation, which is designed to interpret the performance of the Japanese economy in the 1990s and the 2000s. Part II, “Economics of Time: Nonlinear Dynamics,” contains four chapters on the classical, neoclassical, and other traditions of nonlinear economic dynamics. The common feature of these chapters is the complexity that is due to the nonlinearity of the system. Chapter 6, “Instability Problems and Policy Issues in Perfectly Open Economies” (P. Flaschel), considers the classical model of a small open economy with perfectly ﬂexible wages and prices to study the dynamic instability of the system and the policy rules that stabilize the system. Chapter 7, “Corridor Stability of the Neoclassical Steady State” (A. Dohtani, T. Inaba, and H. Osaka), introduces the permanent income hypothesis to the Solow–Swan-type neoclassical model of economic growth, and shows the existence of the corridor-stability property, which means that the path inside the corridor converges to the steady state while the path outside the corridor diverges. Chapter 8, “Time-Delayed Dynamic Model of Renewable Resource and Population” (A. Matsumoto, M. Suzuki, and Y. Saito), investigates a system of mixed differential and difference equations (delay differential equations) introducing time delays in production to the Ricardo–Malthus dynamic model of population and renewable resources, and provides a comparative dynamic analysis with respect to delay in production. Chapter 9, “A Determinantal Criterion of Hopf Bifurcations and Its Application to Economic Dynamics” (J. Minagawa), provides a new mathematical criterion for the occurrence of Hopf bifurcations in the general ndimensional system of differential equations, and presents an example of the

Preface

VII

application of this criterion to the three-dimensional dynamic macroeconomic model. Part III, “Economics of Space: Empirical Analysis,” selects four ﬁelds that have direct geographical relationships with economic activity in different ways. Chapter 10, “Public- and Private-School Competition: The Spatial Education Production Function” (D. Brasington), using spatial statistics instead of non-spatial ones, reveals the very interesting fact that poverty appears to depress reading and writing pass rates in non-spatial regressions, but this effect disappears in the spatial models. Chapter 11, “Innovation, R&D Cooperation, and the Geography of Regional Labor Acquisition” (J. Simonen and P. McCann), investigates the role played by geography in the promotion of innovation. One of the important suggestions in this chapter is that local, face-to-face contact is an essential feature of the innovation process. Chapter 12, “Taxation of Car Commuters’ Employer-Subsidized Parking” (R. Wall), deals with economic policy instruments to mitigate trafﬁc jams in dense urban areas. It proposes a feasible instrument—taxation of car commuters’ free or partly employer-subsidized parking—and estimates its effectiveness to reduce rush-hour vehicle trafﬁc into the inner city. Chapter 13, “Forest Protection System and Optimal Land-Use Management Policy in Japan” (M. Yabuta and Y. Yamanishi), considers Japanese forest policy and the protection forest system, which is based on the public beneﬁt of forests, and proves that the environment policy of each prefecture in Japan has a positive effect on the choice of protection forests. Part IV, “Economics of Space: Theoretical Analysis,” analyzes the effects of spatial competition between economic organizations or agents on economic performance in a region. Chapter 14, “Railway Competition in a Parkand-Ride System” (T. Kuroda and K. Miyazawa), examines the scale effect of city size and the cost advantage of railways over automobiles in a park-and-ride commuter system. It presents reasons the railway sector should be subsidized to balance the budget for improved social conﬁguration. Chapter 15, “An Analysis of the Relationship Between Manufacturer’s Proﬁt and Spatial Economic Structure in the Retail Market” (T. Ishikawa), describes the circumference model and shows the inﬂuences of spatial competition among retailers on manufacturer’s proﬁt and the retail market structure. Chapter 16, “Redistribution Through Local Competition” (F. Guy), develops a model with different kinds of shops and consumers, and suggests that to the extent that local shops serve a poorer clientele, a rise in prices at out-of-town shops has a progressive distributive effect. Toichiro Asada Toshiharu Ishikawa Editors

VIII

Preface

Acknowledgements We are grateful to Chuo University for its ﬁnancial support with the publication of this book. The title of this volume is from the name of the book Economics of Space and Time (Springer-Verlag, Berlin, 1997) by T. Puu, a pioneer of spatial-dynamic economics.

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

V

I. Economics of Time: Keynesian Macrodynamics 1. A Sophisticatedly Simple Alternative to the New-Keynesian Phillips Curve Reiner Franke . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

2. Harrodian Dynamics Under Imperfect Competition: A Growth-Cycle Model Hiroyuki Yoshida . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

3. Endogenous Technical Change: The Evolution from Process Innovation to Product Innovation Gang Gong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41

4. Tobin’s q and Investment in a Model with Multiple Steady States Mika Kato, Willi Semmler, and Marvin Ofori . . . . . . . . . . . . . . . .

55

5. Signiﬁcance of the Keynesian Legacy from a Theoretical Viewpoint: A High-Dimensional Macrodynamic Approach Toichiro Asada . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

81

II. Economics of Time: Nonlinear Dynamics 6. Instability Problems and Policy Issues in Perfectly Open Economies Peter Flaschel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

99

7. Corridor Stability of the Neoclassical Steady State Akitaka Dohtani, Toshio Inaba, and Hiroshi Osaka . . . . . . . . . . .

129

8. Time-Delayed Dynamic Model of Renewable Resource and Population Akio Matsumoto, Mami Suzuki, and Yasuhisa Saito . . . . . . . . . . .

145

IX

X

Contents

9. A Determinantal Criterion of Hopf Bifurcations and Its Application to Economic Dynamics Junichi Minagawa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

III. Economics of Space: Empirical Analysis 10. Public- and Private-School Competition: The Spatial Education Production Function David M. Brasington . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

175

11. Innovation, R&D Cooperation, and the Geography of Regional Labor Acquisition Jaakko Simonen and Philip McCann . . . . . . . . . . . . . . . . . . . . . . . .

205

12. Taxation of Car Commuters’ Employer-Subsidized Parking Rickard E. Wall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

227

13. Forest Protection System and Optimal Land-Use Management Policy in Japan Masahiro Yabuta and Yasuhito Yamanishi . . . . . . . . . . . . . . . . . . .

239

IV. Economics of Space: Theoretical Analysis 14. Railway Competition in a Park-and-Ride System Tatsuaki Kuroda and Kazutoshi Miyazawa . . . . . . . . . . . . . . . . . . .

265

15. An Analysis of the Relationship Between Manufacturer’s Proﬁt and Spatial Economic Structure in the Retail Market Toshiharu Ishikawa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 16. Redistribution Through Local Competition Frederick Guy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

297

Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

309

Part I Economics of Time: Keynesian Macrodynamics

1. A Sophisticatedly Simple Alternative to the New-Keynesian Phillips Curve Reiner Franke

Summary. This chapter reconsiders the standard microfoundations of the new-Keynesian–Phillips curve. It argues that with heterogeneous and boundedly rational ﬁrms, the expectational variable in the Phillips curve refers to a longer time horizon than the next period, and is updated in a gradual manner. However, ﬁrms are less naive than adaptive expectations; they also take output data into account, as well as the credibility of the central bank. At the macroeconomic level, the ideas about such an adjustment process are speciﬁed in the concept of an “adaptive inﬂation climate.” Exploratory estimations of the process where the inﬂation climate is proxied by a survey measure of expectations yield reasonable results that are supportive of this approach. Key words. New-Keynesian Phillips curve, Bounded rationality, Adaptive inﬂation climate, Survey of Professional Forecasters

1 Introduction As a theory of nominal rigidity, the new-Keynesian Phillips curve explicitly incorporates forward-looking elements in economic behavior, and accounts for the importance of expectations. Doing this in an elegant way, it is widely used in teaching macroeconomics and in policy analysis, speciﬁcally in evaluating the properties of alternative policy rules. However, there is a growing awareness that the model is hard to square with the facts. In particular, the canonical sticky-price model allows for the possibility of disinﬂationary booms, whereas shocks to monetary policy are well known to have delayed and gradual effects on inﬂation, such that in the ﬁrst stages a reduction of inﬂation is typically accompanied by output losses. Mankiw Institute for Monetary Economics, Technical University, Vienna, Austria

3

4

R. Franke

(2001, p. C59) even concludes that the new-Keynesian Phillips curve “is appealing from a theoretical standpoint, but it is ultimatively a failure.” To give a larger role to inﬂation inertia, present versions of the newKeynesian Phillips curve have often built in some backward-looking behavior. The rate of inﬂation expected for the next period that shows up on the right-hand side of the Phillips curve relationship is here generalized to a weighted average of expected inﬂation and inﬂation in the previous period(s). It also appeared that these hybrid expectations could be successfully estimated, and that the forward-looking component receives a signiﬁcant weight; Galí and Gertler (1999) is only one example among many.1 In a profound econometric study, these positive results (i.e., their newKeynesian interpretation) have recently been severely challenged by Rudd and Whelan (2001). Under conditions that are highly possible to meet in practice, the paper ﬁrst points out that the weight on the forward-looking expectations is upward-biased. It can even be signiﬁcant when the true model of the inﬂation dynamics is purely backward-looking.2 This holds when in the estimation equation, the realized rates of inﬂation are substituted for the expected values, as well as when expected inﬂation is proxied by a survey measure (cf. Rudd and Whelan 2001, p. 9, fn 12). While this kind of criticism might not apply to full-information estimation techniques, these estimates run the great risk of being inconsistent if any part of the model is misspeciﬁed, even if it does not belong to the theoretical core. The tests just mentioned are based on only the most elementary property of the model’s rational expectations hypothesis, according to which the period t + 1 expectational error should be unforecastable by variables dated t or earlier. Noting this, the authors subsequently turn to the stronger property that rational expectations should be model-consistent, that is, consistent with the process for inﬂation as described by the model. Given the forcing variable xt in the Phillips curve, a measure of economic activity or real marginal cost, this additional prediction yields testable implications for the inﬂuence that the inﬁnite discounted sum of the expected future values of xt exerts on the current rate of inﬂation. In the corresponding reducedform estimation, the coefﬁcient on this present-value term, if it is at all signiﬁcant, is extremely small (0.012 at the most) and absolutely dominated by the coefﬁcients on lagged inﬂation (0.795 as their minimum sum). The authors also check the possibility that lags of inﬂation are given this large In addition, we will indicate below that although this extension of the Phillips curve is straightforward, its microfoundations are presently less rigorous than in the baseline case with purely forward-looking expectations. 2 The basic supposition involved is that a dynamic variable that belongs in the true model is erroneously omitted from the test speciﬁcation. 1

Alternative to the New-Keynesian Phillips Curve

5

weight because they already contain so much information about the future of the forcing variable. However, they ﬁnd no statistically signiﬁcant effect of this kind, which means the signiﬁcant role played by lags of inﬂation in reduced-form Phillips curves cannot be assigned to their proxying for rational expectations of future output gaps or real marginal costs. In sum, the new-Keynesian–Phillips curve, its baseline version as well as its extension with hybrid expectations, “cannot serve as an adequate approximation to the empirical inﬂation process” (Rudd and Whelan 2001, p. 18).3 Apparently, some modiﬁcation of the new-Keynesian sticky-price model is needed to explain the role played by lagged dependent variables in inﬂation regressions. The alternative usually put forward at this stage of the discussion is “adaptive expectations”. 4 Mankiw (2001, p. C59), for example, notes that this assumption is far from satisfying, and that it would be odd to assert that expectations about inﬂation are formed without incorporating the detailed news from the permanent media coverage of monetary policy. “Yet,” he continues, “the assumption of adaptive expectations is, in essence, what the data are crying out for.” This chapter proposes an alternative formulation of a Phillips curve relationship whose expectations may be viewed as some compromise between rational expectations and adaptive expectations.5 Our price-setting ﬁrms are not assumed to be rational in the sense of economic theory, but rational in the sense of using their limited capabilities consciously and intelligently. In contrast to rational expectations, the expectation formation process of these ﬁrms is modelled in an explicit way. The stylized assumptions on the reasoning of human beings facing a complex decision problem seek to follow the famous KISS principle coined by Zellner (1992, 2002), “keep it sophisticatedly simple.” In forming expectations, the ﬁrms are at least more sophisticated than the plain adaptive expectations of the inﬂation rate and take additional variables into account. Since we want to remain within the bounds of the output– inﬂation nexus, this set of variables is limited to the central bank’s target rate of inﬂation, and the level and change of the output gap. The approach On p. 17, the authors also mention that their results are perfectly consistent with the work of Fuhrer (1997), who arrived at a very similar conclusion by using a different methodology. 4 While originally adaptive expectations was a dynamic process of partial adjustments, the expression is now often reduced to merely pointing out the inﬂuence of lagged dependent variables. 5 Mankiw and Reis (2002) take a different route to obtain a dynamic response to monetary policy that resembles Phillips curves with lagged inﬂation rates. While their ﬁrms are still supposed to form expectations rationally, they do not do so in every period, but only infrequently. The model is thus a model of sticky information. 3

6

R. Franke

is in the spirit of vector autoregressions, although for reasons of parsimony our speciﬁcations will be much simpler. We also add that by considering target inﬂation, there can be a role for credibility of monetary policy, a feature that is lacking in the usual, as they are called, backward-looking models. An important point is the time horizon of expected rates of inﬂation. Like most of the new-Keynesian microfoundations, we employ the Calvo framework of time-contingent price setting where the single ﬁrm, which does not know when it will next be “allowed” to reset its price, minimizes the expected losses from committing itself to a ﬁxed price over that time interval. These losses do not only depend on next period’s inﬂation, but on the entire path of future inﬂation, which the ﬁrm cannot possibly forecast and does not even try. 6 Instead, in period t a ﬁrm f seeks to characterize the future time path by a single number p tf. This is the rate of inﬂation that the ﬁrm expects to prevail on average over the whole future, where the averaging procedure is based on Calvo’s probability weights for discounting. More accurately, the ﬁrm is aware that its estimation is more or less imprecise, but efforts to reduce the degree of imprecision appear too costly and too unreliable. So the ﬁrm contents itself with informed guesses about average future inﬂation. The guesses are predetermined in a given period, and updated between periods as new information arrives. Moreover, the expectations of ﬁrms are not homogeneous, so that the prices that are reset in a given period are dispersed. It is clear that the changes in the aggregate price level depend on the ﬁrm-speciﬁc rates p tf. In fact, the other components of the ﬁrms’ decision problem and, in particular, their expectations regarding future output are treated in such a way that in the end the determination of the economy’s rate of aggregate inﬂation looks very similar to the new-Keynesian Phillips curve. However, the expected inﬂation rate for the next period in the latter relationship is replaced with the aggregate of the ﬁrms’ guesses p tf. To accentuate this, we will from then on avoid the expression “expectations” and call this aggregate the general inﬂation climate in the economy. Accepting the modelling philosophy, everything else hinges on the updating of the inﬂation climate, which will again be discussed at the macroeconomic level. Although the inﬂation climate is an unobservable variable, it may be proxied by survey measures of inﬂation expectations. From studies of their main characteristics, we infer the following guidelines for designing an adjustment process. (1) Firms do not incorporate innovations in inﬂation Without rational expectations, it is also of no use to the ﬁrm to reformulate the solution of its optimization problem as a one-period ahead forward-looking equation for the reset price. 6

Alternative to the New-Keynesian Phillips Curve

7

and other variables instantaneously and to their full extent, but the inﬂation climate responds only partially to this news. (2) There are systematic overand underpredictions in the surveys, which suggest a (partial) role for an adaptive expectations component. (3) On the other hand, the survey data exhibit mean-reverting tendencies; this motivates us to employ the central bank’s target rate of inﬂation as a second benchmark toward which the inﬂation climate gradually seeks to adjust (in addition to the current inﬂation benchmark). (4) Inﬂation surveys also account for information other than inﬂation, which as already indicated leads us to presume reactions of the inﬂation climate to the output gap as well as to its changes. We understand these adjustments to be carried out by adaptive ﬁrms in an uncertain and constantly changing environment.7 The concept of the inﬂation climate in combination with its updating process is therefore called the adaptive inﬂation climate, also referred to by the acronym AIC. This module is the theoretical core of this chapter. As a ﬁrst check on whether the AIC concept can make economic sense, we proxy the inﬂation climate by a survey measure of inﬂation expectations. In this way, the adjustment process can easily be estimated. Although the coefﬁcients are not perfectly robust, the results are reasonable. They are, in any case, sufﬁciently encouraging to try more elaborate methods in future research. The remainder of the chapter is organized as follows. Section 2 reconsiders the optimization problem of price-setting ﬁrms with heterogeneous expectations, and presents our reinterpretation of the expectational variable in the Phillips curve. Section 3 develops the concept of the adaptive inﬂation climate. In particular, it makes the parameter explicit that reﬂects the credibility of monetary policy. The adjustment process is subsequently justiﬁed by the exploratory estimation just mentioned. Section 5 concludes.

2 The New-Keynesian Phillips Curve Revisited 2.1 Homogeneous and Heterogeneous Expectations We begin with a recapitulation of the standard derivation of the newKeynesian Phillips curve, as based on the Calvo model with its timecontingent price adjustment rules for ﬁrms in monopolistic competition. Each ﬁrm must precommit to a price until it receives a random signal that To emphasize the expression “adaptive” beyond the serious scientiﬁc paths, we may mention a one-line advertising campaign of hp-invent in Germany in autumn 2004, which was directly formulated in English and read, “Solutions for the adaptive enterprise.” In contrast to ruling theory, the profane business world was apparently less attracted by a slogan like, “The solution for the rational ﬁrm.” 7

8

R. Franke

it can change the price. Accordingly, let (1 − q) be the random fraction of ﬁrms that are able to reset their price in any particular time period t. Denoting the reset price of a representative ﬁrm in that period by zt and the overall price level by pt (both expressed in logs), these staggered price changes give rise to pt = qpt−1 + (1 − q)zt

(1)

Designate the frictionless price of the ﬁrm as p*t , which is the price the ﬁrm would set in t if there were no price rigidities. It deviates from the prices set by the ﬁrm’s competitors according to the current state of demand. Under an upward-sloping supply curve with elasticity h, the frictionless price may be given by p*t = pt + hyt

(2)

where yt is the output gap, i.e., the percentage deviation of actual output from potential output. In an inﬂationary environment, however, the reset price should exceed p*t since the ﬁrm will take into account that its price may be ﬁxed for multiple periods. The ﬁrm therefore chooses zt such that it minimizes an expected “loss function,” a convenient speciﬁcation of which is the quadratic expression L(zt) = Σ ∞k=0q kEt(zt − p*t+k)2 . Certainly, Et(zt − p*t+k)2 is the ﬁrm’s expected loss in period t + k if it had no opportunity to set a frictionless optimal price either then or previously, and this loss is weighted by the probability q k that its price will remain unchanged until this period. Then the optimality condition from differentiating the loss function with respect to zt, dL(zt)/dzt = 0, and some elementary rearrangements determine the reset price as ∞

zt = (1 − θ )∑ θ k Et pt*+ k

(3)

k=0

Even without rigorously solving the optimization problem, it is immediately recognized that zt is the average desired price until the next price adjustment. Furthermore, Eq. 3 is equivalent to zt = qEtzt+1 + (1 − q)p*t

(4)

All relationships entering the derivation of the new-Keynesian Phillips curve are made explicit if we lastly apply the expectation operator to the price level Eq. 1 for the next period t + 1 and solve it for Etzt+1, from which one gets Et zt +1 =

1 ( Et pt +1 − θ pt ) 1−θ

(5)

Then, substitute Eq. 2 into Eq. 4, solve Eq. 1 for zt, and equate the two resulting expressions for zt, which gives

Alternative to the New-Keynesian Phillips Curve

1 ( pt − θ pt −1 ) = θ Et zt +1 + (1 − θ )( pt + η yt ) 1−θ

9

(6)

Substituting Eq. 5 for Etzt+1 leads to 1 θ ( pt − θ pt −1 ) = ( Et pt +1 − θ pt ) + (1 − θ )( pt + η yt ) 1−θ 1−θ

(7)

It remains to denote the rate of inﬂation by p t = ∆pt = pt − pt−1, and after a few algebraic manipulations of Eq. 7 one arrives at

π t = Et π t +1 +

(1 − θ )2η yt θ

(8)

It should be noted that the expectations in Eq. 8 need not necessarily be rational. The key assumption in deriving the relationship was to treat all ﬁrms alike, so that we have just spoken of “the” ﬁrm as a representative of all price-changing ﬁrms. As hinted at in the introduction, when employing rational expectations in Eq. 8, this forward-looking Phillips curve is well known for several undesirable features such as lack of inertia or disinﬂationary booms. A straightforward extension that seeks to overcome these deﬁciencies is to allow for some form of lagged dependence in inﬂation. With one lag, for simplicity, the usual speciﬁcation is p t = mEtp t+1 + (1 − m)p t−1 + byyt

(9)

where by > 0 and 0 ⭐ m ⭐ 1. This relationship, perhaps with some lags of the rate of inﬂation added, represents the contemporary consensus on a theoretical new-Keynesian model of inﬂation. While the primary justiﬁcation for Eq. 9 is empirical, its theoretical background has largely been left to the fringes of the discussion. One proposal to derive Eq. 9 in the present Calvo framework, although with a more elaborated ﬁrm sector, has been advanced by Christiano et al. (1999). They maintain the hypothesis of homogeneous ﬁrms with rational expectations, but relax the assumption that the ﬁrms that are not allowed to reoptimize commit themselves to their old price; the latter rather increase their price automatically at the most recent rate of inﬂation. 8 A similar device is Woodford’s (2003, sect. 3.2) backward-looking indexation in the ﬁrms’ setting of optimal prices. Often, however, equations like Eq. 9 are introduced as reﬂecting forwardlooking and backward-looking “behavior.” This characterization suggests that the economy is populated by two (or more) types of ﬁrm, which form 8

With a discount factor of unity, then m = 1/2 results.

10

R. Franke

their expectations about inﬂation in different ways. Generally, a continuum of ﬁrms that are distributed over the unit interval may be assumed, with density function s = s( f ). A single ﬁrm is identiﬁed as f ∈ [0, 1], and ﬁrmspeciﬁc variables are labelled by a superscript f. In particular, let E tf be the operator that incorporates the rules according to which this ﬁrm forms its expectations. Then the following reformulation of Eq. 8 suggests itself:

π t = ∫ Etf π t +1ds ( f ) + 1

0

(1 − θ )2η yt θ

(10)

Speciﬁcally, a fraction m of the population may be told to have rational expectations, which may here be expressed as E tf p t+1 = Etp t+1 for a ﬁrm f, if the symbol Et is reserved for model-consistent expectations. The rest of the ﬁrms, labelled f′, are supposed to entertain backward-looking expectations in the simple form Etf′p t+1 = p t−1. Thus,

∫

1

0

Etf π t +1ds ( f ) = µEt π t +1 + (1 − µ )π t −1

(11)

and combining Eqs. 10 and 11 obviously gives rise to the hybrid Phillips curve in Eq. 9. But precisely how, then, is Eq. 10 supposed to come about? It might be expected that Eq. 9 or Eq. 10 can be similarly derived as the baseline case Eq. 8.9 For a careful investigation of this conjecture, we have to go back to the current and future reset price and make the aggregation procedure explicit. Let z tf be the reset price of ﬁrm f in period t, and It ⊂ [0, 1] the set of ﬁrms (with mass 1 − q) changing their price in that period. Suitably scaled, the aggregate reset price is therefore (1 − q)−1兰I ztfds(f ). Assume furthermore that the distribution of the heterogeneous expectations in the set It does not differ from their distribution in the total population of ﬁrms.10 This makes the aggregate reset price independent of the outcome of the random process selecting the ﬁrms that can charge a new price, and we have zt = 兰10ztfds(f ). Denoting the frictionless price of ﬁrm f by ptf, the ﬁrmspeciﬁc version of Eq. 4 is ztf = qEtfzft+1 + (1 − q)ptf, and the aggregate version reads t

9 Galí and Gertler (1999, pp. 210f) is an alternative proposal to Christiano et al. (1999) and Woodford (2003) to derive Eq. 9, which has explicit recourse to forward- and backward-looking ﬁrms. Here the backward-looking ﬁrms’ reset price is speciﬁed as the average reset price of the two types of ﬁrm in the previous period plus a correction for the rate of inﬂation it has brought about. Interestingly, the authors themselves call this rule “admittedly ad hoc.” The resulting coefﬁcient b y, by the way, is similar, but not identical, to (1 − q)2h/q in Eq. 8. 10 The assumption is acceptable if 1 − q is not too small, a feature that is corroborated by Blinder et al. (1998). They report that the median US ﬁrm adjusts its price once per year, which for a quarterly model gives us 1 − q = 1/4.

Alternative to the New-Keynesian Phillips Curve

11

zt = ∫ ztf ds ( f ) = θ∫ Etf ztf+1 ds ( f ) + (1 − θ )∫ ptf ds ( f ) 1

1

1

0

0

0

On the other hand, with 兰 10y tfds(f ) = yt, the counterpart of Eq. 6 is given by 1 1 ( pt − θ pt −1 ) = θ∫ Etf ztf+1ds ( f ) + (1 − θ )( pt + η yt ) 0 1−θ

(12)

If we seek to proceed with this equation analogously to Eq. 6, a fundamental difference from the economy with homogeneous expectations becomes apparent. Recall that at this stage of the argument, Etzt+1 in Eq. 6 was replaced with Eq. 5. In the homogeneous case, the latter equation could be thought of as follows. The single ﬁrm not only knows that the price level is updated every period according to Eq. 1, but also that it shares its expectations with the other ﬁrms. Hence Eq. 5 equally holds for its own reset price, and the ﬁrm can use this relationship for a determination of Etzt+1 that directly refers to expectations about aggregate prices. In contrast, in the present framework with heterogeneous expectations, Eq. 5 for the future aggregate reset price becomes (after an obvious rearrangement)

∫

1

0

Etf pt +1ds ( f ) = θ pt + (1 − θ )∫ Etf ztf+1ds ( f ) 1

(13)

0

If on the analogy of Eq. 5 this equation were solved for the integral on the right-hand side and this term substituted in Eq. 12, we could indeed continue as before.11 But what does Eq. 13 mean? Of course, it would be true if each ﬁrm f believed that its future reset price and the aggregate price level are related by f Etfpt+1 = qpt + (1 − q)Etfzt+1

(14)

However, the individual ﬁrm is negligibly small, and if it reckons on the other ﬁrms having different expectations about their future reset prices, it will see no reason for this relationship to hold. More precisely, while the single ﬁrm may well have expectations about prices in the next period, it will not conceive the general price level pt+1 as having anything essential to do at all with its own reset price in t + 1. Equation 14 or any modiﬁcation of it that maintains the term Etf z ft+1 will simply be meaningless for an individual ﬁrm. As a consequence, there is no conceptual basis for the aggregate version Eq. 13 to come into being either. In ﬁner detail, this procedure would involve an additional assumption of the ﬁrms’ expectations about other ﬁrms’ expectations, for which, in short, no predictable movements are expected. This feature is spelled out in Adam and Padula (2002). On the other hand, the authors use an equivalent formulation of Eq. 13 without any further comment (see Eq. 10 in their Appendix). 11

12

R. Franke

We conclude that when heterogeneous expectations of ﬁrms are allowed for, the extended Phillips curve relationship in Eq. 10 cannot be established along the lines of the standard argument. In particular, substituting mEtp t+1 + (1 − m)p t−1 for Etp t+1 in Eq. 8 has so far no rigorous microeconomic underpinnings that are comparable to those of the baseline case, where m = 1. In this sense, hybrid expectations appear to contain some “ad hoc” element. This observation motivates us to take one more step and reconsider the kind of reasoning by which single ﬁrms determine their reset price. In this way, we will, in particular, face the question of the time horizon of expectations in an alternative interpretation of the Phillips curve.

2.2 Reconsidering the Reset Price and Expectations About Inflation To re-examine the ﬁrm’s decision problem about its reset price, we return to the solution formula of Eq. 3 that minimizes the expected losses for the periods in which the ﬁrm will not be capable of changing the price. Assuming that the reset probability (1 − q) is common knowledge, and taking account of the desired frictionless price (Eq. 2), the reset price z tf of ﬁrm f is given by ∞

ztf = (1 − θ )∑ θ kEtf ( pt + k + η ytf+ k )

(15)

k=0

To be clear, we henceforth abjure rational expectations for any ﬁrm. As has been pointed out in the previous section, because of its negligible size it is not meaningful for a single ﬁrm to relate its future reset price to the expected aggregate price level. Each ﬁrm has thus to cope with Eq. 15 in another, more direct, way. It would be an enormous task for a ﬁrm to predict an entire time proﬁle of future price levels pt+k and the demand directed to itself, from which its f derive. This would ask too much of the major economic output gaps y t+k research institutes of a country, and ﬁrm f is not supposed to be more able in this respect. Therefore, the ﬁrm decomposes Eq. 15 into a few elements which it then treats in an approximate manner. The central idea is that ﬁrm f captures future inﬂation inherent in the price series pt+k by a single number p tf. As discussed below, p tf will change over time; therefore the time index t. In a ﬁrst description, p tf could be conceived as the ﬁrm’s expected average rate of inﬂation for the whole future from t on, where the terms regarding the period t + k are to be discounted by the probability weight q. In detail, consider the expected price level for t + k and, invoking the rates of inﬂation until that time, write it as E tfpt+k = pt + Σ kj=1E tfp t+j. When the ﬁrm seeks to grasp future inﬂation by its summarizing rate p tf, we have Etf pt + k = pt + ∑ kj =1 [π tf + Etf π t + j − π tf ] = pt + kπ tf + ∑ kj =1 ( Etf π t + j − π tf )

Alternative to the New-Keynesian Phillips Curve

13

Denoting the accumulated residuals for period t + k by e ft+k = e ft+k(p tf ), the expected price level is split up into E tf pt+k = pt + kp tf + e ft+k(p tf ) where k

etf+ k (π tf ) = ∑ ( Etf π t + j − π tf )

(16)

j =1

What the number p tf should accomplish is that these residuals average out on the whole. Realistically, however, the ﬁrm sees itself in an uncertain inﬂationary environment. It has no reliable basis on which to build up a probability distribution of future inﬂation rates and on which it could really expect the residuals to cancel out by a suitable choice of p tf . On the other hand, the ﬁrm has to settle down on a deﬁnite reset price in the end. So it just proceeds as if p tf were able to satisfy this condition. The ﬁrm is aware that this will not be exactly the case, but for lack of a better (and not more costly) procedure it is willing to accept the possible errors. Accordingly, in using Eq. 16 to determine its reset price ztf , the ﬁrm sets the discounted errors equal to zero: ∞

∑θ e (π ) = 0 k

k =1

f t +k

f t

(17)

This rule completes the treatment of future prices in Eq. 15. Turning to the second component in this equation, the ﬁrm still has to form expectations about its output. ∞

∑θ

∞

k

k=0

Etf ytf+ k = ytf + ∑ θ k Etf ytf+ k

(18)

k =1

Since the output gap ﬂuctuates around zero in the long run, in this equation f for the next few no secular growth rate has to be assessed. The values y t+k periods will nevertheless be relevant. In a similar spirit as before, the ﬁrm could work here with a single growth rate g tf that it expects, or rather presumes, to prevail on average over the near future. We abstain from this approach, however, since we want to stay close to the conventional Phillips curve where output expectations have no explicit role to play. Instead, let the ﬁrm assume that, in the absence of foreseen shocks, output in the near future (or demand, for that matter) is still essentially determined by its present level and growth rate. Thus, ﬁrm f approximates the inﬁnite sum on the right-hand side in Eq. 18 by ∞

∑θ k =1

k

Etf ytf+ k = ξ1 ytf + ξ2 ∆ytf

(19)

14

R. Franke

Generally, the coefﬁcients x1, x2 ⭓ 0 may vary across ﬁrms, but to simplify the notation we treat them directly as homogeneous. x1 and x2 might also be supposed to be zero, by arguing that the ﬁrm has already sought to eliminate Eq. 19 by its choice of p tf (again, by and large at least). This case would not be too unreasonable since when updating their current value of p tf , ﬁrms will take account of the output gap and its change (this is explained in greater detail below). Combining the rules and approximations in Eqs. 15–19, the reset price ztf on which ﬁrm f decides is given by ∞

∞

k =0

k =0

ztf = (1 − θ )∑ θ k pt + π tf ∑ (1 − θ )θ k k + (1 − θ )η[(1 + ξ1 ) ytf + ξ2 ∆ ytf ]

(20)

The remaining two inﬁnite series in Eq. 20 can easily be resolved. Since the q k sum to 1/(1 − q), the ﬁrst term on the right-hand side is just pt. The second inﬁnite series can be written as qΣ ∞k=1(1 − q)q k-1k or, putting q = 1 − q, as qΣ ∞k=1(1 − q)qk-1qk. The latter series is precisely the mean value of the geometric distribution, which can be looked up in a formulary as 1/q (in addition, this is also the mean waiting time for a ﬁrm until it can change its price). Hence, the second term on the right-hand side of Eq. 20 equals p tf q/q = qp tf /(1 − q). To sum up, we arrive at the following determination of the reset price of ﬁrm f, ztf = pt +

θ π tf + (1 − θ )η[(1 + ξ1 ) ytf + ξ2 ∆ytf ] 1−θ

(21)

2.3 A Reinterpretation of Expectations in the Phillips Curve Having available the ﬁrms’ reset prices in Eq. 21, it is now straightforward to derive a Phillips curve. To this end, for the aggregation of the ﬁrm-speciﬁc rates of inﬂation p tf , we deﬁne

π tc = ∫ π tf ds ( f ) 1

(22)

0

The aggregate reset price is already known to be given by zt = 兰01z ft ds(f ). Thus, Eq. 21 gives rise to the aggregate equation zt = pt + qp ct /(1 − q) + A, with A := (1 − q)h[(1 + x1)yt + x2∆yt] Substituting this expression for zt in Eq. 1 for the staggered price changes yields pt = (1 − q)pt + (1 − q)qp ct /(1 − q) + (1 − q)A + qpt−1, qp t = q(pt − pt−1) = qp ct + (1 − q)A

or

Alternative to the New-Keynesian Phillips Curve

15

The Phillips curve relationship we thereby obtain reads (1 − θ )2η πt = π + [(1 + ξ1 ) yt + ξ2 ∆ yt ] θ c t

(23)

While the analogy to Eq. 8 is obvious, two important distinctions should be drawn. The ﬁrst one is immediate, as in Eq. 23 not only the output gap but also its rate of change enters (which is the difference between the two growth rates of actual total output and potential output). This could perhaps be an attractive feature for empirical estimations. Conceptually more signiﬁcant, however, is the second distinction, that Etp t+1 in Eq. 8 is here replaced with p ct . From what has been said above, it is clear that this variable is no longer meant to capture expectations about next period’s inﬂation. For each single ﬁrm, the p tf that constitute p ct are rather the ﬁrms’ time averages of all future inﬂation rates, where the averaging procedure uses the probability q for discounting. Moreover, the ﬁrms’ choice of p tf cannot be based on consistent mathematical expectations such that the expected deviations of p tf from the future p t+k offset each other in the long run, in the sense of Eq. 17; hence, this choice has to be made in more informal ways, or it includes more intuitive elements. For this reason, we completely avoid speaking of expectations in connection with Eq. 23. As an alternative, we prefer to call the aggregate variable p tf a general inﬂation climate. Of course, to breathe life into this notion it has to be made clear how ﬁrms come to set up their rates p tf , or how the general inﬂation climate is supposed to change. These dynamic adjustments, studied directly in the aggregate, are the subject of the next section.

3 Dynamic Adjustments of the Inflation Climate 3.1 Motivational Background As the notion of the inﬂation climate p ct has been introduced, there is no direct empirical counterpart to which this variable could be related. However, survey measures of expected inﬂation are data that come relatively close to it. These types of data appear attractive insofar as they are obtained from real-world agents who, like our ﬁrms, are not blessed with rational expectations, but derive their forecasts from a mix of formal methods, sophisticated reasoning, and intuition.12 Hence, although we have emphasized that p ct is

However, intuition can later be rationalized by referring to more acknowledged arguments while ignoring others. 12

16

R. Franke

not a rate of inﬂation that ﬁrms expect to prevail in the next period (or four quarters ahead, for that matter), survey data may be stimulating in designing or evaluating the mechanisms that determine the movements of the inﬂation climate. Since p ct is an average rate across the population of ﬁrms, our discussion of the survey expectations will only be concerned with the mean of the forecasted values.13 The ﬁrst question about survey expectations is, of course, their rationality; that is, whether these subjects process the information available to them as if, in the mean, they were perfectly rational. Several studies have examined the statistical properties of such inﬂation measures, where departures from rationality are usually identiﬁed as rejections of the hypotheses of unbiasedness and efﬁciency. While there are a number of interesting details concerning results over subperiods and for single measures, it seems fair to say that on balance the evidence points to rejections of these null hypotheses (see Thomas 1999; Andolfatto et al. 2002, sect. 2; Mehra 2002). On the other hand, the literature on surveys of inﬂation expectations has also revealed that they are not as unsophisticated as simple autoregressive models (“adaptive expectations”) suggest. Already Mullineaux (1980) and Gramlich (1983), for example, have both found that additional information like the money supply helps explain movements in the survey data, even when lagged inﬂation is controlled for (see also Croushore 1998, for more recent evidence). In view of this evidence, Roberts (1997, 1998), has developed the idea that survey expectations are an average of rational expectations and adaptive expectations, as captured by lagged inﬂation. As he stresses himself, however, this approach represents only an empirical regularity, and does not correspond to structural models derived from underlying economic behavior (Roberts 1998, p. 9). Furthermore, even if this work could make a case for rational expectations as one component of a structural model, this does not necessarily rule out that alternative concepts also have comparable explanatory power.14 Branch (2004) has recently advanced, and successfully estimated, an ambitious model that seeks to explain the dynamics of the total population of forecasts. It assumes that each forecaster employs one of three available predictors (corresponding to static, adaptive, and vector autoregressive expectations), which he may change from period to period according to their costs and current performance. As a side remark regarding the population share of fully rational forecasters that Roberts (1998) estimates in his simpler framework (as we mention in the next footnote), it might be interesting to note that Branch (2004, pp. 617–619), when he replaces the vector autoregression (VAR) expectations with rational expectations, gets the much smaller estimate of 11% for this proportion than Roberts. 13

Alternative to the New-Keynesian Phillips Curve

17

Another approach in the literature explicitly denies rational expectations to the agents, but views them as smarter than simply sticking to adaptive expectations. As a fairly modest deviation from rational expectations, the agents are treated as econometricians; they have estimated a dynamic reduced-form structure of the economy in the form of a vector autoregression (VAR), and use this model to predict inﬂation in the next, or next few, periods. This type of behavior has been called “near-rational,” since it produces forecast errors that are only a little worse than those of fully rational expectations. When the VAR module is incorporated into a complete model economy, the regression coefﬁcients may be exogenous and invariant over the simulated sample runs, as, for example, in Ball (2000) and Huh and Lansing (2000), or they may even be endogenous, where in a learning process agents update them in each period, as in Orphanides and Williams (2002, 2003) or (for an AR inﬂation process) in Tetlow and von zur Muehlen (2003). In our opinion, this approach to modelling expectations is a more appealing compromise between rational expectations and adaptive expectations than the one proposed by Roberts (1997, 1998). The main reason why we do not adopt such a VAR to determine our inﬂation climate is, again, the argument that p ct is not the inﬂation to be expected for a particular period in the future. Also, we strive for a simpler speciﬁcation than a complete VAR. Nevertheless, the adjustment mechanism that we will put forward for p ct can be viewed as being in the same spirit; it could even be interpreted as a particularly simple (or degenerate) VAR, which, especially regarding the number of lags, has been subjected to a great number of a priori restrictions. Whether, or in what sense, our simpliﬁed speciﬁcation is compatible with the data has to be investigated at a later stage. When looking at time series of surveys of expectations, three features emerge that a model determining the inﬂation climate p ct should reﬂect. First, at least at a quarterly frequency, the expectation series are typically smoother than the inﬂation rates to be predicted (cf. Figs. 1–3 in Thomas

Actually, Robert’s estimations seem to be fraught with econometric subtleties. While in Roberts (1997, p. 190) the results “indicate that the departure of survey expectations from rationality is not adequately captured by the simple weighted-average model,” and they suggest “that some other model of the deviation of the surveys from rationality should be preferred” (Roberts 1997, p. 191), the estimations presented in Roberts in 1998 are more favorable, and here the relevant parameters are signiﬁcant at a meaningful order of magnitude. The latter results lead him to the conclusion “of an ‘intermediate’ degree of rationality, with perhaps 20 to 40 percent of the population using simple, backward-looking expectations” (Roberts 1998, p. 15). 14

18

R. Franke

1999). This observation suggests that new information is not immediately, or not to its full extent, taken into account; instead, expected inﬂation appears to adjust to new information in a partial way, or with some delay. Incidentally, for imperfect agents in an uncertain world this behavior is by no means “irrational,” but can very well be shown to be optimal (Heiner 1988). Expressing our inﬂation climate p ct directly as a function of the other endogenous variables in a model would imply that p ct exhibits the same volatility. We therefore take the partial adjustments explanation of the observed smoothness in the surveys as an inspiration to treat p ct as a predetermined variable, which in the light of incoming information is then updated for the next period in a gradual manner. A second feature of the survey expectations is that they tend to underestimate inﬂation during expansions and to overestimate it during contractions (Thomas 1999, p. 134). This lagging behind the actual series is typical of an adaptive expectations rule. A more detailed examination of this phenomenon reveals a third feature. Underestimations of inﬂation also occurred during periods in which inﬂation exceeded its average. Conversely, overestimations occurred during periods of below-average inﬂation. The interpretation of this ﬁnding is that the forecasts exhibit a propensity to expect inﬂation to revert toward its longer-run average value (Thomas, 1999, p. 134). The latter two features lead us to incorporate an adaptive expectation and a mean-reverting component into the adjustment mechanism for p ct . In addition, as a reﬂection of the VAR method, the inﬂation climate should respond to movements of variables other than inﬂation. Since we want to remain in the framework of elementary price Phillips curves, the only variable that we will consider here is the output gap.

3.2 Specification of the Adaptive Inflation Climate (AIC) Having sketched this motivational background, we now turn to a detailed description of the dynamic adjustments of the inﬂation climate. To begin with, the inﬂation climate p ct is predetermined in a given period t, and modiﬁed by the ﬁrms at the beginning of the next period as new information arrives. The updating procedure is based on the notion of a general benchmark rate of inﬂation, toward which the current value of p ct is adjusted in a gradual manner. The benchmark itself is a combination of four single benchmark components. Needless to say, the composite benchmark is not ﬁxed, but generally varies over time. A ﬁrst requirement of a benchmark concept is, of course, that the inﬂation climate should not be persistently above or below the current rate of inﬂation p t. Hence, p t is one of the four single benchmarks toward which p ct is to

Alternative to the New-Keynesian Phillips Curve

19

adjust. Formally, this amounts to the above-mentioned adaptive expectations. In this respect, however, we emphasize the following three points. (i) p ct is not the expected inﬂation in the next quarter t + 1, but is held to be some sort of average over a longer period of time. (ii) Firms are aware that the benchmark p t is a moving target, which is also quite volatile in the presence of random shocks. In particular, since they do not have perfect knowledge of the probability laws of the shocks, the ﬁrms are reasonably cautious and do not adjust p ct too strongly in the direction of p t; this is basically Heiner’s (1988) argument. (iii) Current inﬂation is only one guideline among several others for updating the inﬂation climate. For all these reasons we will avoid the term “adaptive expectations” altogether. Nevertheless, on many occasions p t would not be very useful as an overall benchmark. For example, this becomes clear in a boom phase of the business cycle. When ﬁrms expect an additional upward pressure on prices in this situation, p t as a benchmark may be considered to be corrected upward in proportion to the degree of capacity utilization, which in the present setting is represented by the output gap yt. On second thoughts, the idea of rising inﬂation in a stage of overutilization may not be entirely convincing if the economy has reached a turning point and a contraction is already underway. Recognizing such a pattern is often difﬁcult without the data from the next several quarters, so in this respect ﬁrms may just have a look at the recent change in capacity utilization. This means that, in a third concept, ﬁrms correct the benchmark p t for a growth component. Accordingly, we suppose that they correct p t for this purpose in proportion to the change in the output gap, ∆yt. Whereas the latter two concepts primarily seek to extrapolate the current trend in the motion of the rate of inﬂation, a longer-run time perspective is also relevant. In accordance with the above-mentioned mean-reverting component in the survey expectations, we think of ﬁrms imagining that in a not too distant future the central bank will succeed in bringing inﬂation back to normal, where an obvious measure of “normal” inﬂation is given by the central bank’s target rate of inﬂation. If monetary policy is assumed to be sufﬁciently transparent, we can work directly with a deﬁnite target rate of inﬂation p * that is known to all ﬁrms. This constitutes the fourth benchmark toward which the inﬂation climate may partially adjust. The four concepts of sensible benchmark rates of inﬂation are summarized in Table 1. This table also contains our notation for the proportionality factors (z) by which the benchmark p t is corrected in the second and third concept; for the speeds of adjustments (δ) at which p ct is, respectively, updated in the direction of the four single benchmarks; and for the weights (w) that each of these concepts carries when they are combined into a closed formula. Thus, the changes in the general inﬂation climate are described as

20

R. Franke

Table 1. The four updating concepts for the inﬂation climate p ct Provisional benchmarks for p ct to adjust Adjusted speed Current inﬂation: Output–adjusted inﬂation: Growth–adjusted inﬂation: Target inﬂation:

pt p t + zyyt p t + zg∆yt p*

δp δy δg δs

Weight of concept wp wy wg ws

As well as zy, zg > 0, the adjustment speeds and weights of the four concepts satisfy 0 艋 δa, w a 艋 1 (a = p, y, g, s) and Σaw a = 1

p ct+1 = p ct + w p ∆p c,p + w y∆p c,y + w g∆p c,g + w s∆p c,s

(24)

where ∆p c,p := δ p (p t − p ct)

∆p c,y := δy(p t + zyyt − p ct)

∆p c,s := δs(p * − p ct)

∆p c,g := δg (p t + zg∆yt − p ct)

(the index “g” alludes to the growth component that is captured by ∆yt, and “s” alludes to the star symbol in the target rate of inﬂation). It goes without saying that with p c = p *, Eq. 24 supports the steady-state p = p*, y = 0 of the economy. The structural representation of the dynamics of the inﬂation climate contains ten behavioral coefﬁcients. While they are useful in order to distinguish the four single benchmark concepts from each other, they are not all needed in the further analysis. We reduce them to the following four parameters: a c := w p δ p + w yδy + w gδg + w sδs g := w sδs/a c

ay := w yδyzy/(w p δ p + w yδy + w gδg) a g := w gδgzg/(w p δ p + w yδy + w gδg)

and write Eq. 24 equivalently as p ct+1 = p ct + a c[gp * + (1 − g)(p t + ayyt + a g∆yt) − p ct ]

(25)

Once the four parameters in Eq. 25 are given, the values of the original ten structural parameters in Eq. 24 can be constructed from them. Clearly, this determination is not unique, but there exists a whole continuum of such parameter combinations. Especially regarding the coefﬁcient a c, however, one may no longer wish to decompose it any further since, as the terms are collected in Eq. 25, it is seen to represent the average adjustment speed in the four updating procedures (a c ⭐ 1). Equation 25 summarizes how the single ﬁrms’ views about future inﬂation cause the general inﬂation climate to change in an adaptive way: where, as usual in the learning literature on heterogeneous agents, the expression

Alternative to the New-Keynesian Phillips Curve

21

“adaptive” is used in a broader sense than just an “adaptive expectations” rule. The equation can thus be said to describe the concept of an adaptive inﬂation climate. In short, the updating rule in Eq. 25 may be referred to as the AIC module. If it is introduced into our Phillips curve Eq. 23 from Sect. 2.3, we may simply speak of the PC + AIC module.

3.3 Central Bank Credibility and Inflation Persistence in the Phillips Curve In this short subsection, we point out the economic signiﬁcance of the parameter g in Eq. 25 and how it relates to more familiar concepts in the literature. We start out from an elementary speciﬁcation of expectations in the Phillips curve, which can reﬂect the faith ﬁrms have got in the conduct of monetary policy. Following Freedman (1996, pp. 253f), such a Phillips curve reads p t = mp * + (1 − m)A(L)p t−1 + byt

(26)

where A(L) is a polynomial lag function indicating that expected inﬂation is tied to current and past rates of inﬂation, whose coefﬁcients add up to unity. The parameter we are interested in is the weight m, which expresses the degree to which inﬂation expectations are anchored on the target rate of inﬂation. In this sense the coefﬁcient can be viewed as measuring the credibility of the central bank.15 While m will depend on the success of monetary policy in the past, it is intuitively clear that increases in credibility are also favorable for maintaining price stability, although this will generally require the central bank to appropriately adjust (the coefﬁcients in) its reaction function.16 Hence the coefﬁcient m in equations like Eq. 26 deserves particular attention. Of course, this amounts to the same as other discussions of the Phillips curve that, in terms of Eq. 26, concentrate on the coefﬁcient 1 − m as a measure of inﬂation persistence (as mentioned in the previous footnote, the role of p * here might not be made explicit). In the formulation of Eq. 26, central bank credibility (m) and inﬂation persistence (1 − m) are obviously complements.

Note that this concept is inherent in Phillips curve estimations with demeaned or detrended inﬂation rates, where the sum of the coefﬁcients on lagged inﬂation is signiﬁcantly less that one. This is easily seen by adding target inﬂation on both sides of such a regression equation, when it is assumed that p * approximately equals the trend inﬂation in the data. 16 This is an important outcome in Amano et al. (1999), which they obtain in an elaborated macroeconometric model of the Canadian economy (see especially pp. 11–13, 24f). 15

R. Franke

22

Now, in combination with Eq. 25, our Phillips curve Eq. 23 from Sect. 2.3 also makes reference to the central bank’s target rate of inﬂation p *, the strength of its inﬂuence being governed by the coefﬁcient g. Note, however, that p * in Eq. 26 directly impacts on period-t inﬂation, while in Eqs. 23 and 25 its effect on inﬂation is delayed. If for the moment the expectational term mp * + (1 − m)A(L) in Eq. 26 is identiﬁed with our general inﬂation climate, we can say that in Eq. 26 the target rate determines the level of the climate, and in Eq. 25 it determines the change of the climate. The PC + AIC module in Eqs. 23 and 25 can nevertheless be made comparable to Eq. 26 by an elementary rearrangement of terms. Dating Eq. 25 one period backward and substituting it in Eq. 23, we get p*, p t−1, and p ct−1 on the right-hand side (as well as the output terms, of course). Then dating Eq. 25 two periods backward and substituting it for p ct−1, an additional term with p * appears as well as p t−2 and p ct−2 . If we proceed in this way and simplify the rest by neglecting ∆yt in Eq. 23 and denoting the composed coefﬁcient on yt as by, our Phillips curve Eq. 23 becomes ∞

π t = α c ∑ (1 − α c )k[γπ *+ (1 − γ )(π t − k −1 + α y yt − k −1 + α g ∆yt − k −1 )] + β y yt k=0

Since the inﬁnite series Σ ∞k=0(1 − a c)k has limit 1/a c, the equation can be rewritten as ∞

π t = γπ *+ (1 − γ )∑ α c (1 − α c )k π t − k −1 k=0

∞

+ β y yt + α c (1 − γ ) ∑ (1 − α c )k (α y yt − k −1 + α g ∆yt − k −1 )

(27)

k=0

which includes the inﬂation terms in the same form as Eq. 26. As inﬂation persistence in a Phillips curve context is usually speciﬁed as the sum of the coefﬁcients on the lagged rates of inﬂation, the role of the parameter g in the PC + AIC module can be summarized as follows: g 1−g

credibility of the central bank inﬂation persistence in the Phillips curve

(28)

In addition to highlighting the role of the parameter g, Eq. 27 shows the main difference of the PC + AIC module from the usual backwardlooking Phillips curves, even if they include a target inﬂation rate: as well as inﬂation, our approach also implicitly invokes the entire history of output evolution. This observation is useful for comparisons with the literature. The inﬂation dynamics themselves, however, are better studied by keeping track of the explicit adjustments of the inﬂation climate in Eq. 25.

Alternative to the New-Keynesian Phillips Curve

23

3.4 Econometric Exploration with the Survey of Professional Forecasters (SPF) It has been pointed out above that while the concept of the adaptive inﬂation climate has no direct empirical counterpart, survey expectations should be relatively close to it. This suggests a ﬁrst econometric test of the AIC module, which substitutes such a measure for pC in Eq. 25 and tries to estimate the coefﬁcients in a regression equation. If they come out with the correct sign and in a reasonable order of magnitude at some decent level of signiﬁcance, and also if the ﬁt to the data is not too bad, we can take this result as support that it is indeed worthwhile to introduce the AIC module into the Phillips curve, and inquire into the inﬂation dynamics that are thus generated, although on basis of additional criteria, the numerical parameter values from the regression might still be modiﬁed. Three survey measures for the US are commonly used in the Phillips curve literature (Thomas 1999): the Michigan Survey of Households, the Livingston Survey, and the Survey of Professional Forecasters (SPF). For the present purpose, it seems appropriate that the participants are businessmen or professional forecasters, which is the case for the latter two surveys. Since the Livingston Survey is only semiannual, we concentrate on the SPF forecasts. Regarding inﬂation, quarterly predictions are here provided for the GDP implicit price deﬂator and the Consumer Price Index (CPI). The ﬁrst measure is available from 1968 on, but has the disadvantage that the underlying deﬂator has been periodically redeﬁned. For this reason we consider the CPI inﬂation forecasts which, however, were not initiated until the third quarter of 1981. As for the forecast horizon, we choose the mean forecasts of the (annualized) quarterly rate of change four quarters ahead. To reduce the inﬂuence of the last high inﬂation rates at the beginning of the 1980s, we let the sample period begin with 1981 : 4 and end with quarter 2000 : 4. The other variables entering the regression are the (annualized) quarterly CPI inﬂation rates, the output gap, and target inﬂation. The output gap yt is conveniently represented by the percentage deviations of real GDP from its Hodrick–Prescott trend line.17 Because inﬂation rates have also declined to some extent in the ﬁrst half of the 1990s, it may be argued that target inﬂation was subject to (mild) variations, so that it may better be proxied by trend inﬂation. We try both prespeciﬁed ﬁxed values p * = const and a Hodrick– Using the standard smoothing parameter for quarterly data, l = 1600. The trend was computed over a longer time span than the sample period, so there are no problems with end-of-period distortions. 17

24

R. Franke

Prescott trend p *t of CPI inﬂation.18 Denoting the SPF survey expectations by SPFt, the regression equation thus reads SPFt+1 = (1 − a c)SPFt + a c[gp *t + (1 − g) (p t + ayyt + ag∆yt)] + et

(29)

With respect to a ﬁxed value of target inﬂation, the three rates p * = 2.50%, 3.00%, and 3.50% are considered. The ﬁt is then only slightly worse than in the case in which trend inﬂation is employed for p *. In all four cases, the coefﬁcient ay on the level of the output gap turns out to be negative, so we re-estimate Eq. 29 by putting ay = 0. The results of these regressions are collected in Table 2. All coefﬁcients reported in the table are signiﬁcant at the conventional levels. In all four estimations there are no serious signs of serial correlation in the residuals either. Corresponding to the R2 -values are standard errors of regression of around 0.30%. To get a visible impression of the goodness of ﬁt, Fig. 1 juxtaposes the actual SPFt series (solid line) and the series resulting from the ﬁrst estimation in Table 2 (bold line). A main factor explaining the good ﬁt is the relatively high degree of smoothness in the forecast series: it looks even more pronounced if the forecasts are contrasted with the actual rates of inﬂation. Plotting the survey forecast SPFt together with the current rates of inﬂation p t also shows that the forecasts do not deviate very much from what one would draw free-hand as a trend line of p t (perhaps with some upward bias). This feature suggests that already the component p *t in Eq. 29 explains a good deal of the change in SPFt, so that the coefﬁcient g might confuse two effects: expectations of the forecasters that inﬂation will return to trend inﬂation, and because of the low variation in the forecasts themselves the simple fact that SPFt remains quite close to its own trend, especially in the 1990s. In this respect the high value g = 0.70 in the ﬁrst row of Table 2 could be a deceptive measure of central bank credibility. Actually, the suspicion

Table 2. Estimations of regression, Eq. 29

1 2 3 4

p*t

ac

g

ay

ag

R2

HP-trend 2.50 3.00 3.50

0.21 0.13 0.15 0.17

0.70 0.37 0.47 0.54

— — — —

2.92 2.71 2.67 2.43

0.908 0.898 0.900 0.903

To correct for the possibly spurious effects at the end of the sample period, after inspection of the time series graph we have frozen p*t at a value of 2.60% from t = 1999 : 3 until t = 2001 : 2. 18

Alternative to the New-Keynesian Phillips Curve

25

Fig. 1. Actual and estimated survey forecasts SPFt (solid and bold lines, respectively)

seems to be conﬁrmed by the estimations with the constant values of p *t , which lead to a sizeable reduction of g. Besides the good ﬁt of Eq. 29, there are other properties of the regression that appear less delightful. If the sample period is shortened and the subperiods are varied, the estimates of a g easily become insigniﬁcant, while a c and g change in nonnegligible ways. This lack of robustness is unsatisfactory from an econometric point of view. In the light of the smoothness of the dependent variable, it may indicate that there are too many explanatory variables on the right-hand side of Eq. 29 (although they are not collinear). This means there would be a whole continuum of combinations of the parameters a c, g, a g, and possibly also ay that yield a similar goodness of ﬁt. In this intuitive sense the approach may be said to be over-identiﬁed. On the other hand, this observation is not necessarily an argument against Eq. 29 as a theoretical construct. First, there do exist meaningful values of the coefﬁcients that give rise to a satisfactory ﬁt. Second, if the module is combined with other model building blocks there may be additional criteria beyond the regression Eq. 29 that offer some guidance to selecting among the many parameter combinations. Here, we take the results in Table 2 as evidence that our updating procedure of the inﬂation climate has passed the basic econometric test. The AIC module Eq. 25 makes economic sense, not only as a theoretical device, but also when it comes to concrete numerical values.

26

R. Franke

4 Conclusion The baseline case of the new-Keynesian Phillips curve, as well as its hybrid variants with forward-looking expectations, have come under severe econometric criticism, the crucial argument being that this approach cannot explain the role played by lagged dependent variables in inﬂation regressions. It is thus time to consider alternative, backward-looking versions which, however, should not fall back on simple adaptive expectations as in the traditional interpretation. Abjuring rational expectations and taking heterogeneous expectations of ﬁrms seriously, this chapter has proposed a reinterpretation of the expectational variable in the Phillips curve as a general inﬂation climate. Rather than refer to inﬂation in the next period, this concept seeks to summarize in a single number the expectations about suitably discounted inﬂation over the entire future. Updating of the climate in response to new information proceeds in a gradual manner. The adjustments are not only oriented toward current inﬂation, but also take the level and change of the output gap into account as well as the central bank’s target rate of inﬂation. These are in our view the basic ingredients of any reasonable expectation formation process about inﬂation. Our aim was to model such a process in a sophisticatedly simple way. The four-parameter speciﬁcation at which we arrived was then called the adaptive inﬂation climate (AIC). Obtaining speciﬁc numerical coefﬁcients for the adjustment process is a separate matter. Here, we have limited ourselves to a preliminary test on whether beginning such an endeavor would be at all worthwhile. For this purpose the inﬂation climate was proxied by a survey measure of fourquarter inﬂation expectations. Although this does not exactly correspond to our idea of the climate, it is good enough for a ﬁrst exploratory estimation. While the coefﬁcients, especially the output gap coefﬁcients, should not be taken too literally, the estimation results are sufﬁciently encouraging to approach a numerical analysis of AIC in greater depth, which is left to subsequent research. The natural ﬁeld of application for our inﬂation module are small models to study monetary policy. While the use of the new-Keynesian Phillips curve in these models has an understandable theoretical appeal, its implications may not be innocuous. The fact that here current inﬂation summarizes the entire sequence of expected future output gaps for the economy is a strong prediction that may well have a bearing on the kind of optimal policy. To quote Rudd and Whelan (2001, p. 20), “given that this prediction is soundly rejected by the data, the use of these models for policy analysis strikes us as questionable at best.” Even if one does not fully share this harsh assessment, substituting our alternative model with the adaptive inﬂation climate for the new-Keynesian Phillips curve should be worth investigating.

Alternative to the New-Keynesian Phillips Curve

27

References Adam K, Padula M (2002) Inﬂation dynamics and subjective expectations in the United States. European Central Bank, Working Paper Series 222 Amano R, Coletti D, Macklem T (1999) Monetary rules when economic behaviour changes. Bank of Canada, Working Paper 1999–8 Andolfatto D, Hendry S, Moran K (2002) Inﬂation expectations and learning about monetary policy. Bank of Canada, Working Paper 2002–30 Ball L (2000) Near-rationality and inﬂation in two monetary regimes. NBER Working Paper 7988 Blinder AS (1998) Asking about prices: a new approach to understanding price stickiness. Russell Sage Foundation, New York Branch WA (2004) The theory of rationally heterogeneous expectations: evidence from survey data on inﬂation expectations. Econ J 114:592–621 Christiano LJ, Eichenbaum M, Evans C (1999) Nominal rigidities and the dynamic effects of a shock to monetary policy. NBER Working Paper 8403 Croushore D (1998) Inﬂation forecasts: how good are they? Federal Reserve Bank of Philadelphia, Working Paper 98-14 Freedman C (1996) What operating procedures should be adopted to maintain price stability? Practical issues. In: The Federal Reserve Bank of Kansas City, Achieving price stability: a symposium, pp 241–285 (www.kc.frb.org/publica/sympos/1996/ pdf/s96freed.pdf) Fuhrer J (1997) The (un)importance of forward-looking behavior in price speciﬁcations, J Money Credit Banking 29:338–350 Galí J, Gertler M (1999) Inﬂation dynamics: a structural econometric analysis, J Monetary Econ 44:195–222 Gramlich EM (1983) Models of inﬂation expectations formation. J Money Credit Banking 15:155–173 Heiner RA (1988) The necessity of delaying economic adjustment. J Econ Behav Organ 10:255–286 Huh CG, Lansing KJ (2000) Expectations, credibility, and disinﬂation in a small macroeconomic model. J Econ Bus 51:51–86 Mankiw G (2001) The inexorable and mysterious tradeoff between inﬂation and unemployment. Econ J 111:C45–C61 Mankiw NG, Reis R (2002) Sticky information versus sticky prices: a proposal to replace the new-Keynesian–Phillips curve. Q J Econ 117:1295–1328 Mehra YP (2002) Survey measures of expected inﬂation: revisiting the issues of predictive content and rationality. Federal Reserve Bank of Richmond, Econ Q 88:3, 16–36 Mullineaux DJ (1980) Inﬂation expectations and money growth in the United States”. Am Econ Rev 70:149–161 Orphanides A, Williams JG (2002) Imperfect knowledge, inﬂation expectations, and monetary policy. In: Bernanke B, Woodford M (eds) Inﬂation targeting. Chicago University Press, Chicago, in press Orphanides A, Williams JG (2003) The decline of activist stabilization policy: natural rate misperceptions, learning, and expectations. Federal Reserve Bank of San Francisco, FRBSF Working Paper 2003–24 Roberts JM (1997) Is inﬂation sticky? J Monetary Econ 39:173–196 Roberts JM (1998) Inﬂation expectations and the transmission of monetary policy. Board of Governors of the Federal Reserve System, mimeo

28

R. Franke

Rudd J, Whelan K (2001) New tests of the new-Keynesian–Phillips curve. Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series 2001–30; published in J Monetary Econ 52(2005):1167–1181 Tetlow R, von zur Muehlen P (2003) Avoiding Nash inﬂation: Bayesian and robust responses to model uncertainty. Federal Reserve Board, Washington DC, mimeo Thomas LB Jr (1999) Survey measures of expected US inﬂation. J Econ Perspect 13:4, 125–144 Woodford M (2003) Interest and prices: foundations of a theory of monetary policy. Princeton University Press, Princeton, Oxford Zellner A (1992) Statistics, science and public policy. J Am Stat Assoc 87:1–6 Zellner A (2002) My experience with nonlinear dynamics models in economics, Stud Nonlinear Dynamics Econometrics 6:2, 1–18

2. Harrodian Dynamics Under Imperfect Competition: A Growth-Cycle Model Hiroyuki Yoshida

Summary. This chapter considers Harrod’s knife-edge instability, which implies severe and extreme business cycles restricted by a full employment ceiling and a zero-gross-investment ﬂoor. We construct a dynamic model with imperfect competition in the output market by using the subjectivedemand curve approach. In addition, we consider technical choice by taking account of the putty–clay technology. The chapter shows the occurrence of endogenous and moderate ﬂuctuations through the Hopf bifurcation. Key words. Harrod’s knife-edge, Growth cycle, Imperfect competition, Putty–clay technology, Hopf bifurcation

1 Introduction Harrod (1939) set up the foundation of “dyamic theory” and analyzed the relation between the actual and the warranted rates of growth. His conclusion was remarkable: the actual growth path is “highly unstable,” and thus any departure from the warranted growth path stimulates a further divergence. Later, Harrod’s theory was developed by Hicks (1950), who presented a constrained business cycle model with a full employment ceiling and a zero-gross-investment ﬂoor. Although Hicks’ theory is truly essential and complete from the theoretical point of view, the ﬂuctuation observed in his model is so severe that we rarely experience it. The purpose of this chapter is to investigate the possibility of perpetual growth cycles in the Harrodian framework.1 Here we emphasize the College of Economics, Nihon University, Misaki 1-3-2, Chiyoda-ku, Tokyo 101-8360, Japan 1 There is a body of literature that argues Harrod’s growth theory by using the mathematical methods of nonlinear dynamics. See, for example, Nikaido (1990), Flaschel (1994), Yoshida (1999), and Sportelli (2000).

29

30

H. Yoshida

microeconomic behavior of ﬁrms. In particular, we introduce imperfect competition in the output market. Furthermore we consider technical choice by taking account of the putty–clay technology: ﬁrms choose the optimal level of technology for new capital equipment by solving a unit-cost minimization problem. These elements will reveal the mechanism of the cyclical growth process. The rest of the chapter is organized as follows. Section 2 explains the distinction between the short-run and long-run production functions. Section 3 sets up the formal model. In Section 4, we investigate the dynamic properties of the Harrodian dynamic model. The ﬁnal section summarizes our results.

2 Long-Run and Short-Run Production Functions We start this section by explaining the long-run and short-run production functions. The distinction between the two functions is due to the presumption of putty–clay technology, which implies ex ante substitution possibilities between capital and labor, but no such possibilities ex post. While the long-run production function represents the best suited equipment and manufacturing technique, the short-run production function prescribes current production decisions with the existing capital stock of the ﬁrm. Let k[v](t) stand for the amount of equipment of vintage v surviving at time t, and let n[v] e (t) be the labor required to operate that equipment fully. Since we assume that the capital stock depreciates at the same constant rate d, k[v](t) = e−d(t−v)k[v](v). Thus we obtain the total stock of capital at time t. t

t

−∞

−∞

K (t ) = ∫ k[v ] (t ) dv = ∫ e −δ (t − v ) I (v ) dv

(1)

where we deﬁne k[t](t) = I(t), which represents the rate of gross investment at time t. By differentiating Eq. 1 with respect to time, we obtain . K(t) = I(t) − dK(t) (2) We now introduce x[v] e (t) as the labor intensity required to operate equipment of vintage v at full capacity, or the normal labor intensity for newly installed equipment at time v. Note that for v < t, x[v] e (t) is ﬁxed, and for t = v it is variable because of the putty–clay technology. Moreover, let us denote by n[v] e (t) the amount of labor required to operate equipment of vintage v at [v] [v] [v] −d(t−v) I(v). In total we full capacity. Then we have n[v] e (t) = xe (t)k (t) = xe (t)e [v] [v] t t −d(t−v) I(v)dv, where Nn(t) is the necessary write Nn(t) = 兰 −∞ne (t)dv = 兰 −∞xe (t)e amount of employment when all the existing capital stock is fully utilized, or the normal level of employment to produce capacity output.

Harrodian Dynamics Under Imperfect Competition

31

The long-run production function is assumed to be represented by Yn(t) = F(Nn(t), K(t))

(3)

where Yn(t) is the level of capacity output at time t. What is more, we assume that the long-run production function is homogeneous of degree one, and then we can write Yn(t)/K(t) = f(xn(t)),

xn(t) = Nn(t)/K(t)

(4)

which satisﬁes f(0) = 0, f(∞) = ∞, f ′(⋅) > 0, and f ″(⋅) < 0. In reality, the ﬁrm controls the degree of capital utilization by adjusting the level of employment according to the state of the economy; consequently, the capital stock is not always fully utilized. To incorporate the above relationship between the rate of capital utilization u and the actual level of employment N, we introduce the following function 2: u = u( N / Nn ),

u ( 0) = 0,

u (1) = 1,

lim u ( N / N n ) = u > 1

N →+∞

(5)

where ¯ u represents a physical maximum rate of capacity utilization. The above function is referred to as a “utilization rate function.” Then, using Eq. 5 we can now express the short-run production function as Y Y Yn = = u ( x / xn ) f ( xn ) K Yn K

x = N /K

(6)

where Y is the actual level of output. The difference between two functions can be explained by a geometric example. We depict the long-run and short-run production functions in Fig. 1. There are two short-run production functions corresponding to two different techniques, x1n and xn2 , in that place. In addition, we ﬁnd the longrun production function (LPF) therein. For example, consider the curve SPF1, which represents the short-run production function corresponding to a particular technique x1n. Obviously, we can observe that SPF1 lies below LPF for all x, and SPF1 equals LPF at the point x1n. Accordingly, SPF1 and LPF must be tangential to each other at x1n. What is more, it is clear that the same properties are realized for all xn. We can therefore conclude that the long-run production function is the upper envelope of the family of the short-run production functions. Such a relationship between two production functions is essentially identical to that between the long-run and short-run cost functions that is explained in microeconomic theory. The envelope property of the long-run production function brings us to one further point that we must not ignore. The following condition holds: 2

Here and henceforth we omit the time index so long as no confusion arises.

32

H. Yoshida

LPF

SPF2

SPF1

x1n

xn2

x

Fig. 1. Long-run and short-run production functions

f ′ ( xn ) = u ′ (1)

f ( xn ) xn

for ∀xn

(7)

This condition means that the long-run and short-run production functions are tangential to each other at the normal employment–capital ratio xn. By replacing u′(1) with q and integrating Eq. 7 with respect to xn, we obtain f(xn) = A(xn) q A > 0,

0
(8)

which is the Cobb–Douglas production function. We take the inverse function of Eq. 5 to explicitly consider the utilization rate, which is a key variable in our model. N/Nn = G(u)

(9)

with G(0) = 0, G(1) = 1, limu→u¯ G(u) = +∞, and G′(1) = f(xn)/(xnf′(xn)) = 1/q. With the above discussion as background, we now turn to the derivation of the dynamic process of technology. Recalling the deﬁnition of xn, we obtain N n (t ) xn (t ) = = K (t )

∫

t

−∞

xe[v ] (t ) e −δ (t − v ) I (v ) dv

∫

t

−∞

e −δ (t − v ) I (v ) dv

(10)

Harrodian Dynamics Under Imperfect Competition

33

Taking the logarithmic derivative of Eq. 10 with respect to time gives . xn = g(xe[t](t) − xn(t)), g = I/K

(11)

where g represents the rate of capital accumulation. This equation represents the evolution of labor intensity with a speed g.

3 The Model We consider imperfect competition in the output market; ﬁrms are assumed to be price makers for their output. To establish a connection between the behavior of ﬁrms and the condition of the economy as a whole, it is suitable to adopt the ﬁction of the representative ﬁrm. Furthermore, we will use a subjective demand function approach. Let us consider the following constant elasticity speciﬁcation for the subjective demand function: Y = Bp−1/e ,

0<e<1

(12)

where Y is the level of output, p is the price level, and B is a scale parameter that reﬂects the expected level of demand. Note that the parameter B is a function of time. This is a point to which we shall return later. The ﬁrm produces output by operating its own capital equipment. The ﬁrm’s production decision problem concerns the optimal utilization of the existing capital stock. Strictly speaking, the ﬁrm controls u to maximize the proﬁt, p, subject to the short-run production function (Eq. 6) and the subjective demand function (Eq. 12). Indeed, by using Eqs. 6, 9, and 12, we ﬁnd that p = pY − WN = [(uf(xn)k)−e uf(xn) − WG(u)xn]K

(13)

where k = K/B and W is the nominal wage rate. The ﬁrst-order condition for a maximum is (1 − e)[uf(xn)k]−e f(xn) = WG′(u)xn

(14)

which states that marginal revenue equals marginal cost. We should note that the ﬁrst-order condition may lead to a minimum rather than a maximum. Thus, we must examine the second-order condition for proﬁt maximization.

(

d 2π WG ′xn uG ′′ =− ε+ du 2 u G′

)

so that we impose the restriction given below.

(15)

H. Yoshida

34

Assumption 1: e + uG″/G′ > 0

(16)

From the implicit function theorem, Eq. 14 can be rewritten as u = u(k, xn)

(17)

This equation means that the optimal choice of u depends on k and xn. The functional relationships above can be summarized as follows: Lemma 1: The partial derivatives of the function, u = u(k, xn), are given by uk =

∂u u ε =− <0 ∂k k ε + uG ′′ (u) / G ′ (u)

(18)

ux =

∂u u 1 − θ (1 − ε ) =− <0 ∂xn x ε + uG ′′ (u) / G ′ (u)

(19)

Proof of Lemma 1: Substituting Eq. 8 into Eq. 14 gives (1 − e)u−e A1−e (xn) q(1−e)k−e = WG′(u)xn

(20)

Then taking the logarithmic differential of the above equation, we obtain ( ε + uG ′′ (u) / G ′ (u))

du dk dx = −ε − [1 − θ (1 − ε )] n u k xn

(21)

It is therefore clear that we can obtain the derivatives in Eqs. 18 and 19. Furthermore, the signs of uk and u x are determined by Assumption 1 unambiguously. This completes the proof. 䊏 As explained in the previous section, new production techniques are introduced mainly at the time of machinery installation. Here we assume the hypothesis of cost minimization: the ﬁrm chooses xe to minimize the unit cost of production with new capital equipment at the level of u. wN + rI wG (u) xe + r = uF ( N , I ) uf ( xe )

(22)

where w is the real wage rate and r is the real rate of interest. Then the ﬁrstorder condition is given by f ( xe ) − xe f ′ ( xe ) r = f ′ ( xe ) wG (u)

(23)

Substituting Eqs. 8 and 14 into Eq. 23, we can obtain xe =

θ r G ′ (u) ( xn )1−θ = φ ( xn , u ) 1 − θ 1 − ε G (u) A

(24)

Harrodian Dynamics Under Imperfect Competition

35

In the following lemma we ﬁnd the partial derivatives of Eq. 24. Lemma 2: The partial derivatives of the function xe = f(xn, u) are computed as ∂xe xe = φx = (1 − θ ) > 0 ∂xn xn

(

(25)

)

∂xe uG ′′ (u) uG ′ (u) xe x = φu = − = Φ (u) e ∂u G ′ (u) G (u) u u

(26)

Proof of Lemma 2: Using logarithmic differentiation, we obtain

(

)

dxe dx uG ′′ (u) uG ′ (u) du = (1 − θ ) n + − xe xn G ′ (u) G (u) u

(27)

Clearly, this result leads to Eqs. 25 and 26. The proof is complete. 䊏 It must be noted that F(u) is the elasticity of labor intensity for new capital equipment with respect to capital utilization. The value of F(u) plays an important role in determining the stability properties of our dynamic system. In the next section we shall return to this point. In the Harrod model, the investment decision is independent of the saving decision. Although Harrod did not formulate the investment function, it is plausible to suppose that the amount of investment per unit of capital g = I/K depends on the degree of capital utilization in the following way: g t+∆t = g t + (∆t)b(u − 1),

b>0

(28)

The accumulation rate in the next period is determined by current rates of accumulation and capital utilization. Taking the limit of the above equation, (∆t → 0) leads to . g = b(u − 1), b>0 (29) which states that the ﬁrm increases the rate of capital accumulation if the exsiting capital stock is over-utilized and vice versa. Such an investment function has been formulated by Alexander (1950), Nikaido (1975, 1980), Okishio (1964), Silverberg et al. (1988), and Flaschel et al. (1997, Sect. 4.2) The output market is not always in equilibrium. According to the amount of excess demand in the output market, the ﬁrm modiﬁes the growth rate of B. The adjustment rule will be speciﬁed by B I −S =ω ⋅ + µ, B K

ω >0

(30)

where the additional variable m is the expected growth rate of aggregate demand. We posit the following expectation formulation with respect to m:

36

H. Yoshida

. m = (Y /Y)* (31) . where (Y /Y)* is the steady-state growth rate of output. Before closing this section, we must draw attention to the following two points. First, the community’s propensity to save s is assumed to be constant. Thus, saving S per unit of capital can be expressed as S = suf ( xn ) , K

0 < s <1

(32)

Second, we assume that there is excess supply in the labor market at all times. In other words, the Keynesian unemployment equilibrium is always realized.

4 Stability Analysis We are now in a position to set up the dynamic system. The Harrodian dynamic model described in the preceding section can be reduced to a threedimensional model in terms of g, k, and xn: . g = b(u(k, xn) − 1) (33) . (34) k = [g − d − m − w (g − su(k, xn)f(xn))]k . (35) xn = g(f(u(k, xn), xn) − xn) First, substituting . Eq.. 17 into . Eq. 29, we obtain Eq. 33. Then, Eq. 34 is obtained from k/k = K/K − B/B by making use of Eqs. 2, 30, 32. Finally, Eq. 35 follows from Eqs. 11, 17, and 24. A stationary point of system Eqs. 33–35 constitutes a steady point of the . . . economy, which is denoted by (g*, k*, x*) such that g = 0, k = 0, and xn = 0. It . . immediately follows from g = 0 and xn = 0 that the capacity is fully utilized, and the technique for new capital equipment is identical to that for the total capital stock at the steady state; u* = 1 and xe* = xn* = x*. Note that the steadystate value x* is a unique value satisfying f(1, x*) = x*. This implies that the steady-state . . output–capital ratio, (Y/K)* = f(x . *), is constant, and hence that (Y /Y)* = (K/K)* = g* − d. From Eq. 31 and k = 0, we can therefore recognize that g* = sf(x*) at the steady state. This equality is Harrod’s “fundamental equation” itself (see Harrod 1939, p. 17). This determines the warranted rate of growth which allows for steady advance with productive capacity being fully utilized. Moreover, this condition means that the output market clears at the steady state. Finally, k* is obtained by solving the proﬁt maximization condition Eq. 14, (1 − e)[u*f(xn*)k]−e f(xn*) = WG′(u*)xn*, with u* = 1 and x*n = x*.

Harrodian Dynamics Under Imperfect Competition

37

The linearization of the system Eqs. 33–35 around the steady-state point leads to the following Jacobian matrix 3:

βuk βuk ⎡ 0 ⎤ ⎢(1 − ω ) k ω suk fk ω s (uk k + uf ′ ) k ⎥ ⎢⎣ 0 gφuuk g (φuux + φx − 1)⎥⎦

(36)

Let us denote the corresponding characteristic equation by l3 + b1l2 + b2l + b3 = 0

(37)

where

(

)

1 − θ (1 − ε ) +θ ε + G ′′ / G ′ b2 = −uk(1 − w)kb − wsgkuk(1 + F(u))qf b1 = −ω suk fk + g Φ (u)

b3 = −(1 − w)kuk gqb

(38) (39) (40)

We can check the stability of the steady-state point by means of the Routh– Hurwitz conditions, which are necessary and sufﬁcient for the local asymptotic stability of the steady-state point. b1 > 0,

b3 > 0,

b1b2 − b3 > 0

(41)

For high values of the adjustment speed of B (w > 1), the instability of the steady state can be veriﬁed immediately on the grounds that b3 < 0. For the remainder of this section, we assume that 0 < w < 1. Now we can prove the following two important propositions. Proposition 1 deals with the case where F(u) > 0, while Proposition 2 investigates the system for the intermediate negative values of F(u). Proposition 1: If the adjustment speed of B is sufﬁciently small (0 < w < 1) and the elasticity of labor intensity for new capital equipment with respect to capital utilization F(u) is positive, then the steady state is locally stable. Proof of Proposition 1: It is easily checked that b1 > 0 and b3 > 0. Moreover, we must verify the sign of b1b2 − b3. Using Eqs. 38–40, we can obtain C : = b1b2 − b3 1 {[1 − θ (1 − ε )]Φ (u) + εω }[ −uk (1 − ω ) k ] β = g ( ε + G ′′ / G ′ ) − b1ω sgkuk (1 + Φ (u))θ f

(42)

The asterisk is suppressed without confusion, since all evaluations in this section refer to the steady-state point. 3

38

H. Yoshida

Clearly the sign of C is positive. This completes the proof. 䊏 Proposition 1 assures us of the stability of the warranted growth path. The local stability can be ensured in the case of 0 < w < 1 and F(u) > 0. What happens in the case where f u < 0? We can obtain the following: Proposition 2: Suppose that 0 < w < 1 and that max{−[1 − q(1 − e)], −q(e + G″/G′)} < [1 − q(1 − e)]F(u) < −we. Then there exists a positive critical value bH. The steady state is locally stable if b < bH, while it is unstable if b > bH. Furthermore, the system undergoes a Hopf bifurcation at b = bH. In the case of a super-critical bifurcation, the economy experiences perpetual and selfsustained ﬂuctuations, whereas in the case of a subcritical bifurcation the economy exhibits corridor stability, as proposed by Leijonhufvud. (1973). Proof of Proposition 2: Again, we can be fairly certain that b1 > 0 and b3 > 0 for all b > 0. Since 0 < w < 1 and [1 − q(1 − e)]F(u) < −we, we can see that the function C(b) is monotonically decreasing with respect to b. Furthermore, the function C(b) has a positive vertical intercept since b1 > 0 and −1 < F(u). We can therefore recognize that there exists a critical value bC such that C(b) = 0; C(b) > 0 for b < bC and C(b) < 0 for b > bC . It follows from what has been said that the Routh-Hurwitz conditions are satisﬁed for b < bC . Let us now turn to the issue of Hopf bifurcation. For our purpose it is sufﬁcient to examine the following conditions: (i)

b1 > 0,

b3 > 0,

C(bH) = 0

(ii) dC(b)/db| b =b ≠ 0 H

where bH is a bifurcation value of b. These conditions enable us to apply the Hopf bifurcation theorem.4 From the above discussion, it is clear that the Hopf bifurcation criterion is satisﬁed at b = bC . This proves the proposition. 䊏

5 Conclusion This chapter has developed and analyzed a Harrodian growth model with imperfect competition. In the steady-state position, the economy grows along the warranted growth path where all the existing capital stock is fully utilized. We showed in Proposition 1 that the local stability of the growth path can be guaranteed by the slow reaction of expected demand (0 < w < 1) and a positive elasticity of labor intensity for new capital equipment with respect to utilization rate (F(u) > 0). 4

On this subject, see Asada and Semmler (1995, pp. 634–635).

Harrodian Dynamics Under Imperfect Competition

39

An outline of the stabilizing mechanism can be explained as follows. Suppose that the economy is in a prosperous state with high levels of capacity utilization. At that stage the ﬁrm chooses a labor-using direction of technical change (f u > 0), which induces an increase in xn. This phenomenon causes the ﬁrm to reduce the rate of capital utilization (u x < 0), and hence cut down the rate of capital accumulation through the Harrod-type investment function. As can be seen in the above process, the warranted growth path is surrounded by centripetal forces. The stabilizing process mentioned above would be described by the following feedback chain: u ⇑ → [f u > 0] → xe ⇑ → xn ⇑ → [u x < 0] → u ⇓ → g ⇓ Judging from the above discussion, we conjecture that the warranted growth path is unstable if the ﬁrm chooses a labor-saving direction of technical change (f u < 0) in prosperity. In fact, this observation is true, as mentioned in Proposition 2. In addition, the application of the Hopf bifurcation theorem established the existence of periodic ﬂuctuations around the warranted growth path if the destabilizing force of technical change is weak, that is, the value of F(u) lies in a certain range of negative values. This ﬁnal point is the most important part of our arguement.

References Adachi H (1991) A growth model with imperfect competition (Hukanzen kyosou no seichoumoderu) (in Japanese). J Polit Econ Commer Sc (Kobe University) 164:53–77 Alexander SS (1950) Mr. Harrod’s dynamic model. Econ J 60:724–739 Asada T, Semmler W (1995) Growth and ﬁnance: an intertemporal model. J Macroecon 17:623–649 Flaschel P (1994) A Harrodian knife-edge theorem for the wage–price sector. Metroeconomica 45:266–278 Flaschel P, Franke R, Semmler W (1997) Dynamic macroeconomics: instability, ﬂuctuations, and growth in monetary economics. MIT Press, Cambridge Harrod RF (1939) An essay in dynamic theory. Econ J 49:14–33 Harrod RF (1973) Economic dynamics. Mcmillan, London Hicks JR (1950) A contribution to the theory of the trade cycle. Clarendon Press, Oxford Leijonhufvud A (1973) Effective demand failures. Swedish Econ J 75:27–48 Nikaido H (1975) Factor substitution and Harrod’s knife-edge. Z Nationalökonomie 35:149–154 Nikaido H (1980) Harrodian pathology of neoclassical growth. Z Nationalökonomie 40:111–134 Nikaido H (1990) A model of cyclical growth. J Tokyo Int Univ Dept Econ 3:1–9

40

H. Yoshida

Okishio N (1964) Instability of Harrod–Domar’s steady growth. Kobe Univ Econ Rev 10:19–27 Silverberg G, Dosi G, Orsenigio L (1988) Innovation, diversity and diffusion: a selforganization model. Econ J 98:1032–1054 Sportelli MC (2000) Dymamic complexity in a Keynesian growth-cycle model involving Harrod’s instability. J Econ 71:167–198 Yoshida H (1999) Harrod’s “knife-edge” reconsidered: an application of the Hopf bifurcation theorem and numerical simulations. J Macroecon 21:537–562

3. Endogenous Technical Change: The Evolution from Process Innovation to Product Innovation Gang Gong

Summary. A nonlinear growth model is presented that introduces technical innovations within a Harrodian growth framework. We ﬁrst introduce product innovation into a standard Harrodian model. This resolves the knife-edge problem, whereas irregular growth cycles could occur with excess capacity. We then introduce process innovation. Bifurcation analysis shows that the model permits different structurally stable dynamics when the parameters go through some critical levels. Such a structural change in dynamics reﬂects an essential feature of endogenous technical change in the long run: the evolution from process innovation to product innovation. Key words. Process innovation, Product innovation, Knife-edge and bifurcation

1 Introduction The last 15 years have seen the development of new growth theory inspired by the seminar papers of Romer (1986, 1990) and Lucas (1988). The central argument of new growth theory which makes it different from the old growth theory1 is that technical innovation is endogenous. Therefore, their models are often considered to be the endogenous growth model.2 A special contribution to new growth theory was made by Romer (1990), in which technical innovation is explicitly formulated as product innovation. Product innovation refers to the type of innovation that introduces new (including upgraded) products. This is different from the other type of innovation, process innovation, by which new and improved production methods School of Economics and Management, Tsinghua University, Beijing 100084, PR China. 1 See Harrod (1939), Solow (1956), and Kaldor (1957). 2 For the most recent extensive review of endogenous growth literature, see Greiner et al. (2005).

41

42

G. Gong

(often through specialization and mechanization) are introduced. The economic impacts of these two innovations are also different. Process innovation increases the factor productivity, whereas the major effect of product innovation is the emergence of new markets and sector re-organization. Although the idea of such distinction can be found as early as in Schumpeter (1934), 3 it was only with Romer (1990) that product innovation was formally introduced into a model. However, Romer’s 1990 formulation with regard to product innovation cannot be considered as satisfying: it simply gives a new interpretation to the Solow residual, i.e., an expansion of product horizon instead of total factor productivity as expressed in other models. Therefore, we essentially cannot see the economic impact brought about by product innovation compared to process innovation. Also, the new growth theory cannot express the stylized factor of productivity growth slowdown, as discussed intensively in the mid-1990s. One of the difﬁculties arising from both old and new growth theories is that they do not distinguish between innovation and invention, the two apparently different concepts that had been emphasized by Schumpeter almost 70 years earlier. Inventions are new knowledge, new patents, and new ﬁndings. They provide the technological resource for innovations, which are supposed to be carried out by an entrepreneur. However, “as long as they are not carried into practice, inventions are economically irrelevant.” (Schumpeter 1934, p. 88) Yet in the current literature, all inventions are assumed to be automatically carried into innovations. This chapter tries to contribute to the research in technical innovation by distinguishing innovation from invention. We shall still follow the idea of a “cumulative cause of knowledge,” as proposed by new growth theorists, so that the technological resources for innovation are not scare. This will allow us to be focused on the incentive for an entrepreneur to carry out innovation. We assume that the incentives are different depending on the current stage of business cycles, in order to introduce the two different types of innovation (product vs. process). In particular, when the economy is in excess demand, the entrepreneur has an incentive to carry out process innovation to speed up its production, but not product innovation because the existing product is marketed well. On the other hand, the reverse will hold when the economy is in recession (or excess supply). According to Schumpeter (1934, p. 66), technical innovation can be organized into ﬁve categories: the new product, the new method of production, the opening up of a new market, the utilization of new raw materials, and the sector reorganization of the economy. Modern economists have grouped these into two types: process innovation and product innovation (see, e.g., Scherer 1984; Pavitt 1984; Nelson and Winter 1982).

3

Endogenous Technical Change

43

This consideration about innovation incentives means that the model we construct here cannot be an equilibrium model. Therefore, the model will be built on a Harrodian framework. One of the interesting ﬁndings of this chapter is that the knife-edge problem that is supposed to occur in a Harrodian type of model can be resolved when we introduce technical innovation. The rest of this chapter is organized as follows: Section 2 formulates a prototype model where technical innovations are not introduced, and hence the knife-edge problem can be revealed. Section 3 introduces product innovation into the prototype model. Section 4 deals with process innovation by a bifurcation analysis of the model. Finally, Sect. 5 concludes the argument. The mathematical proof of the propositions in the text is provided in the appendix.

2 A Prototype Model The prototype model consists of the following structural equations: It = g(ut−1)Kt−1, ut = Yt/Y

g′ > 0

p t

(1) (2)

Y pt = Kt/v,

v>0

(3)

Yt = It/s,

s>0

(4)

Kt = (1 − d)Kt−1 + It−1,

1>d>0

(5)

Here, It is investment, Kt is capital stock, ut is capacity utilization, Yt is actual output, Y pt is potential output, s is the marginal and average propensity to save, v is the capital–output ratio, and d is the depreciation rate. The meanings of these equations are all straightforward. Equation 1 is an investment function, Eqs. 2 and 3 are the deﬁnitions of capacity utilization and potential output, respectively, Eq. 4 explains the determination of output by a multiplier, and Eq. 5 describes the dynamics of capital stock. For the investment function, we assume g(⋅) to be nonlinear, which may take the form shown in Fig. 1. This nonlinear form resembles the investment functions in Kaldor (1940) and Goodwin (1951). If we express ut−1 in terms of Eq. 2, and further Y pt−1 in terms of Eq. 3, Eq. 1 can also be written as It = G(Kt−1,Yt−1). This is Kaldor’s investment function as deﬁned in Chang and Smyth (1971). Let kt denote the growth rate of capital stock, and hence Kt = (1 + kt)Kt−1. Substitute this into Eq. 3, we obtain Y pt = (1 + kt)Kt−1/v. Meanwhile, expressing Yt in terms of Eq. 4, we obtain from Eq. 2

44

G. Gong

I t K t −1

Fig. 1. Standard investment function

ut =

It / s (1 + kt ) K t −1 / v

(6)

From Eq. 5 kt = −d + It −1 / Kt −1 = −d + ( s / v )ut −1

(7)

The warranted growth rate in this model is equal to −d + s/v when ut−1 is set to 1.4 Now expressing It/Kt−1 in terms of Eq. 1 and kt in terms of Eq. 7, Eq. 6 can be rewritten as ut = F(ut−1)

(8)

with F (u) =

g (u) ( s / v )(1 − d ) + ( s / v )2 u

(9)

By the restriction of nonnegative investment, Eqs. 8 and 9 should be subject to g(ut−1) > 0. Otherwise, It will be 0 and so will ut. The trajectory ut can thus be derived by iterating the map F. The feasible interval of this trajectory is [0, +∞). Once ut is determined, the trajectory kt is also determined through Eq. 7. Therefore, understanding ut is sufﬁcient to understand the dynamics of kt. We remark that Harrod did not consider depreciation, and therefore his warranted growth rate is s/v.

4

Endogenous Technical Change

45

The ﬁxed points of F can be resolved as follows. Let ut = ut−1 = ¯ u , where ¯ u is regarded to be a steady state. Thus, from Eqs. 8 and 9, we obtain ¯) = g(u ¯), where y(u y(u) = u(1 − d)s/v + u2(s/v)2

(10)

Figure 2 provides a graphical analysis of the possible solutions of ¯ u. The curve y(u) can be justiﬁed by observing u ≥ 0, y(u) ≥ 0, y′(u) > 0, and y″(u) > 0. One ﬁnds that there are two possible equilibriums (or steady u 2 , with ¯ u1 > ¯ u 2 . Proposition 1 regards the stability of these states): ¯ u 1 and ¯ two equilibriums. Figure 3 provides a graphical analysis of trajectory ut. Proposition 1: Let F be deﬁned as in Eq. 9. Then, ¯2) > 1, F′(u

¯1) > −1 1 > F′(u

Clearly, Harrod’s unstable equilibrium corresponds to ¯ u 2 in this model. This steady state is close to, and possibly equal to, 1, indicating demand–supply ¯2 , +∞), it will move asymptotically to ¯ u 1. If it equilibrium. If ut starts in (u starts in [0, ¯ u 2), the trajectory will be forward to its lower bound 0. However, u 2 , the trajectory, as Harrod if the initial ut happens to be at the knife-edge ¯ expected, will stay on ¯ u 2.

3 Introducing Product Innovation We have presented our prototype model where investment follows the conventional principle and is considered to ﬁll only the expected capacity shortage. Such a conventional principle indicates that investment must be positively related to capacity utilization. Apparently it is this principle that causes the knife-edge problem. However, investment not only creates the capacity of existing products, but also the capacity of new products. Next, we shall introduce a model of product innovation. The model is built on the

Fig. 2. The steady states of the prototype model

46

G. Gong

Fig. 3. The orbits of ut in the prototype model

following considerations. First, in each period, some technological resources for new products are always available for ﬁrms. This might be due to the idea of a “cumulative cause of knowledge,” as proposed by new-growth theorists. Second, although the technological resources for product innovation are always available, in a given period, the resources are limited despite the fact that they accumulate over time. Third, for each ﬁrm in the economy, there exists a threshold (in terms of capacity utilization) below which it will consider investing in a new product. When the demand is strong for an existing product, or the capacity utilization is still high, there is not much incentive for the entrepreneur to promote the new product. However, when the entrepreneur ﬁnds that his existing capacity is excessive or below some threshold, he then consider introducing the new product in order to continuously establish his leadership in the economy. We now begin presenting our model. Assume that there exist n ﬁrms in the economy, among which m ﬁrms, m ≤ n, have a technological resource that can either upgrade existing product or produce a completely new product. To simplify our analysis, we shall further assume that each of these m ﬁrms have one, and only one, such technology, which requires a certain amount of investment. However, the ﬁrms may have different considerations about when they should invest in their new products. For instance, ﬁrm A may consider investing when its capacity utilization is less than 87%, while ﬁrm B will invest when its capacity utilization is less than 85%. Let q i be the required amount of investment for i to carry its product innovation. Then its investment in new product, denoted as I ni, may be written as ⎧θ Iin = ⎨ i ⎩0

µi < µi* otherwise

(11)

Endogenous Technical Change

47

Here, mi is the capacity utilization for ﬁrm i, and m*i is the threshold below which i will invest in its new product. Given Eq.11, the aggregate investment in new products, denoted as I n, can be written as m

I n = ∑ I in

(12)

i =1

Equations 11 and 12 form a map from Rn to R. This gives rise to the derivation of In ∈ R from m ∈ Rn, where m is the vector with its i-th coordinator mi. We shall deﬁne this map as Q: Rn → R. However, our objective is to derive a function H : R → R that relates the aggregate capacity utilization u ∈ R to the aggregate investment in new products I n ∈ R. It is reasonable to assume that a measure of aggregate capacity utilization is a function (most likely a weighted average) of n individual ﬁrms’ capacity utilization. We denote this function by U: Rn → R. It has the property ∂U/∂mi ≥ 0. We shall now make a somewhat strong assumption. We shall assume that mi = mj,

for all i ≠ j,

i, j = 1,2, . . . , n

(13)

In other words, the ﬁrms in the economy have the same level of capacity utilization. The economic meaning of this assumption can be roughly expressed as follows: the ﬁrms in the economy share the impact of a shock (either positive or negative) to approximately the same extent. Although this is a strong assumption, it will allow us to exclude some intractable situations. Suppose there are two ﬁrms A and B in the economy. Without our assumption, it is quite possible that a measure of 85% of aggregate capacity utilization may be a correspondence of the combination of either 80% and 90% or 87% and 83% capacity utilizations for A and B, respectively. Since each combination, or a point in R2 , may indicate a certain amount of investment via Q formed by Eqs. 11 and 12, our aggregate investment function could be multivalued at u = 85%. Therefore, the assumption could help us to exclude this type of intractable situation. Finally, we shall also remark that this assumption is not a necessary condition for the proposition below to hold. However, it will makes our derivation of the proposition much simpler. Proposition 2: Given U and Q, there exists an interval [u2 , u1] such that H has the following properties: — I n = 0 when u > u1; — ∂I n/∂u ≤ 0 and I n ≤ Σmi=1q i when u2 ≤ u ≤ u1; — I n ≤ Σmi=1q i when u < u2 . It should be noted that an investment function H derived in this way is not smooth and continuous. However if n and m are large, we could treat H as smooth and continuous within the interval [u2 , u1].

48

G. Gong

Given our discussion on H, we now introduce product innovation into our conventional investment function. The new investment function (Fig. 4) can be written as It = [g(ut−1) + h(ut−1)]Kt−1,

h′ ≤ 0

(14)

Above, we let h(ut−1) ≡ H(ut−1)/Kt−1. The nonlinear form of h(⋅) shown in Fig. 4 reﬂects the properties that we have just discussed on the product innovation H. Given h(⋅), F can be rewritten as F (u) =

g (u) + h (u) (1 − d )( s / v ) + u ( s / v )2

(15)

Figure 5 provides a graphical analysis of the ﬁxed points of this new F. As shown in Fig. 5, a new ﬁxed point ¯ u 3 emerges with the introduction of h.

Fig. 4. The new investment function

Fig. 5. The steady states of the extended model

Endogenous Technical Change

49

Fig. 6. The orbits of ut in the extended model

Moreover, the map F (Fig. 6) has a new hump between ¯ u 2 and ¯ u 3. This makes the orbits starting at less than ¯ u 2 no longer forward to the lower bound 0. u 2 that have been proved in Sect. 2 should still The properties of ¯ u 1 and ¯ be valid. Thus we only need to consider ¯ u 3. Elsewhere (Gong 2001), I have proved that for some probable functional forms of g(⋅) and h(⋅), irregular cycles can occur in the neighborhood of ¯ u 3.

4 Bifurcation Analysis In this section, we shall assume that d and the parameters in g(⋅) and h(⋅) are all ﬁxed, and therefore we concern ourselves only with the variation in s/v denoted as l. The bifurcation analysis in this section will provide a mathematical preparation for the major argument developed in the next section. Given the parameter d, Eq. 10 indicates that ∂y/∂l > 0. Thus, with the variation of l , the motion of y(e) and hence of F(e) can be illustrated as in Figs. 7 and 8, respectively. When l is sufﬁciently small, F has only one ﬁxed point ¯ e 1. With the e 2, ¯ e 3, increase in l , new ﬁxed points are born, starting with ¯ e 4 , followed by ¯ e 3 remains. and then ¯ e 5. When l is sufﬁciently large, only ¯

50

G. Gong

Fig. 7. The motion of y(e) with the increase in l

Fig. 8. The motion of F with the increase in l

Apparently, ¯ e 4 and ¯ e 5 are not structurally stable because they will vanish by an arbitrarily small perturbation in l. Denoting l1 and l2 , with l1 < l2 , e 5, respectively, as the values of l corresponding to the occurrence of ¯ e 4 and ¯ 5 e1 bifurcation occurs at l1 and l2 . When l < l1, all orbits are asymptotic to ¯ since ¯ e 1 is attracting and is the only steady state. When l > l2 , all orbits e 3 is repelling) or be asymptotic to ¯ e 3 (if ¯ e 3 is will ﬂuctuate around ¯ e 3 (if ¯ ¯5) = 1, as indicated by Fig. 9, l2 is a saddle-node bifurcation, as deﬁned in Since F′(e Devaney (1989) and Guckenheimer and Holmes (1983), or fold bifurcation, as deﬁned in Lorenz (1989). l1, however, is formally not a saddle-node bifurcation because F may not ¯4) is not necessarily equal to 1. Nevertheless, it plays the be smooth at ¯ e4 , and hence F′(e same role as a saddle-node bifurcation. 5

Endogenous Technical Change

51

Fig. 9. The equilibrium locus

attracting). When l2 > l > l1, F has three ﬁxed points, ¯ e 1, ¯ e 2 , and ¯ e 3. Whether 6 ¯ ¯ an orbit moves towards e 1 or e 3 depends on the initial e. The analysis above allows us to describe the locus of equilibrium points, as in Fig. 9. As l increases from very small values (less than l1), the growth path around which the economy ﬂuctuates gradually moves along the equilibrium locus from the upper leaf and jumps downward at l2 .7

5 Conclusion: The Evolution from Process Innovation to Product Innovation The previous section has shown that our extended model permits different structurally stable dynamics when l goes through different bifurcations. Consider a situation in which the economy persistently remains in excess demand. This is possible when l is small (i.e., less than l1). In this case, there is an incentive for the economy to change l in the following directions. First, the saving ratio s will increase because entrepreneurs want to accumulate More bifurcations are permitted when l > l1. One possible bifurcation is the periodicdoubling bifurcation, which concerns the attractability of ¯ e 3 . Another possibility, which plays an important role in the system of chaotic behavior, is the homoclinic bifurcation (see Devaney, 1989, and Guckenheimer and Holmes, 1983, for the deﬁnitions of these two bifurcations). These two bifurcations, especially the homoclinic bifurcation, are not easily detectable. Their existence depends on d and the parameters in g(⋅) and h(⋅). However, the major argument developed in the following section does not necessarily rely on the detection of these two bifurcations. Thus, for conservation of space, we do not provide a formal detection of their existence. 7 This somewhat resembles a fold catastrophe in a continuous time-differential system. See the deﬁnition of catastrophe for a continuous time system in Varian (1979) and Lorenz (1989). 6

52

G. Gong

more capital to create more capacity. Second, v will decline since the persistent capacity shortage provides an incentive for entrepreneurs to carry out technical change in the form of process innovation. As expected, this will increase productivity and hence reduce v (recall that v is an inverse measure of capital productivity). Both variations indicate an increase in l and thus drive the economy to move downwards along the equilibrium locus, as in Fig. 9. On the basis of previous bifurcation analysis, we suggest the following evolution of a capitalist economy in the long run. Historically, the capitalist economy experienced a period of excess demand. This was the early stage of the capitalist economy, where the production process was in the form of craft production (Nell 1992). Nevertheless, this period produced a strong incentive for the capitalist to carry technical innovation in terms of process innovation, and therefore increased the productivities of both labor and capital. The increase in capital productivity caused a decline in the capital–output ratio v and hence an increase in l. With the increase in l , the economy gradually moved away from the state of excess demand into the one characterized by excess capacity. In between, there is a period in which the direction of movement is determined by the initial condition. This implies that exogenous shocks could be important in determining the state of the economy. However, this period cannot last, since as long as the economy remains in persistent excess demand, the capital–output ratio will decline. When the capital–output ratio moves downward to the level where l < l2 , the economy will ﬁnally evolve to the state where persistent excess capacity is dominant. Given such a state of excess capacity, the form of technical innovation is also transformed. The emergence of new products and new markets become the main form of technical innovation. Accordingly, the direction of movement of productivity becomes unclear due to the properties of the new products that the economy introduces. This also explains the slowdown of productivity growth that occurred in US and other Western countries.

Mathematical Appendix Proof of Proposition 1 From Fig. 2, both ¯ u 1 and ¯ u 2 belong to the interval satisfying g(u) > 0. Thereu 2, fore, in the neighborhood of ¯ u 1 and ¯ F ′ (u ) =

g ′ (u) − u ( s / v )2 (1 − d )( s / v ) + u ( s / v )2

where ¯ u stands for either ¯ u 1 or ¯ u 2 . Let

(A1)

Endogenous Technical Change

¯) − y′(u ¯) ∆ ≡ g′(u

53

(A2)

where y(u) is given by Eq. 10 and hence ¯) = (s/v)(1 − d) + 2u ¯(s/v)2 y′(u

(A3)

¯). We Substitute Eq. A2 into Eq. A1 and then use Eq. A3 to interpret y′(u obtain F ′ (u ) =

∆ +1 (1 − d )( s / v ) + u ( s / v )2

(A4)

¯2) < g′(u ¯2) and hence ∆ > 0 at ¯ u 2 . We thus prove that From Fig. 2, y′(u ¯2) > 1. Also from Fig. 2, y′(u ¯1) > g′(u ¯1), and thus ∆ < 0 at ¯ u 1. It follows F′(u ¯1) < 1. Next we prove F′(u ¯1) > −1. that F′(u ¯1) → 0 and hence ∆ → y′(u ¯1). From Consider ¯ u 1 → +∞. In this case, g′(u ¯1) tends to its lower bound. We thus Eq. A4, this further indicates that F′(u obtain − y ′ (u1 ) inf F ′ (u 1) = lim ⎡⎢ + 1⎤⎥ u1 →+∞ ⎣ (1 − d )( s / v ) + u ( s / v )2 ⎦ 1 u1 ( s / v )2 ⎡ ⎤ = lim ⎢ − u1 → +∞ ⎣ (1 − d )( s / v ) + u ( s / v )2 ⎥ ⎦ 1 = −1

䊏

Proof of Proposition 2 Deﬁne u1 to be the aggregate capacity utilization that is equal to the maximum threshold Max(mi). Then when u > u1, no ﬁrm will invest in new products, indicating that In = 0. Deﬁne u2 to be the aggregate capacity utilization that is equal to the minimum threshold Min(mi). Then when u < u1, all ﬁrms in m will invest. However, due to the limitation of technological resources, there is a maximum amount of in investment in the economy, which is equal to Σmi=1q i. When u2 ≤ u ≤ u1, a small reduction in u may not induce a ﬁrm to invest, indicating that ∂In/∂u = 0. However, when the reduction is large, passing some threshold, some ﬁrms will invest, indicating that ∂In/∂u < 0. Again the total amount of investment will not exceed 䊏 Σmi=1q i. Acknowledgments. The author would like to thank Robert Solow, Edward Nell, Peter Flaschel, and Peter Scott for their comments on various earlier versions of the manuscript. Their critique and suggestions brought many important improvements. Of course, the usual disclaimer applies.

54

G. Gong

References Chang WW, Smyth DJ (1971) The existence and persistence of cycles in a non-linear model: Kaldor’s 1940 model re-examined. Rev Econ Stud 38:37–46 Devaney RL (1989) An introduction to chaotic dynamic systems, 2nd edn. AddisonWesley, Redwood Gong G (2001) Product innovation and irregular growth cycles with excess capacity. Metroeconomic 52:428–448 Goodwin RM (1951) The nonlinear accelerator and the persistence of business cycles. Econometrica 19:1–17 Greiner A, Semmler W, Gong G (2005) The forces of economic growth: theory and time series evidence. Princeton University Press, Princeton Guckenheimer J, Holmes P (1983) Nonlinear oscillations, dynamic systems and bifurcations of vector ﬁelds. Springer, New York Harrod RF (1939) An essay in dynamic theory. Econ J 49:14–33 Kaldor N (1940) A model of the trade cycle. Econ J 50:78–92 Kaldor N (1957) A model of economic growth. Econ J 66:591–624 Lorenz HW (1989) Nonlinear dynamical economics and chaotic motion. Springer, New York Lucas RE (1988) On the mechanics of economic growth. J Monetary Econ 22:3–42 Nell E (1992) Transformational growth: from Say’s law to the multiplier. In: Nell E (ed) Transformational growth and effective demand. New York University Press, New York Nelson RR, Winter SG (1982) Evolutionary theory of economic change. Harvard University Press, Cambridge Pavitt P (1984) Sector pattern of change. Res Policy 13:343–373 Romer P (1986) Increasing return and long-run growth. J of Polit Econ 94:1002–1037 Romer P (1990) Endogenous technical change. J Polit Econ 98:71–105 Scherer FM (1984) Innovation and growth: Schumpeterian perspectives. MIT Press, Cambridge Schumpeter JA (1934) The theory of economic development: an inquiry into proﬁts, capital, credit, interest, and business cycle. Harvard University Press, Cambridge Solow RM (1956) A contribution to the theory of economic growth. Q J Econ 70:65–94 Varian HR (1979) Catastrophe theory and business cycle. Econ Inquiry 17:14–28

4. Tobin’s q and Investment in a Model with Multiple Steady States Mika Kato1, Willi Semmler2, and Marvin Ofori3

Summary. This chapter considers a simple dynamic investment decision problem of a ﬁrm where adjustment costs have capital size effects. This type of setting possibly results in multiple steady states, thresholds, and a discontinuous policy function. We study the global dynamic properties of the model by employing the Hamilton–Jacobi–Bellman method and dynamic programming that help us in the numerical detection of multiple steady states and thresholds. We also explore the model’s implications concerning the effects of aggregate demand, interest rates, and tax rates. Finally, an empirical study on the ﬁrm size distribution is provided using US ﬁrm-size data. We utilize two different approaches, Kernel density estimation and Markov chain transition matrix, to study an ergodic distribution. Our results suggest a twin-peak distribution of ﬁrm size in the long run, which empirically supports the theoretical conjecture of the existence of multiple steady states. Key words. Adjustment costs, Multiple steady states, Global dynamics

1 Introduction In the theory of investment, a ﬁrm maximizes the discounted value of its future net cash ﬂows. What the ﬁrm wants to know is the optimal investment schedule over time. Generally speaking, by solving a ﬁrm’s optimization problem, we want to obtain a policy function which tells us the optimal Department of Economics, Howard University, ASB-B-319, Washington, DC 20059, USA 2 Center for Empirical Macroeconomics, Bielefeld and New School for Social Research, New York, 65 Fifth Ave, New York, NY 10003, USA 3 University of Bielefeld, 33502 Bielefeld, Germany 1

55

56

M. Kato et al.

investment policy corresponding to each level of capital stock. This policy function has been perceived to be continuous. Here, however, we demonstrate the possibility of a discontinuous policy function. The key factor is to introduce adjustment costs with capital size effects, whereby the growth rate of capital is the source of adjustment costs. Interestingly, a recent work by Feichtinger et al. (2000) provides a very simple investment model of this type. The purpose of our study is to solve a ﬁrm’s dynamic investment decision problem where multiple steady states and a discontinuous investment strategy may arise. We study the global dynamic properties of the model, and pursue a numerical detection of the threshold. It should be noted that economists have long been interested in studying the implication of adjustment costs on the dynamics of investment. An earlier version of such an investment study was presented by Jorgenson (1963). In his study, the optimal path of the capital stock for an exogenously given output is derived. The optimal path of investment rate is determined only when assuming a distributed lag function. Those necessary ad hoc assumptions were recognized by Lucas (1967), Gould (1968), Uzawa (1968, 1969), and Treadway (1969). Their solution was to introduce the adjustment costs. Generally speaking, there are two types of adjustment cost: those with and without capital size effects. Lucas (1967), Gould (1968), and Treadway (1969) developed the models with adjustment costs without size effects, whereas the models with adjustment costs with size effects have been studied by Uzawa (1968, 1969) and Hayashi (1982). Their work was stimulated by the earlier work on adjustment costs by Eisner and Stroz (1963). Our model focuses on the role of adjustment costs with capital size effects. As it turns out, the capital size effects are central to the generation of multiple steady states and the associated interesting dynamic properties. In addition to historydependence, both continuous and discontinuous policy functions can arise. Recently, Feichtinger et al. (2000), using a simple framework with adjustment costs with size effects, discussed the case of multiple steady states. Our model here is crucially based on their model, but focuses more on the study of global dynamics using the Hamilton–Jacobi–Bellman (HJB) method, stresses the analysis of comparative dynamics and the policy implications, and explores the empirical support of the model’s prediction on the ﬁrm size distribution. The most important economic implication of the multiple steady states is that it has the property of history-dependence. History-dependence arises if solution paths converge toward distinct attractors depending on the initial conditions. The existence of two or more stable steady states implies the existence of thresholds. At a threshold, the ﬁrm will not be able to decide between investment strategies which converge toward one or the other stable steady state.

Model with Multiple Steady States

57

Feichtinger et al. (2000) solve their model by using Pontryagin’s maximum principle and studying the local properties of each steady state. Instead, we focus on comparing the values of the candidate paths and derive the global solution. We use the HJB method to derive a global value function and pave the way to detecting thresholds numerically. The remainder of the chapter is organized as follows. Section 2 gives a brief review of the literature on adjustment costs in the investment theory. Section 3 presents a traditional model with adjustment costs with no capital size effects, where we will obtain a unique and continuous policy function. In other words, the optimal investment rate is continuous in Tobin’s q. In Sect. 3, we present the model with capital size effects. Using the HJB method, the global value function and the threshold will be derived. The policy function can be both continuous and discontinuous. Numerical simulations are attached at the end of Sect. 3. Section 4 explores the model’s implications concerning the effects of aggregate output (booms and recessions), interest rates, and tax rates. Sections 5 and 6 provide an empirical study on ﬁrm size distribution using the US ﬁrm size data. We utilize two different approaches, Kernel density estimation and Markov chain transition matrix, to obtain an ergodic distribution. Our results suggest the twin-peak distribution of ﬁrm size in the long run, which can be viewed as an empirical support of our theoretical model.

2 The Benchmark Model Without Size Effects We ﬁrst present a traditional model without size effects. The model used here has been developed by Abel (1982), Hayashi (1982), and Summers (1981), and is known as the q theory model of investment. Generally speaking, this type of model has a unique equilibrium and a continuous policy function. The model is represented in Eqs. 1–3. ∞

max ∫ e − rt {R ( K ) − I − A ( I )} dt

(1)

. s.t. K = I − dK;

(2)

I

0

K0 = given

where R(0) = 0, R″(K) < 0, A(0) =0, A′(I) > 0 for I > 0, A′(0) < 0 for I < 0, A″(I) > 0 (3) R(K) is a representative ﬁrm’s revenue function and K(t) is a representative ﬁrm’s capital stock. I(t) is the ﬁrm’s investment. We also assume that the purchase price of a unit of investment goods is 1, and thus the cost of purchase investment goods is I. A(I) represents simply adjustment costs which depend only on the ﬁrm’s investment I. The depreciation rate of capital stock is d.

M. Kato et al.

58

For more speciﬁc results, we use the following speciﬁc functions which satisfy Eq. 3: R(K) = aK − bK2

(4)

A(I) = cI

(5)

2

We employ the HJB method to study the analytical solution of this problem. The HJB equation for the present model has the form rV ( K ) = max [ R ( K ) − I − A( I ) + V ′ ( K )( I − δ K )] I

(6)

Solving ∂[•]/∂I = 0 gives the ﬁrst order condition −1 − A′(I) + V′(K) = 0

(7)

Note also that V′(K) is equivalent to Tobin’s q, Tobin (1969). Therefore, we have the following rule for optimal investment: q (t ) > 1 ⎧I > 0 ⎪ I (t ) = A′ (q − 1) ; ⎨I = 0 for q (t ) = 1 ⎪⎩I < 0 q (t ) < 1 −1

(8)

We can interpret q as the market value of a unit of capital. Since we assume that the purchase price of a unit of capital is 1, the economic interpretation of Eq. 8 is that a ﬁrm invests if the market value of capital exceeds the purchase price of capital, and disinvests if the market value of capital is less than the purchase price of capital. With speciﬁc functions, the optimal investment policy reads I=

1 (V ′ ( K ) − 1) 2c

(9)

If e. is a steady-state level of capital stock, then from Eq. 2 I − de = 0

(10)

and, at the steady state, we obtain rV(e) = R(e) − de − A(de) V ′ (e ) =

R ′ (e ) r +δ

(11) (12)

Thus, from the ﬁrst-order condition in Eqs. 7 and 2, the steady-state e. satisﬁes −1 − A′ (δ e ) +

R ′ (e ) r +δ

(13)

Model with Multiple Steady States

59

Using speciﬁc functions, Eq. 13 becomes −1 − 2cδ e +

a − 2be =0 r +δ

(14)

Therefore, we have a unique steady state e=

a − (r + δ ) 2cδ (r + δ ) + 2b

(15)

To establish a policy function, we construct a phase diagram in {I, K} space. There exists a unique global stable manifold associated with a unique steady state. The global stable manifold in {I, K} space is equivalent to the policy function, and thus the policy function is continuous, as Fig. 1 shows.

3 The Model with Adjustment Cost and Size Effects Here we follow the model developed by Feichtinger et al. (2000). The model is a simple dynamic investment model with relative adjustment costs with capital size effects. The simplest example is an adjustment cost function of investment to capital ratio, as in Feichtinger et al. (2000). Multiple steady states and a discontinuous policy function may arise depending on parameters. We employ the HJB method again to solve the problem with the aim of obtaining a global value function. We want to detect the threshold

I

I =0

K =0

0 Fig. 1. Continuous policy function

e

K

M. Kato et al.

60

numerically when we construct a policy function. The policy function can be both continuous and discontinuous depending on the parameters. Consider a ﬁrm acting to maximize the present value of the sum of future net cash ﬂows. ∞

{

max ∫ e − rt R ( K ) − I − A I

0

. s.t. K = I − dK;

( KI )}dt

K0 = given

(16) (17)

where R(K) is a revenue function, K is a capital stock, I is investment, A(I/K) is adjustment costs with size effects, i.e., the size of capital stock K affects adjustment costs, d is a depreciation rate, and r is a discount rate. Here, by replacing the expression I/K by u, our new control variable is u ∞

max ∫ e − rt {R ( K ) − uK − A(u)} dt

(18)

. s.t. K = uK − dK;

(19)

I

0

K0 = given

We assume that R(K) and A(u) are continuously twice-differentiable, and R(0) = 0, R(0) > 0, R″(K) < 0 for all K

(20)

A(0) = 0, A′(0) = 0, A′(u) > 0 for u > 0, A′(u) < 0 for u < 0, A″(u) > 0 for all u (21) We also assume that R′(0) > r + d

(22)

so that the trivial solution u = 0 for all t is excluded. For more speciﬁc results, let us assume that revenue and adjustment cost functions are quadratic. R(K) = aK − bK2

(23)

A(I) = cI2

(24)

where a, b, c > 0. The HJB equation for the present model has the form rV ( K ) = max [ R ( K ) − uK − A (u ) + V ′ ( K ) (uK − δ K )] I

(25)

Step 1. Compute the steady states for the stationary HJB equation. Solving ∂[•]/∂u = 0 gives the ﬁrst order condition −K − A′(u) + V′(K)K = 0

(26)

Again, it is important to notice that V′(K) is equivalent to Tobin’s q. The following rule of optimal investment is derived:

Model with Multiple Steady States

⎧u > 0 ⎪ u (t ) = A′ [(q − 1) K ]; ⎨u = 0 ⎪⎩u < 0 −1

for

q (t ) > 1 q (t ) = 1 q (t ) < 1

61

(27)

Comparing the investment rule under the relative adjustment cost with size effects, i.e., Eq. 27 with Eq. 8, the rule under the adjustment cost with no size effects, the only difference is that the capital stock K appears in the function u in a multiplicative way in Eq. 27. This implies that the ﬁrm with a larger capital stock has a higher incentive to invest, and an increasing return to scale exists locally. This can be understood as the source of multiple steady states. With speciﬁc functions Eqs. 23 and 24, Eq. 27 can read u=

1 (V ′ ( K ) − 1) K 2c

(28)

If e. is a steady state, then from Eq. 19 ue − de = 0

(29)

Since for any positive steady state e > 0 and u = d rV(e) = R(e) − de − A(d)

(30)

1 V ′ (e ) = [R ′ (e ) − δ ] r

(31)

Thus, the ﬁrst-order condition Eq. 26 at the positive steady state e > 0 becomes 1 −e − A′ (δ ) + [R ′ (e ) − δ e] = 0 r

(32)

From the speciﬁc functions R′(K) = a − 2bK

(33)

A′(u) = 2cu

(34)

Therefore, we obtain positive steady states from the condition Eq. 32 1 −e − 2cδ + [a − 2be − δ ]e = 0 r

(35)

Thus,

{

u=0 u =δ

for for

e=0 e>0

(36)

M. Kato et al.

62

and we have three steady states 0 ⎧ ⎪ e = ⎨ a − r − δ ± (a − r − δ )2 − 16bcδ r ⎪⎩ 4b

(37)

Note that two positive steady states exist if a − r − δ > 4 bcδ r . Step 2. Solve the dynamic HJB equation starting from the equilibrium candidates. From the optimal investment rule Eq. 27, the satisfactory HJB equation is rV(K) = R(K) − A′−1((V′(K) − 1)K)K − A(A′−1((V′(K) − 1)K)) + V′(K)(A′−1((V′(K) − 1)K) − d)K

(38)

For speciﬁc results, we substitute the optimal investment rule Eq. 28, and the satisfactory HJB equation will be V ′ ( K )2 −

2K + 4δ c 4crV ( K ) 4ac V ′ (K ) − + − 4bc + 1 = 0 K2 K K

(39)

Then we obtain an ordinary differential equation in V. . V ′ (K ) =

( K + 2δ c )2 4crV ( K ) 4ac K + 2δ c − + − + 4bc − 1 for K ⭓ e K K2 K2 K

(40)

V ′ (K ) =

( K + 2δ c )2 4crV ( K ) 4ac K + 2δ c + + − + 4bc − 1 for K < e K K2 K2 K

(41)

with 1 V (e ) = [ae − be 2 − δ e − cδ 2 ] r

for each e > 0

(42)

as initial conditions.1 Step 3. Solve the global value function. We can compute the global value function for the original problem by V(K) = maxVi

(43)

The local value functions Vi are generally computed numerically. The global value function is shown in Fig. 2 corresponding to Fig. 3, the phase diagram in {u, K} space. This example shows the case when the middle As we see in Eqs. 36 and 39, u. = 0 for e = 0 and u = d for e > 0. Thus, the value of the value function at each steady state is expressed as V(e) = (1/r)[ae − be2 − de − cd 2] for e > 0 and V(0) = 0 for e = 0. 1

Model with Multiple Steady States

V (K )

Threshold

0

e2

K

e3

Fig. 2. Global value function

u Threshold

δ

K =0

u=0

0

e2

Fig. 3. Discontinuous policy function in {u, K} space

e3

K

63

64

M. Kato et al.

unstable steady state has a focus property. The point where the local value function changes is called the threshold or Skiba point. The phase diagram in {u, K} space can also be expressed in {I, K} space. Using some numerical examples presented by Feichtinger et al. (2000), we carry out some simulations. Case 1: Continuous policy function. The Skiba point coincides with the middle equilibrium candidate e2 . Example: r = 0.3, d = 0.1, b = 0.6, c = 0.3, a = 1, e2 = 0.07047, e3 = 0.2129. Case 2: Discontinuous policy function. The Skiba point does not coincide with the middle equilibrium candidate e2 . Example: r = 0.2, d = 1.2, b = 0.6, c = 0.3, a = 1, e2 = 0.017679, e3 = 0.5665, Skiba = 0.01. When we have multiple steady states, there generally exist multiple local stable manifolds satisfying the system and the transversality condition. Our model has three steady states, and there are two local stable manifolds corresponding to two stable steady states. Moreover, depending on the local dynamic property of the middle unstable steady state, these two stable manifolds overlap at about the middle unstable steady state, as Fig. 3 shows. Our strategy to compare the values of those candidate paths is to use the global value function Eq. 43. Using the global value function shown in Fig. 2, we can uniquely choose the optimal global stable manifold which generates the maximum value among the multiple candidate stable manifolds. The global stable manifold in {u, K} space (or in {I, K} space) is equivalent to the policy function, and thus the policy function can be discontinuous, as in Fig. 3. Discontinuity will easily be obtained depending on the parameters, and we cannot rule out this case a priori. Note that the discontinuity occurs at threshold(s) where the local value function switches. For our parameter set above, the threshold is found 2 to be at 0.01, and thus is to the left of the middle equilibrium candidate e2 .

4 The Effects of Demand, Interest Rate, and Tax Rate In this section, we analyze the effects of exogenous changes in demand, interest rate, and taxes on the steady states and the equilibrium path. The exogenous increase in demand raises the output of the ﬁrm, and thus the

2 That threshold for the above parameter set is computed in Grüne and Semmler (2004).

Model with Multiple Steady States

65

revenue will increase for the same amount of capital, entailing an upward shift of the revenue function R(K). This corresponds to an increase in a in our speciﬁc revenue function. From Eq. 35, we have3 e2 ∂e2 =− <0 (a − r − δ ) − 4be2 ∂a

(44)

e3 ∂e3 =− >0 (a − r − δ ) − 4be3 ∂a

(45)

Output increase will make the middle steady state shift down and the upper steady state shift up. In the same way, the effects of an increase in interest rate can be analyzed as e2 + 2cδ ∂e2 =− >0 (a − r − δ ) − 4be2 ∂r

(46)

e3 + 2cδ ∂e3 =− <0 (a − r − δ ) − 4be3 ∂r

(47)

An increase in interest rate will shift the middle steady state up and the upper steady state down. Concerning the effects of tax rate, we introduce an investment tax credit for simplicity, which is a direct rebate of c percent of the price of capital. This changes the ﬁrm’s optimal investment rule Eq. 27 as follows: ⎧u > 0 ⎪ u (t ) = A′ [(q + χ − 1) K ]; ⎨u = 0 ⎪⎩u < 0 −1

for q (t ) + χ > 1 for q (t ) + χ = 1 for q (t ) + χ < 1

(48)

and three steady states become 0 ⎧ ⎪ e = ⎨ a − r − δ + χr ± (a − r − δ + χr )2 − 16bcδ r ⎪⎩ 4b

(49)

Therefore, the effects of an increase in the rate of tax credit on positive steady states are

From Eq. 37, e2 = (y − x1/2)/4b and e3 = (y + x1/2)/4b where y ≡ ar − d > 0 and x ≡ (a − r − d)2 − 16bcdr > 0. Therefore, e2 < y/4b and e3 > y/4b hold. This implies the signs of Eqs. 44 and 45.

3

66

M. Kato et al.

re2 ∂e2 =− <0 (a − r − δ + χr ) − 4be2 ∂r

(50)

re3 ∂e3 =− >0 ( a − r − δ + χr ) − 4be3 ∂r

(51)

An increase in the tax credit pushes the middle steady state down and the upper steady state up, and thus the domain of attraction associated with the upper steady state is enlarged. Now our interest is how those changes in exogenous variables affect the ﬁrm’s investment policy and its equilibrium path. It is convenient to construct a phase diagram for this purpose. So far, we have relied on the method of HJB, where the discussion can be boiled down to one-dimensional state space. Here, we employ the maximum principle to see explicitly the equilibrium path in the control and state space, which is the equilibrium relationship between the investment decision and the level of capital stock. We follow the model in Eq. 18–22 with the speciﬁc functions Eqs. 23 and 24. The current value Hamiltonian4 is . (52) H = R(K) − uK − A(u) + qK = aK − bK2 − uK − cu2 + q(uK − dK) The ﬁrst-order necessary conditions are Hu = −K − 2cu + qK = 0 . q = (r + d)q − (a − 2bK) + (1 − q)u . K = (u − d)K

(53) (54) (55)

Therefore, the system in {u, K} space is summarized as u = ru +

1 b (r + δ − aK ) + K 2 2c c

(56)

. K = (u − d)K (57) . . Solving u = K = 0 gives the steady states, which are same as Eq. 37. The phase diagram can be constructed by using the information from Eq. 52. We know from Eqs. 44–47 and 50 and 51 that an increase in demand, a decrease . in the interest rate, and an increase of tax credit shifts up the u = 0 curve. Introducing the tax credit c will modify the current value Hamiltonian as: H = aK − bK2 − uK − cu2 − (q + c)(uK − dK). The ﬁrst-order conditions are: Hu = −K − 2cu + (q + c)K = 0, dq/dt = (r + d)q − (1 − q − c)u, and dK/dt = (u − d)K. The system is rewritten as du/dt = ru + (1/2c)(r + d − a − cr) and dK/dt = (u − d)K. Solving du/dt = dK/dt = 0 gives the steady states in Eq. 49. 4

Model with Multiple Steady States

67

. The . effects are shown in Fig. 4. After the events occur, u = 0 moves up and K = 0 stays put. This changes the steady-state values of the middle and upper steady states and, moreover, the threshold level. The threshold moves to the left and therefore the domain of attraction changes, i.e., the domain of the upper steady state’s attraction enlarges. One of the interesting phenomena due to the system with multiple steady states is that such an exogenous change may bring us from one to the other trajectory, each of which is associated with a different steady state. For example, as Fig. 4 shows, even though the ﬁrm’s capital stock is on the shrinking trajectory at the beginning, the ﬁrm may switch to the growing trajectory after any of the suggested events (an increase in demand, a decrease in the interest rate, or an increase of tax credit) happens because of the leftward movement of the threshold. This is likely to happen when such an event occurs before the capital stock shrinks below the new threshold. Unless the market takes a turn for the better (an increase in demand) or the government takes quick action to recover from a recession (through a decrease in the interest rate or an increase of tax credit) before things really get worse (before the ﬁrm’s capital stock falls below the new threshold), there will be a chance for the ﬁrms to come back to the growing trend. It is also reasonable to think that ﬁrms will ride on the upswing of the market in a boom

u Threshold

δ Jump

K =0

u=0

0

e′2

e2

Fig. 4. Comparative dynamic results

e3

e3′

K

68

M. Kato et al.

time, thus raising their investment. On the other hand, a recession may cause ﬁrms to make a change of direction, tightening their investment, and their capital stock may even shrink. When ﬁrm’s investment is shrinking, not only the scale of policy intervention but also a quick reaction by the government is desired. When government action is too late, a large policy intervention is required to alter the situation, since the government has to bring the threshold to the very low level. A quick reaction may be desired. The quicker the policy reaction is, the less costly and smaller intervention is required to save the situation.

5 Kernel Density and Bootstrap Test Next we want to present empirical evidence that may conﬁrm the long-run twin-peak distribution of ﬁrm size which is implied by the dynamic investment model studied above. To address this question, we concentrate on ﬁrm size distribution in the US manufacturing industry. The data of the following empirical study are taken from the pstar dataset used in Hall and Hall (1993). The data set contains 23 variables which quantify certain characteristics such as investment and stock price or assets’ value of US ﬁrms in the manufacturing industry for the time period from 1960 to 1991. We use the variable netcap, which is deﬁned as the book value of assets, adjusted for the effects of inﬂation, to represent capital stock. In the following, net capital, which is normalized by the average net capital of all ﬁrms, will be used as a measure of size. K (i )t =

Net capital of ﬁ rm i in year t Average net capital in year t

(58)

Thus, k(i)t = 1 indicates that ﬁrm i has a net capital that equals the average net capital, and k(i)t = 0.5 means that ﬁrm i has half of the average net capital. The pstar data set gives us a set of observed data points or capital stocks which can be interpreted as a sample of an unknown probability density function for several years. To analyze certain characteristics of this density, such as the number of modes (which are deﬁned as local maximums in the density), one has to determine the unknown density. If, for example the density function has changed from a unimodal to a bimodal one, it can be regarded as a hint for the conclusion of the theoretical ideas stated above. Accordingly, the aim in this section is to estimate the unknown density function from the observed data, especially at the end of the sample period, to see if it is bimodal. To address this aim a nonparametric approach is used. Whereas the parametric approach makes strong assumptions about the dis-

Model with Multiple Steady States

69

tribution of the data (e.g., normal distribution), the nonparametric approach is characterized by less rigid assumptions on the distribution of the observed data.5 The only necessary assumption which will be made is that a probability density function exists, thus “letting the data itself determine the density function”. In the literature there are various kinds of nonparametric density estimators, such as kernels, splines, orthogonal series, or histograms.6 We will concentrate on one of the most common, namely kernel density estimation, which has recently become a standard method in explorative data analysis.7 The idea of kernel density estimators is based on Rosenblatt (1957) and Parzen (1962), and the simplest form is deﬁned as

(

1 n x − Xi fˆ ( x ) = W ∑ h nh i =1

)

(59)

where xi (i = 1, . . . . n) are the observations of the data set, n is the window width or equivalently denoted smoothing parameter or bandwidth, and W = W(u) represents the kernel. Let fˆ(x;h) be a kernel density estimate which uses data x and h as window width. Then Fig. 5 displays the number of modes in the Gaussian kernel density estimation in 1990 as a (step stair) function of the window width h, whereas the critical window widths are the points of discontinuity. From Fig. 5, it is obvious that inferences which can be drawn from the kernel estimator strongly depend on the choice of the window width h. Unfortunately, there is still no generally accepted approach to determine the “correct” window width. Recent approaches range from subjective choice to rather automatic ones, which try to deﬁne optimal criteria. 8 Terrell (1990) assumes that the largest parts of density estimations are based on subjective choice; thus the researcher chooses the window width that best ﬁts his aims. On this basis, a lot of effort was undertaken to identify appropriate, data-driven window-width selection methods.9 The most popular choice for the window width is the so-called optimal bandwidth introduced by Silverman (1986), which is deﬁned as h = 1.06sn1/5

(60)

See Silverman (1986), p. 1. For a detailed overview, see for example Bean and Tsokos (1980) and Tapia and Thompson (1978). 7 See Turlach (1993), p.1. 8 The reader who is interested in a brief summary of these methods is referred to Turlach (1993). 9 See, for example, Jones (1991), Silverman (1986), and many others. 5 6

70

M. Kato et al.

Fig. 5. Number of modes in the Gaussian kernel estimation in 1990

Fig. 6. Kernel density estimate in 1990

The use of Silverman’s optimal bandwidth for a Kernel estimate for the year 1990 results in an optimal bandwidth h = 0.3183, which implies a kernel density estimation with one mode, therefore indicating unimodality. This is shown in Fig. 6. Another way of addressing the problem of the number of modes is to perform hypothesis testing, and therefore it is appropriate to construct a bootstrap test for multimodality, which tests the hypothesis that a Gaussian

Model with Multiple Steady States

71

density estimation has one, two, or in general m modes. The approach of a bootstrap multimodality test is used because, for example, normal tests or permutation tests of multimodality are not efﬁcient in this case. However, a bootstrap test is an effective way to test for multimodality. In the context of this work, suppose that the data on ﬁrms’ capital stocks for every year is an i.i.d (independent idenﬁcally distributed). sample from an unknown distribution F with continuous density p(x). To obtain certain properties of the density, like the number of modes, one can apply the bootstrap multimodality test, which requires drawing bootstraps10 from the empirical distribution Fn.11 Accordingly, the sample of ﬁ rms’ capital stocks can be used to represent the empirical distribution. The problem considered in this section is testing the null hypothesis H0 that the density has one single mode, versus the alternative hypothesis H1 that it has two or more modes. H0: f(x) has 1 mode versus H1: f(x) has more than 1 mode To carry out this test, it is necessary to specify a test statistic. A reasonable choice is the critical window width hˆ1 which is the smallest window width needed to obtain a kernel density estimation with one mode. A large value of hˆ1 implies that a lot of smoothing has to be done to obtain a kernel estimator with 1 mode, and thus is evidence against H0 . Next, one has to deﬁne the estimated distribution under the null hypothesis, which is an estimated null distribution for the test of H0 . A reasonable choice may be fˆ0(x;hˆ1), which is intuitively the density estimate that uses the smallest amount of smoothing among all estimates with one mode. The problem of using fˆ0(•;hˆ1) is that it artiﬁcially increases the variance of the bootstrap sample relative to the variance of the actual data set. To avoid this problem, fˆ0(•;hˆ1) will be adjusted or rescaled to have the same variance as the bootstrap sample variance, and will be denoted to smoothed density estimate ˆg0(•;hˆ1). Finally, it remains to assess the signiﬁcance level of the observed value of hˆ1. The bootstrap multimodality hypothesis test is based on the so-called achieved signiﬁcance level ASLboot. (61) ASLboot = Pgˆ (•;hˆ1){hˆ*1 > hˆ1} 0

Formally, a bootstrap sample is constructed by randomly sampling a times with replacement from a data sample. The data sample, or the distribution from which the bootstraps are drawn, is denoted by empirical distribution Fn, and the replacement of F by Fn is called the “plug-in” principle. 11 It can be shown that Fn is a nonparametric maximum likelihood estimator of F, and therefore it is justiﬁed to estimate the true unknown distribution by the empirical distribution if no other information on F is available (for instance, such that F belongs to a speciﬁc parametric class). 10

72

M. Kato et al.

where hˆ1 is ﬁxed at its observed value from the data set, and hˆ*1 is the critical window width producing one mode of the bootstrap sample ˆx *i, (i = 1, . . . , n). To approximate the ASLboot, it is necessary to draw the bootstrap from the smoothed density estimate ˆg0(•;hˆ1). The smoothed bootstrap data ˆx *i, (i = 1, . . . , n) will be obtained by drawing with replacement a sample yˆ *i, (i = 1, . . . , n) from the actual data set, and then set ˆx *i = ¯ ˆ 2)−1/2(y*i − ¯ y * + (1 + h21/s y * + h1ei),

(i = 1, . . . , n)

(62)

ˆ 2 denotes the sample where ¯* y i denotes the mean of y*i , (i = 1, 2, . . . , n), s ˆ variance of x *i, (i = 1, 2, . . . , n), and ei are i.i.d. standard normal random variables. Now the ASLboot is approximated by ˆ boot = ASL

{

# hˆ1* (b)⭓hˆ1

}

B

(63)

where B is the number of bootstrap replicates, # is the mathematical symbol for number, and (i = 1, 2, . . . , n). If, for instance, a signiﬁcance level of 10% is used, then H0 will be rejected if ASˆLboot < 10%. After describing the basic ideas of the bootstrap multimodality test, it will be applied below. Applying the bootstrap multimodality test. H0: f(x) has 1 mode vs. H1: f(x) has two or more modes. ˆ boot = ASL

{

# hˆ1* (b)⭓hˆ1

}

B

with 500 smoothed bootstraps of the original data on a logarithm of netcap in 1990, a signiﬁcance level of 10%, and hˆ1 = 0.35032 results in 435 smoothed bootstrap samples that have hˆ*1(b) 艌 hˆ1, and therefore ˆ boot = ASL

{

# hˆ1* (b ) ≥ hˆ1 B

} = 87%

Analogous to the latter result, the test indicates that the null hypothesis cannot be rejected in favor of the hypothesis of a bimodal density distribution. The ASˆLboot is about 87%, which supports the hypothesis of just one single peak in the true density. The main problem, however, using a (nonparametric) density estimator concerning the H0 long-run dynamics of ﬁrms, is that it does not consider the potential dynamics of ﬁrm distributions over time. In other words, although the true long-run distribution may indeed be bimodal, a density estimate in the year 1990, or even 2003, has not necessarily to be bimodal

Model with Multiple Steady States

73

because it may take some time for ﬁrms to group around two stable steady states. Therefore, a density estimate that is unimodal for a certain time period does not contradict the hypothesis of a future (long-run) bimodal ﬁrm size distribution. The next section will model the dynamics of ﬁrm size distribution and determine long-run distributions that may be stationary (with two steady states). The Markov chain approach will be used to remedy the latter disadvantage.

6 Markov Chain Approach Assume that the population of US manufacturing ﬁrms is classiﬁed into several discrete size classes (states) according to some criteria that represent capital stock. In this work, the value of a ﬁrm’s book assets (netcap) in relation to the average value of all ﬁrms will be used, whereas other variables like investment, etc., could also be used, because in general it will be expected that they are strongly correlated. The relative netcap now is deﬁned as ki (n ) =

K i (n ) ∑ i K i (n )

(64)

where ki(n) denotes the relative netcap in year n, Ki(n) is the absolute value of netcap in year n and the state space is considered as follows: 1, 1 < k ⭐– 1, 1 < k ⭐ 1, – – 1 < k ⭐ 2, k>2 0 ⭐ k ⭐– 4 4 2 2 The main problem of applying the Markov chain approach is the determination of the corresponding transition probabilities. It is obvious that the transition probabilities and the probability distribution are not known, and accordingly they have to be estimated from the observed data. To address this issue, one considers the variable “relative netcap,” and interprets it as given realizations of a Markov chain. With the help of these realizations, it is possible to determine the maximum likelihood estimator of the true transition probabilities. The procedure of obtaining the maximum likelihood estimates of the true transition probabilities begins with the determination of so-called transition numbers. Transition numbers indicate the number of ﬁrms’ transitions from state i to state j during two successive time points. The realizations of transition numbers can be summarized in the form of a matrix, called a ﬂuctuation matrix. F(n − 1, n) = [f ij(n − 1, n)]i,j∈S,n∈N For the purpose of illustration, the ﬂuctuation matrix for the time period 1973–1974 is presented in Table 1.

74

M. Kato et al.

Table 1. Example of a ﬂuctuation matrix k ⭐ 1/4 1/4 < k ⭐ 1/2 k ⭐ 1/4 1/4 < k ⭐ 1/2 1/2 < k ⭐ 1 12

837 12 1 0 0

12 177 8 0 0

1/2 < k ⭐ 1

1
k>2

0 12 99 5 0

0 0 11 78 2

0 0 0 7 129

To interpret these ﬂuctuation matrices one might look, for example, at the ﬁrst row of ﬂuctuation matrix F(1973, 1974). During the time period 1973 to 1974, 837 ﬁrms that had a relative netcap corresponding to interval k ⭐ 1/4 in 1973 remained there after 1 year. Twelve analogous ﬁrms that had a relative netcap corresponding to interval k ⭐ 1/4 in 1973 transit to the interval 1/4 < k ⭐ 1/2, and so on. As in Onatski (2003), we used a sample period of 22 years (1973–1990) to allow the sample to be a balanced panel. Now it is possible to assess the maximum likelihood estimator. The maximum likelihood estimator for the true transition probabilities on basis of s realizations of a stationary Markov chain is given by pˆij =

∑ ∑

N n =1 ij N n =1 i

f (n − 1, n ) f (n − 1, n )

=

fij fi

(65)

It is worth noting that pˆij is the empirically derived relative frequency, and therefore is only an approximation of the true transition probabilities. However, Anderson and Goodman (1957) have shown that this approximation is indeed a maximum likelihood estimate for a ﬁrst-order Markov chain.12 We apply the maximum likelihood estimator results in the following transition matrix: ⎛ 0.951 ⎜ 0.228 P = ⎜ 0.162 ⎜ 0.155 ⎜⎝ 0.158

0.023 0.692 0.069 0.008 0.006

0.008 0.067 0.700 0.085 0.0006

0.005 0.005 0.062 0.674 0.032

0.013⎞ 0.007⎟ 0.010⎟ 0.078⎟ ⎟ 0.799⎠

The estimated transition matrix P makes it possible to determine the equilibrium distribution or limiting distribution of ﬁrms in the US manufacturing industry. To guarantee that an equilibrium exists, the transition matrix has to satisfy some properties which are presented in the following lemma. 12

See Anderson and Goodman (1957), p. 90.

Model with Multiple Steady States

75

Lemma: An irreducible and aperiodic Markov chain with ﬁnite state space always has a limiting distribution which is a unique stationary distribution, and furthermore is independent of the initial distribution of the Markov chain. On the basis of this lemma, it ﬁrst has to be proved that the transition matrix is irreducible and aperiodic. 1. A Markov chain is called irreducible if all states of the chain communicate. Thus, it must be possible to get from every state to every other state in a ﬁnite number of steps. In addition, it must be true that for each state pair (i, j) ∈ S × S these exists an integer n = n(i, j) ∈ N with pnij. Since every probability in the transition matrix P is strictly positive (p1ij > 0), it is obvious that the chain is irreducible. 2. Let a set Ui contain transition numbers indicating the number of steps. A state i returns to it with a positive probability. Then, if there is a state i such that pnij > 0 for some n > 0, then there is a positive probability that the state i may be returned. Furthermore, the expression d (i ) =

{

∞

gcd of the set U i

if U i = {0} if U i ≠ 0

equals 1 or ∞, and then the state is denoted as aperiodic. It is apparent that in this case the Markov chain is aperiodic. After showing that the transition matrix, i.e., the Markov chain, is irreducible and aperiodic, there have to be a limiting distribution and a stationary distribution that are independent of an initial distribution. To determine the limiting distribution, one has to solve the following equations: p = pP

(66)

p1 + p 2 + p 3 + p 4 + p 5 = 1

(67)

The solution of the equations results in a unique stationary distribution given by p1 = 0.787, p2 = 0.071, p3 = 0.046, p4 = 0.030, p5 = 0.066 It is obvious that this equilibrium distribution is bimodal or, in the words of Quah,13 “twin-peaked,” since the probabilities of being in the extreme size groups in the equilibrium state distribution are larger than of being in the middle state (which is sometimes referred to as thinning in the middle).14 In contrast, the results of the kernel density estimate indicated a unimodal distribution of US manufacturing ﬁrms. However, the application of the 13 14

See, e.g., Quah (1997), p. 28. See Quah (1993), p. 430.

76

M. Kato et al.

Markov chain approach shows that in the long run a multiple steady state, with a large group at the low tail and a smaller group at the high tail, exists. Thus, the empirical study shows that in the long run the unimodal distribution ceteris paribus tends to change to a bimodal one. This implies an increase in the degree of concentration, because there is a relatively small group (state 5: 7%) of ﬁrms having a capital stock that is at least twice as big as the average capital stock in the US manufacturing industry, and a relatively large group of ﬁrms (state 1: 79%) that only have a capital stock corresponding to less then a quarter of the average capital stock in the industry. With reference to the discrete Markov chain approach, researchers sometimes argue that the choice of the class-size intervals has an important inﬂuence on the corresponding ergodic distribution. For instance, if one group is by accident divided into two pieces by using different deﬁnitions or numbers of class-size intervals, then the latter bimodal ergodic distribution may turn into a trimodal distribution, thus leading to a totally different interpretation of the equilibrium state. A second problem associated with discrete Markov chains is that discretization can have great distorting effects on dynamics if the underlying variables (which in our case are ﬁrm asset sizes) are continuous.15 However, just recently, an approach called the stochastic kernel approach has been developed to overcome the shortcomings of discrete and arbitrarily deﬁned class-size intervals. Informally, the idea of the stochastic kernel is to avoid the division into a discrete countable number of states, and to allow the number of states to tend to inﬁnity and then to continuity. Thus, the resulting transition matrix becomes a matrix with a continuity of rows and columns, or a stochastic kernel. The dynamics of ﬁrms’ asset sizes are assumed to be governed by this stochastic kernel in a similar way to a ﬁrstorder autoregression process. To get an adequate understanding of the stochastic kernel, consider the following description. Let R denote the real line and B the Borel s-algebra on it. Furthermore, let m and n be elements that are probability measures on B. Then a stochastic kernel relating m and n is a mapping M(m,n)(R, B) → [0,1] satisfying the following three conditions. 1. ∀x ∈ R, the restriction M(m,n)(x,•) is a probability measure. 2. For any A ∈ B, the restriction M(m,n)(•, A) is B-measurable. 3. For any A ∈ B, m(A) = 兰M(m,n)(x, A)dn(x).

15

See, e.g., Chung (1974).

Model with Multiple Steady States

77

Although apparently the stochastic kernel is the uncountable generalization of a matrix or, in our case, of the transition matrix, the stochastic kernel is completely sufﬁcient to describe the transitions from state x to any other portion of the underlying state space.16 Next, we will show how the stochastic kernel can be used to model the dynamics of ﬁrms’ asset sizes. By deﬁning the distribution of observations of ﬁrms’ assets at time t by l t, the stochastic kernel describes the law of motion, as stated above, through an operator M which maps the Cartesian product of the number of ﬁrms and the Borel set which can be measured on the space [0,1]. The simplest form of the law of motion is:17

λt +1 ( A) = ∫ Mt ( x, A) dλt ( x )

(68)

Accordingly, if one interprets dn(x) as the density of l and dm(y) as the density of lt+s, then Mt+s(x,dy)dl t(x) has to be regarded as the conditional density of lt+s given lt. On the basis of this, the stochastic kernel at time t + s is equal to the transition matrix at t + s with inﬁnite uncountable numbers of rows and columns. Thus, each ﬁrms’ asset size value represents its own class-size group. The problem of arbitrary class-size intervals is solved because it is no longer necessary to construct artiﬁcial and subjective classsize intervals. Again, the stochastic kernel is estimated using the balanced panel of the pstar data set for the period 1973–1990. The stochastic kernel is estimated nonparametrically by applying a Gaussian kernel and the optimal window width, as suggested in Silverman (1986). To estimate the stochastic kernel, the joint distribution of the logarithm of the relative netcap is ﬁrst derived. Subsequently, the implied marginal distribution in 1973 is determined by numerically integrating under this joint distribution. Finally, the stochastic kernel is obtained by dividing the joint distribution by the marginal distribution. The resulting stochastic kernel, which relates the distribution of log ﬁrm assets in 1973 to the distribution of log ﬁrm assets in 1990, is shown in Fig. 7. The stochastic kernel estimate conﬁrms the results of the transition matrix and ergodic distribution obtained previously. Figure 7 clearly indicates that the probability mass has two peaks at the two ends of the mass. As stated above, this result gives empirical support to divergence and polarization, with a tendency of the diverging states to cluster at a high ﬁrm asset level or at a low ﬁrm asset level, respectively. 16 17

See Quah (1997), p. 46. See Quah (1997), p. 47.

78

M. Kato et al.

Fig. 7. Stochastic kernel of log ﬁrm assets

7 Conclusion We have studied a simple dynamic investment decision problem for a ﬁrm where adjustment costs have capital size effects. We have shown that with this type of model, one can easily obtain multiple steady states and thresholds as well as a discontinuous policy function, giving rise to a discontinuous behavior of investment. We studied the global dynamic properties of such a model by employing the Hamilton–Jacobi–Bellman method and dynamic programming that helped us in the numerical detection of multiple equilibria, the thresholds and the jump in investment. We also explored the model’s implications for the effects of aggregate demand, interest rates, and tax rates. Finally, an empirical study of the ﬁrm size distribution was provided using US ﬁrm-size data. We utilized two different approaches, Kernel density estimation and Markov chain transition matrix, to study an ergodic distribution. Our results suggest a twin-peak distribution of ﬁrm size in the long run, which can be viewed as empirical support of the existence of multiple steady states, as predicted in the analytical part of the chapter.

Model with Multiple Steady States

79

References Abel AB (1982) Dynamic effects of permanent and temporary tax policies in a q model of investment. J Monetary Econ 9:353–373 Anderson TW, Goodman LA (1957) Statistical inference about markov chains. Ann Math Stat 28:89–110 Bean SJ, Tsokos CP (1980) Developments in nonparametric density estimation. Int Stat Rev 48:267–287 Chung KL (1974) Elementary probability theory with stochastic processes. SpringerVerlag, New York Eisner R, Stroz R (1963) Determinants of business investment. Impacts on monetary policy. Prentice Hall, Englewood cliffs Feichtinger G, Hartl FH, Kort P, Wirl F (2000) The dynamics of a simple relative adjustment–cost framework. Ger Econ Rev 2:255–268 Gould JP (1968) Adjustment costs in the theory of investment of the ﬁrm. Rev Econ Stud 35:47–55 Grüne L, Semmler W (2004) Using dynamic programming with adaptive grid scheme for optimal control problems in economics. J Econ Dyn Control 30:2427–2456 Hall BW, Hall RE (1993) The value and performance of US corporations. Brookings Pap Econ Activity 1:1–49 Hayashi F (1982) Tobin’s marginal q and average q: a neoclassical interpretation. Econometrica 50:213–224 Jones MC (1991) The roles of ISE and MISE in density estimation. Stat Prob Lett 12:51–56 Jorgenson, DW (1963) Capital Theory and Investment Behavior. Am Econ Rev 53: 247–259 Lucas RE (1967) Adjustment costs and the theory of supply. J Polit Econ 75:321–334 Onatski A (2003) Is the ﬁrm size distribution twin-peaked? Economics Department, Columbia University, Mimeo Parzen E (1962) On estimation of a probability density and mode. Ann Math Stat 33:1065–1076 Quah DT (1993) Empirical cross-section dynamics in economic growth. Eur Econ Rev 37:426–434 Quah DT (1997) Empirics for growth and distribution: stratiﬁcation, polarization, and convergence clubs. J Econ Growth 2:27–59 Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27:823–837 Silverman BW (1986) Density estimation for statistics and data analysis. Chapman and Hall, London Summers LH (1981) Taxation and corporate investment: a q-theory approach. Brookings Pap Econ Activity 1:67–127 Tapia RA, Thompson JR (1978) Nonparametric density estimation. John Hopkins University Press, Baltimore Terrell GR (1990) The maximal smoothing principle in density estimation. J Am Stat Assoc 85:470–477 Tobin J (1969) A general equilibrium approach to monetary theory. J Money Credit Banking 1:15–29

80

M. Kato et al.

Treadway AB (1969) On rational entrepreneurial behavior and the demand for investment. Rev Econ Stud 36:227–239 Turlach BA (1993) Bandwidth selection in kernel density estimation: a review. Working Papers in Statistic and Oekonometrie, Humboldt University, Berlin Uzawa H (1968) The Penrose effect and optimum growth. Econ Stud Q XIX:1–14 Uzawa H (1969) Time preference and the penrose effect in a two-class model of economic growth. J Polit Econ 77:628–652

5. Significance of the Keynesian Legacy from a Theoretical Viewpoint: A High-Dimensional Macrodynamic Approach Toichiro Asada

Summary. In this chapter we consider the signiﬁcance of the “Keynesian Legacy” in modern macroeconomics from a theoretical point of view. We present the outline of a variant of the “high-dimensional dynamic Keynesian model” with debt accumulation without committing to the mathematical details, and investigate the economic implication of the model. Our model is designed to shed some light on the understanding of the Japanese economy in the 1990s and 2000s, and provide some hints on the policy prescriptions. Key words. Keynesian legacy, High-dimensional dynamic Keynesian model, Debt accumulation, Expectation formation, Stabilization policy

The principal view, which drives me most, is that the working of the real world capitalist economy can better be elucidated along the lines suggested by Keynes (1936) in his criticism of the Classical school rather than along the lines currently pursued by the rational expectationists and new Classicals as descendants of the old Classicals. It is especially his grand vision that the economy is unstable and can be ill-behaved that seems most important. (Hukukane Nikaido)1

1 Alternative Legacies of Modern Macroeconomics It is well known that there are two alternative legacies in modern macroeconomics. One of them is the legacy of the “classical school” in a broad sense, which is now considered to be mainstream in the so-called standard textbooks of “advanced” macroeconomics.2 Another legacy is the “Keynesian” Faculty of Economics, Chuo University, 742-1 Higashinakano, Hachioji, Tokyo 192-0393, Japan 1 Nikaido (1996), Introduction, p. xiv. 2 See Romer (1996) as a typical example of such textbooks.

81

82

T. Asada

legacy in a broad sense, which is often considered to be out of fashion. In the classical legacy, it is taken for granted that the full employment equilibrium will automatically be attained if wages and prices are ﬂexible. According to this view, the cause of the persistent unemployment must be the wage or price rigidities, and the sufﬁcient reduction of nominal wages and prices will contribute to increase the employment (cf. Asada 2004). In the classical tradition of thinking, the investment is automatically adjusted to the full employment saving, so that the “effective demand” does not become the constraint on production and employment as far as wages and prices are ﬂexible, which is nothing but a version of “Say’s law.” Needless to say, this vision on the working of the market economy is shared by the old classical economists such as M. Friedman, the new classical economists such as Lucas, Sargent, and Barro, and the real business cycle theorists such as Kydland and Prescott.3 Ironically enough, however, the same vision is shared by the textbook version of the Keynesian IS–LM model and the new Keynesian economists such as Mankiw and D. Romer, who concentrate on the microeconmic interpretation of the wage and price rigidities. 4 On the other hand, there is less fashionable Keynesian legacy in modern economic thinking. Without doubt, the origin of such a tradition is Keynes (1936). Some of the important works in this tradition are Harrod (1948), Steindl (1952), Robinson (1956), Kaldor (1960), Kalecki (1971), Tobin (1980), Minsky (1986), Okishio (1993), and Nikaido (1996).5 According to this tradition of thinking, the modern capitalist economy will be characterized as follows. 1. Production and employment are determined by the level of “effective demand,” and the most important determinant of effective demand is the investment expenditure of the ﬁrms. 2. Firms’ investment activities are independent of households’ saving activities. 3 See Blanchard and Fischer (1989) and Romer (1996) for a detailed exposition of such theories. 4 For the new Keynesian theories, see Mankiw and Romer (eds) (1991), Blanchard and Fischer (1989), and Romer (1996). 5 Here we included the important contributions to Keynesian macrodynamics by two distinguished Japanese mathematical economists, Nobuo Okishio and Hukukane Nikaido. Okishio, a pioneering mathematical Marxian economist, reformulated the Harrodian instability principle rigorously in the 1960s (for the collection of his papers, see Okishio 1993). On the other hand, after publishing pioneering mathematical works on Walrasian general equilibrium theory in the 1950s, Nikaido shifted his attention to the general equilibrium models of imperfect competition and the Keynesian, Harrodian, and Marxian disequilibrium macrodynamics in the 1970s and the 1980s. His original contributions in this area of research are collected in Nikaido (1996).

A High-Dimensional Macrodynamic Approach

83

3. Fluctuations of investment cause ﬂuctuations of the effective demand, which feed back to the investment ﬂuctuations through the investment function and money market. These interactions between investment and effective demand cause the endogenous business cycle. 4. At almost every phase of the business cycle, labor is under-employed (there exists an involuntary unemployment). 5. Flexibility of wages and prices may destabilize rather than stabilize the economy, so that wage and price ﬂexibilities are not sufﬁcient conditions for the attainment of full employment equilibrium. 6 6. Monetary factors such as money supply, ﬁnance, and debt are very important aspects which affect the characteristics of the economic performance in the long run as well as in the short run. This means that money is not neutral even in the long run. 7. Subjective factors such as the expectation formation by the public and the credibility of monetary/ﬁscal policies are often crucial for the effectiveness of the economic policy. We assert that this Keynesian tradition of thinking is important even if “it is not the fashion of the day” (Flaschel et al. 2003), because it still provides us with some important insight for realistic modeling of the working of the modern capitalist economy, such as the Japanese economy in the 1990s and 2000s, and it also provides us with some important hints on the policy prescription. In this chapter, we present the outline of a theoretical model in this Keynesian legacy, and consider the signiﬁcance of such an approach to macroeconomic theorizing. The model presented here is based on the model in Asada (2006a), which is a variant of the “high-dimensional dynamic Keynesian model” developed by Chiarella and Flaschel (2000), Chiarella et al. (2001), Asada et al. (2003), Asada (2006b), etc. This approach is based on mathematical modeling by means of a system of nonlinear differential equations with many variables, in order to capture the several important destabilizing positive feedback mechanisms as well as the stabilizing nega-

Keynes (1936) wrote as follows. “The depressing inﬂuence on entrepreneurs of their greater burden of debt may partly offset any cheerful reactions from the reduction of wages. Indeed if the fall of wages and prices goes far, the embarrassment of those entrepreneurs who are heavily indebted may soon reach the point of insolvency,—with severely adverse effect on investment.” (Keynes 1936, p. 264) “It would be much better that wages should be rigidly ﬁxed and deemed incapable of material changes, than that depressions should be accompanied by a gradual downward tendency of money wages.” (Keynes 1936, p. 265) “There is, therefore, no ground for the belief that a ﬂexible wage policy is capable of maintaining a state of continuous full employment; . . . The economic system cannot be made self-adjusting along these lines.” (Keynes 1936, p. 267) 6

84

T. Asada

tive feedback mechanisms which are embedded in the economy.7 In particular, our formulation stresses the inﬂuence of ﬁnancial factors such as debt accumulation and the expectation formation on the performance of the macroeconomic system and the effectiveness of the monetary/ﬁscal stabilization policies by the government and the central bank. We hope that this model can shed some light on the understanding of the recent Japanese and other economies.

2 A High-Dimensional Dynamic Keynesian Model with Debt Accumulation The model formulated by Asada (2006a) consists of the following system of equations, which consists of seven subsystems.8 1. Subsystem of capital and debt-accumulation dynamics . d = f(g) − sf (r − id) − (g + p)d

(1)

i = r + x(d);

(2)

x ⭌ 0,

g = g(r, r − p e, d); gd = ∂g/∂d < 0

x ′(d) > 0 for d > 0,

g r = ∂g/∂r > 0,

x ′(d) < 0 for d < 0

g r−p = ∂g/∂ (r − p e) < 0, (3)

2. Subsystem of effective demand and goods-market dynamics . y = a(c + f(g) + v − y); a > 0; c = C/K = (Cw + Cr)/K, f(g) = I/K, v = G/K

(4)

Cw = W − Tw = Y − P − Tw

(5)

Cr = (1 − sr){(1 − sf)P + r(B/p) + i(D/p) − Tr}; 0 < sr ⬉ 1, 0 < sf ⬉ 1

(6)

3. Subsystem of labor market dynamics . . . e/e = y/y + g − (n1 + n2) = y/y + g − n . e > 0, 0 < ¯ e <1 w/w = e(e − ¯) e + n2 + p e ;

(7) (8)

The importance of the destabilizing positive feedback mechanism is well reported in recent literature which treats complex economic dynamics. See, for example, Agliardi (1998). 8 This model is called the “Keynes–Goodwin model” because it can be considered to be an extension of the growth cycle model of Goodwin (1967), which introduces Keynesian elements such as money and ﬁnance. See also Franke and Asada (1994), Flaschel et al. (2003), and Asada et al. (2003). 7

A High-Dimensional Macrodynamic Approach

85

4. Subsystem of pricing and income distribution p = z(wN/Y) = zw/a;

z>1

r = P/K = bY/K = by;

0 < b = 1 − (1/z) < 1

(9) (10)

5. Subsystem of monetary sector dynamics

ρ = ρ( y, m) =

{

ρ0 + (h1 y − m ) h2 ρ0

if

h1 y − m ⭌ 0

if

h1 y − m < 0

. m /m = m − p − g

(11) (12)

6. Subsystem of monetary and ﬁscal policies b = B/pK = constant

(13)

tr = Tr/K = constant

(14)

v = tw + tr + mm + (g + p − r)b

(15)

m = m0 + d(m0 − n − p);

(16)

m0 > 0, d ⭌ 0

7. Subsystem of price expectation formation dynamics . g > 0, 0 ⬉ q ⬉ 1 p e = g{q(m0 − n − p e) + (1 − q)(.p − p e)};

(17)

The meanings of the symbols are as follows, Y = real output (real national income). K = real private capital stock. y = Y/K = output–capital ratio, which is supposed to be proportional to the “rate of capacity utilization.” D = nominal stock of ﬁrms’ private debt. B = nominal stock of public debt (public . bond). p = price level. w = nominal wage rate. p = p/p = rate of price inﬂation. p e = expected rate of price inﬂation. d = D/pK = private debt–capital ratio. . b = B/pK = public debt–capital ratio. M = nominal money supply. m = M /M = growth rate of nominal money supply. m = M/pK = money–capital ratio. g . = K/K = rate of capital accumulation. I = f(g)K = real investment expenditure, including adjustment cost. f(g) = adjustment cost function of investment, introduced by Uzawa (1969) with the properties f′(g) ⭌ 1, f″(g) ⭌ 0. G = real government expenditure. Cw = workers’ real consumption expenditure. Cr = capitalists’ real consumption expenditure. r = nominal rate of interest of public bond. i = nominal rate of interest that is applied to ﬁrms’ private debt. W = pretax real-wage income. P = pretax real proﬁt. r = P/K = pretax rate of proﬁt. Tw = real income tax on workers. Tr = real income tax on capitalists (we neglect corporate tax for simplicity). tw = Tw/K. tr = Tr/K. N = labor employment. Ns = labor supply. n1 = growth rate of labor supply = constant . > 0. a = average labor productivity. n2 = a/a = growth rate of average labor productivity (rate of technical progress) = constant > 0.9 n = n1 + n2 = natural 9

This means that we are assuming “Harrodian neutral” exogenous technical progress.

86

T. Asada

rate of growth or “potential” rate of growth, which is assumed to be a positive constant.10 Next, we shall brieﬂy interpret how to derive the above equations. We assume that there are no issues of new shares, and we neglect the repayment of the principal of the private debt for simplicity. In this case, the budget constraint of the private ﬁrms becomes . (18) D = f(g)pK − sf (rpK − iD) where sf ∈ (0,1] is the rate of internal retention of ﬁrms, which is assumed to be a constant parameter, for simplicity. Next, differentiating the deﬁnitional equation d = D/pK with respect to time, we have . . . . . d/d = D/D − p/p − K/K = D/D − p − g (19) Substituting Eq. 18 into Eq. 19, we obtain Eq. 1. Equation 2 captures the fact that the private debt and the public debt are the imperfect substitutes, and the interest rate differentials of these assets reﬂect the difference of the degrees of the risk between them. Equation 3 is the investment function with a “Fisher debt effect.” We can derive this type of investment function from the optimizing behavior of ﬁrms by assuming both Uzawa’s 1969 hypothesis of increasing cost (the Penrose effect) and Kalecki’s 1937 hypothesis of increasing risk of investment (cf. Asada 2001). Equation 4 is the Keynesian/Kaldorian quantity adjustment process of the goods market disequilibrium. In this formulation, we neglect international trade for simplicity. Equations 5 and 6 are Kaleckian consumption functions of workers and capitalists, respectively (cf. Kalecki 1971). In these formulations, it is assumed that workers spend all their income, and capitalists save part of their income, where sr ∈ (0,1] is the capitalists’ average propensity to save, which is assumed to be a constant. Capitalists buy public and private bonds out of their savings. By deﬁnition, we have e = N/N s = yK/aN s

(20)

It is likely that the natural rate of growth is an increasing function of the rate of employment, as the experience of the Japanese and other economies suggests. In this case, we have a Keynesian endogenous growth model with variable rate of employment, in contrast to the neoclassical full employment endogenous growth models represented by Barro and Sala-i-Martin (1995). We can easily see, however, that the qualitative conclusion is not much affected as long as the sensitivity of the natural rate of growth with respect to the changes of the rate of employment is not extremely large, so that we do not introduce this kind of complication in this chapter. For another formulation of the Keynesian endogenous growth theory, see Pally (1996).

10

A High-Dimensional Macrodynamic Approach

87

Differentiating this equation with respect to time, we obtain Eq. 7. Equation 8 is nothing but a standard expectation-augmented wage Phillips curve. Equation 9 is the Kaleckian postulate of the mark-up pricing rule of the imperfectly competitive economy, where z is the mark-up that reﬂects the “degree of monopoly” of the economy (cf. Kalecki 1971). In this case, we can derive Eq. 10, which implies that the rate of proﬁt is proportional to the rate of capacity utilization. By the way, we can derive the following expectationaugmented price Phillips curve from Eqs. 8 and 9. p = e(e − ¯ e ) + pe

(21)

Equation 11 is the standard “LM equation” describing the equilibrium condition for the money market, which can be derived as follows. Following Asada et al. (2003), we specify the equilibrium condition for the money market as M = h1pY + (r0 − r)h2pK; h1 > 0, h2 > 0, r0 ⭌ 0

(22)

The left-hand side of this equation is the nominal money supply, and the right-hand side is a type of Keynesian nominal money demand function, where r0 is the lower bound of the nominal interest rate of the government bond. Solving this equation with respect to r, we obtain Eq. 11. Formally, the case of r = r0 corresponds to the case of h2 → + ∞, which is nothing but the case of the so-called “liquidity trap.” By differentiating the deﬁnitional equation m = M/pK with respect to time, we obtain Eq. 12. Equations 13 and 14 are simplifying assumptions about the government’s attitude on ﬁscal policy.11 We can derive Eq. 15 as follows. The budget constraint of the “consolidated government,” including the central bank, can be written as . . (23) M + B = pG + rB − pT = pG + rB − p(Tw + Tr) which means that the government deﬁcit must be ﬁnanced through the issue of new money or new bonds. From this equation, we have mm + mBb = n + rb − (tw + tr) . where mB = B/B. From Eq. 13, we have . . . b/b = B/B − p/p − K./K = mB − p − g = 0

(24)

(25)

Substituting Eq. 25 into Eq. 24, we obtain Eq. 15. Equation 16 formalizes the monetary policy rule of the central bank. If d > 0, we can consider this 11 A more complicated, but realistic, assumption on the taxation will be tr = tr {(1 − sr)by + rb + id}, where 0 < tr < 1, which implies that tr becomes an increasing function of the capitalists’ income per capital stock. We can show that this reformulation has a destabilizing rather than a stabilizing effect on the system.

88

T. Asada

monetary policy rule as a type of “inﬂation targeting rule” (cf. Krugman 1998; Asada 2006b). The parameter m0 is the long-run target growth rate of nominal money supply, and m0 − n is the long-run target rate of price inﬂation. It is supposed that the central bank announces this target rate to the public, and adjusts the growth rate of nominal money supply towards the realization of this target. It is worth noting that in our formulation, the monetary policy of the central bank and the ﬁscal policy of the government are closely related to each other through Eq. 15, which is derived from the budget constraint of the “consolidated government,” so that it is impossible to separate monetary and ﬁscal policies clearly. In this case, it is meaningless to argue which policy is more effective in stabilizing the economy. Equation 17 formalizes the price expectation formation hypothesis, which is a mixture of a sort of “forward looking” and “backward looking” (adaptive) expectation formation (cf. Asada et al. 2003; Asada 2006b). The case where q = 0 corresponds to the case of purely adaptive or backward-looking expectation. In the case where q = 1, the expected rate of inﬂation is adjusted toward the target rate of inﬂation that is announced by the central bank. This means that the public believes that the monetary policy commitment by the central bank is highly credible. It may be said that in this case the price expectation formation by the public is forward-looking in a sense.12 The speciﬁcation of price expectation formation closes our model. We can reduce the system of Equations 1–17 to the following ﬁvedimensional system of nonlinear differential equations, which we call the “fundamental dynamical equations” of our system. . (i) d = f(by,r(y,m) − p e,d) − sf {by − i(r(y(y,m),d)d} e + p e}d = F1(d,y,e,p e,m) − {g(by,r(y,m) − p e,d) + e(e − ¯) . e − p e)}m (ii) y = a[f(by,r(y,m) − p e,d) + {m0 + d(m0 − n − e(e − ¯) e e e + p − r(y,m)}b + {g(by,r(y,m) − p ,d) + e(e − ¯) + (1 − sr){r(y,m)b + i(r(y,m),d)d − {sf + (1 − sf)sr}by + srtr] = F2(d,y,e,p e,m;m0 ,d) . (iii) e = e[F2(d,y,e,p e,m;m0 ,d)/y + g(by,r(y,m) − p e,d) − n] = F3(d,y,e,p e,m;m0 ,d) . e )} = F4(e,p e;m0 ,q,g) (iv) p e = g{q(m0 − n − p e) + (1 − q)e(e − ¯ . e − p e) − e(e − ¯) e − pe (v) m = m[m0 + d{m0 + n − e(e − ¯) e e (26) − g(by,r(y,m) − p ,d)] = F5(d,y,e,p ,m;m0 ,d) 12 However, the use of the term “forward-looking” in this context may be somewhat misleading, because in our model perfect foresight is not necessarily postulated even if q = 1, while the term forward-looking is usually used in the context of perfect foresight or rational expectation models. Reiner Franke pointed out this fact to the author.

A High-Dimensional Macrodynamic Approach

89

3 Summary of the Analysis In this section, we summarize the results of a mathematical analysis by Asada (2006a) without committing to the mathematical details. The “longrun equilibrium” solution (d*,y*,e*,p e*,m*) of Eq. 26 that satisﬁes the relationship . . . . . (27) d = y = e = pe = m = 0 is determined by the following system of equations corresponding to the given parameter value of monetary policy m0: (i)

f(n) − sf [by − {r(y,m) + x(d)}d] − m0 d = 0

(ii)

f(n) + m0m + {m0 − r(y,m)}b + (1 − sr)[r(y,m)b + {r(y,m) + x(d)}d] − {sf + (1 − sf)sr}by + srtr = 0

(iii)

g(by,r(y,m) − m0 + n,d) = n

(iv)

e =¯ e

(v)

p = p e = m0 − n

(28)

It follows from Eq. 28 iii and iv that at the long-run equilibrium point, the rate of capital accumulation is equal to the “natural rate of growth” (n), and ¯), both the rate of employment is equal to the “natural rate of employment” (e of which are assumed to be exogenously given. Equation 28 v means that the equilibrium rate of price inﬂation is equal to the difference between the target growth rate of the nominal money supply and the natural rate of growth. Therefore, it seems at ﬁrst glance that the classical postulate of the “neutrality of money” applies to the long-run equilibrium of our model. However, we can argue that this ﬁrst impression is wrong for the following reasons. First, the equilibrium values (d*, y*, m*) depend on the value of the monetary policy parameter m0 . Second, the choice of m0 is not irrelevant to the existence of the long-run equilibrium solution. The equilibrium real rate of interest (r − p e)* must satisfy the following inequality, since r cannot be lower than its lower bound r0 . (r − p e)* = r(y*, m*) + n − m0 ⭌ r0 + n − m0

(29)

It is quite likely that in the long-run equilibrium the relatively low real rate of interest is required to support the realization of the natural rate of growth, because the economically feasible range of the variables y and d are restricted. This means that the long-run equilibrium will not exist if the target growth rate of nominal money supply m0 is too small to support the

90

T. Asada

natural rate of growth. Therefore, money is not neutral even in the long run in our model. In our model, the parameter values d, q, g, a, and e do not affect the longrun equilibrium values of the main endogenous variables. However, these parameter values do affect the dynamic stability/instability of the long-run equilibrium point. In fact, Asada (2006a) proved the following proposition mathematically under some reasonable assumptions in case of the “liquidity trap” r(y, m) = r0(h2 → +∞).13 Proposition (i): Suppose that the parameter value d is relatively small, which means that the monetary/ﬁscal policy is relatively inactive. Then the high speed of the quantity adjustment in the goods market (quantity ﬂexibility reﬂected by a large value of a) and the high speed of the wage adjustment in the labor market (wage/price ﬂexibility reﬂected by a large value of e) tend to destabilize the equilibrium point of Eq. 26. Proposition (ii): Suppose that the monetary/ﬁscal policy is relatively inactive (the parameter value d is relatively small) and the price expectation formation by the public is strongly backward-looking (the parameter value q is sufﬁciently close to zero). Then, the high speed of expectation adjustment (expectation adjustment ﬂexibility reﬂected by a large value of g) tends to destabilize the equilibrium point of Eq. 26. Proposition (iii): Suppose that the monetary policy commitment by the central bank is very credible (the parameter value q is sufﬁciently close to 1). Then, the equilibrium point of Eq. 26 is unstable for all sufﬁciently small values of the monetary policy parameter d ⭌ 0, and it is locally stable for all sufﬁciently large values of d > 0. Proposition (iv): Suppose that the parameter value q is sufﬁciently close to 1. Then at the intermediate range of the monetary policy parameter d > 0, endogenous cyclical ﬂuctuations occur. Proposition (iii) means that the central bank can stabilize the intrinsically unstable economy by adopting a sufﬁciently active monetary policy rule if the inﬂation targeting by the central bank is highly “credible.” Proposition (iv) means that at some range of the parameter values, an endogenous business cycle occurs.

13

We shall try to present the economic interpretation of this proposition in Sect. 5.

A High-Dimensional Macrodynamic Approach

91

4 A Numerical Example Next, we shall present a numerical example to illustrate the analytical results of the previous section. We assume the following parameter values and functional forms.14 sf = sr = 1, a = b = g = b = tr = 0.2, e = 0.1, i = r + d 2 , r = r0 = 0.01, f(g) = g, ¯ e = 0.97, n1 = 0.03, n2 = 0.02, n = n1 + n2 = 0.05, m0 = 0.09, m0 − n = 0.04

(30)

g = 0.1 {1.8 y5 − (r − p e) − 0.9d − 0.19} + n = 0.18y5 + 0.1p e − 0.09d + 0.03

(31)

We interpret 100r, 100p e, 100n, 100m0 , and 100(m0 − n) as the annual percentages of the nominal interest rate of government bond, the expected rate of price inﬂation, the natural rate of growth, the target growth rate of the nominal money supply, and the target rate (equilibrium rate) of price inﬂation, respectively. We also assume the following initial conditions of the main variables: d(0) = 0.21, y(0) = 0.18, e(0) = 0.92, p e(0) = 0.01, m(0) = 0.23

(32)

In addition, we modify the model by introducing the following natural nonlinearities, which means that the capacity utilization and the rate of employment cannot exceed their exogenously given upper bounds ymax = 0.32 and emax = 1.

{ {

y =

F2 (d , y , e, π e , m; µ0 , δ ] min[F2 (d , y , e, π e , m; µ0 , δ ) , 0 )

e =

F3(d , y , e, π e , m; µ0 , δ ) min[ F3(d , y , e, π e , m; µ0 , δ ) , 0 ]

if if if if

0 < y < ymax = 0.32 y = ymax = 0.32

(33)

0 < e < emax = 1 e = emax = 1

(34)

We then obtain Figs. 1 and 2, which illustrate some alternative time-paths of the variables e, d, and p. In these ﬁgures, we consider the following three alternative cases, where t denotes the “time period,” and a unit of time is supposed to be a year.15 Case A: q = d = 0 for all t ⭌ 0 Case B: q = 0.8, d = 0 for all t ⭌ 0 Case C: q = 0.8, d = 0 for 0 ⬉ t < 5, and q = 0.8, d = 0.3 for t ⭌ 5 14 This numerical example is based on Asada (2006a). The purpose of our numerical simulation is not to calibrate the model, but to illustrate the qualitative conclusion which can be obtained analytically. 15 We adopted Euler’s algorithm with the time interval ∆t = 0.1 (years) for computer simulations.

92

T. Asada e 1

C B

0.95

0.9

0.85

0.8

0.75

A time

0

5

10

15

20

Fig. 1. Alternative time paths of e

d, π 0.25 0.2 0.15 0.1 C (π)

0.05 0

B (π) 0

5

10

–0.05 –0.1

15

20

time A (d) A (π) B (d) C (d)

–0.15

Fig. 2. Alternative time paths of d and p

In Case A, the price expectation formation is purely backward-looking and the monetary/ﬁscal policy is inactive. In this case, the initial high debt– capital ratio causes serious debt deﬂation. Even in this case, the economy begins to recover endogenously as the real burden of debt decreases because of the sharp decline of debt-ﬁnanced investment expenditure. However, the lower turning point comes too late because the serious deﬂation prevents the

A High-Dimensional Macrodynamic Approach

93

real debt burden from reducing sufﬁciently to induce a sufﬁcient increase of investment expenditure. In addition, the capacity utilization reaches its upper bound ymax = 0.32 at the time period t = 19, and the economic recovery fails at that time. In Case B, the price expectation formation is considerably forward-looking in a sense, but the monetary/ﬁscal policy is inactive. Even if the monetary/ ﬁscal stabilization policy is inactive, in this case the speed of economic recovery is much higher than that in Case A because of the movement of price inﬂation towards the target rate, which helps to sufﬁciently reduce the real debt burden to induce the sufﬁcient increase of investment expenditure. In this case, at the time period t = 21, the capacity utilization reaches its upper bound and then an economic recession is triggered. In Case C, the price expectation formation is the same as in Case B, and the monetary/ﬁscal policy becomes very active from the time period t = 5. In this case, the economic recovery is further accelerated through the active monetary/ﬁscal stabilization policy that is combined with the considerable credibility of inﬂation targeting by the central bank. At the time period t = 19, the full employment of labor is attained, and at t = 19.5 the capacity utilization reaches its upper bound. Unlike the previous two cases, the full employment of labor and excess demand in the goods market continue in this case. In these examples, it is assumed that the nominal rate of interest is ﬁxed at its lower bound r0 = 0.01 all the time. In the real economy, however, the nominal rate of interest will begin to rise at the later stage of economic recovery. This will have the effect of somewhat slowing down the speed of economic recovery, but the qualitative nature of the economic recovery will not seriously change even if we consider the effect of the rising nominal rate of interest at the late stage of economic recovery. By the way, it is worth noting that in the three cases above, the structure of the economy is the same except for parameter q representing the credibility of the policy commitment of the central bank, and the monetary/ﬁscal policy parameter d.

5 Economic Implications of the Analysis Finally, we shall try to interpret the economic implications of our analysis. With the lack of an active counter-cyclical monetary/ﬁscal policy, the following destabilizing positive feedback mechanism, which is called the “Fisher debt effect,” may dominate the economy (cf. Fisher 1933). (e ↓) ⇒ p ↓ ⇒ d = (D/pK) ↑ ⇒ g ↓ ⇒ y ↓ ⇒ (e ↓)

(FDE)

94

T. Asada

The increase of the adjustment speed in the goods market (quantity ﬂexibility) and the increase of the adjustment speed in the labor market (wage/ price ﬂexibility) will reinforce this destabilizing mechanism. If the monetary/ﬁscal policy is inactive and the price expectation formation by the public is strongly backward-looking (adaptive), the following destabilizing positive feedback mechanism, which is called the “Mundell effect,” will also work through the effect on the expected real interest rate. (e ↓) ⇒ p ↓ ⇒ p e ↓ ⇒ (r − p e) ↑ ⇒ g ↓ ⇒ y ↓ ⇒ (e ↓)

(ME)

The increase of the speed of expectation adaptation will reinforce this destabilizing effect, reinforcing the part p ↓ ⇒ p e ↓. The increase of the quantity ﬂexibility and the wage/price ﬂexibility will also reinforce the Mundell effect as well as the Fisher debt effect. On the other hand, the following stabilizing negative feedback mechanism, through the changes of the nominal rate of interest, which is called the “Keynes effect,” will work if the counteracting Mundell effect is relatively weak. (e ↓) ⇒ p ↓ ⇒ m = (M/pK) ↑ ⇒ r ↓ ⇒ (r − p e) ↓ ⇒ g ↑ ⇒ y ↑ ⇒ (e ↑) (KE) However, this stabilizing Keynes effect is almost negligible in the case where the nominal rate of interest has already fallen to nearly its lower bound, like the Japanese economy in the late 1990s and 2000s. Needless to say, this case corresponds to the case of a “liquidity trap.” Even in case of a liquidity trap, however, a sufﬁciently active countercyclical monetary/ﬁscal policy, which is combined with the sufﬁcient credibility of the monetary policy commitment by the central bank, can stabilize the intrinsically unstable economy. The large value of the monetary/ﬁscal policy parameter d has the following direct stabilizing effect, which may be called the “money-ﬁnanced ﬁscal policy effect.” (e ↓) ⇒ p ↓ ⇒ v ↑ ⇒ y ↑ ⇒ (e ↑)

(MFE)

Even if monetary/ﬁscal policy is not strongly active, the sufﬁciently high credibility of the inﬂation targeting commitment by the central bank will have a stabilizing effect through the following mechanism, which may be called the “inﬂation targeting effect.”

µ0 − n > π e ⇒ π e ↑ ⇒ (ρ − π e )↓ ⇒ g ↑ ⇒ y ↑ ⇒ (e ↑ ) ⇓

π ↑ ⇒ d = ( D pK )↓ ⇒ (e ↑ )

(ITE)

This result is consistent with the Keynesian view which stresses the importance of subtle subjective factors such as the expectation of the public and the credibility of the attitude of the policy maker.

A High-Dimensional Macrodynamic Approach

95

Acknowledgments. This is a revised version of the paper which was presented at Chuo METS 05 (Chuo Meeting on Economics of Time and Space 2005, Chuo University, Tokyo, August 30, 2005) and the Sophia Symposium on Keynesian Legacy and Modern Economics (Sophia University, Tokyo, September 24, 2005). Thanks are due to the helpful comments by the participants of the conferences. Needless to say, however, only the author is responsible for possible remaining errors. This research was ﬁnancially supported by the Japan Ministry of Education, Culture, Sports, Science and Technology (Grant-in-Aid for Scientiﬁc Research (B) 15330037) and Chuo University Joint Research Grant 0382.

References Agliardi E (1998) Positive feedback economics. Macmillan, London Asada T (2001) Nonlinear dynamics of debt and capital: a post-Keynesian analysis. In: Aruka Y, Japan Association for Evolutionary Economics (eds) Evolutionary controversies in economics. Springer, Tokyo, p 73–87 Asada T (2004) Price ﬂexibility and instability in a macrodynamic model with a debt effect. J Int Econ Stud 18:41–60 Asada T (2006a) Stabilization policy in a Keynes–Goodwin model with debt accumulation. Structural Change and Economic Dynamics, forthcoming. Asada, T (2006b) Inﬂation targeting policy in a dynamic Keynesian model with debt accumulation: a Japanese perspective. In: Chiarella C, Flaschel P, Franke R, Semmler W (eds) Quantitative and empirical analysis of nonlinear dynamic macromodels. Elsevier, Amsterdam Asada T, Chiarella C, Flaschel P, Franke R (2003) Open economy macrodynamics: an integrated disequilibrium approach. Springer, Berlin Barro RJ, Sala-i-Martin X (1995) Economic growth. McGraw-Hill, New York Blanchard O, Fischer S (1989) Lectures on macroeconomics. MIT Press, Cambridge Chiarella C, Flaschel P (2000) The dynamics of Keynesian monetary growth. Cambridge University Press, Cambridge Chiarella C. Flaschel P, Semmler W (2001) Price ﬂexibility and debt dynamics in high order AS–AD model. Central Eur J Oper Res 9:119–145 Fisher I (1933) The debt–deﬂation theory of great depressions. Econometrica 1:337–357 Flashel P, Krüger M, Asada T (2003) From growth and employment cycles to KMG model building. In: Fuhmann N, Schmoley E, Sud RSS (eds) Gegen den strich: Ökonomische theorie und politische regulierung. Reiner Hampp, Munich p 75–100 Franke R, Asada T (1994) A Keynes–Goodwin model of the business cycle. J Econ Behav Organ 24:273–295 Goodwin R (1967) A growth cycle. In: Feinstein CH (ed) Socialism, capitalism and economic growth. Cambridge University Press, Cambridge, p 54–58 Harrod RF (1948) Towards a dynamic economics. Macmillan, London Kaldor N (1960) Essays on economic stability and growth. Duckworth, London Kalecki M (1937) The principle of increasing risk. Economica 4:440–447

96

T. Asada

Kalecki M (1971) Selected essays on the dynamics of the capitalist economy. Cambridge University Press, Cambridge Keynes JM (1936) The general theory of employment, interest and money. Macmillan, London Krugman P (1998) It’s baaack: Japan’s slump and the return of the liquidity trap. Brookings Pap Econ Act 2:137–205 Mankiw G, Romer D (eds) (1991) New Keynesian economics. 2 vols, MIT Press, Cambridge Minsky H (1986) Stabilizing an unstable economy. Yale University Press, New Haven Nikaido H (1996) Prices, cycles, and growth. MIT Press, Cambridge Okishio N (1993) Essays on political economy. Flaschel P, Krüger M (eds) Peter Lang, Frankfurt am Main Pally TI (1996) Growth theory in a Keynesian mode: some Keynesian foundations for new endogenous growth theory. J Post-Keynesian Econ 19:113–135 Robinson J (1956) The accumulation of capital. Macmillan, London Romer D (1996) Advanced macroeconomics. McGraw-Hill, New York Steindl J (1952) Maturity and stagnation in American capitalism. Oxford University Press, Oxford Tobin J (1980) Asset accumulation and economic activity. Basil Blackwell, Oxford Uzawa H (1969) Time preference and the Penrose effect in a two-class model of economic growth. J Polit Econ 77:628–652

Part II Economics of Time: Nonlinear Dynamics

6. Instability Problems and Policy Issues in Perfectly Open Economies Peter Flaschel

Summary. In this chapter, we consider an advanced classical model of a small, perfectly open economy with, in particular, perfectly ﬂexible wages and prices, and the purchasing power parity (PPP) and uncovered interest parity (UIP) conditions holding at each moment in time. From the budget equations of the model we derive the law of motion for the real value of foreign bonds held domestically, and for the evolution of aggregate real government debt. Moreover, the PPP and the UIP conditions taken together imply a law of motion for the price level in such an economy. Taking these three laws of motion together, instability of the steady state of the dynamics is a likely result if all policy parameters are kept ﬁxed. Speciﬁc policy rules with speciﬁc anchors for the private part of the economy are therefore proposed and used to make the overall dynamics stable, at least from the global point of view. Key words. Perfectly open economies, Government debt, Foreign debt, Instability, Monetary and ﬁscal policy

1 Introduction In this chapter we extend the classical analysis of the specie-ﬂow mechanism in the one good world, considered in Chap. 2 of Asada et al. (2003), toward a modern representation of such a classical approach, and now also including international trade in bonds besides commodity trade and the uncovered interest parity (UIP) condition as a representation of perfect international capital mobility. The small open economy1 considered is also a perfect one Department of Economics and Business Administration, Bielefeld University, PO Box 10 01 31, 33501 Bielefeld, Germany 1 The two-country extension of model considered is brieﬂy described in Appendix A.

99

100

P. Flaschel

in all other respects (from a neoclassical perspective), since the law of one price or the absolute form of the purchasing power parity condition (PPP) is assumed to hold, since ﬂexible money wages are assumed to clear the labor market, since domestic prices are also perfectly ﬂexible, and since all excess supply or demand in the domestic economy is always absorbed by, or served by, the world goods market. We are thus considering an ideal small open economy with no demand constraints, perfectly ﬂexible wages and prices, full employment output, and moreover with the PPP as well as the UIP conditions holding on the goods and the ﬁnancial markets at each moment in time. We provide budget equations for the three sectors of the model, and on this basis ﬂow consistency conditions that imply by themselves that the balance of payments is always balanced without any need for central bank intervention, and in fact also independently of the monetary regimes that are assumed to hold on the domestic money market and the market for foreign exchange. Depending on the ﬁscal policy regime that characterizes such an economy, we have, however, two stock-ﬂow relationships in this economy which can create stability problems even for such a perfectly open economy. On the one hand, there is the evolution of the capital account that is driven by the surpluses or deﬁcits in the current account. The resulting dynamic of the stock of foreign bonds (in real terms) held by domestic residents (if positive) can be isolated from the rest of the dynamics under certain simplifying conditions, and can be shown to be an unstable one if wealth effects in the assumed consumption behavior of the households of the economy are sufﬁciently weak. On the other hand, there is the dynamic of the government budget restriction (GBR) which is necessarily explosive or subject to centrifugal forces if all ﬁscal policy instruments of the government are given magnitudes. The twin deﬁcits (or surpluses) of such an economy, the government deﬁcit as well as the current account deﬁcit, may thus both be subject to cumulative forces under certain assumptions, and may thus represent a problem for such an economy in the long-run. The primary objective of this chapter is to consider the evolution in these deﬁcits and the (in-)stability conditions that characterize this evolution. Moreover, the perfectly open economy considered in this chapter is also subject to explosive price level adjustments, caused by the joint working of the UIP and the PPP conditions. The private sector of this economy is therefore plagued by various type of instabilities, which can only in a very limited way be overcome by the jump-variable technique (JVT) of the rational expectations school, here applicable to the price-level dynamics solely, and indeed of the type as was discussed by the original proposal for such a technique in Sargent and Wallace (1973). We do not follow the JVT here, however, but design appropriate, not always conventionally plausible, policy rules which

Perfectly Open Economies

101

may tame the explosive nature of the dynamics of the private sector under certain conditions. These policies concern an anti-monetarist type of money supply adjustment rule, a tax policy that is adjusting debt to a certain debt to GDP ratio, and a government expenditure policy that attempts to counteract the cumulative forces in the dynamic of the current account or its mirror image, the capital account. In this way the perfectly open economy may be made a stable one, though the dynamics that results from all these additions is difﬁcult to analyze from an integrated point of view in place of the partial ones adopted in the ﬁnal section of this chapter. Section 2 presents our modeling of the perfectly open economy, including the law of motion for the external debt or creditor position of this economy. Section 3 derives and investigates the dynamics of the GBR in isolation from the rest of the dynamics. In Sec. 4, we consider the joint evolution of internal and external debt (or surpluses) and also the unstable dynamic of the price level as it is implied by the UIP and the PPP conditions. Section 5 ﬁnally considers monetary and ﬁscal policy rules that may overcome the various cumulative forces that the perfectly open economy considered is subject to. The appendices provide a two-country extension of the model of this paper and the notation employed.

2 Perfectly Open Economies: An Advanced Formulation In this section, we provide an advanced formulation of the classical model type envisaged which integrates all perfectness assumptions that can exist in such a framework from a modern perspective, and which therefore may be considered the modern counterpart of the classical approach to the specieﬂow mechanism. Here we follow Rødseth (2000, ch. 5) and present a brief summary of this ideal model type for a small open economy. The model here therefore collects in an ideal fashion the assumptions of a perfect working of a small open economy in a classical environment, including internationally traded interest-bearing ﬁnancial assets. This model type, of an extremely or perfectly open classical economy, will be reviewed only brieﬂy here with respect to its structural equations and their fundamental implications. In particular, we will ﬁnd that the introduction of interest-bearing assets and their international trading will give rise to the possibility that the specie-ﬂow mechanism no longer works in the stable fashion of a classical economy without international trade of bonds. In addition, such destabilizing forces are not only working through cumulative effects or accelerating capital account adjustments, but are also present in the evolution of government debt and driven by government budget constraints. The general model to be considered below thus allows for the

102

P. Flaschel

possibility that twin deﬁcits (or surpluses or mixed combinations) can show explosive or implosive tendencies to be counteracted by active policy intervention. In the following, we consider a one-commodity world augmented by the trade in domestic as well as foreign bonds, B,F. We furthermore assume that domestic bonds are perfect substitutes for foreign bonds, but do not consider the inﬂow and outﬂow of bonds explicitly, since the model of this chapter does not consider stock constraints for asset demand functions explicitly. Instead the (balanced) exchange of domestic against foreign bonds is hidden in our presentation of the balance of payments (operating in the background of it), and thus only explicitly present in the uncovered interest rate parity condition to be considered below. Let us ﬁrst consider the ﬂow budget equations of the three sectors of Rødseth’s (2000. ch. 5) extremely open economy: households, the government, and the central bank (Firms only produce output Y by means of labor N, and they do not invest and transfer all proﬁt income to the household sector. They need not be considered from the perspective of budget equations.) These equations, read in the order of the named sectors, are . . . ¯*Fp ≡ pC + M + B p + eFp (1) p(Y − T) + rBp + er . ¯*Fc + B ≡ pG + rBp (2) pT + er . . (3) M ≡ Bc Equation (1) states that wage, proﬁt, and interest incomes (after taxes) of private households are spent on consumption and additions to money, and domestic and foreign ﬁxprice bond holdings (with prices set equal to one in the respective currency). We here denote by r and ¯* r the domestic and the foreign rate of interest, and by Bp and Fp the current stocks of domestic and foreign bonds held by the private sector. In Eq. 2 we state that government ﬁnances its expenditures and its interest rate payments to households through taxes, new issues of bonds, and also the domestic and foreign interest proceeds of the central bank (which are transferred to the government sector). The latter uses open market operations concerning domestic bonds2 in order to change continuously the quantity of money in the form of time derivatives (as shown). We furthermore allow for discontinuous policy changes in the ﬁnancial stocks held in the domestic economy, through the open market of foreign exchange market transactions of the Central Bank (or brieﬂy CB) of the following form.

2

Of course, the central bank could also buy foreign bonds from domestic residents.

Perfectly Open Economies

103

2.1 Central Bank Stock Policy Constraint CB − PC:

−M + Bc + eFc:

−dM + dBc + edFc = 0

We therefore distinguish between rule-determined continuous changes in the money supply, and isolated policy interventions in the ﬁnancial markets of the economy. In the above ﬂow identities, we assume consistency (equality) between the . . . ﬂow supplies M and B − B c provided . by the government and the central bank, . and the ﬂow demands M and B p of households. Furthermore, the central bank has some holdings of government bonds, Bc, due to past open market operations (which concern both the stock and the ﬂow constraints). However, since it transfers (by assumption) interest on these bonds back into the government sector, we need not consider these bond holdings explicitly in the budget Eqs. 1–3. Open market operations thus can also change stocks instantaneously, and thus have to be considered as wealth reallocations of households with respect to their wealth constraint later on. Note that the resulting budget equation of the central bank is formulated in a way that allows for consistent steady state calculations in the case of steady state inﬂation. The consistency assumption made with respect to Eqs. 1–3 implies for the aggregate of these three equations that . ¯*(Fp + Fc) ≡ eFp (4) p(Y − C − G) + er that is, the balance of payments is always balanced in the assumed situation without the need for central bank intervention and independent of the exchange rate and money supply regime the economy is subject to. We assume with respect to this identity that the economy considered produces a single commodity that it can either export (if Y − C − G > 0 holds) or import (Y − C − G < 0), and that there are no restrictions on the world market for doing so. This is some sort of Say’s Law, since there is then no demand constraint (or supply constraint) operating on the domestic goods market. Interpreting Eq. 4 in this way implies that the balance of payment of the domestic economy is always balanced, since the left-hand side represents the current account and the right-hand side (the negative of) the capital account of this economy. This also holds if the capital ﬂows underlying the assumed perfect asset substitutability are made explicit. In fact, in this chapter we go one step further by assuming that only inﬁnitesimal capital ﬂows are needed to ensure the UIP condition of the model considered below, so that the explicitly treated law of motion for the evolution of the foreign bond holdings of domestic residents is solely determined by the current account. This is a signiﬁcant disadvantage of a model type that does not treat ﬁnancial asset demands explicitly (as in the Mundell–Fleming–Tobin model of the literature, see Rødseth (2000, ch. 6) for example).

104

P. Flaschel

We deﬁne the real wealth economy by the following aggregate: W = Wp + Wg + Wc =

M + Bp + eFp − M − Bp eFc + + p p p

(5)

i.e., W=

e ( Fp + Fc ) eF = p p

(6)

which shows that the real wealth of all sectors in the economy (private households, government, and the central bank) adds up exactly to the real value of the foreign bonds held in the domestic economy (the amount of government bonds Bc held by the central bank has already been canceled in these equations3). In terms of growth rates, Eq. 6 implies that . . ˆ = ê + Fˆ − pˆ W or W = êW + eF/p − pˆW Using Eq. 4, this yields

ˆ = (ê − pˆ)W + [p(Y − C − G) + er ¯*F]/p = (r ¯* + ¯ W e − pˆ )W + Y − C − G

(7)

due to what was shown for the balance of payments in nominal (domestic) terms. The change in real domestic wealth, given by the real value of foreign bonds held in the domestic economy, therefore reﬂects the capital account ¯* = ¯* r −p r + ê − pˆ (see the in real terms if the real rate of return ¯ r * = ¯* assumptions of the model presented below) is applied in the interest income balance to calculate real interest ﬂows (as shown above). Equation 7 thus reﬂects the balance of payments (Eq. 4) in real (domestic) terms. In fact it provides the ﬁrst equation, indeed the core equation, of this model of an extremely open economy as formulated in Rødseth (2000, Ch. 5), if a certain taxation policy is implemented (see footnote). The remaining equations of the model are (with ¯ N the full employment level on the labor market)4

Since government bonds held by the central bank are irrelevant for the trajectories followed by the economy, we completely ignore them from now on and thus now use . the . letter B in the place of Bp for notational simplicity. Note also that we always have F = F p by assumption. 4 The assumption that W ag = constant holds is justiﬁed by a corresponding (this ensuring) rule for the collection of government taxes (see Rødseth 2000, P. 117). We also assume that government expenditures G are of a given magnitude in the following equations. 3

Perfectly Open Economies

M/p = md(Y,r),

mdy > 0,

mdr < 0

¯* = constant p = ep*, pˆ* = p

105

(8) (9)

r = ¯* r +ê

(10)

¯) w = pf′(N

(11)

¯) Y = f(N

(12)

r *), C = C(Yp, Wp, ¯

1 > C1 > 0, C2 > 0, C3 < 0

¯ ag r *W − G, Wp = W − W Yp = Y + ¯

(13) (14)

W = eF/p ¯ ag = Wg + Wc = −(M + B)/p + eFc/p = constant W

(16)

¯ ¯* = ¯* r + ê − pˆ = r − pˆ = r = constant r * = r* − p

(17)

We here consider an economy that is small compared to the world market, and that can satisfy all excess demand C + G − Y via imports and can export all excess supply Y − C − G in the opposite case. The world market nominal rate of interest ¯* r is assumed to have a given magnitude, as is the world ¯*. In the above model, we assume a standard LM curve of the inﬂation rate p Keynesian variety, the absolute form of the PPP condition, the UIP condition as a characterization of international capital mobility, labor market equilibrium, and on this basis the usual description of full employment output, and we supplement this model type by a Keynesian consumption function, the Hicksian deﬁnition of disposable income, and the deﬁnitions of private and government wealth. A list of the notation used in these equations is provided in Appendix B. Note that the second part of Eq. 2.17 follows directly from Eq. 9 when transformed to rates of growth. Furthermore, the assumption in Eq. 16 is made for the time being only (and later on discarded in Rødseth 2000, ch. 5, as well as in this chapter). It is based on the implicit assumption of a certain taxation policy, as we shall show below, and it allows us to substitute Wp in ¯ ag and thus reduce this term to the one that governs the Eq. 13 by W − W dynamics of the capital account in the balance of payments (minus a constant). Equation 14 for (Hicksian) disposable income of households is also ﬁnally a consequence of assuming that W ag = (eFc − M − B)/p is a constant (see below). Making use of the government budget restriction and the budget restriction of the central bank (Wc = eFc/p), we ﬁrst get − B c−M eFc + eF eFc − M − B − pˆ W ga = p p

ˆ c + pˆ ⎛⎜ M + B ⎞⎟ ˆ c + r *Wc + T − G − rB p − pW = eW ⎝ p p⎠

106

P. Flaschel

This expression can be ﬁxed to zero if the tax variable T is adjusted accordingly. On the basis of this assumption, we then get B p

(r − pˆ ) − pˆ

M = T + ( r * + eˆ − pˆ )Wc − G p

The Hicksian concept of household disposable income, i.e., the income which when consumed would keep a household’s real wealth on a constant level, is deﬁned by Yp = Y − T + ( r * + eˆ − pˆ )

eFp

B M + (r − pˆ ) − pˆ p p p

This implies the expression eFp Yp = Y − T + ρ * + T + ρ *Wc − G p = Y − G + ρ*

eF = Y − G + ρ *W p

¯ ag The disposable income Yp, like Wp, therefore depends only on W = Wp + W = eF/p as far as wealth effects are concerned (with a given real rate of return ¯ r * in addition). Let us now investigate the model in Eqs. 8–17 for an exogenously given ¯* = 0, p* = constant in addition) and (as was assumed money supply (with p above) with endogenous taxation that keeps the value of W ag constant. Let us also ignore wealth effects in the consumption function for the moment. On the basis of the rational expectations solution for price level and exchange rate adjustments, and in the case of an unanticipated shock in the money supply (dM = −dB), we must then have pˆ = ê = 0, and thus know that r is given by ¯*. r The LM curve in Eq. 8 then determines the perfectly ﬂexible after-shock price level p, whilst the PPP assumption in Eq. 9 determines the after-shock level of the exchange rate e. If perfectly ﬂexible nominal variables are thus adjusting, as is assumed to be the case since Sargent and Wallace’s (1973) introduction of the jump variable technique, we thus get in the case of an unanticipated monetary shock a strict neutrality result for the economy considered as far as the price level, the nominal exchange rate, and the nominal rate of interest are concerned. In addition, the value of W = eF/p remains unchanged in such a situation, i.e., the economy just remains in its steady state position in such an event. In the case of C2 > 0, however, we get a jump in the wealth variables Wp, W ag, and thus a change on the right-hand side of Eq. 18, i.e., we will have real effects of unanticipated jumps in money supply in general.

Perfectly Open Economies

107

In the case of real shocks, 5 for example an unanticipated government expenditure shock, the stock magnitudes F,W, i.e., the nominal and real holdings of foreign bonds in the economy considered, start moving continuously in time after the shock, which therefore sets in motion the dynamic we have derived above for the real variable W. . ¯ ag,r ¯*) − G W =¯ r *W + Y − C(Y + ¯ r *W − G,W − W

(18)

This dynamic is described by a single autonomous differential equation for the evolution of the domestic holdings of foreign bonds (or domestic ¯), and now also indebtedness to foreigners) W = eF/p, since ¯ r * = r, Y = Y(N ¯ G , are given magnitudes. The whole model in Eqs. 8–17 can therefore be reduced to, and analyzed by, a single law of motion for the real value of foreign bonds held in the domestic economy, as is implied by the balance of payments of the economy. Yet the perfect world assumed by Eqs. 7–17, with all prices ﬂexible, the permanent fulﬁllment of the PPP and the UIP, and with myopic perfect foresight in the working of the latter, then faces one signiﬁcant problem, namely the possibility of an explosive evolution of the real value of foreign assets held domestically if and only if ¯* − C2 = (1 − CY (⋅))r ¯* − CW (⋅) > 0 (1 − C1)r p

p

holds true, i.e., when wealth effects in consumption are sufﬁciently weak. The steady state level of W behind this stability condition is given by6 ¯ ag,r ¯ ¯*) + ¯ r *W0 = C(Y − ¯ G +¯ r *W0 ,W0 − W G−Y and is then surrounded by centrifugal forces which drive W towards +∞ for W(0) > W0 , or to −∞ for W(0) < W0 . The specie-ﬂow mechanism of the classical open economy model (Asada et al. 2003, ch. 2) thus can now be unstable in a global way if interest rate effects are present in the model considered, here due to a Keynesian consumption function. However, this occurrence of an unstable balance of payments adjustment process with respect to the accumulation of foreign bonds or dollar-denominated debt in the domestic economy undermines the application of the jump variable technique, since then there does not exist a bounded trajectory onto which the jump variables p, e can jump, and that brings the economy back to its steady state when thrown out of this position by an appropriate real shock. Where W(0) remains constant due to the PPP condition. We again assume that there is a single (positive or negative) solution W0 to this equation (describing a creditor or debtor position for the private sector of the economy considered).

5 6

108

P. Flaschel

Therefore, this case is excluded or blocked out from consideration in Rødseth (2000, p. 121). However, it is possible to investigate such an unstable adjustment process further within the balance of payments if it is assumed that wealth effects become dominant far off the steady state, to the extent that not only the slope of the right-hand side of Eq. 18 with . respect to W becomes negative, but also that two further equilibria of the W dynamics are created, as shown in Fig. 1. This shows two stable and, in between, one unstable stationary point for balance of payments adjustment. Flukes apart, the economy will therefore converge over time to either W1 or W3. Let us assume the latter situation, i.e., the economy then approaches a high creditor position with respect to the rest of the world (due to its consumption behavior along this path). Assume now that government increases its expenditure in such a situation . as long as there is a surplus in the current account of the economy. The W curve shown in Fig. 1 is thereby shifted into a downward direction, accompanied by decreases in the level of the actual as well as the stationary level of real foreign wealth held domestically. This process may reach a point where the upper equilibrium W3 gets lost (immediately after the situation where W2 = W3 is established). There is then only one stable equilibrium left, W1, to which the economy will converge over time. Along this path, the creditor position of the domestic economy will sooner or later change into a debtor position (in the situation shown in Fig. 11.1). This process is accompanied by current account deﬁcits until the new stationary point is reached where the current account has become balanced again. Appropriate nonlinearities in consumption behavior may therefore be used to explain long-lasting regime switches away from a strong economy towards a regime of increasing the foreign debt of the domestic economy (and vice versa). The basic result of these considerations is that the steady state of this extreme monetary approach to open economies that are small relative to the

W

W W1

W2

W3

Fig. 1. Multiple stationary points for the balance of payments adjustment process

Perfectly Open Economies

109

world market may be unstable in the larger world (if wealth effects in consumption are so small that they do not turn exports into imports as wealth is accumulated in terms of foreign bonds). This result has been shown here in a perfect ﬂexprice environment with a Keynesian LM curve and Keynesian consumption behavior, but no IS-restriction on the economy (I ≡ 0 still). Rødseth (2000, ch. 5) goes beyond this analysis by also considering imperfect capital mobility with ﬁxed or ﬂexible exchange rates, a government deﬁcit ¯ ag, and nominal (real) wage rigidity by way of an expectationand changing W augmented money-wage Phillips curve (with myopic perfect foresight), again with ﬁxed or ﬂexible exchange rates. In this chapter we will now dispense with the assumption made on the tax rate policy and assume instead that T = ¯ G) is determined exogeT (in addition to given government expenditure ¯ ¯ ag is changing, and nously.7 This implies that aggregate government wealth W will lead us to a situation where this magnitude is interacting dynamically with total wealth W, and in fact now also with the price level p.

3 Integrating the Dynamics of the Government Budget Constraint We start again from the budget equations of the three relevant sectors, households, the government, and central bank. With respect to ﬁrms, we again assume that all their income is transferred to the household sector. They do not invest in the present framework, and thus are only organizing the production of the full employment output Y by means of the given labor supply ¯ N . Note also that the CB holds both types of government bonds and may change these holdings by way of instantaneous open market or foreign exchange market policies, 8 but that this does not inﬂuence the shown form of the budget equations, since all interest income from these bond holdings is transferred to the government sector, which therefore only has to pay interest on the bonds B held by the private sector. The relevant ﬂow budget restrictions of the small open economy are thus (when government and the Central Bank are considered in integrated form) . . . ¯*Fp = pC + M + B + eF p p(Y − ¯ T ) + rB + er . . ¯ + er ¯ + rB ¯*Fc + B + M = pG pT

Note that this rigidity in government behavior is one root for the instability found to exist below. 8 To be represented by dM, dB, and dFc in a ﬂexible exchange rate regime (where Fc = constant otherwise). 7

110

P. Flaschel

and their implications for the evolution of domestic and foreign bonds held by the domestic economy are (describing, if positive, the public debt of the government and of the foreign economy) . . ¯ −¯ ¯*Fc − M B = rB + p(G T ) − er . ¯*F, eF = p(Y − C − ¯ G) + er F = Fp + Fc Considering the same situation from the viewpoint of savings, we can write . . . ¯*Fp − pC = M + B + eF p T ) + rB + er pSp = p(Y − ¯ . . ¯ + er ¯ − rB = − B − M ¯*Fc − pG S g = pT which, as expected, gives for total savings pS . . ¯*(Fp + Fc) = eF p = eF pS = pY − pC − pG + er This is simply a reformulation of the fact that the balance of payments must be balanced in the assumed situation without any further adjustment. processes (solely based on the . assumption that the issue of new money M and new government bonds B is accepted by the household sector, as is implicitly made is the above formulation of the two budget equations of our economy). In analogy to the Hicksian deﬁnition of private disposable income, we now deﬁne and rearrange this concept for the aggregate government sector (including the foreign interest income of the central bank), and show on this basis in particular that the aggregate real wealth of this sector W ag is, in its time-rate of change (as in the case of the private sector), determined by deducting from its real disposable income the consumption of this sector. Then this also provides us with a law of motion for the real aggregate wealth of the government sector, as well as the one we have already determined for the total wealth of the economy. These two laws describe the evolution of surpluses or deﬁcit in the government sector and the evolution of current account surpluses or deﬁcits, and thus in particular allow the joint treatment of the issue of twin deﬁcits in an open economy with a government sector (yet extended towards capital accumulation and economic growth). rB er *Fc ˆ M + B eFc + +p + (eˆ − pˆ ) p p p p eF M+B M + r + ( r * + eˆ − pˆ ) c = T − (r − pˆ ) p p p M M+B − ( M + B) eFc + ( r * + eˆ − pˆ ) = T + r − (r − r * − eˆ) p p p M M+B − ( M + B) eFc + ρ* =T +r +ξ , ξ = r * + eˆ − r p p p

Y ga = T −

Perfectly Open Economies

111

with W ga =

( M + B) eFc − ( M + B) eFc =− + = Wg + Wc p p p

i.e., + B ) eF c p − ( M + B) eFc − (M W ga = − p p p ˆ c ( − pG + rB − pT − r *eFc ) + eeF − ( M + B) eFc − pˆ = p p =T −G −r

B ˆM+B eFc +p + ( r * + eˆ − pˆ ) p p p

which ﬁnally gives = Y ga − G = ρ *W ga + ξ

M+B M + r +T −G p p

The above calculations concern the sources of income, and consider as disposable in this regard that part of accounting income that, when consumed, just preserves the current level of real wealth of the sector considered, which here is the aggregated government sector. On the basis of this income concept and the deﬁnition of the aggregate wealth of the government sector (where central bank wealth is included), we then derived the law of motion for aggregate government wealth, and showed in particular that the time-rate of change of this wealth magnitude is just the difference between diposable real government income and real government expenditure. ¯* holds), In the perfectly open economy (where x = 0 and ê = pˆ − p this then gives for a Cagan-type real-money demand function ¯* − r)), i.e., for the reduced form of the LM curve, md(Y, r) = kY exp(a(r ln p + lnY + ln k − ln M as the law of motion for aggregate governr = r*+ α ment wealth (or better debt). . W ag = ¯ r *W ag + rmd(Y,r) + ¯ T −¯ G i.e.,

(

)

ln p + lnY + ln k − ln M M +T −G W ga = ρ *W ga + r * + α p

P. Flaschel

112

This shows that the evolution of this debt is governed (in a destabilizing way) by its level, but is also dependent on the price level and its evolution in addition to the budget surplus (or deﬁcit). Since this debt position of the government is now no longer constant, we next repeat the equations for the evolution of total wealth W = eF/p of the economy for this case, and then again consider private disposable income Yp and private wealth Wp in its interaction with the evolution of aggregate government debt. W = Wp + Wg + Wc =

eF , p

F = Fp + Fc

ˆ = eˆ + Fˆ − pˆ W eF ˆ ˆ + W = eW − pW p p (Y − C − G ) + er *F ˆ − pW p eF = (eˆ − pˆ )W + r * + Y − C − G p ˆ = ( r * + eˆ − p )W + Y − C (Yp , Wp , ρ *) − G ˆ + = eW

. ¯*) − ¯ W =¯ r *W + Y − C(Yp,Wp,r G where we have for the deﬁnition of private wealth and disposable income: M + B + eFp = W = W ga p eFp B M + (r − pˆ ) − pˆ Yp = Y − T + ( r * + eˆ − pˆ ) p p p

Wp =

= Y − T + ( r * + eˆ − pˆ )Wp − ( r * + eˆ − pˆ ) = Y − T + ρ *Wp − ξ

B M M+B + (r − pˆ ) − pˆ p p p

M+B M r p p

= Y − T + ρ * (W − W ga ) − ξ

M+B M −r p p

with x = 0 in the case of a perfectly open economy. From the results on the disposable income of households and the government, we ﬁnally also get

Perfectly Open Economies

Yp = Y − Yag + ¯ r *W

or

113

Yp + Yag = Y + ¯ r *W

as the relationship between total disposable income, domestic product, and real interest on domestically held foreign bonds. In the case where the price level p, money supply M, government expenT are all held constant, we then get that the implied diture ¯ G , and taxes ¯ evolution of government debt W ag is always divergent at both sides of its steady state level

Wgoa =

T −G −⎛r * + ⎝

ln po + lnY + ln k − ln M ⎞ M ⎠ pc α ρ*

There is thus a need for an active ﬁscal or monetary policy (not necessarily supported by price-level adjustments according to a standard expectationsaugmented Phillips curve) in order to stabilize the government budget. Without such a policy, both the budget of the government and the behavior of the capital account will be unstable, since the government deﬁcit feeds back into the capital account via the consumption behavior of households (but is, in the present situation, completely independent from the rest of the economy). Using the steady-state value of W ag in the law of motion for W then implies the same stability characterization for the adjustment of domestic foreign bond holdings as before, which therefore is now also driven by the unstable behavior of government debt, but otherwise is behaving as was analyzed in the case of a constant government debt. It follows that we have to integrate active ﬁscal policy rules, price-level adjustments, and an active monetary policy rule here in other to get less one-sided results on the evolution of the twin deﬁcits considered, the stability of the evolution of government debt, and the balance of payments (BOP) adjustment processes involved.

4 Twin Deficits and PPP/UIP-Driven Price-level Dynamics At the end of the preceding section, we assumed that the price level is a given constant in order to study the evolution of deﬁcits or surpluses in the GBR and in the BOP in isolation. Yet prices (and the exchange rate) are moving if the price level is shocked out of its steady-state position p0 , where the nominal rate of interest implied by the LM curve is no longer equal to the given world interest rate. In such a case, the resulting interest rate differential determines, via the UIP condition, the growth rate of the nominal exchange rate, and consequently, via the PPP, also the growth rate of domestic prices. Changes in the domestic price level feed back into the laws of motion for W as well as W ag (but not vice versa), and thus make the dynamics of these two

114

P. Flaschel

stock variables more complicated. In this subsection, we study these integrated dynamics of the private sector in their still fairly recursive format before we come, in the next section, to the question of what ﬁscal and monetary policy can do in view of such extreme formulations of the dynamics of the private sector.

4.1 Price-Level Dynamics in the Perfectly Open Economy The full set of three laws of motion of the private sector of the perfectly open economy considered in this chapter, with foreign inﬂation set equal to zero ¯ ,T ¯,G ¯), reads, on the basis of the ¯* = 0) and with ﬁxed policy parameters (M (p Cagan money demand function of this section, as M ⎛ ⎞ W = ρ *W + Y − C ⎜ Y + ρ *W − ρ *W ga − r − T , W − W ga , ρ *⎟ − G ⎝ ⎠ p M a a W g = ρ *W g + r + T − G p

(19) (20)

now coupled with the law of motion for the price level p ln p + lnY + ln k − ln M pˆ = [eˆ = ] α

(21)

This is based on the relationships W=

eF , p

r = r*+

ln p + lnY + ln k − ln M , α

W ga =

− ( M + B) + eFc p

The steady state of this dynamic system is determined by p0 =

M Y

[ln p0 = ln M − lnY ]

M ⎛ M⎞ ⎜⎝ r ⎟⎠ = r * = r *Y p 0 p0

(W ga )0 =

r *Y exp ( −α r *) + T − G ρ*

(Yp )0 = Y − G + ρ *W0 (Wp )0 = W0 − (W ga )0 Wo =

Y − C (Yp 0 ) , (Wp )0 , ρ *) − G

ρ*

Note that the equations for private disposable income and wealth have to be inserted into the last equation in order to get a single equation in the only unknown W0 , that must then be solved for this term.

Perfectly Open Economies

115

The three laws of motion shown above, which describe the dynamics of the private sector of our perfectly open economy as a result of all the assumptions made on its ideal performance, can be structured hierarchically in the following way. ln p + lnY + ln k − ln M ˆ pˆ = = p ( p) α ln p + lnY + ln k − ln M ⎞ M W ga = ρ *W ga + ⎛⎜ r * + ⎟⎠ + T − G ⎝ α p a a = W g ( p, W g ) ⎛ Y − ρ *W a − ⎛ r * + ln p + lnY + ln k − ln M ⎞ M ⎞ g ⎜⎝ ⎟⎠ ⎟ α p −G W = ρ *W + Y − C ⎜ ⎟ ⎜ a ⎠ ⎝ −T + ρ *W , W − W g , ρ * = W ( p, W ga , W )

(22) (23)

(24)

. Inserting the steady-state values p0 , (Wag)0 into the law of motion for W would lead us back to the situation considered in the preceding section, since neither p nor W ag depend on the evolution of total domestic wealth, due to the hierarchy in the laws of motion of the 3D dynamics that characterize the private sector of this perfectly open economy. Now, however, the price level p and the representation of aggregate government debt W ag are moving in general, and their stability properties must be investigated. The law of motion for the price level is of the type already considered in Sargent and Wallace (1973) in their formulation of the jump-variable technique of the rational expectations approach. It could, in principle, be treated as in this chapter, but would then be confronted with the problem of how the real magnitudes W ag, W have to be treated in view of such jumps in the price level. In view of the many deﬁciencies of the JVT, see for example Asada et al. (2003), we do not follow such an approach any more, but now investigate the 3D dynamics from the perspective that all its three variables are predetermined, and that an implied instability of the system must be overcome by conscious policy actions and not by the imposition of appropriate jumps on a set of variables that are assumed to be nonpredetermined (a situation that is hard to belive to apply to the general level of prices of an applicable macromodel).

4.2 Stability Analysis The Jacobian of the given 3D dynamics, considered at the steady state, is characterized by

116

P. Flaschel

⎛ + 0 0⎞ J = ⎜ ∓ + 0⎟ ⎝ ∓ + ∓⎠ where the sign of J21, J31, is given by the slope of the function f ( p) = r * +

ln p + lnY + ln k − ln M M ⋅ p α

with respect to p, and the sign of J33 is determined as we have discussed in the original one-dimensional setting of this chapter. Reformulating f(p) as a function in the rate of interest r, we get f˜(r) = f(p(r)) = rmd(Y,r) and thus d

m f ′ (r ) = rmrd + md = md ⎛ rd + 1⎞ ⎝m ⎠ i.e., the sign of this function is determined by the interest-rate elasticity of the money demand function dm d m d f ′ (r ) ⱝ 0 ⇔ f ′ ( p ) ⱝ 0 ⇔ ⱝ−1 dr r The coefﬁcients of the Routh–Hurwitz polynomial corresponding to the matrix J fulﬁll the given situation (where there are three zero entries in this matrix J) a1 = −traceJ = −J11 − J22 − J33 a2 = J11J22 + J11J33 + J22J33 a3 = −detJ = −J11J22J33 The stability of the dynamics considered thus only depends on the terms in the diagonal of the matrix J, and thus, in particular, not on the considered interest-rate elasticity condition for the money demand function. Furthermore, if a3 > 0 holds (as a necessary stability condition), we basically need to consider only whether the signs of a1 or a2 can hurt stability, since the remaining stability condition a1a2 − a3 > 0 is not a questionable one due to the fact that the terms in a1a2 completely dominated the single one in a3. Ignoring this, the remaining necessary conditions for local asymptotic stability are then J33 < 0,J11 + J22 < −J33

Perfectly Open Economies

117

and if this holds, −(J11J33 + J22J33) < J11J22 This implies that −J33(J11 + J22) < J11 + J22 < −J33 or − J 33 < 1 < −

J 33 J11 + J 22

As a result of the preceding section, the assumption −J33 < 1 is not a crucial one, while the other inequality states that J11 + J22 must be sufﬁciently small (relative to the size of J33) in order to get local asymptotic stability for the 3D dynamics considered. We thus see that the only potentially stabilizing term J33 in the 3D dynamics considered bears a big burden should it be able to make the overall dynamics of the private sector (with given policy parameters of the government and the Central Bank) an asymptotically stable one. It seems likely that active policy is needed for the proper working of such an economy. We conclude that given ﬁscal and monetary policy parameters represent too narrow a situation in order to safely exclude accelerating processes from the dynamics implied by the model. This is also fairly obvious from the nominal representation of these dynamics as provided in the digression that follows.

4.3 Digression: The Dynamics in Nominal Terms ¯* = 0), we can In a world where the steady state is inﬂation-free (M = ¯ M, p also discuss the dynamics of Eqs. 22–24 in nominal terms and get in this case for its three laws of motion. ln p + lnY + ln k − ln M ˆ pˆ = r − r * = = p ( p) α

(

)

ln p + lnY + ln k − ln M B = rB + p (G − T ) − r *pWc = r * + B α + p (G − T ) − r *pWc = B ( p, B ) W = r *W + Y − G

(

)

⎛ Y + r * (W − W ) + r * M + B − r * + ln p + lnY + ln k − ln M M ⎞ c ⎜ α p p⎟ − C⎜ ⎟ ⎜ − T, W −W + M + B , r * ⎟ c ⎜⎝ ⎟⎠ p

118

P. Flaschel

or indeed translated back into nominal terms F = r *F + p * (Y − G )

(

)

⎛ F − W + r * M + B − r * + ln p + lnY + ln k − ln M M ⎞ ⎛ c ⎜ p* ⎜ α p⎟ p − p *C ⎜ Y + r * ⎜ ⎟ ⎜ − T, F −W + M + B , r * ⎟ ⎜ c ⎜⎝ ⎟⎠ ⎜⎝ p* p ¯ c = eFc/p = Fc/p ¯*, W ¯* = conwhere we have again made use of W = eF/p = F/p stant. The stability analysis for system in Eqs. 22–24 also applies to this special case of course, now with respect to the dynamic variables p, B, F. For an application of the jump variable technique, we then ﬁnd that only p can be a candidate for its application, i.e., for its application, we must construct a case where there is exactly one unstable root of the Jacobian J of the dynamics at the steady state. This can be the case if and only if the determinant of J (the product of the three eigenvalues of the matrix J) is positive, in which case, however, these three eigenvalues must be real and positive in fact, i.e., there is no case within these dynamics where the jump variable can ﬁnd application. From the perspective of current new Keynesian theory, one may call such a situation an indeterminate one, which should be blocked from consideration. From this perspective, the task would then be to ﬁnd policy rules that bring determinacy to the system, i.e., that establish the existence of one unstable root for the extended dynamics. This would bring jumpvariable stability to the system by the very existence of these rules, so that, for example, an unanticipated jump in the money supply would cause the price level to jump in the same proportion without any reaction in the other variables of the system (with the exception of the nominal exchange rate that also has to jump by the percentage change in the money supply). In this situation, therefore, latent rules cause the economy to return to its real steady state immediately by purely nominal adjustments, and thus provide stability without any need for policy intervention. However, our approach to such a situation is to assume that if the system is unstable, it behaves in a predetermined way according to the three laws of motion above, and thus, in the case of instability (which is necessarily monotonic, as we have just seen), with increasing or decreasing price levels and increasing imbalances in the budget of the government and in the current and the capital account. We could then, for example, assume policy reactions (ﬁscal or monetary) when these imbalances pass certain thresholds, which attempt to reduce single or twin deﬁcits in the budget and the current account by tailored policy reactions. To consider such behavior in more detail is the objective of the following section.

Perfectly Open Economies

119

5 Fiscal and Monetary Policy in the Perfectly Open Economy We have seen in the preceding section that constant ﬁscal and monetary expressions (passive policies) are not a good scenario for obtaining stability of the stationary solution of the dynamic system considered. In particular, prices and the evolution of government debt (but also the evolution of the foreign position) will then be subject to centrifugal forces. In this section, we formulate some appropriate (but not necessarily plausible) monetary and ﬁscal policy rules that, at least from a partial perspective, are capable of taming the cumulative forces in the price level, the budget deﬁcit, and the current account deﬁcit, but due to space limitations only separately. Later we combine these policy rules in a general setup of the model in order to allow, at least numerically (in future research), an analysis of their potential for stabilizing the perfectly open economy considered either locally around the steady state or, if the policy is still inactive, at least globally if the economy has departed signiﬁcantly from its steady-state position. This section leads to the conclusion that policy anchors in the form of given policy targets and policy rules that attempt to realize these targets are needed for a proper working of the ideal type of economy considered here.

5.1 Stabilizing Monetary Policy In view of the special type of price dynamics characterizing our perfectly open economy (where the law of motion of the price level is a direct consequence of the UIP and the PPP conditions), we now assume as the monetary policy the rule shown below, which due to its formulation implies an independent 2D dynamics for the state variables p, M. ln p + lnY + ln k − ln M pˆ = α ˆ M = βm1 ( M0 M − 1) + βm2 ( p p0 − 1) with steady-state p0 = 1 = M0 by way of an appropriate choice of the measurement of output Y. Due to the isolated nature of price-level changes, this dynamic system is an autonomous one that can be studied in isolation from the remaining laws of motion of the economy considered. The Jacobian of these dynamics at the steady state is given by J=

(1β α m2

−1 α −βm1

)

The steady state is therefore locally asymptotically stable iff

P. Flaschel

120

1/a < b m < b m 1

2

holds true. Commenting on this result, it is a bit strange to see that the ideal monetarist world of this chapter needs, in its advanced form, money supply increases in order to ﬁght inﬂation. This is due to the destabilizing feedback channel that leads from rising prices to rising interest rates, from there to rising growth rates of the exchange rate (due to the UIP assumption), and then via the PPP to rising inﬂation rates in the domestic economy. Increasing the money supply in such a feedback chain then means that the increases in interest rates is counteracted, so that under the above conditions, the positive feedback mechanism between the price level and its rate of change is neutralized, such that the economy can indeed converge back to the steady state after the occurrence of isolated shocks. We thus have a stabilizing antimonetarist monetary policy rule in a model which is extremely monetarist in nature.

5.2 Stabilizing Fiscal Policy I We next consider a ﬁscal policy rule that attempts to suppress the cumulative forces in the evolution of government debt by way of a debt to a GDP target, as in the Maastricht Treaty of the European Community. We again formulate an isolated 2D dynamics that only shows the interaction of this rule with the dynamics of the government budget constraint, and thus ignores all inﬂuences of the dynamics of the price level (and of the money supply). For this purpose, we assume that the above steady state values po, Mo = 1 are underlyp * = 1, ro = ¯* r in ing the discussion of the next policy regime (eo = 1, ¯ addition). . G − T − ¯*W r ¯c B = roB + ¯ ¯) Tˆ = b t (To/T − 1) + b t (B/Y − d 1

2

We have assumed in the second law of motion that government wants to ¯ by adjusting taxes with a certain speed achieve a certain debt to GDP ratio d parameter b t accordingly. In addition, there is lump-sum tax smoothing with a speed parameter b t . The steady-state values of these partial dynamics ¯Y, To, where (where government expenditure is.also held constant) are Bo = d the latter is determined such that B = 0 is established. For the Jacobian at the steady state, in this case we get 2

1

J=

(β Tr Y o

t2 o

−1 −βt1

)

The steady state is therefore locally asymptotically stable iff b t > ro and rob t Y < b t To holds true. A taxation rule with some tax smoothing and a term that 1

2

1

Perfectly Open Economies

121

attempts to steer lump-sum taxes such that the debt to GDP ratio approaches a given percentage (the 60% of the Maastricht Treaty, for example) is thus successful in stabilizing the government deﬁcit if government expenditure is held constant during this adjustment process, and if the price-level dynamics is ﬁxed at its steady-state position. Note here that the limiting case b t = ∞ has already been considered earlier as the case where taxation is determined such that government debt W ag is kept constant. 1

5.3 Stabilizing Fiscal Policy II We have just seen that a debt to GDP target of the government may stabilize the originally cumulative forces in the GBR if taxes are varied into the direction of obtaining such a debt ratio, and if this variation is occurring with sufﬁcient strength. Regarding the second, possibly cumulative, instability in the balance of payments adjustment process, we assume now as a further ﬁscal policy rule that government expenditure varies such that current account surpluses or deﬁcits are reduced intentionally, again accompanied by a second component in this ﬁscal policy rule that adds some expenditure smoothing. We now assume (for a partial investigation of the implications of such a policy rule) that the steady-state values po, Mo, Bo are underlying p * = 1, ro = ¯*), r and the discussion of this policy regime (again plus eo = 1, ¯ that lump-sum taxes are a ﬁxed magnitude. W = r *W + Y − G

(

)

⎛ Y + r * (W − W ) + r * Mo + Bo − r * + ln po + lnY + ln k − ln Mo Mo ⎞ c ⎜ α po ⎟ po − C⎜ ⎟ ⎜ − T , W − Wc + Mo + Bo , r * ⎟ ⎝ ⎠ po G = β g1 (Go − G ) + β g2 W Improvements in the current account thus lead to increases in government spending from this partial perspective. In the steady state we have G = Go, which can here be given from the outside (within certain limits). We then obtain the value of Wo by setting the ﬁrst equation equal to zero. The Jacobian of these dynamics at the steady state reads ⎛ W J = ⎜ W ⎝ β g2 WW

W G ⎞ β g2 W G − β g1 ⎟⎠

This implies local asymptotic stability iff . . W W = ro(1 − C1) − C2 < b g + b g , W Wb g > 0 . holds true. In the case when W W < 0, i.e., for a stable balance of payments adjustment process, this demands that b g > 0 holds, i.e., indeed the 1

2

1

1

P. Flaschel

122

occurrence of government expenditure smoothing. In the opposite case, however, we need b g < 0, i.e., the government should then change its smoothing rule into a rule with a centrifugal in the place of a centripetal adjustment pattern. Stimulating government expenditure above its steady-state level, as well as responding positively to current account improvements, then induces an adjustment in the current account and the level of government expenditure that imply convergence back to the steady state should the economy be shocked out of this steady-state position. 1

5.4 Outlook: The Integrated Dynamics Looking at the system as a whole, we can state that here monetary policy works independently of ﬁscal policy, while the last two dynamic systems depend on each other, the ﬁrst through changing G and the second through changing T. Of course they also depend on what is going on in the nominal part of the economy. It is therefore necessary to consider the integrated interaction of the monetary policy rule and the two ﬁscal rules. The resulting 6D dynamics read ln p + lnY + ln k − ln M pˆ = α ˆ = βm ( Mo M − 1) + βm ( p po − 1) M 1 2 ln p ln Y l n k − ln M + + B = r * + B + pG − pT − r *pWc − M α Tˆ = βt1 (To T − 1) + βt2 (B ( pY ) − d ) W = r *W + Y − G

(

)

(

)

⎛ Y + r * (W − W ) + r * M + B − r * + ln p + lnY + ln k − ln M M ⎞ c ⎜ p p⎟ α − C⎜ ⎟ ⎟ ⎜ − T , W − Wc + M + B , r * ⎠ ⎝ p G = β g1 (Go − G ) + β g2 W ¯ c = Fc/p ¯*, W ¯*. where W = F/p G again given The steady state of these dynamics is characterized by (Go = ¯ exogenously) ¯Y, Wco = Mo/Po = 1, (T − G)o = ¯* r o(Bo − Mo), po = 1, Mo = 1, Bo = d To = Go + (G − T)o

Perfectly Open Economies

123

. while Wo can then be determined from the W = 0 equation on the basis of the steady-state values for ﬁscal and monetary policy.9 We note that the laws of motion for the price level p and the money supply M are again independent from the rest of the system, while the remaining laws of motion depend on them and are now fully independent of each other. Furthermore, we note that the only link from the capital account block to the block describing the budget dynamics is given by the term G in the government budget constraint, while the laws of motion for W, G depend on all the state variables of the model. It is not possible here to provide a proof of local asymptotic stability due to the complexity of the dynamics considered. In fact, one may furthermore assume that the policy rules considered only come into effect outside a certain neighborhood of the steady state. Numerical tools have therefore to be used for the resulting nonlinear system to check whether the proposed ﬁscal and monetary policy rules bound the dynamics to an economically meaningful domain. The result (yet to be achieved) would then imply that even the ideal world of the perfectly open economy under consideration (due to its centrifugal forces) needs the visible hand of economic policy to limit these forces to an economically feasible domain (not to speak of convergence back to the steady state). Instability due to myopic perfect foresight and the UIP condition, transferred to price-level dynamics by the PPP, and augmented by eventually cumulative processes in the accumulation of deﬁcits or surpluses in the budget and the current account thus need policy intervention even under the ideal conditions of this advanced classical model of a small open economy.

6 Conclusions The primary objective of this chapter has been to consider the evolution of the twin deﬁcits (or surpluses) concerning the government budget and the current account, and the (in)stability conditions that characterize this evolution. Moreover, the perfectly open economy considered here was also subject to explosive price-level adjustments caused by the joint working of the UIP and the PPP conditions. The private sector of this economy was therefore . If one uses as expenditure rule the law of motion G = b g 1(G 0 − G) + b g 2(Y − C − G − X0), which focuses on the trade account instead of the current and which assumes a given government target X0 for exports X, one can determine. the steady-state value of W by Wo = Xo/¯*, r and on this basis a steady value for G from W = 0 by making use of the given ¯, X0 , in the place of d ¯, ¯ steady-state value for G − T. We then have the anchors d G , for the long-run position of the economy. 9

124

P. Flaschel

plagued by various types of instability, which can only be overcome in a very limited way by the jump-variable technique of the rational expectations school. We did not follow the JVT here, however, but designed appropriate, not always plausible, policy rules which could tame the explosive nature of the dynamics of the private sector under certain conditions. These policies concerned an antimonetarist type of money supply adjustment rule, a tax policy that was adjusting debt to a certain debt to GDP ratio, and a government expenditure policy that attempted to counteract the cumulative forces in the dynamics of the current account, or its mirror image the capital account. In this way, the perfectly open economy could be made a stable one, although the dynamics that results from all these additions may be difﬁcult to analyze from an integrated point of view in the place of the partial ones adopted in the ﬁnal section of this chapter. However, such an integration must be left for future research.

Appendix A The Two-Country One-Commodity Case and Other Extensions of the Model We return to consideration of the given ﬁscal and monetary policy parameters, but now consider data from the rest of the world as being endogenously determined in interaction with the dynamics that was so far considered. The situation in country 1, the domestic economy, is then described by ln p + lnY + ln k − ln M pˆ = r − ρ = ro + −ρ α

(

(A1)

)

ln p + lnY + ln k − ln M M +T −G W ga = ρW ga + ro + α p

(

(A2)

)

⎛ Y + ρW − r + ln p + lnY + ln k − ln M M ⎞ o α p ⎟ −G W = ρW + Y − C ⎜ ⎟ ⎜ ⎝ − T − ρW ga , W − W ga , ρ ⎠

(A3)

since r* ≠ r and r = r − pˆ = r* = r* − pˆ* are now endogenously determined variables. In the same way, it holds for the foreign economy that on the basis of the assumptions for perfectly open economies, ln p* + lnY * + ln k* − ln M * pˆ * = r * − ρ = ro + −ρ α*

(A4)

ln p* + lnY * + ln k* − ln M * ⎞ M * W ga * = ρW ga * + ⎛⎜ ro + + T * − G* ⎟⎠ ⎝ α* p*

(A5)

Perfectly Open Economies

125

⎛ Y + ρW − ⎛ r + ln p + lnY + ln k − ln M ⎞ M ⎞ ⎜⎝ o ⎟⎠ ⎟ α p − G (A6) −W * = W = ρW + Y − C ⎜ ⎟ ⎜ ⎠ ⎝ − T − ρW ga , W − W ga , ρ The uniquely determined real rate of interest, the world real rate of interest r, is then to be determined as the “natural rate” from the world goods market equilibrium relationship ⎛ Y + ρW − ⎛ r + ln p + lnY + ln k − ln M ⎞ M ⎞ ⎜⎝ o ⎟⎠ ⎟ α p +G Y +Y* = C⎜ ⎟ ⎜ a a ⎠ ⎝ − T − ρW g , W − W g , ρ

(A7)

⎛ Y * − ρW − ⎛ r + ln p* + lnY * + ln k* − ln M * ⎞ M * ⎞ ⎜⎝ o ⎟⎠ α* p* ⎟ + G * + C*⎜ ⎟ ⎜ a a ⎠ ⎝ − T * − ρW g *, − W − W g *, ρ This has to be inserted into the ﬁve laws of motion above for p, p*, W ag, W ga*, and W in order to arrive at an autonomous system of ﬁve differential equations describing the dynamic interactions in our two-country world economy. We note that the rate r depends, according to the world goods market equilibrium, on all ﬁve state variables of the dynamics considered. Finally, the dynamics of the nominal exchange rate is obtained as an appended law of motion, via the PPP, through ê = pˆ − pˆ*. Due to the form of this law of motion, we see that it depends linearly on the laws of motion for p, p*, which implies that zero-root hysteresis is present in this two-country model as far as the evolution of the nominal exchange rate is concerned. The model considered shows how the real world rate of interest (which was so far considered with given magnitude) is determined in a world of two perfectly open economies, and how, on this basis, the dynamics of goods prices can be obtained together with the laws of motion for aggregate government debts and the debt–creditor relationship between the two economies. The dynamics that results from these interacting two-country macrodynamics is, of course, already fairly involved. In our view, they should be solved on the basis of gradually adjusting price levels and of course gradually adjusting wealth expressions. If they are locally unstable, mechanisms in the private sector or ﬁscal and monetary policy reactions must again be found that limit the local explosiveness at least away from the steady state such that the world economy remains economically viable. Further extensions may concern the assumption of a two-commodity twocountry world, the removal of the full employment condition and its replacement by a conventional type of Phillips curve, the existence of Keynesian demand restrictions on the world market for the domestic commodity, i.e.,

126

P. Flaschel

the assumption of an export demand function, the inclusion of investment behavior, imperfect asset substitution, and more.

Appendix B Notation The following list of symbols contains only domestic variables and parameters. Magnitudes referring to the foreign country are deﬁned analogously and are indicated by an asterisk (*). Superscript d characterizes demand expressions, while the corresponding supply expressions do not have any index (in order to save notation). A “dot” is used to characterize time-derivatives, and a “hat” for corresponding rates of growth. Furthermore, we use an index o to denote steady-state expressions. Finally, we characterize exogenous variables by means of a bar over the variable considered. The statically or dynamically endogenous variables of the model are listed below. p w r r x e Y T G Yp Yap f(N) N C Sp S M B F Wp W ag W

price level wage level nominal interest rate real interest rate risk premium exchange rate output lump sum taxes government expenditure private disposable income aggregate government disposable income production function labor supply private consumption private savings total savings money supply domestic bonds foreign bonds real private wealth aggregate real government wealth foreign bonds (in real terms)

Perfectly Open Economies

127

References Asada T, Chiarella C, Flaschel P, Franke R (2003) Open economy macrodynamics. An integrated disequilibrium approach. Springer, Heidelberg Rødseth A (2000) Open economy macroeconomics. Cambridge University Press, Cambridge Sargent T, Wallace N (1973) The stability of models of money and growth with perfect foresight. Econometrica 41:1043–1048

7. Corridor Stability of the Neoclassical Steady State Akitaka Dohtani1, Toshio Inaba2, and Hiroshi Osaka1

Summary. Combining the permanent income hypothesis with the capital accumulation equation, we construct a growth model which gives a modiﬁed version of the neoclassical growth model. The consumption at one time is determined by the expected permanent income at that time. The modiﬁed growth model possesses a unique steady state which is the same as that of the neoclassical growth model. Unlike the neoclassical growth model, however, the modiﬁed growth model yields corridor stability. That is, there exists a corridor of stability around the steady state such that any path inside the corridor converges to the steady state and any path outside the corridor diverges. Thus, the neoclassical view of economy holds true for inside the corridor, but does not hold for outside the corridor, and hysteresis emerges near the corridor. Although the modiﬁed growth model generates ﬂuctuating dynamics, it does explain the stably observed feature that any path which maintains a nearly proportional relation between consumption and income is compatible with ﬂuctuating dynamics. We also make clear the effects of the parameters on the aggregate and the stability of the per-capita steady state. Key words. Solow–Swan growth model, Permanent income, Neoclassical steady state, Corridor stability, Hopf bifurcation

1 Introduction After the pioneering growth model of the Keynesian type by Harrod (1939) and Domar (1946), Solow (1956) and Swan (1956) independently constructed Faculty of Economics, University of Toyama, Gofuku 3190, Toyama, Japan School of Education, Waseda University, Nishiwaseda 1-6-1, Shinjuku-ku, Tokyo, Japan 1 2

129

130

A. Dohtani et al.

the neoclassical aggregate model of growth (from now on, the S–S model). For the S–S model, see also Barro and Sala-i-Martin (1995). Harrod and Domar tried to prove the inherent instability of the capitalist economy, namely, the instability of the steady state of growth. Conversely, Solow and Swan proved the global stability of the steady state. The growth analyses of the Harrod–Domar type play little role in today’s thinking because the instability is too strong. On the other hand, the S–S model has laid the foundation for the advance of growth theory. Some recent statistical investigations show that the S–S model can describe the actual economy relatively well. See, for example, Mankiw et al. (1992). However, is the capitalist economy globally stable in the manner that the S–S model describes? Leijonhufvud (1973) stated that there exist two types of economic view. The ﬁrst one is the neoclassical type: the market equilibrium is globally stable. The second one is the Keynesian type: the market equilibrium is completely unstable. In addition to these views, Leijonhufvud offers a new type of economic view: corridor stability. That is, there exists a corridor of stability around the market equilibrium such that any path inside the corridor converges to equilibrium, and any path outside the corridor moves far away from equilibrium. Thus, the emergence of corridor stability implies the coexistence of neoclassical and Keynesian views. Such a coexistence yields hysteresis. From the viewpoint of economic policy, the emergence of corridor stability is important, because in such a case even a small shock may change the economic situation drastically. We can easily extend the notion of corridor stability on market equilibrium to that on the steady state of growth. Consequently, the emergence of corridor stability implies that there exists a corridor around the steady state such that the neoclassical view holds true for the inside of the corridor, but does not do so for the outside of the corridor. This chapter tries to construct a growth model that generates a corridor of stability around the neoclassical steady state. In the sense that the markets in the growth model are always cleared, the growth model is an equilibrium model. The growth model combines the consumption function of the permanent income hypothesis (from now on, the PI hypothesis) with the capital accumulation equation. The steady state of the growth model is the same as that of the S–S model, and is locally asymptotically stable. Thus, the growth model gives a modiﬁed version of the S–S model, which is seemingly a commonplace model. Unlike the S–S model, however, the modiﬁed version is not globally stable, and generates a corridor of stability that consists of an unstable limit cycle. Therefore, in the modiﬁed version, the validity of the neoclassical steady state is limited to the inside of the corridor. This chapter is organized as follows. Section 2 constructs the growth model. Section 3 analyses the dynamics of the model. Section 4 shows that

Stability of the Neoclassical Steady State

131

our model explains an nearly proportional relation between consumption and income. Section 5 considers the effects of parameters on the aggregate and stability of the steady state. Section 6 concludes the chapter. The appendix proves some results.

2 The Model In this section, we construct a continuous-time growth model that possesses the same steady state as the S–S model. In the same way as the S–S model, the capital accumulation equation is given by . k = f(k) − (n + d)k − c where k is the capital stock per capita, c is per-capita consumption, n is the growth rate of the population, d is the depreciation rate, and f is the production function. Throughout this chapter, all functions are assumed to be smooth. The smoothness assumption is necessary in order to use the Hopf bifurcation theorem that proves the emergence of Hopf cycles (namely, periodic paths). We assume a representative household. Unlike the S–S model, the consumption of the representative household is dynamically determined by a differential equation. Such an equation is derived by combining the adjustment equation of the expected per-capita permanent income (from now on, EPCP income) with the consumption function that depends on the EPCP income. Our consideration of expected permanent income is based on Friedman (1957, pp. 143 (5.15)). We assume that the EPCP income of the representative household is determined by the following distributed lag of y: y p (t ) =

∫

β y (τ ) exp (β (τ − t )) dτ

(1)

( −∞ ,t ]

where y is income per capita and yp is EPCP income per capita. To transform Eq. 1 into a differential equation, we consider the time derivative of Eq. 1. y p (t ) =

⎫ d⎧ ⎨exp ( −βt ) ∫ β y (τ ) exp (βτ ) dτ ⎬ dt ⎩ ⎭ ( −∞ ,t ]

= −β exp ( −βt )

∫

β y (τ ) exp (βτ ) dτ + exp ( −βt ) β y (t ) exp ( βt )

( −∞ ,t ]

= β { y (t ) − y p (t )} We call this equation the adjustment equation of EPCP income.

132

A. Dohtani et al.

Remark 1: We consider a model economy in which the household expects that per-capita income does not grow. It can be expected that analogous results will be obtained in the case where per-capita income grows. In fact, Friedman (1957, pp. 144, (5.17)) treated this case by extending Eq. 1 to y p (t ) =

∫

β y (τ ) exp ((β − γ )(τ − t )) dτ

( −∞ ,t ]

where g is the estimated growth rate of yp. In the context in this chapter, the growth of per-capita income is caused by exogenous Harrod-neutral or labor-augmenting technological progress. However, for further generalizations along this line, we will need to analyze higher-dimensional and/or nonautonomous differential equations. Such generalizations are therefore left for future research. 䊏 We next assume the following consumption decision depending on the EPCP income: c = ayp We call this equation the consumption function. Remark 2: In the PI hypothesis, consumption consists of permanent and transitory components. The permanent component of consumption is closely related to growth dynamics. On the other hand, the transitory component of consumption ﬂuctuates stochastically in the PI hypothesis, and is therefore related to business ﬂuctuations. Thus, through transitory consumption, the PI hypothesis can be linked to the stochastic business ﬂuctuations. Since our concern here is the deterministic features in the long run, the permanent component rather than the transitory component plays an important role. So for simplicity, we will disregard the transitory consumption. 䊏 In the S–S model, the consumption function relates the consumption at any one time to the income at that time. However, this consumption function relates the EPCP income at the time to the consumption at the time. Since the EPCP income is dynamically determined, the consumption decision is inevitably dynamic, as follows. From the consumption function and the adjustment equation, we obtain . c = ab(y − yp) We call this equation the consumption equation. Combining the capital accumulation equation with the consumption equation, we have the following system: k = f (k ) − (n + δ ) k − c (2) c = β {α f ( k ) − c}

{

Stability of the Neoclassical Steady State

133

We call a solution of System 2 a path, and the equilibrium point of System 2 the steady state, which is equivalent to the neoclassical steady state. Although System 2 is seemingly a commonplace model, in the next section we prove that System 2 can generates a novel dynamic phenomenon around the steady state.

3 Growth Dynamics of the Model To analyze the dynamic behavior of System 2, we employ the following standard assumptions on the production function: Assumption 1: lim f ′ ( k ) = ∞ , lim f ′ ( k ) = 0.

䊏

Assumption 2: f ′(k) > 0 and f″(k) < 0 for any k > 0.

䊏

k →∞

k→0

We start with the next result on the existence and uniqueness of the steady state. Theorem 1: Suppose Assumptions 1 and 2 are satisﬁed. Then System 2 has a unique steady state, say (k*, c*) in the positive quadrant. 䊏䊏

Proof: See Appendix. For simplicity, we denote the elasticity of the production function by e(k) = kf ′(k)/f(k)

Moreover, to use the Hopf bifurcation theorem, we introduce the following condition on the elasticity: Assumption 3: 1 − a < e(k*) < 1.

䊏

For example, let f(k) = sk . Then Assumption 3 implies 1 − a < m < 1. Thus, Assumption 3 is plausible. Before describing the phase diagram of System 2, we make sure of the following lemma. m

Lemma 1: Under Assumption 3, the slope of the characteristic curves of . . k = 0 and c = 0 is positive in the neighborhood of the steady state. 䊏 Proof: See Appendix.

䊏

Noting Lemma 1, the stylized phase diagram of System 2 is indicated in Fig. 1. We will now prove the emergence of a Hopf bifurcation in System 2. As a bifurcation parameter, we choose b. We start with the next lemma. Lemma 2: Suppose Assumptions 1–3 are all satisﬁed. Deﬁne b # = (n + d){e(k*) + a − 1}/(1 − a)

134

A. Dohtani et al.

Fig. 1. Stylized phase diagram of System 2 in the neighborhood of the steady state

Then, b # > 0, and any eigenvalue of the Jacobian matrix evaluated at b = b # is purely imaginary. 䊏 Proof: It follows directly from Assumption 3 that b # > 0. For the remainder of the proof, see Appendix. 䊏 The emergence of a Hopf bifurcation can be obtained immediately from this lemma. For the Hopf bifurcation, see Guckenheimer and Holmes (1983, Sect. 3.4). Theorem 2: Suppose Assumptions 1–3 are all satisﬁed. If b > b # (or b < b #), then the steady state of System 2 is stable (or unstable). Moreover, System 2 䊏 generates a Hopf bifurcation at the value b = b #. Proof: See Appendix.

䊏

We next consider the stability of the Hopf cycle of Theorem 2. When a stable cycle emerges, the cycle attracts any path with the initial point near the cycle. On the other hand, when an unstable cycle emerges, any path inside the unstable cycle converges to the steady state, and any path outside the unstable cycle moves away from the cycle and, therefore, from the steady state (Fig. 2). Benhabib and Miyao (1981) stated that such a situation may be related to the notion of corridor stability by Leijonhufvud (1973). (See also Gabisch and Lorenz (1987, Sect. 6.1), Owase (1987, Sect. 20), Rosser (1991,

Stability of the Neoclassical Steady State

135

Fig. 2. Corridor stability

Sect. 6.1), Owase (1991), and Lorenz (1993, Sect. 3.2).) Our concern here is what condition guarantees an unstable Hopf cycle in System 2. One remark should be made here. System 2 satisﬁes the capital accumulation equation, so that aggregate demand equals aggregate supply. In this sense, the instability outside the corridor of System 2 is not related to the instability of market equilibrium, but to the instability of the steady state. As stated in the Introduction, Leijonhufvud’s notion of corridor stability is closely related to market equilibrium, and therefore is different from ours. However, his notion can easily be modiﬁed to the notion concerning the steady state. In general, it is not easy to determine the stability of a Hopf cycle. The reason is that the condition guaranteeing the stability of a Hopf cycle depends on up to third-order derivatives of a vector ﬁeld. In many cases, such a condition is too complicated to allow any economic meaning. We can, however, see that the Hopf cycle in Theorem 2 is unstable under standard production function and standard assumptions. In fact, we have: Theorem 3: Suppose Assumptions 1–3 are all satisﬁed. In addition, we assume that f ′′′(k*) > 0. Then there is a set included in {b : b > b #}, say Ω, such that for any b ∈ Ω System 2 yields an unstable Hopf cycle. 䊏 Proof: See Appendix.

䊏

Example 1: As the typical production functions that satisfy f ′′′(k*) > 0, we 䊏 have f(k) = skm and f(k) = s log (k + 1) where 0 < s and 0 < m < 1. Theorem 3 is important in the sense that it proves the emergence of corridor stability. In our model, the corridor of stability is the Hopf cycle, so that any path inside the unstable Hopf cycle converges to the steady state, and any path outside the unstable Hopf cycle diverges. Accordingly, a sort

136

A. Dohtani et al.

of hysteresis emerges near the unstable Hopf cycle, that is, a slight difference in starting points yields a drastic historical difference. Theorem 3 shows that, under standard production function and standard assumptions, such an interesting phenomenon can emerge in a modiﬁed version of the S–S model that is seemingly commonplace. Remark 3: There exist various notions concerning the idea of corridor stability. The notion of the corridor of stability mathematically implies a sort of the basin boundary (see Robinson 1995, Sect. 5.5), and the inside of the corridor is a basin of attraction. Since the corridor of stability in our model consists of an unstable Hopf cycle, the corridor is also called a stability threshold (see Medio and Lines 2001, Sect. 5.3). Moreover, hysteresis yielded by corridor stability is closely related to the notion of the bifurcation of behavior in the biological context (see Smale and Hirsch 1974, Sect. 12.3). Any of these terminologies, from various viewpoints, grasps the important features of dynamics concerning corridor stability. 䊏 In the case where hysteresis emerges, the policy that intends to cause small economic changes may yield drastic economic changes, that is, may lead us to outcomes that are contrary to the intention of policy makers. Thus, in the case where corridor stability emerges, the policy makers need to be well prepared for unintended cases. The consumption equation constructed above can be extended slightly. Below we discuss it brieﬂy. Consider the smooth function Q(x) that satisﬁes Q(0) = 0

(3)

Q′(x) > 0 for any x > 0

(4)

Q(x) is linear in a neighborhood of x = 0

(5)

Moreover, we consider the following adjustment equation of EPCP income: . (6) yp = Q(y − yp) From Eq. 6 and c = ayp, we obtain . c = aQ(f(k) − c/a) = aQ((af(k) − c)/a) ≡ F(af(k) − c)

(7)

Equation 7 gives a generalization of the consumption equation above. Moreover, the local linearity condition Eq. 5 shows that the F-function is also linear in a neighborhood of x = 0. Then the same argument is true of the more general growth model with the generalized consumption Eq. 7. The local linearity condition Eq. 5 yields F″(0) = 0, F′′′(0) = 0. Moreover we have

Stability of the Neoclassical Steady State

137

F′(0) = Q′(0) = b We can now easily see that the same results are true of the general growth model because, as for the information on the F-function, only the value of b = F′(0) is required to calculate the stability index concerning System 2. It should be noted here that if the local linearity condition Eq. 5 is taken away, complicated conditions on derivatives of the Q-function or the F-function will be necessary to determine the sign of the stability index. Remark 4: In general, it is not always clear what type of distributed lag equation corresponds to Eq. 6, but if the Q-function is piecewise-linear, Eq. 6 with the Q-function follows from a simple switch corresponding to the 䊏 values of (y − yp). The above generalization about the consumption equation possesses an important economic implication. We explain it brieﬂy here. Since the bifurcation value is given by b # = (n + d){e(k*) + a − 1}/(1 − a) and n + d is very small, the bifurcation value may be too small. Therefore, the argument in such a parameter domain does not seem to be plausible. However, even if the derivative Q′(x) is not small for large values of |x|, the value of b = Q′(0), which is closely related to the emergence of Hopf bifurcation, can be sufﬁciently small. The reason is that in the case where the difference between actual income and EPCP income is sufﬁciently small, the sufﬁciently small adjustment coefﬁcient can be plausible. Thus, such a problem on the bifurcation value b # can easily be improved by incorporating the nonlinear adjustment function of EPCP above into our growth model. Allowing for this point, we assumed, for simplicity, the linear adjustment function. Finally we make one more remark. We paid attention to the possible emergence of an unstable Hopf cycle. If Condition (5) is not satisﬁed (that is, the Q-function is not locally linear), then there exists a possibility that the Hopf cycle of Theorem 2 is stable. From an economic viewpoint, however, it will be difﬁcult to derive a signiﬁcant condition for the emergence of a stable Hopf cycle. Moreover, it seems that the plausible parameter domain in which stable Hopf cycles emerge is extremely narrow. However, we do not discuss that here.

4 Nearly Constant Propensity to Consume By assuming the completely proportional relation between consumption and income, Solow (1956) and Swan (1956) succeeded in constructing a growth model in which any path monotonously converges to the steady state. Such a proportional relation between consumption and income (not permanent income) is stably observed in long-term time-series. For this point, see

138

A. Dohtani et al.

Kuznets (1942) and also Romer (2001, Sect. 7.1). On the other hand, we assumed c = ayp, i.e., a proportional relation between consumption and permanent income (not income). Since c = ayp ≠ ay for any path that is not the steady state, the dynamic behavior yielded under such an assumption does not guarantee a completely proportional relation between consumption and income. Moreover, System 2 possesses a corridor that consists of an unstable Hopf cycle. Paths inside the cycle ﬂuctuate around the steady state while converging slowly to the steady state. Can our modiﬁed growth model with such ﬂuctuating dynamics explain the important observation by Kuznets? Below, we make clear that a nearly constant propensity to consume, i.e., a nearly proportional relation between consumption and income, is compatible with cyclical but convergent ﬂuctuations around the steady state. That is, any ﬂuctuating path maintains a nearly constant propensity to consume. Although we described Fig. 1 as being an easy-to-see phase diagram, a more exact and typical phase diagram is shown in Fig. 3. As Fig. 3 shows, typical paths that are not the steady state are ﬂuctuating around the steady state. Such typical ﬂuctuating paths cling to the graph c = af(k). Hence, for any t > 0, c(t)/y(t) ⱌ af(k(t))/y(t) = a on the ﬂuctuating paths, so that a ⱌ c(t)/y(t) = C(t)/Y(t) where Y is income and C is consumption (not per capita, respectively.) This shows that although typical paths of System 2 ﬂuctuate, a nearly proportional relation between consumption and income is stably observed on these paths. Figure 4 plots C(t)/Y(t). Thus, System 2 explains why a nearly proportional relation is compatible with ﬂuctuations around the steady state.

Fig. 3. Phase diagram of System 2 in the neighborhood of the steady state

Stability of the Neoclassical Steady State

139

Fig. 4. A nearly constant propensity to consume

5 Effects of Parameters on Aggregate and Stability In this section, to clarify our argument, we consider the production function of the form skm (0 < s, 0 < m < 1). In the same way as the S–S model, the growth rate of the steady state in System 2 equals the growth rate of labor. Therefore, the growth rate of the steady state does not depend on any parameter of System 2, but the aggregate of the steady state depends on a parameter of System 2. Moreover, we will see that the stability of the steady state depends on the parameters of System 2. First, we consider the effects of the steady state on the aggregate. We can easily see that the capital stock of the steady state is given by K = kL = L0 exp(nt){s(1 − a)/(n + d)}1/(1−m) where K is captial stock (not per capita) and L is population. Thus we obtain the same results as the S–S model, that is, an increase in s or m leads to an increase in the aggregate of the steady state. However, an increase in a or n leads to a decrease in the aggregate of the steady state. Next we consider the effects on the stability of the steady state. Denote the Jacobian matrix of System 2 by J, and the trace of J by TrJ. We here consider the case where the eigenvalues of J are complex conjugate, and the case where the real parts of the eigenvalues of J are negative. In this case, any path near the steady state of System 2 approaches the steady state exponentially. For the notion of an exponential approach, see Smale and Hirsch (1974, Sect. 7.1). Under the assumption of the conjugation and the exponential approach, the convergence speed depends on the magnitude of TrJ, which is given by TrJ = (n + d)(m + a − 1)/(1 − a) − b

140

A. Dohtani et al.

See the proof of Lemma 2 in the Appendix. Any change in s has no effect on the stability. An increase in b makes the steady state more stable, but an increase in m or n makes it more unstable. Since dTrJ/da = m(n + d)/(1 − a)2 an increase in a also makes the steady state more unstable.

6 Conclusions and Final Remarks In this chapter, combining the permanent income hypothesis with the capital accumulation equation, we constructed a growth model that gives a modiﬁed version of the well-known neoclassical growth model (S–S model). The modiﬁed version has a unique steady state, which is the same as the S–S growth model. Using the Hopf bifurcation theorem, we showed that under standard production functions and standard conditions, the modiﬁed version can generate unstable periodic paths around the steady state. In other words, the model can generate corridor stability, that is, there exists a neighborhood of the steady state such that the neoclassical view of the economy collapses outside the neighborhood. This also implies that hysteresis occurs near the boundary of the neighborhood, that is, a slight difference in starting points can yield a drastic historical difference. Thus, we see that in the case where corridor stability emerges, even an economic policy intended to result in small economic changes may cause drastic economic changes. The S–S model assumed a completely proportional relation between consumption and income. Consequently, the S–S model is globally stable. The actual economy, however, yields small ﬂuctuations around the steady state while maintaining a nearly proportional relation. The modiﬁed version explains why a proportional relation is nearly compatible with such ﬂuctuations around the steady state. By almost the same argument as the S–S model, we can easily see that the growth rate of the steady state equals the growth rate of the population, so that the growth rate does not depend on parameters other than the growth rate of the population. On the other hand, for the effects of param-eters on the aggregate of the steady state, we obtained the same results as the S–S model. Moreover, for the effects of parameters on the stability of the steady state, we proved that any change in s has no effect on the stability. An increase in b makes the steady state more stable, but an increase in a, m, or n makes it more unstable. Our results on the emergence of a Hopf cycle were obtained by the socalled local bifurcation analysis. For the local bifurcation analysis, see, for example, Guckenheimer and Holmes (1983, chap. 3). We cannot say anything

Stability of the Neoclassical Steady State

141

about whether or not a similar result on the emergence of corridor stability can be obtained by global analysis or phase analysis. Other mathematical or geometric methods will be required to attack such a problem. Global analysis of our growth model is left for future research. Acknowledgment. We are grateful for helpful and valuable comments to participants in the Rokko Forum (May 9, 2004) at Kobe University, and to participants in the Chuo Meeting on Economics of Time and Space 2005 (August 29–30, 2005) at Chuo University. Particular thanks go to T. Asada, T. Hagiwara, T. Haruyama, T. Nakamura, and T. Nakatani. Remaining errors are ours.

Appendix The appendix gives the proofs of Lemmas 1 and 2 and Theorems 1–3. Proof of Theorem 1: From Eq. 1, we see that any steady state of System 2 is given as a solution of the equation c = af(k) = f(k) − (n + d)k. Now deﬁne g(k) = (1 − a)f(k) − (n + d)k If System 2 possesses more than two steady states in the positive quadrant, then there are more than two points in the positive quadrant such that the points satisfy g(k) = 0. Since g(0) = 0, it follows from the mean value theorem that there are at least two positive numbers that satisfy g′(k) = 0. Let k1 and k2 be such points. Then g′(k1) = g′(k2) = 0

(A1)

Without loss of generality, we assume that k1 < k2 . Assumption 2 yields g″(k) = (1 − a)f″(k) < 0 for any k > 0 Therefore, the function g′(k) is strictly decreasing, so that g′(k1) > g′(k2). This contradicts Eq. A1. The contradiction proves Theorem 1. 䊏 Proof of Lemma 1: From the deﬁnition of k and Eq. 7, we have (1 − a) f(k) = (n + d)k, so that for any k > 0, f ′(k) = {kf ′(k)/f(k)}f(k)/k = e(k)(n + d)/(1 − a)

(A2)

Therefore we have f ′(k) − (n + d) = e(k)(n + d)/(1 − a) − (n + d) = {e(k) − (1 − a)}(n + d)/(1 − a) > 0

(A3)

Equation. A3 and Assumption 3 show that the slope of the characteristic curve of k = 0 is positive in the neighborhood of the steady state. On the other

142

A. Dohtani et al.

. hand, the slope of the characteristic curve of c = 0 is positive in the domain of k > 0. This completes the proof. 䊏 Proof of Lemma 2: The former part of Lemma 2 is a direct consequence of Assumption 3. So we now prove the latter part. Denote the determinant of • by Det •, and the Jacobian matrix of System 2 evaluated at the steady state by J(b). Then, Assumption 3 and Eq. A2 yield TrJ(b) = f(k*) − (n + d) − b = (n + d)e(k*)/(1 − a) − (n + d) − b = (n + d){e(k*) + a − 1}/(1 − a) − b

(A4)

DetJ(b) = −b{(1 − a)f ′(k*) − (n + d) = −b(n + d){e(k*) − 1} > 0

(A5)

#

Since Eq. A4 yields TrJ(b ), the proof follows directly from Eq. A5.

䊏

Proof of Theorem 2: Denote the real part of J(b) by R(b). Since in Lemma 2 we showed that the eigenvalues of J(b #) are purely imaginary, the eigenvalues of J(b) are also purely imaginary in a neighborhood of b #. Hence, Eq. A4 yields that in the neighborhood of b # R(b) = TrJ(b) = (n + d){e(k*) + a − 1}/(1 − a) − b Therefore, R(b #) = 0 and R′(b #) = −1 < 0. The proof follows directly from the Hopf bifurcation theorem. 䊏 Proof of Theorem 3: In Theorem 2, we proved the existence of a Hopf cycle. Theorem 3 is now proved by calculating the stability index. For the stability index, see Guckenheimer and Holmes (1983, pp. 151–153). Deﬁne F(c, k) = b{af(k) − c}, G(c, k) = f(k) − (n + d)k − c We see that Fcc = Fck = Fccc = Fcck = Gck = Gcc = Gcck = Gccc = 0 Fkk = abf″(k*), Fkkk = abf ′′′(k*), Gkk = f″(k*), Gkkk = f ′′′(k*) Using these equations, the stability index evaluated at the steady state is calculated as follows: r = (Fccc + Fckk + Gcck + Gkkk)/16 + {Fck(Fcc + Fkk) − Gck(Gcc + Gkk) − FccGcc + FkkGkk}/(16x) where x > 0. For the parameter x, see Guckenheimer and Holmes (1983, pp. 151–153). The deﬁnition of x is not, however, required for the proof of Theorem 3. The assumption that f ′′′(k*) > 0 yields r = Gkkk/16 + FkkGkk/(16x) = f ′′′(k*)/16 + ab{f″(k*)}2/(16x) > 0 Thus, the stability index is positive. This implies that the Hopf cycle in Theorem 2 is unstable. For this point, see Guckenheimer and Holmes (1983, pp. 151–153). 䊏

Stability of the Neoclassical Steady State

143

References Barro RJ, Sala-i-Martin X (1995) Economic growth. McGraw-Hill, New York Benhabib J, Miyao T (1981) Some new results on the dynamics of the generalized Tobin model. Int Econ Rev 22:589–596 Domar ED (1946) Capital expansion, rate of growth, and employment. Econometrica 14:137–147 Friedman M (1957) A theory of the consumption function. Princeton University Press, Princeton Gabisch G, Lorenz HW (1987) Business cycle theory: survey of methods and concepts. Springer, Berlin–Heidelberg–New York Guckenheimer J, Holmes P (1983) Nonlinear oscillations, dynamical systems, and bifurcations of vector ﬁelds. Springer, New York–Berlin–Heidelberg Harrod RF (1939) An essay in dynamic theory. Econ J 49:14–33 Kuznets S (1942) Use of national income in peace and war. Occasional Paper No. 6, NBER, New York Leijonhufvud A (1973) Effective demand failures. Swed J Econ 75:27–48 Lorenz HW (1993) Nonlinear dynamical economics and chaotic motion. Springer, Berlin–Heidelberg–New York Mankiw NG, Romer D, Weil DN (1992) A contribution to the empirics of economic growth. Q J Econ 107:407–437 Medio A, Lines M (2001) Nonlinear dynamics: a primer. Cambridge University Press, Cambridge Owase T (1987) Dynamical system theory in economics. Zeimukeirikyoukai, Tokyo Owase T (1991) Nonlinear dynamical systems and economic ﬂuctuations: a brief historical survey. IEICE Trans E74:1393–1400 Robinson C (1995) Dynamical systems: stability, symbolic dynamics, and chaos. CRC Press, Boca Raton Romer D (2001) Advanced macroeconomics. 2nd ed. McGraw-Hill, New York Rosser JB Jr (1991) From catastrophy to chaos: a general theory of economic discontinuities. Kluwer, Boston/Dordrecht/London Smale S, Hirsch MW (1974) Differential equations, dynamical systems, and linear algebra. Academic Press, New York Solow RM (1956) A contribution to the theory of economic growth. Q J Econ 79:65–94 Swan TW (1956) Economic growth and capital accumulation. Econ Rec 32:334–361

8. Time-Delayed Dynamic Model of Renewable Resource and Population Akio Matsumoto1, Mami Suzuki2, and Yasuhisa Saito3

Summary. This chapter reconsiders the “Ricardo–Malthus” dynamic model of population and renewable resource developed by Brander and Taylor. To this end, it sheds light on a delay in production, which is a key feature of long-run evolution in a preindustrial economic society. It is clear that the delay has a considerable effect on the long-run dynamics of natural resources and population. This notwithstanding, much previous work relates to a case in which production is instantaneous. As such, the purpose of this chapter is to investigate the effect caused by the delay in production. Our analysis shows that there is a critical value of the delay with which a delayed version of an otherwise stable system becomes unstable. In addition, it shows numerically that the critical value is negatively related to the size of the carrying capacity and the exogenous net birth rate. Key words. Delayed differential equation, Easter Island, Economic dynamics model, Environmental degradation, Population growth

1 Introduction This study reconstructs an economic dynamic model for a small island based on the Easter Island study by Brander and Taylor (1998) (BT henceforth). Introducing a delay in production into the BT model, we deal with analytical as well as numerical representations of the evolution of a small island economy, evolution which is characterized by a two-fold process of transformation: environmental degradation and population growth. Its main purpose Department of Economics, Chuo University, Tokyo, Japan Department of Management, Aichi Gakusen University, Aichi, Japan 3 Department of Systems Engineering, Shizuoka University, Shizuoka, Japan 1 2

145

146

A. Matsumoto et al.

is to investigate the effects caused by the delay in production on an adjustment process in population and the stock of natural resources. Archeological studies have suggested that many Paciﬁc islands followed similar evolutionary patterns of natural resources and population dynamics, that is, rapid population growth, resource degradation, economic decline, and then population collapse. BT reconsider the archeological and anthropological evidence from Easter Island as an example from an economic point of view. In particular, BT present a general equilibrium model of renewable resources and population dynamics, and seek to explain the rise and fall of Easter Island for 1400 years between the 4th century and the middle of the 18th century. They bring to light the economic conditions under which a small island economy can survive or perish. Their ﬁndings indicate that an economic model linking resources and population dynamics may explain not only the sources of past historical evolution discovered in these small islands, but also the possibility of sustainable growth for our world economy, in which a rapidly increasing population and a rapidly degrading environment become serious problems. The analysis in the BT model has been extended in various directions. Dalton and Coats (2000) examine the impact of market institutions and different property-rights structures. Reuveny and Decker (2000) consider numerically how technological progress and population management reform affected the long-run dynamics of Easter Island. Matsumoto (2002) reconstructs the BT continuous model in discrete steps, and shows that the modiﬁed model can generate various dynamics ranging from simple dynamics to complex dynamics involving chaos. In the existing literature, however, not much has been revealed with respect to the “history” of Easter Island. It has been believed that a small group of Polynesians arrived on the island around ad 400, deforestation occurred around ad 1000, most of the statues were carved during ad 1000–1400, and so forth.1 Based on this “conventional wisdom,” BT as well as other researchers attempt to reproduce a dynamic pattern of natural resources and population. However, the “wisdom” is still one of several possible hypotheses and is not yet fully conﬁrmed. In particular, according to Intoh (2000), recent reconsiderations of archeological evidence on the island imply indeterminacy in the arrival time of the Polynesians. It can only be an estimate that a small group settled between ad 410 and 1270. This new ﬁnding is inconsistent with the traditional wisdom. In other words, we may have a different history (as well as a different evolutionary pattern) of Easter Island, even though the available historical evidence is the same. It is thus imperative to construct a model for small islands that can generate various patterns of dynamics in order to deal with such ambiguous characteristics of archeological evidence. 1

See Sect. 1 of Brander and Taylor (1998) for more details.

Renewable Resource and Population

147

In this study, we extend the BT model to include a delay or lag in production for the following reasons. Agricultural production may play an essential part in economic activity in a preindustrial economic society. It is well known that an important characteristic of such agricultural production is the signiﬁcant time-lag between the time at which producers make their decisions to plant seeds in the ﬁelds, and the time when they actually gather crops from the ﬁelds. It is thus natural to raise the question, How did the delayed production affect the evolutionary pattern of a small island economy? This study is organized as follows. Section 2 constructs a simple economic dynamics model of an small island based on BT’s Easter Island analysis. Section 3 analytically as well as numerically examines the effects caused by a delay in production on the evolution of natural resources and the population. Section 4 provides a summary and concluding remarks.

2 Basic Model for Small Islands Because our analysis is based on BT’s dynamic model of renewable resources and population, we recapitulate the basic part of their model in this section (see BT’s paper for more details). The model describes the dynamics of an economy with two types of goods and three types of economic agent (two producers and one consumer). The harvest of the renewable resource is called the agricultural good, and some other good is called the manufactured good. The model functions as follows. At time t, the stock of natural resources S(t) and the size of the population L(t) are given.2 Producers determine their demands for labor and supplies of goods so as to maximize their proﬁts. A manufacturing producer supplies the manufactured good produced with constant returns to scale using only labor. Since, by choice of units, one unit of manufactured goods can be produced by one unit of labor, the total supply of manufactured good MS is determined by the demand for labor LMD MS = LMD

(1)

An agricultural producer supplies the agricultural good carried out according to the Shaefer harvesting production function HS = aSLMD

(2)

where HS is the harvest supplied, a is a positive constant indicating the harvesting efﬁciency, and LHD is the labor demand in resource harvesting. A representative consumer is endowed with one unit of labor, and is assumed to have a Cobb–Douglas utility function. 2

For the time being, t is suppressed for notational simplicity.

148

A. Matsumoto et al.

u(h,m) = h b m1−b

(3)

where h and m are individual consumptions of the agricultural good and the manufactured good, respectively, and b ∈ (0,1) is a positive constant reﬂecting the preference of the agricultural good. Each consumer supplies one unit of labor and demands both goods so as to maximize his utility subject to budget constraint ph + m = w, where p is the price of the agricultural good, w is the wage rate, and the price of the manufactured good is normalized to 1 as it is treated as a numeraire. The usual utility maximization procedure yields optimal demands for both goods, hd = wb/p and md = (1 − b)L where d is attached to a variable indicating individual demand. The total number of the population is L, so that the total demands are HD = Lhd and MD = Lmd

(4)

Prices are adjusted to establish temporary equilibrium in each of three markets: agricultural goods market (HD = HS), manufactured goods market (MD = MS), and labor market (LMD + LHD = L), in which the labor force is assumed to be equal to the population. It can be veriﬁed that a ﬁxed proportion of the total population is employed in the agricultural section, LHD = bL, and thus the resource harvest is HS = abSL at the temporary equilibrium state. After ﬁnishing transactions in each market, new values of natural resources and the size of the population are determined at the next instant of time. With these new values, the process repeats until a stationary state is attained. The dynamics of temporary equilibrium are described as follows. A change in the stock at time t is determined by the natural growth rate G(S) minus the harvest rate. dS (t ) = G ( S (t )) − H S (t ) dt

(5)

For analytical simplicity, the logistic functional form for G is assumed to be G(S) = r(1 − S/K)S, where K is the maximum possible size for the resource stock, r is the intrinsic growth rate of the natural resource, and both are positive constants. A change of population depends on a difference between an underlying birth rate and death rate, where b denotes the exogenously determined birth rate, and d the endogenously determined death rate. It is assumed that the net rate, denoted as c = b − d, is negative. Following the formulation of Malthusian population dynamics, it is further assumed that the endogenously determined birth rate depends on economic activities, that is, per capita consumption of the agricultural goods increases fertility and/or decreases mortality. Let fHS/L be a fertility function, where f is a positive constant. Then the population growth rate is

Renewable Resource and Population

H S (t ) ⎞ 1 dL (t ) ⎛ = c +φ ⎝ L (t ) ⎠ L (t ) dt

149

(6)

where the ﬁrst factor is the exogenous net birth rate, and the second is the endogenous birth rate. Substituting the logistic function into the natural growth rate and the optimal harvest into the fertility function yields the dynamics process of the natural resources and the population.

{

dS (t ) /dt = (r (1 − S (t ) / K ) − αβ L(t )) S (t ) dL (t ) /dt = (c + αβφS (t )) L (t )

(7)

This is a two-dimensional dynamic system of differential equations, and is a variant of the Lotka–Volterra predator–prey model in which the human is the predator and the resource stock is the prey. The system has a steady state if dS/dt = 0 and dL/dt = 0 hold. We can solve the last simultaneous system for stocks of the natural resource and the population size.3 The coordinates of an interior solution are Se = −

c αβφ

and

Le =

r ⎛ c ⎞ ⎜⎝ 1 + ⎟ αβ αβφK ⎠

(8)

BT ﬁnd the following result for the stability of the interior steady point. Theorem 1: (Proposition 4 (iii) of Brander and Taylor) An interior steady state (Le > 0 and Se > 0) is a spiral node with cyclical convergence if −cr + 4 ( −c − αβφ K ) < 0 αβφ K and an improper node allowing monotonic convergence if not. Using the parameterization for the basic model for which BT provide a detailed justiﬁcation, 4 we perform two simulations, which are reproductions of BT’s numerical examples. Figure 1 illustrates the time series for the population size (i.e., a mountain-shaped curve) and resource stocks (i.e., a

The system can have a corner solution depending on the values of the parameters. However, the corner solution, with L = 0, means an extinction of humans, which is not interesting. Thus, in this study, we focus only on the interior solution. 4 BT use the following parameter values for their simulations: L0 = 40 (initial human population), S 0 = 12 000 (initial stock of the renewable resource), K = 12 000 (carrying capacity), a = 0.00001 (harvesting efﬁciency), b = 0.4 (preference for the agricultural good), r = 0.04 (intrinsic growth rate of the renewable resource), f = 4 (fertility rate), c = −0.1 (intrinsic net birth rate). 3

150

A. Matsumoto et al.

S,L 12000

8000

4000

4c

9c

11c

14c 15c

18c

14c 15c

18c

Time

Fig. 1. Slow rate of natural resource (g = 0.01)

S,L 40000

20000 12000

4c

9c

11c

Time

Fig. 2. Rapid growth rate of natural resource (r = 0.35)

downward sloping concave–convex curve) when the intrinsic growth rate of the renewable resource is low, r = 0.04. Figure 2 illustrates the time series for the same variables when the rate is roughly nine times higher, r = 0.35. In these ﬁgures, one period represents one decade, and the horizontal axis shows 140 periods. The initial period corresponds to the year ad 400, when the ﬁrst indigenous people are said to have arrived on the island, and the last period corresponds to some time in the 18th century when the ﬁrst

Renewable Resource and Population

151

European arrived on the island, after which the substantial changes in the environment make the dynamic model in Eq. 7 hold no longer. Figure 1 shows oscillatory dynamics, and appears to replicate what is known of Easter Island history. The island was settled by a Polynesian group about ad 400, and was covered with large palm trees at this time. For the ﬁrst 300 years, the populations size was small, and the resources degraded very little. However, soon after, the population size began to increase rapidly, and the resources began to decline correspondingly. The heyday of Easter Island is supposed to be between 1100 and 1300: the population reached its maximum (about 10 000) and the statue carving was intensive. After reaching its peak of population size, the island entered a period of decline, and then disappeared from the history; the palm forest was entirely gone by 1400, carving ceased by 1500, and violent internecine conﬂict appeared. According to Theorem 1, the model can generate monotonic behavior for some combinations of parameter values, which may explain the monotonic evolution observed on some Polynesian islands. In the second simulation, we change the growth rate of the natural resources from the lower value (r = 0.04) to the higher value (r = 0.35), and use the same values of any other parameters as in the ﬁrst simulation. As illustrated in Figure 2, the simulation shows a entirely different dynamics: a smooth adjustment converging to the stationary state. Comparing these two ﬁgures, we observe that an island with a slow-growing resource base will exhibit overshooting and collapse, while an island with rapidly growing resources exhibits a near-monotonic adjustment of population and resource stocks towards a steady state.

3 Application Model with Time Delay In this section, we investigate the hypothetical role that a delay might play in disturbing the monotonic dynamics and further disturbing the oscillatory dynamics. In order to see this disturbing or destabilizing effect, we introduce a delay in production into the basic model, taking account of the fact that it takes some time for agricultural goods to progress from seeding to harvesting. H(t) = abL(t)S(t − t)

(9)

where t is a delay in production. The current harvesting depends on the current amount of labor and the stock of natural resources at time t − t. Substituting Eq. 9 into the basic system Eq. 7 generates a dynamic system with a delay in production t.

A. Matsumoto et al.

152

(

)

⎧⎪dS (t ) /dt = rS (t ) 1 − S (t ) − αβ L (t ) S (t − τ ) K ⎨ ⎪⎩dL (t ) /dt = L (t )(b − d + φαβ S (t − τ ))

(10)

It can be veriﬁed that the delayed system has the same equilibrium point as the basic system. To investigate the stability of the delayed system, we ﬁrst make a coordinate transformation such that a new system is centered at the equilibrium point (Se, Le) and then linearize the resultant system at the origin L = L − Le. Then the to derive its characteristic equation. Let ¯ S = S − Se and ¯ centered system is reduced to e ⎧dS (t ) /dt = r ⎛ 1 − 2S ⎞ S (t ) − αβ S e L (t ) − r S (t )2 ⎪ ⎝ K ⎠ K ⎪ e −αβ L S (t − τ ) − αβ L (t ) S (t − τ ) ⎨ ⎪dL (t )/dt = Le + L (t ) φαβ S (t − τ ) ( ) ⎪ ⎩

(11)

S (t )⎞ = Ce λt , where C ∈ C2 and l ∈ C. Comparing the linear terms, Put ⎛ ⎝ L (t )⎠ we have e ⎛ r ⎛ 1 − 2S ⎞ 0 − αβ S e ⎞ λt −αβ Le λt Cλe = ⎜ ⎝ Ce λ (t −τ ) ⎠ ⎟ Ce + K φαβ Le 0 ⎜⎝ ⎟⎠ 0 0 In consequence, for Eq. 10, we get the characteristic equation of the linearized system around (Se, Le),

(

{

)

}

2S e ⎞ λ 2 + −r ⎛ 1 − + αβ Le e − λτ λ + (αβ )2 φS e Le e − λτ = 0 ⎝ K ⎠

(12)

and substituting Eq. 8 into this gives

{

}

2cr ⎞ ⎛ rc ⎞ − λτ ⎛ λ2 + −⎜ r + λ + (αβ )2 φS e Le e − λτ = 0 ⎟⎠ + ⎜⎝ r + ⎟⎠ e ⎝ φαβ K φαβ K

(13)

When t = 0, Eq. 12 becomes

λ2 +

rc λ + (αβ )2 φS e Le = 0 αβφK

(14)

Since c < 0 and (ab)2fSeLe > 0, all the characteristic roots of Eq. 13 have negative real parts, by which, in this case, the equilibrium point (Se, Le) is locally asymptotically stable for Eq. 10. For the sake of notational simplicity, we introduce new variables, p and q, deﬁned as

Renewable Resource and Population

p=r+

rc αβφK

153

q = (αβ )2 φS e Le > 0

and

Substituting the new variables into Eq. 12, we can rewrite the characteristic equation with delay as l2 + (−(2p − r) + pe−lt )l + qe−lt = 0

(15)

Substituting l = iy into Eq. 14 gives py sin yt + q cos yt = y2

(16)

py cos yt − q sin yt = (2p − r)y

(17)

Squaring and adding Eqs. 15 and 16 yields y4 + (3p2 − 4pr + r2)y2 − q2 = 0

(18)

Let Y = y2 ⭓ 0, where the direction of inequality is due to y ∈ R. By the way, if Y = 0, we have q = 0 which is a contradiction. Thus, we are concerned with Y > 0 and the roots y0 of Eq. 17. 1

⎛ − ( 3 p2 − 4 pr + r 2 ) + ( 3 p2 − 4 pr + r 2 )2 + 4q 2 ⎞ 2 y0 = ± ⎜ − ⎟ 2 ⎝ ⎠

(19)

From Eqs. 15 and 16, cos ( y0τ ) =

y02 (q 2 + 2 p2 − pr ) p2 y02 + q 2

and

sin ( y0τ ) =

y0 { py02 − q (2 p − r )} p2 y02 + q 2

which imply that there is a t0 such that

τ0 =

y02 (q + 2 p2 − pr ) y0 { py02 − q (2 p − r )} 1 1 arcsin arccos = y0 q 2 + p2 y02 y0 p2 y02 + q 2

(20)

Then we have the following theorem. Theorem 2: The delayed dynamic system Eq. 10 is unstable if t > t0, where

τ 0 = 1 / y0 arcsin [ y02 (q + 2 p2 − pr ) / (q2 + p2 y02 )] = 1 / y0 arccos [ y0 { py02 − q (2 p − r )} / (q 2 + p2 y02 )] p=r+

rc > 0, αβφK

q = (αβ )2 φS e Le > 0

A. Matsumoto et al.

154

and

1

⎛ −(3 p2 − 4 pr + r 2 ) + (3 p2 − 4 pr + r 2 )2 + 4q 2 ⎞ 2 y0 = ± ⎜ ⎟⎠ 2 ⎝ Proof: See Appendix. To the best of our knowledge, a nonlinear dynamic system with a time delay does not have an analytical solution.5 Nevertheless, it is possible to examine its dynamic behavior by simulating the system numerically. Since we have performed two simulations without delay in the last section, we now conduct simulations with a production delay, and then compare the results with a time delay with those without one. By doing so, we can detect the effects on the long-run dynamics of a population and natural stocks caused by the delay in production. Figure 3 presents simulation results that occur when the delay in production is introduced into the ﬁrst example, ceteris paribus. The solid lines show the time series of the population and the natural resources obtained in the current simulation, while the dotted lines are reproductions of the simulation results depicted in Fig. 1. With a delay in production, the population and the resource stocks generates more volatile ﬂuctuations relative to those

St,Lt 17500

12500

7500

2500 4c

9c

11c

14c 15c

18c

Time

Fig. 3. Production delay in a ﬂuctuating economy

It is possible to construct an analytical solution when a nonlinear dynamic system is discrete (see Suzuki 1996, 2000). 5

Renewable Resource and Population

155

without a delay. The population reaches a much higher peak and a much lower trough, jumping to 17 500 and falling quickly down to near zero, while the natural resources decline more rapidly and get closer to zero stock. These numerical simulations indicate the destabilizing effect caused by a delay in production in the oscillatory case. Figure 4 presents simulations that occur when a delay in production is introduced into the second example, ceteris paribus. As in Fig. 3, the solid lines show the simulation results with the delay in production, and the dotted lines those without it, which reproduce the results depicted in Fig. 2. Comparing these results, we observe ﬁrstly the earlier and much more volatile ﬂuctuations in population as well as in natural resource, and secondly the near exhaustion of natural resources and the much more severe falls in population at the beginning of 18th century. These numerical simulations again indicate the destabilizing effect of a delay even on otherwise monotonic dynamics. In these numerical simulations, we adopt the parameterization used by BT, for which the critical value of the delay is t 0 = 10.1652. 6 This is indeed a large delay, but we should not be too concerned. As seen in Eq. 20, the critical value depends on many other parameters. We should thus be content with the existence of the critical value, trusting that it might be considerately

S,L 50000

30000

10000

4c

9c

11c

Fig. 4. Production delay in a stable economy 6

See Appendix for calculations.

14c 15c

18c

Time

156

A. Matsumoto et al.

smaller under slightly different parameter speciﬁcations. Although it may be possible to analytically derive its dependency on the parameters, we can expect the computations to become messy, and thus conﬁrm it numerically. Figure 5A illustrates the relationship between the delay and the maximum size of the natural resource, ceteris paribus. It displays the downward sloping borderline between the stable region and the unstable region. It can be seen that the critical value of t gets smaller as the size of K becomes larger. Figure 5B illustrates the relationship between the delay and the exogenous net birth rate. We ﬁnd the same property that t 0 is negatively related to b − d. A combination of larger K and smaller b − d in absolute values can lead to a smaller t 0 .

4 Concluding Remarks This study has investigated the long-run dynamics of renewable natural resources and the population by introducing a delay in agricultural production activities. It ﬁrst conﬁrms that the basic mode without a delay exhibits stable dynamics. Then it analytically derives a critical value of the delay in production for which a loss of stability occurs. Subsequently, two numerical simulations show the destabilizing effect caused by the delay in production on the evolution of a small island economy. The resource stocks and the population ﬂuctuate much more widely over time when the growth rate of the natural resources is low (see Fig. 3), and ﬂuctuations are also generated

10

τ

τ 5

8

Unstable Region

4

Unstable Region

6

3 4

Stable Region

2

A

2

Stable Region

1 K 14000 16000 18000 20000 22000 24000

–0.15

–0.1

–0.05

0

b–d

Fig. 5. Downward sloping t 0 curves. A The size of the resource stock. B The net birth rate

B

Renewable Resource and Population

157

even in a constantly growing economy with a higher growth rate (see Fig. 4).

Appendix To prove Theorem 2, we apply the method used by Saito (2002, p. 116–122), which makes it possible to check the simplicity of a characteristic root on the imaginary axis without tedious calculations. Let P(l ,t) = l2 + (−(2p − r) + pe−lt )l + qe−lt and tn = t 0 + 2pn (n = 0,1,2, . . .). Then, P(iy0 , tn) = 0 and we obtain the following equations from Eq. 14. ∂P (iy0 , τ n ) = iy0 [ − y02 − (2 p − r )iy0 ] ∂τ ∂P (iy0 , τ n ) = 2iy0 − ( 2 p − r ) + [ p − τ n (q + piy0 )]e − iy0τn ∂λ Clearly, K=

∂P (iy0 , τ n ) ≠ 0 . We now consider the value ∂τ

p2 y04 + 2q2 y02 + (2 p − r )2 q 2 [((2 p − r ) y0 )2 + y04 ][q2 + ( py0 )2 ]

We get K > 0. Furthermore, from Eq. 14, 2iy0 − (2 p − r ) p ⎡ ⎛ ⎞⎤ signK = sign ⎢ Re ⎜ + ⎟ 2 ⎝ −iy0 ( − y0 − (2 p − r )iy0 ) iy0 (q + piy0 ) ⎠ ⎥⎦ ⎣ 2iy0 − (2 p − r ) pe −iy0τ n τn ⎞ ⎤ ⎡ ⎛ + − = sign ⎢ Re ⎜ ⎟ − iy0τ n 2 ⎝ −iy0 ( − y0 − (2 p − r )iy0 ) iy0 (q + piy0 ) e iy0 ⎠ ⎥⎦ ⎣ − iy τ ⎡ ⎛ 2iy0 − (2 p − r ) + pe 0 n τ n ⎞ ⎤ − = sign ⎢ Re ⎜ ⎟ ⎣ ⎝ −iy0 ( − y02 − (2 p − r )iy0 ) iy0 ⎠ ⎥⎦ ⎡ ⎛ ∂P (iy0 , τ n ) ⎞ ⎤ ⎢ ⎟⎥ ∂λ = sign ⎢ Re ⎜ − ⎜ ∂P (iy0 , τ n ) ⎟ ⎥ ⎟⎠ ⎥ ⎢ ⎜⎝ ⎣ ⎦ ∂τ

∂P (iy0 , τ n ) ≠ 0 and, by the well-known implicit ∂λ function theorem, we have Hence, we can obtain

158

A. Matsumoto et al.

⎡ ⎛ dλ sign ⎢Re ⎜ ⎣ ⎝ dτ

⎡ ⎛ ∂P (iy0 , τ n ) ⎞ ⎤ ⎢ ⎜ ⎞⎤ ⎟⎥ ∂τ ⎟ ⎥ = sign ⎢ Re ⎜ − ∂P (iy , τ ) ⎟ ⎥ n 0 λ = iy0 , τ = τ n ⎠ ⎦ ⎟⎠ ⎥ ⎢ ⎜⎝ ⎣ ⎦ ∂λ −1 ⎡ ⎧⎛ ∂P (iy0 , τ n ) ⎞ ⎫⎤ ⎢ ⎪ ⎟ ⎪⎥ ∂τ = sign ⎢ Re ⎨⎜ − ⎜ ∂P (iy0 , τ n ) ⎟ ⎬⎥ ⎢ ⎪⎜ ⎟⎠ ⎪⎥ ⎪⎭⎦ ∂λ ⎣ ⎩⎝ = signK > 0

This implies that (Se, Le) becomes unstable if t > t 0 holds (see, for example, Kuang 1993). Under the current parameter speciﬁcation in which K = 12 000, c = −0.1, r = 0.04, f = 4, a = 0.00001, and b = 0.4, we have p = 0.0191667, q = 0.00191667, and y0 = 0.0459087. Then solving either arcsin[t 0y0] = 0.449917 or arccos[0.466673] = t 0y0 for t 0 yields t 0 = 10.1652. If we take t > 10.1652, then the dynamic system in Eq. 10 becomes unstable, as shown in Figs. 3 and 4. Acknowledgments. This is a revised version of the paper with the same title that was presented at the international conference on Dynamical Systems Theory and its Applications to Biology and Environmental Science, held at Shizuoka University, Shizuoka, Japan, March 15–17, 2004. We are grateful for comments from Bob Sacker and participants of the conference. We appreciate the ﬁnancial support from the Japan Ministry of Education, Culture, Sports, Science and Technology (Grand-in-Aid for Scientiﬁc Research (B) 15330037 for the ﬁrst two authors, and Grand-in-Aid for JSPS Fellows 00000472 for the third author). The ﬁrst two authors also appreciate ﬁnancial support from Chuo University (Joint Research Grant 0382). Needless to say, any remaining errors are our responsibility.

References Brander JA, Taylor NS (1998) The simple economics of Easter Island: a Ricardo–Malthus model of renewable resource use. Am Econ Rev 88:119–138 Dalton TR, Coats RM (2000) Could institutional reform have saved Easter Island? J Evolut Econ 10:489–505 Intoh M (2000) Prehistoric Oceania. In: Yamamoto M (ed) Oceania history. 17–45 Yamakawa Publishing Co. Tokyo Kuang Y (1993) Delay differential equations with application in population dynamics. Academic Press, New York Matsumoto A (2002) Economic dynamic model for small islands. Discrete Dyn Nature Soc 7:121–132

Renewable Resource and Population

159

Reuveny R, Decker CS (2000) Easter Island: historical anecdote or warning for the future. Ecol Econ 35:271–287 Saito Y (2002) The necessary and sufﬁcient condition for global stability of a Lotka– Volterra cooperative or competition system with delays. J Math Anal Appl 268:109–124 Suzuki M (1996) On some differential equations in economic models. Math J 43:129–134 Suzuki M (2000) Difference equation for a population model. Discrete Dyn Nature Soc 5:9–18

9. A Determinantal Criterion of Hopf Bifurcations and Its Application to Economic Dynamics Junichi Minagawa

Summary. We present a criterion for a class of Hopf bifurcations using the properties of bialternate products of matrices, and apply it to a certain economic system. Key words. System of differential equations, Stability criterion, Bialternate products, Criterion of Hopf bifurcations, Economic dynamics

1 Introduction Given a linear system of differential equations, the system is asymptotically stable if and only if all the roots of the characteristic equation have negative real parts. As such a stability criterion, the Routh–Hurwitz criterion is well known (e.g., Gantmacher 1959). Given a parametric system of differential equations, the system has a simple Hopf bifurcation when a pair of complex conjugate roots of the characteristic equation passes through the imaginary axis, while all other roots have negative real parts. Liu (1994) showed a criterion of simple Hopf bifurcations based on the Routh–Hurwitz criterion. Fuller (1968) gave an alternative determinantal stability criterion. The Fuller criterion has an advantage over the Routh–Hurwitz criterion in the sense that, while the Hurwitz determinants involved in the latter have elements which are themselves sums of determinants, the elements of the determinants involved in the former are simpler. In this study, a criterion of simple Hopf bifurcations based on the Fuller criterion will be shown, and then applied to a certain economic system.

Graduate School of Economics, Chuo University, 742-1 Higashi-Nakano, Hachioji, Tokyo 192-0393, Japan

161

162

J. Minagawa

2 The Routh–Hurwitz and Fuller Criteria Consider a system n dz i = ∑ aij zi , dt j =1

(i = 1, 2, . . . , n )

(1)

where aij are real constant coefﬁcients. In vector–matrix notation, the system is dz = Az dt

(2)

For the system to be asymptotically stable, it is necessary and sufﬁcient that all the eigenvalues of the characteristic equation of A, namely, |lIn − A| = 0

(3)

have negative real parts (e.g., Gantmacher 1959). Let us denote Eq. 3 as a polynomial equation pnl n + pn−1l n−1 + . . . + p0 = 0,

pn > 0

(4)

Routh (1877) gave the following stability criterion. Theorem 1: (Routh 1877). Consider Eq. 4. Let qmmm + qm−1mm−1 + . . . + q0 = 0

(5)

be the equation of root–pair sums of Eq. 4, i.e., let the roots of Eq. 5 be the n(n − 1)/2(= m) values m = li + lj,

(i = 2, 3, . . . , n; j = 1, 2, . . . , i − 1)

(6)

where l1, l2 , . . . , l n are the roots of Eq. 4. Let qm > 0. Then for the roots of Eq. 4 to have all their real parts negative, it is necessary and sufﬁcient that p0 , p1, . . . , pn−1 > 0 and q0 , q1, . . . , qm−1 > 0. The conditions involved are disadvantageous from the following two points of view: “First, they are not all independent, being n(n + 1)/2 conditions in number, whereas only n are necessary. Secondly, despite several ingenious methods devised by Routh, it is not easy to compute them in the general case” (Samuelson 1941). The following Routh–Hurwitz criterion improves on Theorem 1, since only n conditions are required. Theorem 2: (e.g., Gantmacher 1959). All the roots of Eq. 4 have negative real parts if and only if H1 > 0,

H2 > 0, . . . , Hn > 0

(7)

A Determinantal Criterion of Hopf Bifurcations

163

where Hi are the Hurwitz determinants: H1 = pn −1 p H 2 = n −1 pn pn −1 H 3 = pn 0

pn − 3 pn − 2 pn − 3 pn − 2 pn −1

pn − 5 pn − 4 pn − 3

pn −1 pn − 3 pn − 5 0 pn pn − 2 pn − 4 0 0 pn −1 pn − 3 0 Hn = 0 pn pn − 2 0 p0

(8)

However, as noted in Fuller (1968), the Hurwitz determinants have elements which are themselves sums of determinants, and for n > 2 the Hi become extremely cumbersome. Then Fuller posed the question of whether one can obtain alternative determinantal stability criteria in which the elements of the determinants are simpler functions of the aij of matrix A, and presented such a criterion based on Theorem 1. In what follows, the stability criterion will be stated according to Fuller. We begin by introducing the bialternate product studied by Stéphanos (1900) (cited by Fuller), and consider a matrix whose characteristic equation is Eq. 5. Deﬁnition 1: (Stéphanos 1900). Let A be an n-dimensional matrix (aij) and B be an n-dimensional matrix (bij). Let F be an m = n(n − 1)/2-dimensional matrix (f pq,rs) whose rows are labeled pq (p = 2, 3, . . . , n; q = 1, 2, . . . , p − 1), whose columns are labeled rs (r = 2, 3, . . . , n; s = 1, 2, . . . , r − 1), and whose elements are 1 a pr f pq ,rs = ⎡ 2 ⎣⎢ bqr

a ps bpr + bqs aqr

bps ⎤ aqs ⎦⎥

(9)

Then F is the bialternate product of A and B, and is written as A ⋅ B. Theorem 3: (Stéphanos 1900). The characteristic roots of the matrix G = 2A ⋅ In

(10)

where A is an n-dimensional matrix (aij) and In is an n-dimensional identity matrix, are the n(n − 1)/2 values

164

J. Minagawa

li + lj,

(i = 2, 3, . . . , n; j = 1, 2, . . . , i − 1)

(11)

where li are the eigenvalues of A. The elements of G is g pq ,rs =

a pr δ qr

a ps δ pr + δ qs aqr

δ ps aqs

(12)

where d ij is the Kronecker delta, d ij = 0 for i ≠ j, and d ij = 1 for i = j. With p > q and r > s, Eq. 12 is ⎧−a ps ⎪a pr ⎪⎪a + a g pq ,rs = ⎨ pp qq a ⎪ qs ⎪−aqr ⎪⎩0

if r = q if r ≠ p and s = q if r = p and s = q if r = p and s ≠ q if s = p otherwise

(13)

Theorem 4: (Fuller 1968). Let A = (aij) be a real square matrix of dimension n > 1. Let G = (g pq,rs) be the square matrix of dimension m = n(n − 1)/2 deﬁned by Eq. 10, with elements given by Eq. 12, or equivalently by Eq. 13. Then for the characteristic roots of A to have all their real parts negative, it is necessary and sufﬁcient that in the characteristic polynomial of A, namely, |lIn − A|

(14)

and in the characteristic polynomial of G, namely |mIm − G|

(15)

the coefﬁcients of li (i = 0, 1, . . . , n − 1) and mi (i = 0, 1, . . . , m − 1) should all be positive. The Fuller criterion can be considered as an alternative solution to the second disadvantage of Theorem 1. Jury (1982, p. 109) mentioned that it is advisable, for computational purposes, to use the Fuller criterion even though some redundancy may exist. We note that, while the Routh–Hurwitz criterion is widely applied to economic dynamics, the Fuller criterion is rarely used in the economics literature. For applications of the latter, see, for example, Murata (1977) and Hadjimichalakis and Okuguchi (1979). Regarding the ﬁrst disadvantage of Theorem 1, Araposthathis and Jury (1979) showed that the conditions in Theorem 1 can be reduced to 1 + n(n − 1)/2 conditions. Theorem 5: (Araposthathis and Jury 1979). Consider Eqs. 4–6. Then for the roots of Eq. 4 to have their real parts negative, it is necessary and sufﬁcient that p0 > 0 and q0 , q1, . . . , qm−1 > 0.

A Determinantal Criterion of Hopf Bifurcations

165

Remark 1: (Jury 1982). The problem of obtaining the minimum number of conditions as a function of n is still an open problem.

3 Hopf Bifurcations and the Liu Criterion Following Guckenheimer and Holmes (1983) and Liu (1994), we state simple Hopf bifurcations and the Liu criterion. Consider a system dz = fα ( z ) , dt

z ∈⺢n ,

α ∈⺢1

(16)

with an equilibrium (z*, a*), and f ∈ C ∞ . Assume that (i) The Jacobian matrix Dzf a *(z*) has a simple pair of purely imaginary eigenvalues, and all other eigenvalues have negative real parts.

Then there is a smooth curve of equilibria (z(a), a) with z(a*) = z*. The eigenvalues l(a), ¯ l (a) of J(a) = Dzf a (z(a)) which are purely imaginary at a = a* vary smoothly with a. Moreover, if (ii)

d (Re λ (α *)) ≠ 0 dα

then there is a simple Hopf bifurcation. The “simple” Hopf bifurcation is used to distinguish this from the Hopf bifurcations with some other eigenvalues with nonzero real parts. Let us denote the characteristic equation of the Jacobian matrix J(a), namely, |lIn − J(a)| = 0

(17)

as a polynomial equation p(l; a) = pn(a)l n + pn−1(a)l n−1 + . . . + p0(a) = 0,

pn(a) > 0

(18)

where every pi(a) is a smooth function of a. Theorem 6: (Liu 1994). Assume there is a smooth curve of equilibria (z(a), a) with z(a*) = z* for the system Eq. 16. Conditions (i) and (ii) for a simple Hopf bifurcation are equivalent to the following conditions on the coefﬁcients of the characteristic polynomial p(l; a): (i′) p 0 (a*) > 0, H1(a*) > 0, H2(a*) > 0, . . . , Hn−2(a*) > 0, Hn−1(a*) = 0 d ( H n−1(α *)) ≠ 0 (ii′) dα

where Hi are the Hurwitz determinants.

166

J. Minagawa

The Liénard–Chipard criterion is well known as another stability criterion (e.g., Gantmacher 1959). As Gantmacher (1959, vol. II, p. 173) stated, the Liénard–Chipard criterion has an advantage over the Routh–Hurwitz criterion in the sense that the number of determinantal inequalities in the former is roughly half that in the latter. From the practical point of view, Manfredi and Fanti (2004) reformulated the Liu criterion by replacing the conditions based on the Routh–Hurwitz criterion with the corresponding Liénard– Chipard conditions. It is worth pointing out that the coefﬁcient criterion of Hopf bifurcations with some other eigenvalues with nonzero real parts has been established for small values of n (= 2, 3, 4) (see Asada and Yoshida (2003) for n = 4).

4 Alternative Criterion An alternative criterion of simple Hopf bifurcations based on the Fuller criterion will now be given. The following theorem is an immediate consequence of Theorem 1, and the idea of the proof is due to Routh’s (1877) proof of Theorem 1 and Liu’s (1994) proof of Theorem 6. Theorem 7: Given Eqs. 4–6 with qm > 0, Eq. 4 has a simple pair of purely imaginary roots, and all other roots have negative real parts if and only if p 0 , p1, . . . , pn−2 > 0, pn−1 ⭓ 0, q0 = 0, and q1, q2 , . . . , qm−1 > 0. Here, pn−1 = 0 when n = 2 and pn−1 > 0 when n ⭓ 3. Proof: Necessity: If Eq. 4 has a simple pair of purely imaginary roots, and all other roots have negative real parts, then the left-hand side of Eq. 4 can be written as p(l) = r(l)(l2 + s1l + s0)

(19)

where s0 > 0, s1 = 0, and then s 21 − 4s0 < 0, r(l) = r n−2l n−2 + r n−3l n−3 + . . . + r0 ,

r n−2 > 0

(20)

and all (n − 2) roots of r(l) have negative real parts. Thus, from Theorem 1, the coefﬁcients of r(l), r0 , r1, . . ., r n−2 , are all positive. Hence, all the coefﬁcients of p(l) are positive, i.e., p0 , p1, . . . , pn−1 > 0, except for p1 = 0 when n = 2. On the other hand, the left-hand side of Eq. 5 can be expressed as q(m) = t(m)(m + u0)

(21)

where u0 = 0, t(m) = tm−1mm−1 + tm−2mm−2 + . . . + t0 ,

tm−1 > 0

(22)

A Determinantal Criterion of Hopf Bifurcations

167

and all (m − 1) roots of t(m) have negative real parts. By the same reasoning, the coefﬁcients of t(m), t0 , t1, . . . , tm−1, are all positive, Hence we have q0 = 0, q1, q2 , . . . , qm−1 > 0. Sufﬁciency: The case n = 2 is clear. Let n ⭓ 3 and assume that p0 , p1, . . . , pn−1 > 0. Then p(l) > 0 for l ⭓ 0, i.e., Eq. 4 has neither positive real roots nor zero real roots. On the other hand, if q0 = 0 and q1, q2 , . . . , qm−1 > 0, from the factorization Eq. 21, Eq. 5 has all its real roots in the open left half plane except for one zero real root, but from Eq. 6, the real roots of Eq. 5 include twice the real parts of the complex roots of Eq. 4. Hence, Eq. 4 has a simple pair of purely imaginary roots and all other roots have negative real parts. 䊏 Suppose that the roots of Eq. 17 are l1(a), l2(a), . . . , l n(a). Then there is the equation q(m; a) = qm(a)mm + qm−1(a)mm−1 + . . . + q0(a) = 0,

qm(a) > 0

(23)

whose roots are given by m(a) = li(a) + lj(a),

(i = 2, 3, . . . , n; j = 1, 2, . . . , i − 1)

(24)

where every qi(a) is a smooth function of a and m = n(n − 1)/2. Let us deﬁne G(a) = 2J(a) ⋅ In

(25)

where J(a) is the Jacobian matrix of the system Eq. 16, In is an n-dimensional identity matrix, and ⋅ denotes bialternate product. Then G(a) is the matrix whose characteristic equation is Eq. 23. Theorem 8: Assume there is a smooth curve of equilibria (z(a), a) with z(a*) = z* for the system Eq. 16. Conditions (i) and (ii) for a simple Hopf bifurcation are equivalent to the following conditions in the characteristic polynomial P(l;a) and the associated polynomial q(m; a): (i″) p 0 (a*), p1(a*), . . . , pn−2(a*) > 0, pn−1(a*) ⭓ 0, q0 (a*) = 0, and q1(a*), q2(a*), . . . , qm−1(a*) > 0. Here pn−1(a*) = 0 when n = 2 and pn−1(a*) > 0 when n ⭓ 3 d (q0(α *)) ≠ 0 (ii″) dα

Proof: The equivalence of (i) and (i″) follows from Theorem 7. Hence, given (i), which is equivalent to (i″), in a sufﬁciently small neighborhood of a*, q(m; a) can be expressed as q(m; a) = t(m; a)(m + u0(a))

(26)

where u0(a*) = 0, t(m; a) = tm−1(a)mm−1 + tm−2(a)mm−2 + . . . + t0(a),

tm−1(a) > 0

(27)

168

J. Minagawa

t0(a) > 0, and u0(a) and t0(a) are smooth functions of a. Let the eigenvalues l 0(a). Then, from of J(a) which are purely imaginary at a = a* be l0(a) and ¯ 0 0 0 0 ¯ Eqs. 24 and 26, Re l (a) = (l (a) + l (a))/2 = m (a)/2 = −u0(a)/2. Therefore d 1 d (Re λ 0(α *)) ⱀ 0 ⇔ − ⎡⎢ (u0(α *))⎤⎥ ⱀ 0 ⇔ ⎦ dα 2 ⎣ dα 1⎡ d d 1 (t 0(α *)u0(α *))⎤ ⱀ 0 ⇔ − ⎡ (q0(α *))⎤ ⱀ 0 − ⎢ ⎢ ⎥ ⎥⎦ ⎦ 2 ⎣ dα 2 ⎣ dα

(28)

䊏 This theorem is an immediate consequence of Theorems 2, 4, and 6. In addition, Fuller mentioned the necessary condition for Eq. 14 to have at least one pair of purely imaginary roots, which is expressed as |G| = 0. In Guckenheimer et al. (1997), this condition is used for the numerical detection of candidates of Hopf bifurcation points. On this point, see also Kuznetsov (1998) and Govaerts (2000). It must be noted that Theorem 7 involves redundant conditions. The following theorem is an immediate consequence of Theorem 5, and the idea of the proof is due to Araposthathis and Jury’s (1979) proof of Theorem 5. Theorem 9: Given Eqs. 4–6 with qm > 0, Eq. 4 has a simple pair of purely imaginary roots, and all other roots have negative real parts if and only if p0 > 0, q0 = 0, and q1, q2 , . . . , qm−1 > 0. Proof: Necessity: Necessity follows from Theorem 7. Sufﬁciency: From p0 > 0, Eq. 4 has no zero real roots. Then it is deduced that Eq. 4 can either have an even number of positive real roots or no positive real roots. However, if q0 = 0 and q1, q2 , . . . , qm−1 > 0, Eq. 5 has all its real roots in the open left half plane except for one zero real root. Thus, from Eq. 6, the real roots of Eq. 4 must be negative. Also, the real roots of Eq. 5 include twice the real parts of the complex roots of Eq. 4. Hence, Eq. 4 has a simple pair of purely imaginary roots and all other roots have negative real parts. 䊏 From Theorem 9, (i″) in Theorem 8 can also be replaced with (i′′′) below. Theorem 10: Assume there is a smooth curve of equilibria (z(a), a) with z(a*) = z* for the system Eq. 16. Conditions (i) and (ii) for a simple Hopf bifurcation are equivalent to the following conditions in the characteristic polynomial P(l; a) and the associated polynomial q(m; a):

A Determinantal Criterion of Hopf Bifurcations

169

p 0 (a*) > 0, q0 (a*) = 0, q1(a*), q2(a*), . . . , qm−1(a*) > 0 d (q0(α *)) ≠ 0 (ii′′) dα (i′′′)

Remark 2: The problem of obtaining similar criteria of simple Hopf bifurcations without redundant conditions is still an open problem.

5 Examples Example 1: For the system Eq. 16 with n = 2, the Jacobian matrix is a (α ) J (α ) = ⎡⎢ 11 ⎣a21(α )

a12(α ) ⎤ a22(α ) ⎦⎥

(29)

and the matrix deﬁned by Eq. 22 is G(a) = [a22(a) + a11(a)]

(30)

In this case, (i′′′) in Theorem 10 is p0(a*) ≡ |J(a*)| > 0 and q0(a*) ≡ −G(a*) = 0. Thus, the conditions for a simple Hopf bifurcation are |J(a*)| > 0,

−G(a*) = 0

(31)

and d ( −G (α *)) ≠ 0 dα

(32)

Example 2: For the system Eq. 16 with n = 3, the Jacobian matrix is ⎡ a11(α ) a12(α ) a13(α ) ⎤ J (α ) = ⎢a21(α ) a22(α ) a23(α )⎥ ⎢⎣a31(α ) a32(α ) a33(α ) ⎥⎦

(33)

and the matrix deﬁned by Eq. 22 is −a13(α ) ⎤ a23(α ) ⎡a22(α ) + a11(α ) G (α ) = ⎢ a32(α ) a33(α ) + a11(α ) a12(α ) ⎥ ⎢⎣ −a31(α ) a21(α ) a33(α ) + a22(α ) ⎥⎦

(34)

In this case, (i′′′) in Theorem 10 is p0(a*) ≡ −|J(a*)| > 0, q0(a*) ≡ −|G(a*)| = 0, q1(a*) ≡ |G(a*)1| + |G(a*)2| + |G(a*)3| > 0, and q2(a*) ≡ −trG(a*) > 0, where G(a*)i stands for the submatrix of G(a*) after deleting its i-th row and column. Then q1(a*) > 0 is redundant, since q1(a*) = (p2(a*))2 + p1(a*). Thus, the conditions for a simple Hopf bifurcation are −|J(a*)| > 0, −|G(a*)| = 0, −trG(a*) > 0

(35)

170

J. Minagawa

and d ( − G (α *) ) ≠ 0 dα

(36)

6 Economic Application Consider the generalized Tobin model formulated by Hadjimichalakis (1971) and studied by Benhabib and Miyao (1981). The system is as follows: . k = sf(k) − (1 − s)(q − q)m − nk . m = m(q − pˆ − n) . q = g (pˆ − q) pˆ = e[m − L(k, q)] + q

(37) (38) (39) (40)

where k, m, q, and pˆ are the capital–labor ratio, the money stock per capita, and the expected and actual rates of inﬂation, respectively, and s,q, n,g, and e are the saving ratio, the rate of monetary expansion, the natural rate of growth, the speeds of adjustment of expectations, and the price level, respectively, and the over-dot denotes the time derivative. Furthermore, L(k, q) and f(k) are assumed to be smooth functions. To cope with the instability of the original Tobin model, several generalizations of the model have been proposed, together with stability analyses (see the above references and those cited therein). From the viewpoint of this study, it is noted that for the system Eqs. 37–39 with pˆ = e[m − L(k, q)] instead of Eq. 40, Hadjimichalakis and Okuguchi (1979) examined its stability of equilibrium by utilizing the Fuller criterion, while Benhabib and Miyao (1981) showed that the system Eqs. 37–40 has a simple Hopf bifurcation for a certain value of g = g * by implicitly using the Liu criterion. In what follows, the Liu criterion and the alternative criterion in this study will be applied to the system Eqs. 37–40 to show their differences. Let us denote the Jacobian matrix of the system as J(g) = (aij(g)), i, j = 1, 2, 3. Then the Liu criterion becomes −|J(g *)| > 0, −trJ(g *) > 0, H2(g *) = 0, and dH2(g *)/dg ≠ 0. In the above conditions, H 2(γ *) =

p2(γ *) 1

p0(γ *) p1(γ *)

(41)

A Determinantal Criterion of Hopf Bifurcations

171

where p2(γ *) = − ( sf ′ − n − εm − γ *ε L2 ) (1 − s ) m sf ′ − n − εm m (ε L2 − 1) sf ′ − n p1(γ *) = + + γ *ε −γ *ε L2 −γ *ε L1 −γ *ε L2 εmL1 (1 − s )m sf ′ − n − (1 − s )n p0(γ *) = − εmL1 m (ε L2 − 1) − εm −γ *ε L1 γ *ε −γ *ε L2

− (1 − s )n − εm (42)

and f′ = df(k*)/dk, L1 = ∂L(k*, q*)/∂k, and L2 = ∂L(k*, q*)/∂q. On the other hand, Eqs. 35 and 36 give −|J(g *)| > 0, −|G(g *)| = 0, −trG(g *) > 0, and d(−|G(g *)|)/dg ≠ 0. Here, − G (γ *) = −

−εm + sf ′ − n γ *ε γ *ε L1

m (ε L2 − 1) −γ *ε L2 + sf ′ − n εmL1

− (1 − s )m − (1 − s )n −γ *ε L2 − εm

(43)

These expressions show that the determinant H2(g *) has elements which are themselves sums of determinants, while the elements of the determinant of the matrix G(g *) are simple combinations of the aij(g *). Note that −trG(g *) = −2trJ(g *) and −|G(g *)| = H2(g *).

7 Conclusion The determinantal criterion of simple Hopf bifurcations is provided in Sect. 4 and applied to the generalized Tobin model in Sect. 6.

References Araposthathis A, Jury EI (1979) Remarks on redundance in stability criteria and a counter-example to Fuller’s conjecture. Int J Control 29:1027–1034 Asada T, Yoshida H (2003) Coefﬁcient criterion for four-dimensional Hopf bifurcations: a complete mathematical characterization and applications to economic dynamics. Chaos, Solitons Fractals 18:525–536 Benhabib J, Miyao T (1981) Some new results on the dynamics of the generalized Tobin model. Int Econ Rev 22:589–596 Fuller AT (1968) Conditions for a matrix to have only characteristic roots with negative real parts. J Math Anal Appl 23:71–98 Gantmacher FR (1959) The theory of matrices. Chelsea, New York Govaerts WJF (2000) Numerical methods for bifurcations of dynamical equilibria. SIAM, Philadelphia Guckenheimer J, Holmes P (1983) Nonlinear oscillations, dynamical systems, and bifurcations of vector ﬁelds. Springer, Berlin

172

J. Minagawa

Guckenheimer J, Myers M, Sturmfels B (1997) Computing Hopf bifurcations. I. SIAM J Numer Anal 34:1–21 Hadjimichalakis MG (1971) Money, expectations, and dynamics: an alternative view. Int Econ Rev 12:381–402 Hadjimichalakis MG, Okuguchi K (1979) The stability of a generalized Tobin model. Rev Econ Stud 46:175–178 Jury EI (1982) Inners and stability of dynamic systems, 2nd edn. Krieger, Malabar, FA Kuznetsov YA (1998) Elements of applied bifurcation theory, 2nd edn. Springer, Berlin Liu WM (1994) Criterion of Hopf bifurcations without using eigenvalues. J Math Anal Appl 182:250–256 Manfredi P, Fanti L (2004) Cycles in dynamic economic modeling. Econ Model 21:573–594 Murata Y (1977) Mathematics for stability and optimization of economic systems. Academic Press, New York Routh EJ (1877) Stability of a given state of motion. Macmillan, London Samuelson PA (1941) Conditions that the roots of a polynomial be less than unity in absolute value. Ann Math Stat 12:360–364 Stéphanos C (1900) Sur une extension du calcul des substitutions linéaires. J Math Pures Appl 6:73–128

Part III Economics of Space: Empirical Analysis

10. Public- and Private-School Competition: The Spatial Education Production Function David M. Brasington

Summary. School vouchers may increase the competition that public-school districts face. Greater competition may spur public schools to improve student outcomes, which reliably predict labor market productivity and earnings. Previous school competition studies did not use spatial statistics, so they failed to incorporate spillovers and the effect of omitted variables into their education production functions. Signiﬁcant spatial effects are found in all regressions, and spatial statistics improve adjusted R2 values. There seems to be no consistent association between private-school attendance rates and public-school achievement, or between the number of publicschool districts in a county and public-school performance. Competitive effects, which seem plausible in nonspatial regressions, dissipate when spatial statistics are used. When school inputs appeared to be statistically signiﬁcant in nonspatial regressions, the spatial regressions generally made the signiﬁcance disappear. Poverty appeared to depress reading and writing rates, but this effect disappeared in the spatial models. Key words. Spillovers, Human capital formation, Spatial statistics, School vouchers, Private- and public-school competition

1 Introduction Economists care about schooling because school quality, as measured by proﬁciency test scores, is an important predictor of labor market productivity and earnings (Sander 1996; Bishop 1989; Loury and Garman 1995; Murnane et al. 1995). Whatever improves school quality might also improve labor market productivity and earnings. Economics Department, Louisiana State University, Baton Rouge, LA 70803, USA

175

176

D.M. Brasington

The private sector and the government both provide primary and secondary schooling. The dual provision of schooling has inevitably led to comparisons between private and public schools, with many people stating that private schools do a better job with less money. Some economists speculate that private schools must compete for students, and must therefore deliver a high-quality product at low cost to survive. Public-school systems, which have local monopoly power, are under less competitive pressure to deliver high-quality and efﬁcient services (Couch et al. 1993). School vouchers are often proposed to increase competition. Under the current system, students are assigned to tax-funded (public) schools based on their residence. If students attend the schools they are assigned to, they pay no extra tuition, but if students want to attend private schools or a taxfunded school other than the one they are assigned to, parents must continue to pay taxes for their assigned public school, and they also have to pay full tuition rates at the school they attend. A voucher program would lower the cost of attending another school. Under a voucher program, parents receive a voucher (a check from the government) that can be redeemed at the assigned school or another school. Some voucher plans restrict voucher use to public schools only, while other voucher plans allow recipients to use the voucher at any public or private school. A ﬂurry of recent research uses education production functions to investigate the extent to which public schools respond to competition from other public and private schools. By knowing whether public schools respond more to competition from other public schools or from private schools, the most effective voucher policy may be designed. If vouchers increase school competition, student outcomes may improve, and the labor force of the future may be more productive. Previous research has failed to address some important econometric issues. Public schooling is often thought to have positive spillovers over some spatial subgroup. Traditional education production functions ignore the possibility of spillovers. Furthermore, even the most careful education production functions are subject to omitted variable bias. Because of the publicpolicy importance of the voucher issue, estimates of school competition must be as unbiased and efﬁcient as possible. This study introduces the spatial education production function. The spatial education production function uses spatial statistics, i.e., an estimation that accounts for the spatial layout of the data, to address spillovers and omitted variable bias. The study found spatial effects in all sections of the Ohio proﬁciency test and in all 15 spatial regressions. Therefore, education production functions that do not account for the spatial conﬁguration of the data are vulnerable to biased, inefﬁcient, and inconsistent

The Spatial Education Production Function

177

parameter estimates (Anselin 1988). The speciﬁc spatial models used are designed to be insensitive to outliers, and for this reason are expected to have lower adjusted R2 than nonspatial models. Nevertheless, the spatial education production functions have higher adjusted R2 values than their nonspatial counterparts, suggesting that spatial statistics add to explanatory power. Most of the nonspatial regressions show a positive relationship between public school outcomes and the number of public-school districts in the county. Once spatial statistics are used to address spillovers and omitted variables, only 13% of the regressions show a relationship. The magnitude of the effect is trivial: the elasticity of a test pass rate with respect to the number of public-school districts is 0.026.1 Overall, this study provides weak evidence that public schools respond a little to competition from other public schools. The results are more similar to those of Rothstein (2000), who ﬁnds no effect, than to those of Hoxby (2000), who ﬁnds large competitive effects. A comparison of spatial and nonspatial estimation methods fails to support the assertion that traditional regressions bias the parameter estimate of private-school competition downward (Hoxby 1998). Neither the traditional nor the spatial education production functions indicate a consistent public-school response to competition from private schools. The response is more prevalent in the nonspatial regressions (40%) than the spatial regressions (20%), suggesting that once spatial effects are considered, the correlation between private-school attendance and public-school outcomes is weakened. In fact, when the result is signiﬁcant, it is almost always negative. The implications for allowing vouchers at private schools are discussed.

2 Theoretical Model of School Competition The main contribution of this chapter is its novel application of an empirical technique, and the results it yields about school competition. However, some researchers ﬁnd it useful to couch empirical work in a theoretical framework that motivates the statistical test. A theoretical model of the production of education with spillovers is now presented. It is based on the work of Murdoch The number quoted is from the %PASS MATH results, which is the set of results with the highest proportion of statistically signiﬁcant results. The median of the three signiﬁcant parameter estimates is used (0.16), and the values of #DISTRICTS and %PASS MATH are evaluated at the sample means. All elasticities throughout the paper are evaluated at sample means. 1

178

D.M. Brasington

et al. (1993), but deviates in several respects.2 The model abstracts from many of the complex issues involved in education; its purpose is to motivate the use of spatial statistics in the empirical section by showing that there may be spillovers between school districts. Each school district i undertakes an activity a, the provision of schooling, which is measured by proﬁciency test pass rates. Production takes the following form: ai = ai(di, Yi, bi)

(1)

where d is a vector of student and parent demographic characteristics, Y is a vector of community demographic characteristics, and b is school board administrative efﬁciency and effectiveness. In turn, bi = bi(ki, ei(ki), si)

(2)

where k is a vector of variables related to competition in schooling market, e is enrollment, and s is a vector of inputs chosen by the school board such as teacher quality and the pupil/teacher ratio. Vector s is chosen in a framework outside the current model, but it is ﬂexible enough to incorporate either rent-seeking or output-maximizing behavior by the administration. Education is assumed to exhibit positive externalities (Wyckoff 1984); therefore, although the median voter chooses a, the total amount of schooling consumption g is gi = ai + w i

(3)

where w represents spill-ins from public-schooling provision in neighboring communities. The median voter’s utility is a function of the total amount of schooling consumption, regardless of its source. Therefore, Ui = Ui(ni, g i)

(4)

Equation 4 portrays the median voter’s utility as a function of consumption of a numeraire good n as well as public schooling. The utility function is assumed to be strictly quasiconcave, twice continuously differentiable, and a monotonically increasing function of its arguments. School district i not only receives spill-ins from neighboring communities, it generates them as well. When the median voter chooses a, part of that provision is privately consumed. Another part is the pure public portion of Among the main differences are (1) converting from expenditures to proﬁciency tests as an outcome measure, (2) including variables for student and parent demographic characteristics, community demographic characteristics, variables related to competition, school characteristics, enrollment, and school board administrative efﬁciency and effectiveness, and (3) deriving ﬁrst-order conditions from the model. 2

The Spatial Education Production Function

179

schooling consumed by the community in question as well as by its neighbors. Formally, a i = t i + pi

(5)

where t is the community’s own provision of the public aspect of schooling, and p is the consumption of the purely private portion of schooling. The proportion of a community’s provision of public schooling that stays in the community and the proportion that spills over into neighboring school districts are described by the following joint product technology: pi = q i * a i

(6)

ti = f i * a i

(7)

where q is the proportion of own provision of schooling that is privately consumed, and f is the fraction that is a pure public good. Equation 7 holds for all school districts, so

ω i = ∑ φr * α r

(8)

i ≠i

where f r is the fraction of activity in school district r that spills into jurisdiction i as a public good. Because g i = ti + pi + w i, Eq. 4 may be rewritten as Ui = Ui(ni, ti + pi + w i)

(9)

The median voter maximizes Eq. 9 subject to the following budget constraint: y i = ni + t i a i

(10)

In Eq. 10, y is income and t is the per-unit cost of education faced by the median voter. The median voter chooses a taking other districts’ schooling decisions as given. Constrained maximization proceeds by setting up the Lagrangian ⎛ ⎞ L = U i ⎜ ni , φi * α i + θi * α i + ∑ φr * α r ⎟ + λ ( yi − ni − τ iα i ) ⎝ ⎠ r ≠i

(11)

where l is the Lagrange multiplier. Partial differentiation yields the following ﬁrst-order conditions, along with the budget constraint: ∂U i −λ =0 ∂ni

(12)

∂U i (θi + φi ) − τ i λ = 0 ∂ni

(13)

180

D.M. Brasington

so that the ratio of the marginal rates of substitution of the numeraire good and own contribution of schooling are equated to the tax price. ∂U i ∂α i

∂U i = τi ∂ni

(14)

The theoretical model has shown how spillovers may be involved in the production of education, and how the spillovers enter communities’ choice of schooling levels. The theoretical model contains variables representing student and parent characteristics, community characteristics, and schoolspeciﬁc inputs. The following section discusses the inputs to education in greater detail in order to justify the choice of variables in the empirical section.

3 Literature Review and Choice of Education Production Function Variables Education is produced using the following inputs: students and parents, the community, and school-speciﬁc factors. Each input group is discussed in turn. Every attempt is made to include the typical set of education production function variables as well as variables related to competition. Student and parent characteristics are typically strongly related to publicschool outcomes. One such characteristic is the presence of a two-parent household. A single-parent household may not devote as much attention to its children; therefore, %BOTH PARENTS is expected to be positively related to student outcomes. Another theoretically important household characteristic is income. PARENT INCOME is expected to be positively related to student achievement. On the other hand, low parent education levels may imply that parents are less able to help their children with their homework, or that they do not value education highly. Therefore, %NO DIPLOMA and %HS DIPLOMA ONLY are likely to be negatively related to student outcomes compared with higher levels of parent education. The proportion of nonwhite students, %MINORITY, is also included; this has been found to be negatively related to school outcomes in many education production functions. The characteristics of the community include demographic factors and variables related to competition. The proportion of school-district residents who are living in poverty, %POVERTY, is expected to be negatively related to school outcomes. Competitive pressures may also inﬂuence the supply of public-school quality. Competition may come from private schools as well as from other public schools. Each of these possibilities is discussed in turn. Theoretical models have found positive (Falkinger 1994), negative (Epple and Romano 1998), and ambiguous (Ireland 1990) effects of increased

The Spatial Education Production Function

181

private-school enrollment on public-school performance. Empirical studies have shown more consistent results. Couch et al. (1993) found that the percentage of students in a public-school district who attended private schools had a positive relationship with standardized algebra test scores in North Carolina’s public schools; public schools seem to respond to competition from private schools. Borland and Howsen (1996) extended Couch et al.’s analysis to include the effect of both public- and privateschool competition on public-school performance. This competition from both sources is captured in a single Herﬁndahl index, which is positively related to public-school performance. Hoxby (1998) also found that competition from private schools was positively related to higher public-school achievement. Dee (1998) showed that competition from private schools had a positive and statistically signiﬁcant impact on the high-school graduation rates of neighboring public schools. In contrast, Zanzig (1997) found a negative relationship between public-school math test scores and the percentage of students in the county who attend private schools. %PRIVATE is included in the current study in order to capture the relationship between private-school competition and public-school performance. There may also be competition between public schools. Hoxby (1996, 1998) stated that such competition should exist because efﬁcient and high-quality education providers are rewarded with higher budgets. An increase in quality causes an increase in house prices, which increases the tax base, tax collections, and the size of the school budget. In addition, Hoxby suggests that schools are more responsive to parents’ desires, as opposed to school staff desires, when there are many public-school districts in the area. Hoxby further asserts that when parents have more choice among districts, they are more involved in their children’s schooling. Therefore, one would expect a positive relationship between the number of school districts and publicschool outcomes. Indeed, Hoxby (1998, 2000) ﬁnds such a relationship. Using a Herﬁndahl index to measure competition, she found that an increase in competition is associated with a small but statistically signiﬁcant increase in achievement. In contrast, when Rothstein (2000) uses Hoxby’s (2000) data with different instruments, he ﬁnds no statistically signiﬁcant results. Rothstein further suggests that Hoxby’s failure to control for private-school attendance led to her ﬁndings about public-school competition. Zanzig (1997) generally found that the number of public-school districts in a county had a positive effect on public-school performance until a threshold of three or four districts is reached. Beyond this threshold, additional public schools depress performance. Based on the literature, the number of public school districts in each of Ohio’s 88 counties, #DISTRICTS, is included to capture public-school competition.

182

D.M. Brasington

Finally, the characteristics of the schools themselves may inﬂuence student achievement. There are dissenters (Ornstein 1993; Fowler and Walberg 1991; Jewell 1989), but considerable evidence suggests that larger school districts are consistently negatively related to student outcomes, although the magnitude is small (Haller 1992; Stern 1989; Friedkin and Necochea 1988; Brasington 1997). Therefore, COHORT SIZE is included in the education production function; a negative relationship with student performance is anticipated. Other school-speciﬁc characteristics include TEACHER SALARY, TEACHER EXPERIENCE, and TEACHER EDUCATION levels, as well as the PUPIL/TEACHER RATIO. Although the debate continues as to the signiﬁcance of these factors, no education production function is complete without them. Data on per-pupil expenditure are also available, but this variable is highly correlated with TEACHER SALARY (0.81) and PUPIL/TEACHER RATIO (−0.89), and seems to cause multicollinearity problems. The student outcomes are now described. Ohio’s high-school students must pass a ninth-grade proﬁciency test to receive a full high-school diploma. Students who do not pass this test by the end of twelfth grade, but pass their coursework, receive a certiﬁcate of attendance instead. All students must take the test, so sample selection bias is not an issue (Hanushek and Taylor 1990).3 The 1993 proﬁciency test contains four sections: writing, reading, math, and citizenship. The measures of school outcome employed are the proportion of ninth-graders in each district who passed each portion of the ninth-grade proﬁciency test in 1993. Of the 611 Ohio school districts, 605 reported their 1993 proﬁciency test results. The recent education production literature includes studies that use levels of outcomes (Hoxby 1996; Sander 1996; Kennedy and Siegfried 1997; Zanzig 1997; Dee 1998; Couch et al. 1993; Borland and Howsen 1996) as well as the value-added approach (Meyer 1996; Hanushek 1992; Gomes-Neto et al. 1997; Figlio 1999). Two ways of implementing value-added were tried. The value-added approach of Meyer (1996) yielded no additional insights; the approach of Hayes and Taylor (1996) yielded an adjusted R2 of 0.01. This study therefore restricts its attention to variations in the level of outcomes, rather than value-added. The results of this study complement those of Brasington and Haurin (2006), who found that levels of proﬁciency are valued by the housing market, while measures of the value-added of a school are not. Deﬁnitions of the variables, as well as sources and means, are shown in Table 1. Only students assessed to have a learning disability are exempt. Only if a student’s team leader determines that the student has a learning disability each year and exempts that student every year of his or her high school career will that student not be required to take the proﬁciency test. 3

The Spatial Education Production Function Table 1. Variable deﬁnitions, sources, and means Variable Deﬁnition and source %PASS ALL

%PASS MATH

%PASS CITIZENSHIP

%PASS READING

%PASS WRITING

%BOTH PARENTS

PARENT INCOME

%PRIVATE

%MINORITY COHORT SIZE

TEACHER SALARY

TEACHER EXPERIENCE

Proportion of ninth-grade students passing all sections of the ninth-grade proﬁciency test (math, citizenship, reading, and writing) in each school district in 1993 (1) Proportion of ninth-grade students passing the math section of the ninth-grade proﬁciency test in each school district in 1993 (1) Proportion of ninth-grade students passing the citizenship section of the ninth-grade proﬁciency test in each school district in 1993 (1) Proportion of ninth-grade students passing the reading section of the ninth-grade proﬁciency test in each school district in 1993 (1) Proportion of ninth-grade students passing the writing section of the ninth-grade proﬁciency test in each school district in 1993 (1) Proportion of students in each school district living with two parents in 1990 (2) Average parental income of students in each school district, in hundreds of thousands of dollars in 1990 (2) Proportion of students in grades nine through twelve living in each public school district who attended nonpublic schools in 1990 (2) Proportion of students in each school district who were nonwhite in 1993 (1) Average number of students in each grade in each high school in 1993 in thousands; school district fall average daily membership divided by the number of grade levels the school district serves, divided by the number of high schools the district has, divided by 1000 (1,3) Average teacher salary in each school district in hundreds of thousands of dollars in 1993 (1) Average teacher experience in each school district in hundreds of years in 1993 (1)

183

Mean (s) 0.51 (0.14)

0.61 (0.14)

0.72 (0.12)

0.87 (0.07)

0.87 (0.08)

0.81 (0.09) 0.33 (0.11) 0.06 (0.07)

0.06 (0.12) 0.19 (0.14)

0.33 (0.05) 0.15 (0.02)

184

D.M. Brasington

Table 1. Continued Variable PUPIL/TEACHER RATIO

TEACHER EDUCATION

%POVERTY

#DISTRICTS

%NO DIPLOMA

%HS DIPLOMA ONLY

LAG

Deﬁnition and source Total average daily membership in each school district divided by the total number of classroom teachers, in hundreds of students per teacher in 1993 (1) Number of teachers with a Masters degree divided by the total number of regular teachers in each school district in 1993 (1) Proportion of persons in each school district living under the ofﬁcial poverty income level in 1990 (2) Number of school districts in each county in 1993, in hundreds of districts (4) Proportion of parents of school-age children in each school district in 1990 who do not have a high school diploma (2) Proportion of parents of school-age children in each school district in 1990 who have a high school diploma but have not attended college (2) Spatial lag Wx of the variable in question for the spatial Durbin model of Eq. 18

Mean (s) 0.19 (0.02)

0.33 (0.05)

0.10 (0.07) 0.10 (0.07) 0.16 (0.08)

0.45 (0.11)

— —

Means are shown with the standard deviation in parentheses below. The number of observations was 605. Sources: (1) Ohio Department of Education, Information Management Services (1999); (2) School District Data Book (1994); (3) Ohio Department of Education (1993); (4) Ohio Department of Education (1985)

4 Traditional, Nonspatial Estimation Theory provides no guide as to the functional form the education production function should take (Figlio 1999). Furthermore, a Davidson and MacKinnon (1981) test for linearity versus log-linearity yields inconclusive results; the linear functional form is adopted by default. It is only in the last few years that the literature on the economics of education has addressed endogeneity in education production functions (Akerhielm 1995; Hoxby 1998). From the school district point of view, TEACHER SALARY and PUPIL/TEACHER RATIO are choice variables and should be treated endogenously. TEACHER EXPERIENCE and TEACHER EDUCATION are related to how long a teacher has been in the school district. These

The Spatial Education Production Function

185

variables are probably determined simultaneously with student achievement, and are therefore treated endogenously. Hoxby (1998) suggests that #DISTRICTS is endogenous to school quality, and the number of streams in the area should be used as an instrument. Hoxby also suggests that %PRIVATE is endogenous to public-school quality. However, there is some evidence that school-district consolidation is not a function of public-school quality (Brasington 1999), and therefore the number of school districts in an area is not a function of school quality. In addition, a Hausman test reveals that both #DISTRICTS and %PRIVATE may be treated exogenously in the education production function.4 Rothstein (2000) also argues in favor of treating #DISTRICTS as an exogenous variable.

5 The Spatial Education Production Function The theoretical model of school-district competition includes educational spillovers. In fact, a good deal of research has investigated spillovers in public good provision.5 A natural way to account for externalities is to incorporate spatial autocorrelation in the statistical estimation: rarely do economic theory and statistical technique complement each other so naturally. 6 When each school district affects the performance of neighboring school districts, spatial autocorrelation may exist (LeSage 1997a). The ordinary least-squares method does not account for the interplay between spatially close observations, which may lead to biased, inefﬁcient, and inconsistent parameter estimates (Anselin 1988). Neighboring school districts affect each other more than school districts far away from each other (Wyckoff 1984); consequently, a spatial weight matrix must be constructed to summarize the

4 The form of the Hausman test is that detailed by Maddala (1992) and Ramanathan (1998). Speciﬁcally, education production functions are run with and without the predicted values of #DISTRICTS and %PRIVATE. The calculated F statistic is 1.89 compared to a critical value of 2.30 at the 0.10 level of signiﬁcance, suggesting that #DISTRICTS and %PRIVATE may be treated exogenously. In unreported regressions, #DISTRICTS and %PRIVATE are treated endogenously, yielding somewhat larger parameter estimates for the two variables but no change in statistical signiﬁcance. 5 See Bradford and Oates (1974), Voß (1991), and Reiter and Weichenrieder (1997) for surveys. See also Murdoch et al. (1993), Wyckoff (1984), and Edwards (1986). 6 Murdoch et al. (1993) use a spatial autoregressive model to capture spillovers in recreation expenditures in Los Angeles. Brasington and Hite (2005) use a spatial Durbin model to capture spillovers in environmental quality, and D.L. Millimet and V. Rangaprasad (2005) use spatial statistics to model spillovers in educational input hiring.

186

D.M. Brasington

spatial conﬁguration of the observations. The traditional education production function takes the form tij = bxi + ei

(15)

where t represents the test passes, i is the school district, j is the test section, x is the set of student-, parent-, community-, and school-speciﬁc factors related to the test passes, and e is the error term. However, the traditional education production function ignores spillovers and other forms of spatial dependence. We ﬁrst employ the Bayesian spatial error model of LeSage (1997b) to address heteroskedasticity, outliers, and omitted variables. This is the same as the traditional model in Eq. 15, but with a more complex error term. tij = bxi + ei

(16)

e = lWe + m, m ∼ N(0, s V) 2

V = diag(v1, v2 , . . . , vn) r/vi ∼ c 2(r) / r 1/s 2 ∼ G(h, d) In Eq. 16, e is a spatial autoregressive error term that says the error term for each observation is related to the error terms of the neighboring observations. Unlike a time series, where an autoregressive lag represents nearby time periods, lWe is an autoregressive lag based on nearby observations in space. The W in the autoregressive term is a spatial weight matrix (LeSage 1997a) that summarizes the spatial layout of the data on a map. It tells which the nearest school districts are for each observation. The spatial weight matrix W in this study is constructed so that the ﬁve nearest neighbors are allowed to inﬂuence each school district. With 605 school districts in the sample, W is a 605 × 605 matrix. Row 1 represents school district 1. For each column y in row 1, we must ask, “Is school district y one of the ﬁve nearest to school district 1?” If the answer is yes, the column gets a 1; if not, the column gets a 0.7 Each row then has ﬁve 1’s. A common procedure in the spatial statistics literature is to make the sum of each row equal to unity, so with ﬁve neighbors, each neighbor is assigned a weight of 1/5 = 0.2. The spatial statistics literature commonly assumes that an observation cannot be its own neighbor, so that the spatial weights matrix has a zero diagonal. Also, just like there is no consensus on functional form in traditional regressions there is no consensus on the number of neighbors to use in spatial statistics, and the choice of the number of neighbors often affects estimation results. Five neighbors are chosen in the current study because, looking at a map of Ohio’s school districts, it seems that on average each school district shares a border with ﬁve other school districts. 7

The Spatial Education Production Function

187

The l in Eq. 16 is the spatial autoregressive parameter to be estimated. It gives the degree to which the error terms of our observation and its neighbors are related. Finally, m is a white-noise error term. The ﬁrst two terms in Eq. 16 describe a spatial error model. 8 The remaining terms distinguish the Bayesian model from the non-Bayesian spatial error model. The additional characterization of the error term helps correct for heteroskedasticity and outliers, and is discussed in greater detail in J.P. LeSage (1999), but a brief discussion is provided here. A big difference between Bayesian and non-Bayesian estimation is the use of prior information. We allow diffuse priors for l , b, and s 2 . Ordinarily, the term r in Eq. 16 would be a distributed gamma with two parameters. Instead, following J.P. LeSage (1999, p. 121), we use an informative prior on vi of r = 4. This particular prior yields relatively constant estimates of vi in the presence of homoskedasticity, while at the same time accommodating nonconstant error variances in the presence of heteroskedasticity and outliers.9 Computational tricks by Barry and Pace (1999) and Pace and Barry (1998) are used to allow the sample to run in a reasonable amount of time.10 The spatial error model captures the inﬂuence of omitted variables that vary across space. Any omitted inﬂuence that varies across space will be subsumed in the error term, and the spatial autoregressive term will capture these inﬂuences. The income of parents and the race of the students are included, but the education production functions do not measure the income and racial characteristics of the neighborhood. These community characteristics may affect student performance above and beyond the inﬂuence from the parents, perhaps through volunteering efforts and tax-levy pass rates. Neighborhood crime rates are also omitted, and the accompanying feeling of security could affect student performance. These omitted inﬂuences will be subsumed in the error term, and normally might adversely inﬂuence parameter estimates. However, the second term in Eq. 16 recognizes the correlation between the error terms of neighboring school districts. If people live in areas with other similar people, then income, race, and crime rates will be similar across space, and change in character gradually from school district to school district. If the error term for school district A is affected by income, race, and crime rates, the error term for nearby school district B is affected to a similar degree. The correlation between the error terms is

If V is a matrix of ones, the ﬁrst two terms of Eq. 16 fully characterize a non-Bayesian spatial error model. 9 An alternative is to set the two parameters of the gamma distribution for r to the informative priors of 8 and 2. Nearly identical estimates are achieved either way. 10 For example, the Bayesian spatial error model with 300 draws and 30 burn-in draws, using %PASS MATH as the dependent variable, took 629 seconds. 8

188

D.M. Brasington

captured by the spatial model. In this manner, the spatial error model addresses the inﬂuence of omitted variables in the house-price hedonic. A more complete intuitive explanation of how spatial statistics addresses omitted variables is found in Brasington and Hite (2005). A mathematical proof is available in Grifﬁth (1988). The Bayesian spatial error model in Eq. 16 depends on having a large number of draws to converge to the true joint posterior distribution of the parameters. With insufﬁcient draws, parameter estimates cannot be trusted. Although convergence diagnostics are available, the true test of convergence is when the estimates do not change with added draws. A model with 300 draws (with 30 additional burn-in draws) achieves similar results to a model with 1000 draws (with 100 additional burn-in draws), suggesting that 300 draws is sufﬁcient. There are other ways in which spatial statistics can capture the inﬂuence of omitted variables. One of these is a Bayesian spatial autoregressive model (LeSage 1997b): tij = rWtij + bxi + ei

(17)

e ∼ N(0, s 2V) V = diag(v1, v2 , . . . , vn) r/vi ∼ c 2(r)/r s 2 ∼ G(h, d) r ∼ uniform(−1, 1) In the above equation, rWtij is the spatial autoregressive term. This is the term that captures the essence of public good spillovers, allowing the educational production of each school district to depend on the educational production of its neighboring school districts. The theoretical section modeled spillovers as part of a joint product technology, but there may be an additional justiﬁcation for the rWtij term. School district administrators may keep their eyes on neighboring school districts and try to compete with them academically, so the outcomes of neighboring school districts may affect the administrative effort of a school district, which in turn may affect school district outcomes. Administrators care less about the performance of far-away school districts. Compared with the Bayesian spatial error model of Eq. 16, the Bayesian spatial autoregressive model of Eq. 17 takes the inﬂuence of unobserved variables out of the error term e and controls for them with the rWtij term. While the spatial error model guards against inefﬁcient parameter esti-

The Spatial Education Production Function

189

mates, the spatial autoregressive model guards against biased parameter estimates. If the rWtij term is signiﬁcant, and if it were omitted as in a traditional education production function, the matrix of parameter estimates b would suffer from bias, and hypothesis testing would be invalid. A ﬁnal way in which spatial dependence is captured in the education production functions is through a spatial Durbin model (LeSage 1997b): (I − rW)tij = a + b1xi + Wb2 xi + ei

(18)

e ∼ N(0, s 2V) V = diag(v1, v2 , . . . , vn) r/vi ∼ c 2(r)/r s 2 ∼ G(h, d) r ∼ uniform(−1, 1) As in the spatial autoregressive model of Eq. 17, the spatial Durbin model of Eq. 18 also contains a spatial autoregressive term rWtij and the education production function controls xi. Unlike the spatial autoregressive model, the spatial Durbin model contains the education production function controls for our neighbors Wxi as well. While the b terms are different for xi and Wxi, the same W is used in two places in Eq. 18. Some research (Anselin 1988) has claimed that the r and b2 are not identiﬁed, but H.H. Kelejian and I.R. Prucha (2004) have proved that they are.11 The Wxi term captures spillovers in a different way than the rWtij term. School quality may be more than a function of its own inputs and neighboring districts’ school quality. It may also be a function of demographic inﬂuences in neighboring districts. Children who live on the border of two school districts may play with each other, so that peer-group effects may spill across school district boundaries. Churches, Rotary Clubs, and country clubs may draw members from different (but probably nearby) school districts; parents at these social organizations may discuss schooling with each other and form attitudes about schooling quality, dress codes, and homework levels which may inﬂuence the schooling decisions they make. Wxi allows the characteristics of neighboring school districts to affect outcomes in each school district. These spillover effects may be stronger across schools than across

H.H. Kelejian and I.R. Prucha (2004) prove this for a model with both spatial autoregressive lag terms rWt and lWe, but the results should follow for the spatial Durbin model as well. 11

190

D.M. Brasington

school districts, but if they are present across school districts then they are likely to be present across schools within a district.12

6 Estimation Results The ﬁrst set of regression results uses %PASS ALL, i.e., the percentage of students passing all four sections of the proﬁciency test, as the dependent variable. The ﬁrst set of results in Table 2 is the nonspatial instrumental variables model. The results of the instrumental variables model are generally consistent with expectations. If all else is constant, the proﬁciency test pass rate is higher in school districts where children come from two-parent families with high incomes, where there is a low proportion of minority students, and in communities with low poverty rates and higher-educated citizens. The sign of COHORT SIZE is negative, but fails to achieve statistical signiﬁcance at the 0.10 level. Most of the school-speciﬁc inputs show a signiﬁcant relationship with student achievement, but two of these are of theoretically inconsistent sign. While school districts with better-paid teachers have higher pass rates, the results suggest that school districts with higher teacher education levels have lower pass rates, and that a higher student/teacher ratio raises pass rates. A correlation matrix suggests that these results are not an artifact of multicollinearity. Consistent with competitive effects, #DISTRICTS is positively related to pass rates. However, %PRIVATE is negatively related to pass rates in public-school districts, a result which is in contrast with most of the literature, but is consistent with Zanzig (1997). The preceding analysis does not address spillovers or omitted variables. The results of the ﬁrst spatial education production functions are reported in the remaining columns of Table 2.13 Testing for spatial effects is of supreme importance. The spatial error lag has a parameter estimate of 0.36 and is statistically signiﬁcant. In fact, the spatial parameter estimates are all positive and statistically signiﬁcant throughout the paper, suggesting that the use of spatial statistics is warranted. The 0.36 estimate means that the error terms have, on average, a 0.36

12 School outcomes are not available in Ohio at the school level, but only at the school district level. For high schools, this may not matter much, as most school districts have one high school. For example, even among the urban school districts, only 24 of the 140 have more than one high school. 13 D.L. Millimet and V. Rangaprasad (2005) also have a claim to being the ﬁrst, since neither of our papers have been published. The ﬁrst draft of this paper was completed August 5, 1999, which may precede theirs.

The Spatial Education Production Function Table 2. Regression results using %PASS ALL Instrumental Spatial error variables model model %BOTH PARENTS PARENT INCOME %PRIVATE %MINORITY COHORT SIZE TEACHER SALARY TEACHER EXPERIENCE PUPIL/TEACHER RATIO TEACHER EDUCATION %POVERTY #DISTRICTS %NO DIPLOMA %HS DIPLOMA ONLY LAG %BOTH PARENTS LAG PARENT INCOME LAG %PRIVATE LAG %MINORITY LAG COHORT SIZE LAG TEACHER SALARY LAG TEACHER EXPERIENCE LAG PUPIL/TEACHER RATIO LAG TEACHER EDUCATION

0.29** (4.09) 0.40** (4.53) −0.19* (2.47) −0.15** (2.91) −0.053 (1.18) 0.58* (2.42) −0.30 (0.94) 0.74* (2.13) −0.49* (2.00) −0.24* (2.29) 0.20* (2.46) −0.49** (5.58) −0.25** (3.95) — — — — — — — — — — — — — — — — — —

0.27** (2.68) 0.15 (1.61) −0.15 (1.57) −0.19** (3.26) −0.070 (1.50) 0.32 (1.19) −0.05 (0.13) 0.56 (1.41) −0.20 (0.73) −0.14 (1.08) 0.13 (1.10) −0.56** (5.44) −0.34** (4.16) — — — — — — — — — — — — — — — — — —

191

Spatial autoregressive model

Spatial Durbin model

0.34** (3.56) 0.17* (2.19) −0.14* (1.65) −0.19** (3.65) −0.080* (1.92) 0.23 (0.99) −0.11 (0.35) 0.48 (1.35) −0.16 (0.71) −0.06 (0.53) 0.14* (2.00) −0.51** (6.25) −0.28** (4.47) — — — — — — — — — — — — — — — — — —

0.23* (2.55) 0.15* (2.05) −0.12 (1.43) −0.21** (3.71) −0.077* (1.86) 0.20 (0.83) −0.16 (0.48) 0.56 (1.48) −0.13 (0.55) −0.09 (0.74) 0.26 (1.55) −0.61** (6.59) −0.38** (5.90) 0.25 (1.48) 0.17 (0.92) 0.23* (1.71) 0.06 (0.65) 0.041 (0.38) 0.26 (0.47) −1.20* (1.82) 0.54 (0.73) −0.60 (0.94)

192

D.M. Brasington

Table 2. Continued Instrumental variables model LAG %POVERTY LAG #DISTRICTS LAG %NO DIPLOMA LAG %HS DIPLOMA ONLY CONSTANT Spatial parameter estimate r or l Adjusted R2

— — — — — — — — 0.22* (1.73) — — 0.49

Spatial error model — — — — — — — — 0.37** (2.75) 0.36** (6.87) 0.54

Spatial autoregressive model

Spatial Durbin model

— — — — — — — — 0.17 (1.43) 0.26** (5.73) 0.54

0.08 (0.40) 0.028 (0.15) 0.29 (1.64) 0.36** (2.75) −0.00 (0.02) 0.28** (4.76) 0.56

The number of observations was 605. Parameter estimates are shown with the absolute value of the t-statistic in parentheses below. ** Statistically signiﬁcant at the 0.01 level; * statistically signiﬁcant at the 0.10 level. The spatial parameter is r for both the spatial autoregressive model and the spatial Durbin model, and l for the spatial error model. LAG is the estimate of the spatial lag parameter for the Wx of each variable

spatial correlation with each other: the unmeasured things in our education production function are somewhat similar to the unmeasured things of our neighbors. The next model, the spatial autoregressive model, suggests that the average correlation between one school district’s pass rate and its neighbors’ pass rates is 0.26, and the spatial Durbin model suggests this correlation is 0.28.14 The spatial Durbin model also ﬁnds that three of the explanatory variables of our neighbors help explain the proﬁciency pass rate of our school district. It is difﬁcult to compare the results of D.L. Millimet and V. Rangaprasad (2005) because (1) they do not use proﬁciency tests as measures of school quality, and (2) they do not use the same spatial models and so have no estimates of l or b2 . D.L. Millimet and V. Rangaprasad (2005) reported the estimates of r when the measures of school quality are the pupil/teacher ratio, expenditure per pupil, capital expenditure per pupil, teacher salary, and school size, but the results are not comparable to the estimate of r for the proﬁciency test pass rate variables used in this study. Their estimates also vary from speciﬁcation to speciﬁcation, ranging from small and negative to large and positive. This is much larger than the spatial effects Murdoch et al. (1993) found for recreation expenditures. They found a very small (0.01) spatial autoregressive parameter estimate and statistically signiﬁcant spillovers in only one of their two regressions.

14

The Spatial Education Production Function

193

The explanatory power is higher in the spatial models. Adjusted R2 is 0.49 in the nonspatial models, while it is 0.54, 0.54, and 0.56 in the spatial models. The Bayesian component of the spatial models purposefully tries not to ﬁt outliers, which depresses the adjusted R2 . If the spatial models had ﬁt outliers like the nonspatial model, the difference in the adjusted R2 might be even larger. Even if the improvement is modest, the larger adjusted R2 is found in all spatial models throughout the study. Perhaps the most signiﬁcant ﬁnding in Table 2 is that the spatial models eliminate the statistical signiﬁcance of all the school-speciﬁc inputs. Failure to incorporate spatial dependence seems to have attributed too much explanatory power to teacher salary, the pupil/teacher ratio, and teacher education levels. The lack of signiﬁcance is consistent with Hanushek (1986) and much of the recent literature. The other major differences concern the school competition variables. Hoxby (1998) argued that the parameter estimate of %PRIVATE is biased downward, but the spatial models that address the inﬂuence of omitted variables suggest otherwise. The nonspatial model’s parameter estimate was −0.19, but the parameter estimates of the spatial models ranged from −0.15 to −0.12, suggesting (if anything) an upward bias to least squares. While the nonspatial model showed a signiﬁcant relationship for %PRIVATE, only one of the three spatial models showed a statistically signiﬁcant relationship, and that barely reached the 0.10 level of signiﬁcance. In any case, even the largest parameter estimate −0.19 yields an elasticity of −0.02, so regardless of statistical signiﬁcance, the economic signiﬁcance of competitive effects from private schools is negligible. The results for the relationship between public school competition and those for private school competition are similar. While the instrumental variables regression shows a positive, signiﬁcant relationship, the spatial models generally ﬁnd no relationship. Again, the only spatial model showing a signiﬁcant relationship is the spatial autoregressive model, and again the magnitude of the parameter estimate, even for the largest, shows a trivial elasticity of test passes of 0.04 with respect to the number of school districts. The enrollment of each grade is negatively related to test passage in two of the three spatial models, and approaches signiﬁcance in the third; it was insigniﬁcant in the nonspatial model. The elasticity is −0.03, and is almost exclusively found when %PASS ALL is used. These ﬁndings are consistent with those of Brasington (1997). Table 3 shows the results of regressions when the percentage of students passing the math section is the dependent variable. Pass rates on the math section are lower than those for other sections, so it is not surprising that the results are similar to the %PASS ALL results: if a student failed any

194

D.M. Brasington

Table 3. Regression results using %PASS MATH Instrumental Spatial error variables model model %BOTH PARENTS PARENT INCOME %PRIVATE %MINORITY COHORT SIZE TEACHER SALARY TEACHER EXPERIENCE PUPIL/TEACHER RATIO TEACHER EDUCATION %POVERTY #DISTRICTS %NO DIPLOMA %HS DIPLOMA ONLY CONSTANT Spatial parameter estimate r or l Adjusted R2

0.38** (4.30) 0.09 (1.12) −0.21** (2.81) −0.22** (4.44) −0.031 (0.70) 0.79** (3.34) −0.37 (1.17) 0.88* (2.56) −0.63** (2.60) −0.26* (2.50) 0.16* (2.11) −0.56** (6.53) −0.18** (2.87) 0.33** (2.62) — — 0.49

0.27** (2.72) 0.05 (0.60) −0.16 (1.62) −0.27** (4.20) −0.046 (0.95) 0.58* (2.18) −0.04 (0.11) 0.65 (1.52) −0.38 (1.41) −0.14 (1.04) 0.15 (1.44) −0.67** (6.95) −0.22** (2.96) 0.43** (3.08) 0.31** (5.07) 0.53

Spatial autoregressive model

Spatial Durbin model

0.34** (3.93) 0.10 (1.25) −0.14* (1.70) −0.26** (5.36) −0.052 (1.17) 0.49* (2.04) −0.10 (0.30) 0.57 (1.58) −0.35 (1.42) −0.08 (0.75) 0.15* (2.11) −0.59** (7.02) −0.19** (2.94) 0.23* (1.74) 0.23** (5.48) 0.53

0.24* (2.50) 0.05 (0.67) −0.11 (1.32) −0.30** (5.15) −0.06 (1.29) 0.37 (1.40) −0.04 (0.14) 0.49 (1.38) −0.23 (0.87) −0.06 (0.48) 0.31* (1.97) −0.70** (7.93) −0.29** (4.57) 0.02 (0.05) 0.23** (4.04) 0.55

The number of observations was 605. Parameter estimates are shown with the absolute value of the t-statistic in parentheses below. ** Statistically signiﬁcant at the 0.01 level; * statistically signiﬁcant at the 0.10 level. The spatial parameter is r for both the spatial autoregressive model and the spatial Durbin model, and l for the spatial error model. Spatial lags are included for the spatial Durbin model, but suppressed in output to save space: LAG PARENT INCOME, LAG %NO DIPLOMA, and LAG %HS DIPLOMA ONLY are positive

section, it was most likely to be the math section. However, while parent income was related to %PASS ALL in three of four regressions, it was not related to %PASS MATH in any regression. The only role for parent income was found in the spatial Durbin model, where LAG PARENT INCOME was positively related to %PASS MATH. The parameter estimate (not shown) is 0.29, suggesting that the elasticity of our own math pass rate with respect to

The Spatial Education Production Function

195

our neighbors’ income levels is 0.16. If neighboring school districts’ incomes are 10% higher, our math pass rate is 1.6% higher. The evidence for competitive effects is stronger for the math model than for any other test section. Five out of eight parameter estimates are statistically signiﬁcant, but, as before, the magnitude of the effects is trivial. Also as before, the parameter estimate of %PRIVATE is weaker in the spatial models than in the nonspatial model. As before, the nonspatial model shows signiﬁcant effects for three of four school inputs, and two of these have an unexpected sign. As before, incorporating spatial dependence kills off the statistical signiﬁcance of TEACHER EDUCATION and PUPIL/TEACHER RATIO. However, TEACHER SALARY remains positive and statistically signiﬁcant in two of the three spatial models. The median elasticity from the spatial models is 0.29, suggesting that a 10% rise in teacher salary is associated with a 2.9% rise in math pass rates. In other words, a 10% rise in teacher salaries might raise the average district’s pass rate from 0.61 to 0.628. COHORT SIZE is not signiﬁcant in any regression, spatial or not, and %POVERTY, while signiﬁcant in the nonspatial model, is insigniﬁcant in all spatial models. The magnitude of the spatial parameters ranges from 0.23 to 0.31. The results of the %PASS CITIZENSHIP regression are easily summarized (Table 4). Competitive effects are almost completely absent. As in the math section, the incomes of neighbors seem to spill over and positively affect citizenship pass rates, but no direct relationship between parental incomes and pass rates is found. No school input is statistically signiﬁcant in either the spatial or the nonspatial models. The magnitude of the spatial parameters ranges from 0.17 to 0.23. The adjusted R2 shows its largest leap by going from 0.44 in the nonspatial model to 0.55 in the spatial Durbin model. The results of the %PASS READING model are similar to those of the %PASS CITIZENSHIP model (Table 5). There is no evidence of competitive effects or of the importance of any school input to test pass rates. Parent income shows as insigniﬁcant in the nonspatial model, but becomes statistically signiﬁcant in all spatial models, and while %POVERTY is statistically signiﬁcant in the nonspatial model, when spatial dependence is addressed, it loses its signiﬁcance. The spatial parameters range from 0.17 to 0.26. The model with %PASS WRITING shows almost nothing of statistical signiﬁcance (Table 6). The adjusted R2 ranges from 0.21 in the nonspatial model to 0.25 in the spatial Durbin model. The lack of signiﬁcance and low adjusted R2 may stem from the way the writing section is graded. Unlike the other test sections, the writing section is not a multiple choice test. The tests are sent out-of-state and hand-graded. The pass rates are high: the average pass rate of this section is 0.87, and the standard deviation is 0.08. The lack of variation contributes to the lack of statistically signiﬁcant ﬁndings. Still,

196

D.M. Brasington

Table 4. Regression results using %PASS CITIZENSHIP Instrumental Spatial error variables model model %BOTH PARENTS PARENT INCOME %PRIVATE %MINORITY COHORT SIZE TEACHER SALARY TEACHER EXPERIENCE PUPIL/TEACHER RATIO TEACHER EDUCATION %POVERTY #DISTRICTS %NO DIPLOMA %HS DIPLOMA ONLY CONSTANT Spatial parameter estimate r or l Adjusted R2

0.28** (3.60) 0.06 (0.87) 0.08 (1.22) −0.16** (3.72) −0.060 (1.54) 0.25 (1.21) 0.07 (0.26) 0.06 (0.19) −0.14 (0.67) −0.31** (3.30) 0.09 (1.37) −0.29** (3.84) −0.17** (3.06) 0.58** (5.23) — — 0.44

0.27** (3.01) 0.05 (0.70) 0.03 (0.38) −0.18** (3.41) −0.051 (1.17) 0.22 (0.88) 0.26 (0.73) −0.05 (0.15) −0.13 (0.53) −0.27* (2.21) 0.08 (0.78) −0.33** (3.83) −0.20** (3.07) 0.61** (4.73) 0.22** (3.86) 0.45

Spatial autoregressive model

Spatial Durbin model

0.28** (3.36) 0.07 (0.98) 0.02 (0.26) −0.18** (4.17) −0.057 (1.57) 0.22 (1.02) 0.18 (0.59) −0.02 (0.06) −0.14 (0.65) −0.22* (2.57) 0.09 (1.31) −0.32** (3.84) −0.18** (3.37) 0.46** (3.95) 0.17** (3.53) 0.45

0.24* (2.50) 0.05 (0.67) −0.11 (1.32) −0.30** (5.15) −0.06 (1.29) 0.37 (1.40) −0.04 (0.14) 0.49 (1.38) −0.23 (0.87) −0.06 (0.48) 0.31* (1.97) −0.70** (7.93) −0.29** (4.57) 0.02 (0.05) 0.23** (4.04) 0.55

The number of observations was 605. Parameter estimates are shown with the absolute value of the t-statistic in parentheses below. ** Statistically signiﬁcant at the 0.01 level; * statistically signiﬁcant at the 0.10 level. The spatial parameter is r for both the spatial autoregressive model and the spatial Durbin model, and l for the spatial error model. Spatial lags are included for the spatial Durbin model, but suppressed in output to save space: LAG PARENT INCOME, LAG %NO DIPLOMA, and LAG %HS DIPLOMA ONLY are positive

The Spatial Education Production Function Table 5. Regression results using %PASS READING Instrumental Spatial error variables model model %BOTH PARENTS PARENT INCOME %PRIVATE %MINORITY COHORT SIZE TEACHER SALARY TEACHER EXPERIENCE PUPIL/TEACHER RATIO TEACHER EDUCATION %POVERTY #DISTRICTS %NO DIPLOMA %HS DIPLOMA ONLY CONSTANT Spatial parameter estimate r or l Adjusted R2

0.19** (4.34) 0.04 (1.07) 0.05 (1.38) −0.10** (4.14) −0.038* (1.79) 0.10 (0.89) 0.09 (0.60) 0.10 (0.62) −0.13 (1.08) −0.12* (2.37) 0.09* (2.39) −0.25** (5.95) −0.05 (1.46) 0.75** (12.40) — — 0.47

0.13** (2.60) 0.07* (1.79) 0.04 (0.97) −0.13** (4.60) −0.040 (1.63) −0.01 (0.09) 0.16 (0.75) 0.07 (0.33) −0.03 (0.25) −0.10 (1.50) 0.05 (0.93) −0.28** (5.30) −0.04 (1.08) 0.80** (10.86) 0.26** (4.55) 0.50

197

Spatial autoregressive model

Spatial Durbin model

0.14** (3.07) 0.08* (2.54) 0.03 (0.63) −0.14** (5.76) −0.030 (1.49) −0.02 (0.19) 0.17 (0.91) 0.05 (0.25) −0.05 (0.37) −0.08 (1.62) 0.06* (1.83) −0.28** (6.95) −0.02 (0.70) 0.64** (9.49) 0.17** (4.02) 0.49

0.10* (2.42) 0.07* (2.08) 0.05 (1.50) −0.15** (5.40) −0.032* (1.70) −0.03 (0.27) 0.13 (0.78) 0.06 (0.37) −0.04 (0.33) −0.06 (0.95) 0.05 (0.89) −0.33** (6.94) −0.06* (1.86) 0.43** (3.36) 0.23** (4.48) 0.51

The number of observations was 605. Parameter estimates are shown with the absolute value of the t-statistic in parentheses below. ** Statistically signiﬁcant at the 0.01 level; * statistically signiﬁcant at the 0.10 level. The spatial parameter is r for both the spatial autoregressive model and the spatial Durbin model, and l for the spatial error model. Spatial lags are included for the spatial Durbin model, but suppressed in output to save space: LAG %BOTH PARENTS, LAG %MINORITY, LAG COHORT SIZE, LAG TEACHER SALARY, and LAG %HS DIPLOMA ONLY are positive; LAG TEACHER EDUCATION is negative

198

D.M. Brasington

Table 6. Regression results using %PASS WRITING Instrumental Spatial error variables model model %BOTH PARENTS PARENT INCOME %PRIVATE %MINORITY COHORT SIZE TEACHER SALARY TEACHER EXPERIENCE PUPIL/TEACHER RATIO TEACHER EDUCATION %POVERTY #DISTRICTS %NO DIPLOMA %HS DIPLOMA ONLY CONSTANT Spatial parameter estimate r or l Adjusted R 2

0.21** (3.30) 0.05 (0.77) 0.08 (1.43) −0.04 (1.09) −0.036 (1.14) 0.01 (0.04) −0.03 (0.15) 0.15 (0.61) 0.00 (0.02) −0.15* (2.02) 0.05 (0.87) −0.05 (0.75) −0.11* (2.36) 0.73** (8.16) — — 0.21

0.18* (2.38) 0.03 (0.51) 0.07 (1.23) −0.06 (1.29) −0.037 (0.98) 0.05 (0.26) 0.02 (0.06) 0.16 (0.56) −0.02 (0.09) −0.15 (1.59) 0.01 (0.18) −0.06 (0.79) −0.13** (2.69) 0.77** (7.66) 0.20** (3.13) 0.23

Spatial autoregressive model

Spatial Durbin model

0.20** (3.28) 0.04 (0.66) 0.06 (1.20) 0.05 (1.35) −0.037 (1.22) 0.05 (0.30) 0.00 (0.02) 0.14 (0.55) −0.03 (0.17) −0.12 (1.57) 0.03 (0.53) −0.06 (0.94) −0.11* (2.38) 0.59** (5.35) 0.17** (2.99) 0.22

0.15* (2.45) 0.04 (0.69) 0.10* (1.74) −0.09* (2.24) −0.039 (1.28) 0.02 (0.11) −0.07 (0.30) 0.23 (0.94) −0.02 (0.12) −0.12 (1.13) 0.01 (0.12) −0.07 (0.98) −0.18** (3.89) 0.50* (2.32) 0.18** (3.11) 0.25

The number of observations was 605. Parameter estimates are shown with the absolute value of the t-statistic in parentheses below. ** Statistically signiﬁcant at the 0.01 level; * statistically signiﬁcant at the 0.10 level. The spatial parameter is r for both the spatial autoregressive model and the spatial Durbin model, and l for the spatial error model. Spatial lags are included for the spatial Durbin model, but suppressed in output to save space: LAG COHORT SIZE and LAG %HS DIPLOMA ONLY are positive; LAG TEACHER EDUCATION is negative

The Spatial Education Production Function

199

the spatial parameters range from 0.17 to 0.20, which are similar to those of the other test sections, and three spatial lags are signiﬁcant in the spatial Durbin model. As in the reading section, %POVERTY appears to be signiﬁcant until spatial dependence is addressed.

7 Conclusion Educational spillovers have been discussed theoretically for a long time, but no one has accounted for them in empirical estimations of local publicschooling provision.15 Signiﬁcant spatial effects are present in all 15 spatial education production functions, with spatial parameter estimates that range from 0.17 to 0.36. Researchers who ignore the spatial nature of the data run the risk of having biased, inefﬁcient, and inconsistent parameter estimates (Anselin 1988). Researchers should account for spillovers in the production of education and omitted variable bias by incorporating spatial statistics. In addition to more accurate parameter estimates, the spatial models add to the explanatory power of education production functions. Despite the use of spatial Bayesian techniques that mitigate the ﬁtting of outliers, the average adjusted R2 improves from 0.42 to 0.46. The inﬂuence of competition on public-school performance is the focus of the education production functions. Most recent education production function studies ﬁnd that public schools respond to competition from private schools (Hoxby 1998; Borland and Howsen 1996; Dee 1998; Couch et al. 1993). This study disagrees. There seems to be no consistent association between private-school attendance rates and public-school achievement. Furthermore, although there is always a positive relationship between the number of public-school districts in a county and public-school test scores, it is only statistically signiﬁcant in 8 of 20 regressions. The magnitude of its effect is also paltry, and never exceeds 0.05.16 At ﬁrst glance, such slight competitive effects suggest that school vouchers will not markedly improve public-school performance; consequently, vouchers will probably not improve labor force productivity or earnings either. If so, vouchers may only serve to subsidize rich parents. Willms and Echols (1992) found that Scottish parents with a high socioeconomic status were the ones who took advantage of Britain’s school voucher system. If only rich parents use the vouchers, they will be used to make tuition at other schools more affordable for those most able to afford it.

Except for the working paper by D.L. Millimet and V. Rangaprasad (2005). At the mean, the largest it reaches is 0.043 in the spatial Durbin model of the %PASS CITIZENSHIP regression. 15 16

200

D.M. Brasington

However, these ﬁndings must be interpreted carefully. The data are real. As such, they represent the current competitive environment in education, and as the results of this study conﬁrm, the current environment can hardly be described as competitive. To the extent that they can afford it, parents choose where to live in part based on the quality of school their house is assigned to. However, other considerations are involved: taxes, convenience to work, pollution, availability of parks, and quality of housing stock may all be more important to parents than school quality. Also, the quality of schools may change over time, but high moving costs inhibit relocation. Residents whose children graduate may stop volunteering at the schools and stop approving school tax levies, and the tax laws make it costly to attend private schools or schools outside the attendance zone, since a parent must forego his tax contribution and pay full tuition at the new school as well. What is more, private-school competition consists predominantly of Catholic schools. School vouchers would make private schools more affordable, which would elicit a supply response that may open up a wide variety of attractive choices to parents (Merriﬁeld 2002). So parents who are not particularly interested in sending their children to Catholic schools may be more interested in sending them to a Montessori school that emphasizes African–American culture. Public economists sometimes think of education as a merit good: a pure private good that is funded by taxation and provided by the local government. Being private, its beneﬁts are conﬁned to the area in which it is produced. The signiﬁcant spatial parameter estimates of the current study imply that some of the beneﬁts of schooling spill over into neighboring areas. Therefore, schooling is not a merit good, and the presence of externalities means that a nonoptimal amount may be consumed. Failure to incorporate spatial statistics may lead to misleading parameter estimates. Six of the eight parameter estimates for school inputs were statistically signiﬁcant in the %PASS ALL and %PASS MATH regressions, but ﬁve of the six effects disappeared in the spatial models. Poverty appeared to depress reading and writing pass rates, but this effect disappeared in the spatial models. Spatial statistics also uncovered other provocative ﬁndings, like the discovery that having richer neighbors increases math and citizenship pass rates in our own school district. Having found evidence of spillovers in education production functions, labor and education economists are exhorted to use spatial statistics in their estimations. Public and urban economists who deal with schooling in estimations are encouraged to do the same. Furthermore, signiﬁcant spatial effects may be present in the demand for, and supply of, other publicly provided commodities such as crime prevention, pollution abatement, garbage collection, libraries, and parks. Additional research is required. Pending

The Spatial Education Production Function

201

these investigations, much of the empirical economic literature of the last 30 years may need to be rewritten to address the spatial nature of the data. Acknowledgments. Bob Martin, Jim LeSage, John H.Y. Edwards, Kelley Pace, John Vandermosten, David Rodgers, Toshiharu Ishikawa, and other participants at the 2005 Chuo University Meeting on the Economics of Time and Space are thanked for their comments. The Tulane University Committee on Research is thanked for a $4000 Summer Fellowship in 1999. The ﬁrst draft was completed June 22, 1999.

References Akerhielm K (1995) Does class size matter? Econ Educ Rev 14:229–241 Anselin L (1988) Spatial econometrics: methods and models. Kluwer Academic, London, Dordrecht, pp 35, 58, 59, 103, 181 Barry RP, Pace RK (1999) A Monte Carlo estimator of the log determinant of large sparse matrices. Linear Algebra Appl 289:41–54 Bishop J (1989) Is the test score decline responsible for the productivity growth decline? Am Econ Rev 79:178–197 Borland MV, Howsen RM (1996) Competition, expenditures and student performance in mathematics: a comment on Couch et al. Public Choice 87:395–400 Bradford DF, Oates WE (1974) Suburban exploitation of central cities and governmental structure. In: Hochman HM, Peterson GE (eds) Redistributions through public choice. Columbia University Press, New York, London, p 44–90 Brasington DM (1997) School district consolidation, student performance, and housing values. J Reg Anal Policy 27:43–54 Brasington DM (1999) Joint provision of public schooling: the consolidation of school districts. J Public Econ 73:373–393 Brasington DM, Haurin DR (2006) Educational outcomes and house values: a test of the value-added approach. J Reg Sci 46:245–268 Brasington DM, Hite D (2005) Demand for environmental quality: a spatial hedonic analysis. Reg Sci Urban Econ 35:57–82 Couch JF, Shughart WF, Williams AL (1993) Private school enrollment and public school performance. Public Choice 76:301–312 Davidson R, MacKinnon JG (1981) Several tests for model speciﬁcation in the presence of multiple alternatives. Econometrica 49:781–793 Dee TS (1998) Competition and the quality of public schools. Econ Educ Rev 17:419–427 Edwards, JHY (1986) A note on the publicness of local goods: evidence from New York State municipalities. Can J Econ 19:568–573 Epple D, Romano RE (1998) Competition between private and public schools, vouchers, and peer-group effects. Am Econ Rev 88:33–62 Falkinger J (1994) The private provision of public goods when the relative size of contributions matters. Finanzarchiv 51:358–371 Figlio DN (1999) Functional form and the estimated effects of school resources. Econ Educ Rev 18:241–252

202

D.M. Brasington

Fowler WJ Jr, Walberg HJ (1991) School size, characteristics, and outcomes. Educ Eval Policy Anal 13:189–202 Friedkin NE, Necochea J (1988) School system size and performance: a contingency perspective. Educ Eval Policy Anal 10:237–249 Gomes-Neto JB, Hanushek EA, Leite RH, et al. (1997) Health and schooling: evidence and policy implications for developing countries. Econ Educ Rev 16:271–282 Grifﬁth DA (1988) Advanced spatial statistics: special topics in the exploration of quantitative spatial data series. Kluwer, Dordrecht, p 94–107 Haller EJ (1992) High-school size and student indiscipline: another aspect of the school consolidation issue? Educ Eval Policy Anal 14:145–156 Hanushek EA (1986) The economics of schooling: production and efﬁciency in public schools. J Econ Lit 24:1141–1177 Hanushek EA (1992) The trade-off between child quantity and quality. J Polit Econ 100:84–117 Hanushek EA, Taylor LL (1990) Alternative assessments of the performance of schools: measurement of state variations in achievement. J Hum Resour 25:179–201 Hayes KJ, Taylor LL (1996) Neighborhood school characteristics: what signals quality to homebuyers? Fed Reserve Bank Dallas Econ Rev 3:2–9 Hoxby CM (1996) Are efﬁciency and equity in school ﬁnance substitutes or complements? J Econ Perspect 10:51–72 Hoxby CM (1998) What do America’s “traditional” forms of school choice teach us about school choice reforms? Fed Reserve Bank New York Econ Policy Rev 4:47–59 Hoxby CM (2000) Does competition among public schools beneﬁt students and taxpayers? Am Econ Rev 90:1209–1238 Ireland NJ (1990) The mix of social and private provision of goods and services. J Public Econ 43:201–219 Jewell RW (1989) School and school district size relationships: cost, results, minorities, and private school enrollments. Educ Urban Soc 21:140–153 Kennedy PE, Siegfried JJ (1997) Class size and achievement in introductory economics: evidence from the TUCE III data. Econ Educ Rev 16:385–394 LeSage JP (1997a) Regression analysis of spatial data. J Reg Anal Policy 27:83–94 LeSage JP (1997b) Bayesian estimation of spatial autoregressive models. Int Reg Sci Rev 20:113–129 Loury LD, Garman D (1995) College selectivity and earnings. J Labor Econ 13:289–308 Maddala GS (1992) Introduction to econometrics. Prentice-Hall, Englewood Cliffs, p 395 Merriﬁeld J (2002) School choices: true and false. The Independent Institute, Oakland MESA Group (1994) School district data book. National Center for Education Statistics, US Department of Education, Washington Meyer RH (1996) Value-added indicators of school performance. In: Hanushek EA, Jorgenson DW (eds) Improving America’s schools: the role of incentives. National Academy Press, Washington, p 97–223 Murdoch JC, Rahmatian M, Thayer MA (1993) A spatially autoregressive median voter model of recreational expenditures. Public Financ Q 21:334–350 Murnane RJ, Willett JB, Levy F (1995) The growing importance of cognitive skills in wage determination. Rev Econ Stat 77:251–266 Ohio Department of Education (1985) Maps of Ohio school districts: city, exempted, local. State of Ohio, Columbus Ohio Department of Education (1993) Ohio educational directory: 1992–1993 school year. State of Ohio, Columbus

The Spatial Education Production Function

203

Ohio Department of Education, Division of Information Management Services (1999) http://www.ode.ohio.gov/www/ims/pregen_rept.html Ornstein AC (1993) School consolidation vs. decentralization: trends, issues, and questions. Urban Rev 25:167–174 Pace RK, Barry RP (1998) Simulating mixed regressive spatially autoregressive estimators. Comput Stat 13:397–418 Ramanathan R (1998) Introductory econometrics with applications. Dryden Press, Fort Worth, Texas, USA, p 170 Reiter M, Weichenrieder A (1997) Are public goods public? A critical survey of the demand estimates for local public services. Finanzarchiv 54:374–408 Rothstein J (2005) Does competition among public schools beneﬁt students and taxpayers? A comment on Hoxby (2000). National Bureau of Economic Research, Inc, NBER Working Paper No. 11215 Sander W (1996) Catholic grade schools and academic achievement. J Hum Resour 31:540–548 Stern D (1989) Educational cost factors and student achievement in grades three and six: some new evidence. Econ Educ Rev 8:149–158 Voß W(1991) Nutzen-spillover-effekte als problem des kommanalen ﬁnanzausgleichs: ein beitrag zur okonomischen rationalitat des ausgleichs zentralitatsbedingten ﬁnanzbedarfs, Frankfurt a.M. Willms JD, Echols F (1992) Alert and inert clients: the Scottish experience of parental choice of schools. Econ Educ Rev 11:339–350 Wyckoff JH (1984) The nonexcludable publicness of primary and secondary public education. J Public Econ 24:331–351 Zanzig BR (1997) Measuring the impact of competition in local government education markets on the cognitive achievement of students. Econ Educ Rev 16:431–444

11. Innovation, R&D Cooperation, and the Geography of Regional Labor Acquisition Jaakko Simonen1 and Philip McCann2

Summary. This chapter considers the role played by geography in the promotion of innovation. In order to examine these issues, we employ empirical data from Finland to test the extent to which the variety and nature of faceto-face contacts affects the innovation performance of the ﬁrm. In addition, we also control for the geographical mobility of the labor employed by the ﬁrm. This allows us to identify the different roles which the geography of knowledge spillovers and exchanges and the geography of labor markets play in the innovation process. Our ﬁndings here suggest that local face-to-face contact is an essential feature of innovation process. Moreover, our results concerning the importance of the variety of R&D relations also provide support for this argument. This is not to say, however, that the evidence unambiguously supports agglomeration–innovation arguments. The reason is that our labor market results point to a rather different conclusion. In particular, having controlled for the types of R&D cooperation relations which a ﬁrm exhibits, the ﬁnding that neither local labor acquisition nor local population density are related to innovation casts doubt on some of the agglomeration–innovation theories stressing the importance of local labormarket skills. Key words. Innovation, R&D Cooperation, Labor acquisition, High technology

1 Department of Economics, University of Oulu, PO Box 4600 FIN-90014, University of Oulu, Finland. 2 Department of Economics, University of Waikato, Private Bag 3105, Hamilton, New Zealand, and Department of Economics, University of Reading, Reading RG6 6AW, UK.

205

206

J. Simonen and P. McCann

1 Introduction The role played by knowledge spillovers has become a major focus of research interest amongst economists analyzing the nature and processes of innovation and growth. Since the original work of Schumpeter (1934), the wide-ranging literature on the determinants of innovation has tended to focus on two key lines of enquiry. Firstly, research aims to identify the structural and sectoral characteristics of the ﬁrms and entrepreneurs which promote innovation. Here, some of the key issues which have emerged as playing a signiﬁcant role in promoting ﬁrm innovation are the levels of R&D expenditure, the stock of human capital inputs, the sector of activity, the mode of organization, and also the size of the ﬁrm. More recently, over the last two decades, there has also emerged widespread interest in the role which geography and spatial industrial organization may play in the innovation process. This research interest has arisen as a response to the groundbreaking work of Krugman (1991), Porter (1990), and Scott (1988) on issues related to industrial clustering and agglomeration. The mechanisms by which knowledge is transferred are seen as a crucial driving force behind the spatial agglomeration of activities. In all three of the original Krugman, Porter, and Scott approaches, local knowledge ﬂows either between ﬁrms, or between ﬁrms and organizations, are assumed to take place via either local tacit knowledge spillovers between individual people, or via the local movement of embodied human capital. These knowledge ﬂows are assumed to be mediated primarily by face-to-face contact, and the requirements of proximity to facilitate this process are assumed to give rise to spatial concentration of activities. At the same time, knowledge transfer is also seen as a crucial part of the process of innovation, and there is also much empirical evidence which points not only to the localization tendencies of innovation (Jaffe et al. 1993), but also the links between innovation and the spatial clustering of activities (Arita and McCann 2000; Acs 2002; Cantwell and Iammarino 2003). However, none of this empirical evidence provides fundamental insights into the nature of the face-to-face knowledge exchanges associated with innovation and their links with spatial clustering. The reason is that while the literature dealing with innovation and growth generally assumes that face-to-face contact between individuals and ﬁrms is positively related to the levels of innovation, the same literature also adopts “rather diffuse and vague notions that knowledge and innovation reside ‘in the air’ or in the ‘buzz’ of urban life” (Power and Lundmark 2004, p. 1025). These vague notions limit our ability to empirically test these arguments relating agglomeration and clustering to innovation, and these limitations are enhanced by

Regional Labor Acquisition

207

the general paucity of appropriate data. The result is that the actual role played by face-to-face contact is still largely “a missing aspect of (the) mechanisms that are considered to generate agglomeration” (Storper and Venables 2004, p. 353). This chapter will explicitly examine these questions by identifying and distinguishing between the effects on innovation of tacit knowledge transfers via inter-ﬁrm and inter-organizational contacts, from the effects on innovation by labor mobility. In order to do this, we employ a unique innovation dataset from Finland on ﬁrm innovation behavior, which combines ﬁrm-speciﬁc information on the nature of inter-ﬁrm and interorganizational contacts, and also the geographical and sectoral patterns of ﬁrm labor recruitment. By relating each of these features to innovation performance, we are able to distinguish between the effects of knowledge spillovers from face-to-face contact from those of labor mobility.

2 Innovation and Economic Geography The hypothetical links between innovation and economic geography have arisen from a variety of different empirical and analytical sources. Various hypotheses have been developed to account for the widely observed uneven spatial distribution of innovative behavior (Gordon and McCann 2005a,b), and these can be explained individually. Hypothesis 1: The contemporary geography of innovation is essentially a geography of the currently more innovative sectors of the economy This hypothesis is based on the observation that in any period there are some sectors of economic activity which will be more heavily involved in the innovation of products or processes than others. The reasons why this is so may be because of the particular phase which has been reached in the lifecycle of an industry’s product set, or because some activities with very short product cycles are more or less permanently locked into the innovative phase. If each of these industries is subject to rather different location factors, because of the nature of their production technologies or their marketing and consumption processes, the geography of innovation may then be reducible simply to a geography of industrial location. With activities remaining in the same broad locations through all of the phases of the product cycle, places which they dominate will also appear to move through that cycle, except for the homes of the permanent innovators, which would remain continuing sites of innovation. Hypothesis 2: The contemporary geography of innovation is essentially a result of sectoral differences in the phases of product or proﬁt cycles.

208

J. Simonen and P. McCann

This alternative interpretation of the product cycle geographies emphasizes signiﬁcant and typical shifts in the locational requirements between the phases of a (primarily oligopolistic) industry’s product or proﬁt cycle. From this perspective, during these early innovative phases, neither the scale of production nor the certainty of growth are sufﬁcient for ﬁrms to attempt self-sufﬁciency either in production or training, while design uncertainties also militate against reliance on either distant suppliers or semiskilled labor. In this early phase, access to appropriate skills and subcontractors is a crucial condition for successful innovation and the management of uncertainties. Later on, in the mature phases of the cycle, when output scale has been achieved, production methods have become routine, and cost factors are increasingly important, both simple geographical dispersal to lower-cost locations and the spatial division of labor will become increasingly relevant (Markusen 1985). From this perspective, therefore, what is generally signiﬁcant about the geography of innovative activities is not the distribution of creative or inventive potential, but the production conditions which allow infant ﬁrms and industries to survive and thrive in a nursery environment until they acquire the scale and experience to strike out on their own. Hypothesis 3: The contemporary geography of innovation is essentially the outcome of variations in the characteristics between different places which lead to differences in the geography of creativity and entrepreneurship. This third approach to understanding the distribution of innovative activity focuses on the geography of creativity and entrepreneurship, in the sense of place characteristics favoring the development and commercial launching of potentially successful new or improved products, either through established or new business organizations. The emphasis here is on the factors which stimulate and enable novel developments while also facilitating the selection of those with real competitive potential. The three key sets of factors involve: a rich “soup” of skills, ideas, technologies, and cultures within which new compounds and forms of life can emerge; a permissive environment enabling unconventional initiatives to be brought to the marketplace; and vigorously competitive and critical arenas operating selection criteria which anticipate and shape those of wider future markets. The arguments underlying this third hypothesis are currently dominated by the literatures on agglomeration, creativity, and knowledge spillovers. Hypothesis 4: The contemporary geography of innovation is essentially a result of the fact that innovation is most likely to occur in small and mediumsized enterprises, whose spatial patterns happen to be uneven. This fourth approach to explaining the uneven geographical patterns of innovation involves another type of “milieu” argument, which is focused on

Regional Labor Acquisition

209

the geography of cooperation, and rests on the perception that innovation is most likely to occur in small and medium-sized enterprises (SMEs) which have neither the scale nor the risk-bearing capacity to provided all of the key inputs on their own account. Observations from so-called “new industrial districts” (Scott 1988) such as Silicon Valley (Saxenian 1994) and the Emilia– Romagna region of Italy (Scott 1988; Castells and Hall 1994) have suggested that the geographical proximity of SMEs is a necessary criterion for the development of mutual trust relations based on a shared experience of interaction with decision-making agents in different ﬁrms. The latter two hypotheses assume that innovation is directly related to industrial clustering and agglomeration, and as a result of the work of Krugman (1991), Porter (1990), and Scott (1988), these clustering–innovation hypotheses are now the most popular analytical approaches. Indeed, the nature of the links between spatial clustering and innovation have now become a key feature of the research into innovation systems and economic growth (Caniels 2000; Cantwell and Iammarino 2003; Acs 2002; Breschi and Lissoni 2001a,b). It is speciﬁc aspects of these clustering–innovation arguments that we will test in this chapter. In particular, we focus on the role which face-to-face knowledge exchanges play in promoting innovation. In order to do this, we must ﬁrst acknowledge that there are in reality four quite distinct mechanisms by which face-to-face contact and innovation can be linked. Firstly, face-to-face contact may promote innovation by increasing the possibility of informal knowledge spillovers between ﬁrms and individuals (Krugman 1991); secondly face-to-face contact may promote innovation by increasing the mutual transparency of competitor behavior and thereby competitor responses (Porter 1990); thirdly, face-to-face contact may promote innovation by increasing the levels of cooperation as well as competition between ﬁrms (Scott 1988); fourthly, face-to-face contact may promote innovation by increasing the inter-ﬁrm mobility of labor (Almeida and Kogut 1999). Unfortunately, a lack of appropriate data has meant that it has previously not been possible to identify these different mechanisms empirically, and therefore it is very difﬁcult to distinguish between, and evaluate the effects of, tacit knowledge spillovers which accrue because of inter-ﬁrm or inter-organizational relations, and embodied human–capital knowledge transfers which take place due to the mobility of labor between ﬁrms. In principle at least, it is theoretically possible to treat the ﬁrst three mechanisms by which face-to-face contact and innovation can be linked, namely informal knowledge spillovers, mutual transparancy, and cooperation, as being qualitatively quite similar to each other. At the same time, these three mechanisms can be regarded as a group, and as being quite different in nature from the fourth mechanism, namely the inter-ﬁrm mobility of labor. The reason for this becomes clear if we adopt the analogy of a simply

210

J. Simonen and P. McCann

stock-inventory model. The ﬁrst three types of mechanisms generally involve highly frequent short-term transactions of relatively small quantities of knowledge or information in comparison to the total knowledge of the person(s) undertaking the transaction, whereas the inter-ﬁrm mobility and acquisition of labor involves relatively less frequent transactions in which the whole human capital of the individual is transferred for a signiﬁcant period. In terms of agglomeration theory, any of these types of transaction could be related to innovation. However, while empirical data on patents are readily accessible, along with aggregate measures of labor skills and education, additional microeconometric data on labor mobility and innovation are relatively very difﬁcult to ﬁnd. For this reason, although there is some evidence supporting the role played by the mobility of local human capital in promoting innovation (Angel 1991; Audretsch and Stephan 1996; Almeida and Kogut 1999; Breschi and Lissoni 2003; Franco and Filson 2000; Persson 2002; Power and Lundmark 2004), these papers are very much in the minority. In most recent studies of the geography of innovation, analyses which emphasize the role played by local informal knowledge spillovers (Acs 2002; Kaiser 2002) have tended to predominate over human capital and labor mobility explanations. The prevalence of such papers therefore leads to the impression that informal knowledge spillovers tend to dominate over labor market effects, although as yet there is no actual evidence for this. The objective of this chapter is to partially ﬁll this research gap. Using a unique dataset, our approach is therefore both to identify and distinguish between these different potential forms of knowledge transfer, and then to empirically relate the importance of these alternative knowledge transfer mechanisms to the innovation process. The ﬁndings of this research should contribute to our knowledge of both the economics of growth and also the economics of space.

3 Data and Methodology R&D cooperative relationships usually begin as a result of informal knowledge spillovers, although they subsequently develop into much more complex and formal arrangements. Of all the possible types of inter-ﬁrm relations, R&D cooperative relationships require the most intense face-to-face contact in order to be both established and maintained, and for high-technology ﬁrms at least, have been demonstrated to be the type of inter-ﬁrm relations which are most associated with geographical proximity (Arita and McCann 2000). This is because the required level of trust embedded in the mutual commitments made by the ﬁrms tends to be the highest of all types of interﬁrm and inter-organizational relations. Therefore, testing whether the

Regional Labor Acquisition

211

existence, nature, and variety of R&D cooperation relationships is related to innovation provides an indirect test of whether the existence, nature, and variety of intense and continuing face-to-face contact is related to innovation. In this chapter, we aim to identify the role which labor mobility plays in promoting innovation, after having controlled for other ﬁrm-speciﬁc characteristics, including face-to-face contact with other ﬁrms and organizations embodied in these R&D cooperation relationships. In order to do this, we employ microeconometric data on the innovation behavior and performance of Finnish high-technology ﬁrms. As well as information about the innovation behavior of the ﬁrms, these data provide us with detailed information about the structural characteristics of the ﬁrms, and also information about the R&D cooperation behavior between individual Finnish ﬁrms. After controlling for the impact on innovation of a ﬁrm’s structural characteristics, and also for the nature and variety of a ﬁrm’s cooperation R&D relationships, we are then also able to extend the argument to investigate the effects on a ﬁrm’s innovation performance of the geographical and sectoral origins of the ﬁrm’s labor acquisitions. This will allow us to isolate the independent role on innovation performance which is played by knowledge transfers associated with local and nonlocal human–capital mobility from the role associated with inter-ﬁrm and inter-organizational knowledge spillovers associated with R&D cooperation relationships. The establishment-level data used for our research come from innovation surveys conducted by Statistics Finland in 1996, 2000, and 2002. The innovation surveys undertaken by Statistics Finland are conducted using the subject approach, which is in line with the European Union Community Innovation Survey (CIS) framework. These surveys collect information about an individual establishment’s innovation performance, deﬁned as the launching of any new or substantially improved products or processes during the previous 2 years. In addition to the innovation outputs, the innovation surveys also provide information about the ﬁrm’s innovation cooperation with other enterprises or institutions. For the purposes of this research, these innovation databases have also been further expanded with information from the Finnish Business Register for 1990–2002 and the Finnish Longitudinal Employer-Employee Database (FLEED) database maintained by Statistics Finland, which provide us with information about the geographical and sectoral origins of labor recently acquired by the establishment. Our combined database now includes establishment-level information regarding each of the types of innovation exhibited by an establishment, the employment size of each establishment, the turnover of each establishment, the nature and variety of the R&D cooperation relations of the ﬁrm, and the geographical and sectoral origins of the labor recently acquired by the establishment.

212

J. Simonen and P. McCann

Our dataset consists of all the Finnish high-technology ﬁrms (deﬁned according to the 1995 SIC and listed in Table 1) which were included in the three innovation surveys of 1996, 2000, and 2002. The basic establishment-level unit of analysis we adopt is that of the “local unit,” and the innovation behavior we ascribe to the local unit comes from the ﬁrm-level innovation data. Assigning innovation behavior to the local unit is straightforward in the case of a single plant or single-site ﬁrm because the ﬁrm is the local unit. In the case of a multiplant ﬁrm which has establishments in different areas, each individual establishment is the unit of analysis, and in the absence of any additional information, the innovation behavior we ascribe to each establishment is that of the multiplant ﬁrm as a whole. In the case of Finnish data, this technique has previously been adopted elsewhere (Alanen et al. 2000). There are two major justiﬁcations for this technique. Firstly, if we assume that information is transferred effectively between different local units within the same ﬁrm (Orlando 2000), then it is appropriate to give the same innovation status to all the individual local units of the same ﬁrm. Moreover, the majority of ﬁrms surveyed had fewer than three establishments, given that almost half had only one establishment. In the few cases where a ﬁrm had several establishments in the same local area, the unit of analysis adopted is an aggregate of all the local plants combined. In addition, the surveys also provide information about the R&D interaction between individual ﬁrms and other ﬁrms or institutions, speciﬁcally in order to promote innovation, and this information is provided at the level of the ﬁrm. Once again, with single-plant ﬁrms or with multiplant ﬁrms in different localities, assigning interaction and cooperation behavior to the local unit is straightforward. However, in the case of multiplant ﬁrms with establishments in different localities, it is possible that all the individual establishments do not interact with other ﬁrms or organizations to the same

Table 1. High-technology industries (SIC 1995) High-technology industry Manufacture of pharmaceuticals, medicinal chemicals, and botanical products Manufacture of ofﬁce machines and computers Manufacture of radio, television, communications equipment, and apparatus Manufacture of aircraft and spacecraft Telecommunications Computers and related activities Research & development Architectural and engineering activities and related technical consultancy

SIC 95 244 30 32 353 642 72 73 742

Regional Labor Acquisition

213

extent. However, once again, in the absence of any additional information, we assign the same levels of cooperation and interaction to each establishment within the same ﬁrm. The information on R&D cooperation is treated here as our major proxy for face-to-face contact. The raw nonadjusted data show that between the different size-bands, the proportion of ﬁrms innovating is relatively high among both the groups of very small local establishments employing fewer than 10 people, and also the larger establishments employing more than 50 people, whereas the probability of innovation appears to be lower among the establishments employing between 10 and 49 people. This gives rise to a J-shaped relationship between establishment size and innovation intensity, as observed by Tether et al. (1997), who found that small enterprises did not introduce a disproportionately large share of innovations relative to their employment share, and only the largest enterprises introduced more innovations than their size would suggest. A similar result also holds in terms of the probability of cooperation for R&D activities. However, for econometric analysis, given the types of data we have at our disposal, the analysis of these data is most appropriately undertaken by employing a series of probit models, in which we estimate the probability of a particular type of innovation taking place as a function of the size characteristics of the ﬁrm, the cooperation behavior of the ﬁrm, and the labor acquisition behavior of the ﬁrm. In order to test the relationship between innovation behavior and the importance of these various features, we estimate three different types of probit model. In each of these three models, the binary dependent variable is deﬁned as being equal to 1 if the local establishment has managed to introduce new innovations during the previous years, and equal to 0 if it has not. The three separate probit models cover all three response categories of innovations, i.e., product innovations, process innovations, and new products introduced to the market. Product innovation represents the situation where a ﬁrm has introduced a product to the market which is new for the particular ﬁrm, but not for the industry. Process innovation represents a situation where a ﬁrm has introduced a new production process or a new service delivery process. New products to market represents a situation where a ﬁrm introduces a product which is new to the market as whole, and can be considered as being the most advanced form of innovation. The set of independent variables we employ includes dummy variables for the employment size-band categories of the local establishments, a continuous variable for the establishment turnover, plus a dummy variable indicating whether the establishment has cooperated with other ﬁrms and institutions on R&D issues during the previous 2 years. Our cooperation variable takes a value of 1 if the local establishment cooperates with at least one type of partner, and 0 if it does not have any kind of cooperation

214

J. Simonen and P. McCann

relationship with external partners. As we have seen at the beginning of this section, the importance of the cooperation dummy variable employed here lies in the fact that inter-ﬁrm cooperation is generally regarded as being positively related to the level of face-to-face interaction between ﬁrms, because of the need to build up mutual trust and conﬁdence. This variable is therefore employed as proxy for face-to-face contact. In terms of local labor market variables, our data are calculated with respect to local labor market regions. There are about 80 ofﬁcial subregions in Finland, with an average population of some 65 000. We employ a continuous variable representing the population density of the region in order to capture any local external agglomeration spillover effects which are external to the individual ﬁrm (Ciccone and Hall 1996), and we also employ detailed data about the proportions, and also the respective geographical and sectoral origins, of the ﬁrm’s total employment which has been acquired recently. We run the set of three probit models, Model 1, Model 2, and Model 3, in three different types of circumstance, giving a total of nine different probit models. Model 1 estimates the likelihood of a ﬁrm producing a product innovation, Model 2 estimates the likelihood of a ﬁrm producing a process innovation, and Model 3 estimates the likelihood of a ﬁrm producing a new product to market innovation. In the ﬁrst case, we test whether innovation is related to either cooperation relations, or the pattern of local and nonlocal labor mobility. In the second case, we test whether the geographical and sectoral origins of the labor acquired are related to innovation, after controlling for the number and variety of R&D cooperation relations exhibited. In the third case, having controlled for the geographical and sectoral origins of the labor acquired, we test whether the nature and types of R&D cooperation relations entered into is signiﬁcantly related to innovation. As such, our models become progressively more detailed as we move from the three ﬁrststage models to the three fourth-stage models. The models estimated for the ﬁrst set of circumstances are estimated on the basis of a pooled dataset from all three survey years of 1996, 2000, and 2002, while the second and third groups of models are estimated on pooled data from the 1996 and 2000 surveys. The reason for this is that the 2002 survey did not contain any questions regarding the number, variety, and types of R&D cooperation relations a ﬁrm exhibited, whereas both the 1996 and 2000 surveys did so.

4 Results The nine estimated probit models reported in Tables 2–4 have a goodnessof-ﬁt of between 0.194 and 0.499, with eight models having a goodness-of-ﬁt

Regional Labor Acquisition

215

of over 0.2, and with all models exhibiting an increasing goodness-of-ﬁt with the inclusion of additional explanatory variables. These model results are very good indeed.3 Table 2 shows the results of the models which test whether innovation is related to either cooperation relations, or the pattern of local and nonlocal labor mobility. In terms of the various size characteristics of the establishment, employment size appears to be positively and signiﬁcantly related to the introduction of new products to market for very small establishments, and negatively related to the probability of producing process innovations by medium-sized establishments, while turnover is positively and signiﬁcantly related to both product innovations and new products to market. The one variable which is consistently and positively related to innovation performance is R&D cooperation. Cooperation with other ﬁrms or organizations for R&D is always positively and signiﬁcantly related to all three types of innovation. As we have already mentioned, R&D cooperation relationships require the most intense face-to-face contact in order to be both established and maintained, and for high-technology ﬁrms, at least, have been demonstrated to be the type of inter-ﬁrm relations which are most associated with geographical proximity (Arita and McCann 2000). Therefore, this result supports the argument that face-to-face contact is always essential for all types of innovation. In terms of the proportion of labor acquired from within the local region as against outside the region, our initial results in Table 2 suggest that local labor acquisition is an essential feature of both product innovation and new products to market. These initial ﬁndings appear to lead support to agglomeration arguments, although the lack of any positive signiﬁcance for the population density variable, and even negative signiﬁcance in the case of process innovations, suggests that simple urbanization assumptions may not be tenable. In order to further investigate this point, in Table 3 we present the results of similar probit models to those reported in Table 2, but in these cases, the dichotomous R&D cooperation variables of Table 2 are now broken down into four categories in order to account for the different number of types of cooperation relation entered into. The seven types of cooperation partner relationships are: other establishments within the same ﬁrm; customers and clients; suppliers; competitors; higher education institutions; consultants; government research institutes; or nonproﬁt private research institutes.

In a choice model such as this, a pseudo R2 value of 0.2 is regarded as being very good indeed (McFadden 1979). 3

216

Name of the explanatory variables Intercept/constant Categories of LUs Very small LUs Small LUs SMLUs SMLUs SMLUs Large LUs

(1–9) (10–19) (20–49) (50–99) (100–249) (250→)

Log turnover Cooperation Mobility of labor from the same subregion (2-year lag) Mobility of labor from the other subregions (2-year lag) Population density Number of observations Log–likelihood

Coefﬁcients of Model 1 (P values in parentheses) −2.0055***

(0.0075)

0.7910*** 0.5061** 0.3330 0.1406 −0.0557 0.0000

(0.0062) (0.0514) (0.1651) (0.5545) (0.8067) Reference category

0.1948*** 1.9509*** 0.3663***

(<0.0001) (<0.0001) (<0.0001)

Coefﬁcients of Model 2 (P values in parentheses) −0.2692 −0.1515 −0.4084* −0.5030** −0.3557* −0.1782 0.0000 0.0593 1.2812*** 0.0721

Coefﬁcients of Model 3 (P values in parentheses)

(0.6995)

−1.9084***

(0.0085)

(0.5671) (0.0844) (0.0208) (0.0909) (0.3636) Reference category

0.7409*** 0.5531*** 0.4365** 0.1907 0.1700 0.0000

(0.0065) (0.0229) (0.0487) (0.3755) (0.4004) Reference category

(0.1072) (<0.0001) (0.3104)

0.1374*** 1.1726*** 0.1355*

(0.0004) (<0.0001) (0.0626)

0.1050

(0.3546)

0.0545

(0.6336)

0.1017

(0.4077)

−0.0001

(0.7935)

−0.0005**

(0.0458)

0.0001

(0.6509)

1628 −664.4871

1340 −736.1665

1179 −679.8991

J. Simonen and P. McCann

Table 2. The probit model: likelihood of introducing new innovations (pooled data for the years 1996, 2000, and 2002) Dependent variable Introduction of Introduction of Introduction of new (0,1) product innovations process innovations products to the market

c2

P value

c2

P value

c2

P value

Categories of LUs Turnover Cooperation Mobility of labor from the same subregion Mobility of labor from the other subregions Population density

18.27 23.33 704.08 22.32

0.0026 <0.0001 <0.0001 <0.0001

17.73 2.60 272.19 1.03

0.0033 0.1066 <0.0001 0.3107

10.59 12.74 203.70 3.50

0.0601 0.0004 <0.0001 0.0614

0.86

0.3532

0.23

0.6341

0.69

0.4066

0.07

0.7935

4.00

0.0454

0.21

0.6507

LR-test (diff., df, P value) McFadden’s likelihood ratio index Efron’s pseudo R2 ﬁt measure

395.14 0.373

LR statistics

0.448

12

<0.001

174.51 0.192

12

0.243

<0.001

116.49 0.146

12

<0.001

0.194

*** Signiﬁcant at the 1% level; ** signiﬁcant at the 5% level; * signiﬁcant at the 10% level LR, likelihood ratio; LU, local unit; SMLU, small to medium local unit Regional Labor Acquisition 217

Name of the explanatory variables

Coefﬁcients of Model 3 (P values in parentheses)

Intercept/constant Categories of LUs Very small LUs Small LUs SMLUs SMLUs SMLUs Large LUs

(1–9) (10–19) (20–49) (50–99) (100–249) (250→)

Log turnover Categories of the wideness of cooperation No partners 1–2 types of partner 3–4 types of partner 5–7 types of partner Mobility of labor from the same subregion (2-year lag) Mobility of labor from the other subregions (2-year lag) Population density Number of observations Log–likelihood

Coefﬁcients of Model 1 (P values in parentheses)

Coefﬁcients of Model 2 (P values in parentheses)

−4.0887***

(0.0016)

3.7375***

(0.0046)

1.4023*** 1.3135*** 0.9465*** 0.6444* 0.2745 0.0000

(0.0020) (0.0013) (0.0083) (0.0643) (0.4086) Reference category

−0.9267** −1.2751*** −0.9092*** −0.5327* −0.3567 0.0000

(0.0466) (0.0018) (0.0100) (0.1108) (0.2239) Reference category

1.0464** 1.1020*** 0.9683*** 0.7866** 0.4632 0.0000

(0.0288) (0.0088) (0.0077) (0.0266) (0.1317) Reference category

0.3595***

(<0.0001)

−0.1478**

(0.0342)

0.1779**

(0.0183)

−1.6056*** −0.7597*** −0.3529* 0.0000

(<0.0001) (0.0006) (0.0884) Reference category (0.1010)

−2.0730*** −1.3745*** −0.5773** 0.0000

0.0680

(<0.0001) (0.0008) (<0.0001) Reference category (0.6163)

(<0.0001) (<0.0001) (0.0131) Reference category (0.0140)

0.0646

(0.7708)

−0.2597

(0.3323)

−0.2359

(0.3881)

−0.0004

(0.2827)

−0.0006

(0.1599)

−0.0000

(0.9236)

−2.7456*** −0.9034*** −1.0657*** 0.0000

786 −291.4024

0.2392*

498 −259.4586

−2.0842

(0.1283)

−0.4082**

480 −226.2607

J. Simonen and P. McCann

Introduction of new products to the market

218

Table 3. The probit model: likelihood of introducing new innovations (pooled data, years 1996 and 2000) Dependent variable Introduction of Introduction of (0,1) product innovations process innovations

LR statistics

c2

P value

c2

P value

c2

P value

Categories of LUs Turnover Wideness of cooperation Mobility of labor from the same subregion Mobility of labor from the other subregions Population density

14.64 24.43 392.75 0.25

0.0120 <0.0001 <0.0001 0.6160

14.87 4.55 120.29 2.68

0.0109 0.0329 <0.0001 0.1014

8.42 5.76 173.28 6.32

0.1346 0.0164 <0.0001 0.0119

0.08

0.7709

0.99

0.3203

0.79

0.3753

1.15

0.2828

2.00

0.1577

0.01

0.9236

LR-test (diff., df, P value) McFadden’s likelihood ratio index Efron’s pseudo R2 ﬁt measure

232.62 0.444

<0.001

82.57 0.241

<0.001

106.11 0.319

0.499

14

0.306

14

14

<0.001

0.396

*** Signiﬁcant at the 1% level; ** signiﬁcant at the 5% level; * signiﬁcant at the 10% level

Regional Labor Acquisition 219

220

J. Simonen and P. McCann

As we see in Table 3, after controlling for the variety of a ﬁrm’s R&D relations, the advantage of smallness appears to become even more noticeable for product innovation and new products to market, while the negative effects on process innovations of being a medium-sized or small ﬁrm also become more pronounced. Turnover also now appears to be negatively associated with process innovations, as well as being positively related to product innovations and new products to market. R&D cooperation is highly signiﬁcant, in that cooperation with fewer than three or four different types of partner is negatively related to all three types of innovation. This suggests that the agglomeration–diversity arguments of new economic geography may have some relevance. However, in this case, the effects of the geography of labor acquisition now seem to disappear for both new products and new products to market, whereas it was consistently signiﬁcant in the results reported in Table 2. The population density variable now has no signiﬁcance. In order to clarify these issues, we further disaggregate the R&D cooperation relations into their various different types, and treat them as independent categories. This is in order to see if the nature of relations, rather than simply the number or variety of such relations, is important. As we see in Table 4, all our previous ﬁrm-size results from Tables 2 and 3 are conﬁrmed, in that smallness is an advantage for both product innovations and new products to market, and a disadvantage for process innovations, after controlling for turnover. Meanwhile, in terms of the geography of labor market variables, there appears to be no positive effect of local labor acquisition on innovation behavior whatsoever. On the other hand, however, in terms of the importance of geographic localization for product innovation and new products to market, we see that both R&D cooperation with other establishments within the same enterprise and also cooperation with clients or customers is essential for all types of innovation, while R&D cooperation with suppliers is essential for both product and process innovations. R&D relations with universities are positively associated with new products to market, but negatively related to product and process innovations, while R&D relations with government or private nonproﬁt research institutes are important for all types of innovation.

5 Discussion and Conclusions The results given here suggest that R&D cooperation is always essential for the introduction of all types of innovation, and particularly R&D cooperation with other establishments within the same ﬁrm, as well as with customers and suppliers. Given that R&D cooperation relationships require the

Table 4. The probit model: likelihood of introducing of new innovations (pooled data, years 1996 and 2000) Dependent variable Introduction of product Introduction of process Introduction of the new (0,1) innovations innovations products to the market Name of the explanatory variables Intercept/constant Categories of LUs Very small LUs Small LUs SMLUs SMLUs SMLUs Large LUs

Coefﬁcient of Model 1 −2.4894*

(1–9) (10–19) (20–49) (50–99) (100–249) (250→)

Log turnover

−1.4509

(0.0586)

4.7462***

1.1787*** 1.2682*** 0.8635** 0.6396* 0.2949 0.0000

(0.0119) (0.0025) (0.0203) (0.0765) (0.3980) Reference category

−1.2638*** −1.4347*** −1.0355*** −0.6348* −0.3884 0.0000

(0.0097) (0.0007) (0.0047) (0.0682) (0.1988) Reference category

1.0146** 1.1556*** 0.9857*** 0.8636** 0.5170* 0.0000

(0.0425) (0.0076) (0.0086) (0.0181) (0.1033) Reference category

0.3155***

(<0.0001)

−0.1911***

(0.0092)

0.1527**

(0.0491)

1.0235*** 0.6635*** 1.0696*** −0.0985 0.2852 −0.4343*

(<0.0001) (0.0049) (<0.0001) (0.7240) (0.1962) (0.0589)

0.4994** 0.5160*** 0.6978*** −0.4640** 0.2536 −0.4587*

(0.0128) (0.0063) (0.0014) (0.0236) (0.1593) (0.0600)

0.4789** 0.0821 0.4464** 0.0872 0.0208 0.7827***

(0.0224) (0.6881) (0.0399) (0.7066) (0.9187) (0.0008)

0.9819***

(0.0004)

0.7516***

(0.0005)

0.4166*

0.0763

(0.5787)

0.2017

(0.1884)

−0.4650***

(0.0055)

0.2145

(0.3242)

−0.2191

(0.4308)

−0.2660

(0.3389)

(0.1296)

−0.0006

(0.1926)

−0.0000

−0.0006 786 −297.9683

498 −246.2274

(0.3054)

(0.0762)

(0.9878) 480 −226.8095

221

Mobility of labor from the same subregion (2-year lag) Mobility of labor from the other subregions (2-year lag) Population density Number of observations Log–likelihood

(0.0006)

Coefﬁcient of Model 3

Regional Labor Acquisition

Cooperation: Within enterprise With suppliers With clients or customers With competitors With consultants With universities and higher education institutes With government or private nonproﬁt research institutes

Coefﬁcient of Model 2

222

Name of the explanatory variables LR statistics Categories of LUs Turnover Cooperation: Within enterprise With suppliers With clients or customers With competitors With consultants With universities and higher education institutes With government or private nonproﬁt research institutes Mobility of labor from the same subregion Mobility of labor from the other subregions Population density LR-test (diff., df, P value) McFadden’s likelihood ratio index Efron’s pseudo R2 ﬁt measure

Introduction of product innovations

Introduction of process innovations

Introduction of the new products to the market

Coefﬁcient of Model 1

Coefﬁcient of Model 2

Coefﬁcient of Model 3

c2

P value

c2

P value

c2

P value

12.83 17.78

0.0250 <0.0001

15.12 6.92

0.0099 0.0085

9.18 3.98

0.1021 0.0460

22.10 8.11 27.34 0.12 1.68 3.63

<0.0001 0.0044 <0.0001 0.7247 0.1947 0.0569

6.18 7.50 10.05 5.13 1.98 3.65

0.0129 0.0062 0.0015 0.0235 0.1592 0.0561

5.23 0.16 4.21 0.14 0.01 11.35

0.0222 0.6888 0.0402 0.7061 0.9187 0.0008

14.13

0.0002

12.46

0.0004

3.11

0.0777

0.31

0.5787

1.72

0.1895

8.13

0.0043

0.96

0.3282

0.65

0.4204

0.97

0.3240

2.30

0.1290

1.72

0.1902

0.00

0.9878

226.05 0.431 0.471

24

<0.001

95.80 0.280

24

0.355

*** Signiﬁcant at the 1% level; ** signiﬁcant at the 5% level; * signiﬁcant at the 10% level

<0.001

105.53 0.318 0.390

24

<0.001

J. Simonen and P. McCann

Table 4. Continued Dependent variable (0,1)

Regional Labor Acquisition

223

most intense face-to-face contact in order to be both established and maintained, and that for high-technology ﬁrms, at least, R&D cooperation links have been demonstrated to be the type of inter-ﬁrm relations which are most associated with geographical proximity (Arita and McCann 2000), our ﬁnding here suggests that local face-to-face contact is an essential feature of the innovation process. Moreover, our results concerning the importance of the variety of R&D relations also provide support for this argument. In particular, the increased intensity of face-to-face contacts which are made possible by geographical proximity provides a ﬁrm with the ability to interact with a greater number of potential contacts, and therefore to beneﬁt from a greater range of knowledge spillovers (Storper and Venables 2004). This is not to say, however, that the evidence unambiguously supports agglomeration–innovation arguments. The reason is that our labor market results point to a rather different conclusion. In particular, having controlled for the types of R&D cooperation relations which a ﬁrm exhibits, the ﬁnding that neither local labor acquisition nor local population density are positively related to innovation casts doubt on some of the agglomeration–innovation theories stressing the importance of local labor market skills. As such, to the extent that industrial clustering and innovation are found to be associated, our results here, as well as other theoretical arguments (McCann and Mudambi 2004, 2005), imply that this outcome is probably most likely related only to the particular information problems faced by small ﬁrms, as detailed in hypothesis 4 of Sect. 2. While innovation is most likely to occur in small and medium-sized enterprises, these small ﬁrms have neither the scale nor the risk-bearing capacity to provide all of their own key inputs. As such, they need to develop external cooperation relations, and these are most easily fostered with other local ﬁrms. The result is ﬁrm-clustering. Yet, as we see from the results here, this is only one limited aspect of the relationship between geography and innovation. No previous authors, as far as we are aware, have been able to relate data on both knowledge spillovers and labor mobility to innovation simultaneously. As such, no previous papers have been able to identify and distinguish between these two effects. On the other hand, by using data on R&D cooperation as a proxy for face-to-face contact and knowledge spillovers, and combining this with information on the geographical and sectoral origins of a ﬁrm’s labor acquisitions, our results here can begin to shed some light on the mechanisms by which innovation takes place. Acknowledgments. We are grateful for the suggestions and comments of Mikko Puhakka, Markku Rahiala and Rauli Svento. The authors also wish to thank Statistics Finland for the provision of data.

224

J. Simonen and P. McCann

References Acs Z (2002) Innovation and the growth of cities. Edward Elgar, Cheltenham Alanen A, Houtari J, Kangasharju A (2000) Constructing a new indicator for regional impact of innovativeness. Working Paper, Pellervo Economic Research Institute, Helsinki Almeida P, Kogut B (1999) Localization of knowledge and the mobility of engineers. Manage Sci 45:905–917 Angel DP (1991) High technology agglomeration and the labor market: the case of Silicon Valley. Environ Plann A 1501–1516 Arita T, McCann P (2000) Industrial alliances and ﬁrm location behaviour: some evidence from the US semiconductor industry. Appl Econ 32:1391–1403 Audretsch DB, Stephan PE (1996) Company–scientist locational links: the case of biotechnology. Am Econ Rev 86:641–652 Breschi S, Lissoni F (2001a) Localised knowledge spillovers vs. innovative milieux: knowledge “tacitness” reconsidered. Pap Reg Sci 80:255–273 Breschi S, Lissoni F (2001b) Knowledge spillovers and local innovation sytems: a critical survey. Ind Corp Change 10:975–1005 Breschi S, Lissoni F (2003) Mobility and social networks: localised knowledge spillovers revisited. CESPRI Working Paper 142, Department of Economics, Bocconi University, Milan Caniels M (2000) Knowledge spillovers and economic growth. Edward Elgar, Cheltenham Cantwell JA, Iammarino S (2003) Multinational corporations and European systems of innovation. Routledge, London Castells M, Hall P (1994) Technopoles of the world: the making of 21st century industrial complexes. Routledge, London Ciccone A, Hall RE (1996) Productivity and the density of economic activity. Am Econ Rev 86:54–70 Franco AM, Filson D (2000) Knowledge diffusion through employee mobility. Federal Reserve Bank of Minneapolis: Research Department Report 272, Minneapolis Gordon IR, McCann P (2005a) Innovation, agglomeration and regional development. J Econ Geogr 5:523–543 Gordon IR, McCann P (2005b) Clusters, innovation and regional development: an analysis of current theories and evidence. In: Johansson B, Karlsson C, Stough R (eds) Entrepreneurship, spatial industrial clusters and inter-ﬁrm networks. Edward Elgar, Cheltenham Jaffe AB, Trajtenberg M, Henderson R (1993) Geographic localization of knowledge spillovers as evidenced by patent citations. Q J Econ 108:577–598 Kaiser U (2002) Measuring knowledge spillovers in manufacturing and services: an empirical assessment of alternative approaches. Res Policy 31:125–144 Krugman P (1991) Geography and trade. MIT Press, Cambridge Markusen A (1985) Proﬁt cycles, oligopoly and regional development. MIT Press, Cambridge McCann P, Mudambi M (2004) The location decision of the multinational enterprise: some theoretical and empirical issues. Growth Change 35:491–524 McCann P, Mudambi M (2005) Analytical differences in the economics of geography: the case of the multinational ﬁrm. Environ Plann A 37:1857–1876

Regional Labor Acquisition

225

McFadden D (1979) Quantitative methods for analysing travel behaviour of individuals, in Hensher DA, Storper PR (eds.) Behavioural Travel Modelling, Croom Helm, London Orlando MJ (2000) On the importance of geographic and technological proximity for R&D spillovers: an empirical investigation. Research Working Paper RWP00-02, Research Division, Federal Reserve Bank of Kansas City Persson LO (2002) The impact of regional labour ﬂows in the Swedish knowledge economy. In: Higano Y, Nijkamp P, Poot J, van Wyk K (eds) The region in the new economy: an international perspective on regional dynamics in the 21st century. Ashgate, Aldershot, p 221–237 Porter ME (1990) The competitive advantage of nations. Free Press, New York Power D, Lundmark M (2004) Working through knowledge pools: labour market dynamics, the transference of knowledge and ideas, and industrial clusters. Urban Stud 41:1025–1044 Saxenian A (1994) Regional advantage: culture and competition in Silicon Valley and Route 128. Harvard University Press, Cambridge Schumpeter JA (1934) The theory of economic development. Harvard University Press, Cambridge Scott AJ (1988) New industrial Spaces. Pion, London Storper M, Venables AJ (2004) Buzz: face-to-face contact and the urban economy. J Econ Geogr 4:351–370 Tether BS, Smith IJ, Thwaites AT (1997) Smaller enterprises and innovation in the UK: the SPRU innovations database revisited. Res Policy 26:19–32

12. Taxation of Car Commuters’ Employer-Subsidized Parking Rickard E. Wall

Summary. In order to mitigate trafﬁc jams in dense urban areas, congestion charges are currently being planned and introduced in cities around the world. In Sweden, empirical results of the Jansson and Wall (2002) Stockholm study have drawn attention to another potential economic policy instrument—namely, that of taxation of car commuters’ free or partly employer-subsidized parking. The Jansson and Wall study shows that to reduce trafﬁc volumes, it is not sufﬁcient to tax employers for offering this fringe beneﬁt, e.g., in terms of social beneﬁt payments. Commuters must pay taxes in some form out of their own pockets. If properly designed and enforced, taxation of commuters’ free or partly employer-subsidized parking is estimated by Jansson and Wall to reduce rush-hour car trafﬁc into the inner city of Stockholm by as much as 10%–20%. Key words. Taxation-free parking, Trafﬁc congestion

1 Background and Purpose The dramatic increase in motorized trafﬁc during the post-World War II era has given rise to severe congestion problems in dense urban areas all over the world. In the early days of motorization, the standard approach for dealing with current or anticipated trafﬁc congestion was investment in road transport infrastructure. During the last few decades, however, the potential of properly designed complementary measures is increasingly being

Department of Social Sciences, Informatics and Teaching, Gotland University, SE-621 67 Visby, Sweden

227

228

R.E. Wall

recognized. For example, motorized trafﬁc has long been subject to tolling. Traditionally, the purpose of tolling has been to ﬁnance the roads, bridges, and other kinds of transport infrastructure the tolled vehicles use. Nowadays, the explicit purpose of introducing tolling systems is often to reduce trafﬁc volumes—typically in inner-city areas. Where this is the case, the appropriate term for the tolling system is congestion charges. In 1975, the world’s ﬁrst large-scale congestion-charge system was put into operation in Singapore, and in 1998, advanced technology was employed to facilitate the payment process. Congestion charges were introduced in London in February 2003, and the car trafﬁc volume within the congestioncharge zone was subsequently considerably reduced. From the point of view of congestion-charge proponents, the London experience offers strong evidence that such charges are an effective means of reducing inner-city car trafﬁc congestion. It is likely that in the years to come, a number of congestion-charge systems will be put into operation all over the globe. On January 3, 2006, Stockholm, Sweden, will begin a 7-month congestion-charge trial. The trial will conclude on July 31, 2006, and a local referendum on the permanent implementation of congestion charges will be held on September 17, 2006. Proponents of the trial expect that the congestion charges will result, as in London, in an appreciable reduction of the Stockholm inner-city car trafﬁc. It perhaps comes as no great surprise that increasing the cost of driving a car in a particular area reduces the car trafﬁc volume in that area. An interesting question, however, is whether it is possible to achieve the same results with lower charge-collection costs by the implementation of some other car trafﬁc reducing system? The congestion charge system in London was the result of a careful planning process spanning some 5 years. At an early stage of the planning process, an alternative to implementing congestion charges was considered very seriously, i.e., taxing car commuters’ free or partly employer-subsidized parking in the inner-city area of London. The idea was to tax ﬁrms on parking lots offered to employees free of charge or partially subsidized. The parking taxation alternative was abandoned during the course of the London planning process in favor of congestion charges, mainly because the ﬁrms’ behavior was not expected to be sensitive to such taxation. It seems that no one involved in the London system planning process asked the crucial question: What effect would it have on trafﬁc congestion if employees were responsible for paying the parking tax themselves? However, this question has been raised and studied in Sweden, by Jansson and Wall (2002), and this paper is available in Swedish. The purpose of the present chapter is to present the empirical results of the Jansson and Wall Stockholm study, and to comment on its legislative and

Taxation of Car Commuters’ Employer-Subsidized Parking

229

practical consequences. The chapter is organized as follows. Section 2 highlights issues related to car trafﬁc congestion in large cities, and provides some preliminaries on the Jansson and Wall study. Section 3 reviews the study, and Sect. 4 reports on the legal and practical consequences of the study. A few remarks in Sect. 5 conclude the chapter.

2 The Problem The major road-trafﬁc problem in Stockholm—as well as in many other large cities—is car congestion, which may be said to have three major sources. 1. Workers commuting by car from suburbs to the inner city in the morning and back in the late afternoon. 2. Workers commuting by car from suburbs to other suburbs and passing through the inner city in the morning and back in the late afternoon. 3. Car trafﬁc within the inner city during the day, particularly after the lunch hour. The theoretically ideal road pricing principle implies that each motorist should be charged the price-relevant social marginal cost for using space in streets and roads in proportion to the time and distance of his utilization. This means that the charge should be differentiated, for instance, with respect to when and where in the city the vehicle is used. The road pricing principle is appropriate for dealing with the three sources of trafﬁc congestion listed above. However, today’s technology does not allow an implementation of road pricing that is fully in line with the theoretical ideal. The congestion charge systems that are currently operating or are being planned are, by necessity, second-best solutions. Most car travelers who enter an inner-city area park there for a shorter or longer period of time. This observation allows for implementation of an alternative, or complementary, second-best solution—namely, measures directed at the parking market. An important fact for Swedish conditions is that the tax legislation stipulates that an employer’s provision of free or partly subsidized employee parking is to be regarded as a fringe beneﬁt, and therefore, individual private commuters who use such parking are subject to taxation. The beneﬁt should be valued at its market value. (In addition, the employer must pay social beneﬁts in proportion to the market value of the beneﬁt.) Monthly parking costs on the commercial parking market in the inner city of Stockholm amount to SEK 2000–4000 (1 USD ≈ 8 SEK in August 2005). As the income tax rate in Sweden varies from about 30% to 50%, such a parking taxation

230

R.E. Wall

would amount to some 5% of a typical employee’s net salary. Such costs are clearly prohibitive. Indeed, trafﬁc planners and transport economists have long been aware that only a few percent of the car commuters into the inner city of Stockholm pay out of their own pockets anything close to market prices for parking (see, e.g., Jansson and Swahn 1987). Yet many thousands of car commuters from the suburbs do park in the inner city of Stockholm during working days, in spaces provided free by their employers. However, there is an exception in Swedish taxation law. If an employee uses his private car, on order of his employer, for business trips during the workday, then the free parking at the workplace is not subject to taxation. No study to determine the proportion of free-parking commuters who use their cars during the workday had ever been undertaken, but it was generally believed that the vast majority of Stockholm’s private car commuters fell into this category. Therefore, the fact that some employers offered free parking to employees who commuted by car was not considered signiﬁcant by the tax authorities, and this particular tax stipulation has traditionally been poorly enforced. The tax stipulation’s implications for the trafﬁc congestion situation—particularly during rush hours in Stockholm—have gone unrealized. It seems that nobody recognized that this type of measure in the parking market is potentially useful for mitigating congestion that results from source 1 above.

3 Relevant Empirical Evidence Some 300 000 people work daytime hours in the inner city of Stockholm. Many of these people live in Stockholm’s suburbs and commute to the inner city on a daily basis. The aim of the Jansson and Wall Stockholm study was to estimate the extent of free or partly subsidized parking offered to these car commuters, and to calculate the likely effects on rush-hour trafﬁc volume of strictly enforcing present tax legislation on parking beneﬁts. In the main study, 28 workplaces in the inner city of Stockholm were visited during November and December 2001. The sample was not randomly drawn, but was intentionally selected so as to include different kinds of relevant workplaces—that is, private ﬁrms, nonproﬁt organizations, and public authorities. In a complementary study, 30 schools were visited. The commuting and parking routines of employees in both studies were recorded during interviews and by visual observations of car parks. It appeared to be common knowledge among employees that free or partly subsidized parking is legally subject to taxation. Therefore, it was necessary to make use of different interview and observation “tricks” in order to reveal the true parking situation at some workplaces.

Taxation of Car Commuters’ Employer-Subsidized Parking

231

3.1 Empirical Evidence of a Wide Selection of Workplaces A total of 28 workplaces, employing some 2850 people, were visited, meaning that this empirical part of the Jansson and Wall study covered some (300 000/2850 ≈) 1% of the total labor market in the inner city of Stockholm. The main aim of these visits was to get a rough idea of what proportion of the labor force commuted to work by car, but did not use their cars for business purposes during the working day—and thus would be taxed for parking if current tax stipulations were strictly enforced. The results are summarized in Tables 1 and 2. Table 1 presents data for workplaces at which the employer does not provide free parking, and Table 2 presents data for workplaces at which the employer does provide free parking. An additional workplace, not listed in Table 1 or 2, was visited, at which parking was partly subsidized by the employer. The employer at this workspace gave its approximately 500 employees the opportunity to rent a parking space for SEK 500 per month. The employer supplied 45 spaces, but only about 40 spaces were rented by employees. This gives a car commuting proportion of (40/500 =) 8%. Table 1 shows that at the 15 workplaces at which the employer does not provide free parking, only 2% of the employees commuted by car. Table 2 shows that this ﬁgure rises to 18% for those 12 workplaces at which the

Table 1. Data for the 15 work places in the sample at which the employer does not provide employee parking spaces free of charge Car commuters Workplace no. Number of employees Number % 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

9 40 19 170 10 175 45 35 550 70 3 325 8 2 110

0 2 0 2 1 5 0 0 10 2 0 3 0 1 0

0 5 0 1 10 3 0 0 2 3 0 1 0 50 0

Total

1571

26

2

232

R.E. Wall Table 2. Data for the 12 work places in the sample at which the employer does provide employee parking spaces free of charge Car commuters Workplace no. Number of employees Number % 16 17 18 19 20 21 22 23 24 25 26 27

130 170 33 25 13 150 7 6 4 52 25 160

20 14 7 5 10 30 3 3 1 25 4 15

15 8 21 20 77 20 43 50 25 48 16 11

Total

775

137

18

employer does provide free parking. If the corresponding proportions hold true for the total labor market of the inner city of Stockholm, this would mean that employers provide their employees with about ((300 000/2850) × 137 ≈) 17 000 parking spaces free of charge, and partly subsidize an additional ((300 000/2850) × 40 ≈) 4000 spaces. It may be argued, however, that the present sample is not only nonrandom and small in size, but is also inadequate in other respects. In order to shed more light on how car commuters would react if they were required to pay taxes on their parking spaces, another comparison between two groups of car commuters was conducted. In the supplementary study, 30 schools in the inner city of Stockholm were visited.

3.2 Empirical Evidence of Schools Free parking for school employees—the majority of whom are teachers—is, in most cases, offered at the schoolyard or campus itself, and is therefore easy to observe. The 30 schools visited represent almost all elementary and high schools in the inner city of Stockholm. It was quite easy to determine whether the commuter’s private car is needed during working hours—for teachers, such a need is virtually absent, for obvious reasons. The supply of free parking provided by schools displayed a high degree of variation. The decision as to whether or not free parking should be provided is typically up to the school principal’s discretion. Free parking was not provided at seven of the schools visited. The results are presented in Table 3.

Taxation of Car Commuters’ Employer-Subsidized Parking

233

Table 3. Data for the 7 schools in the sample at which free parking is not offered to the personnel Car commuters School no. Number of employees Number % I II III IV V VI VII

80 90 45 90 140 50 18

3 0 0 5 5 0 2

4 0 0 6 4 0 11

Total

570

15

3

The data in Table 3 are much in line with those of Table 1. That is, where the employer does not provide free parking, only a few percent of the employees use their own cars for commuting. The similarity in percentages is conspicuous—3% in the case of schools, and 2% for the wide selection of workplaces. The results indicate that only a small percentage of total commuters use private cars when free parking is unavailable. Indeed, this seems to hold true across the whole Stockholm inner-city labor market. Tables 4 and 5 report data from 22 schools at which employers provided free parking. Table 4 covers the 11 schools at which the number of free parking spaces supplied is not sufﬁcient to cover employee demand, and Table 5 covers the 11 schools at which the employer has met employee demand for free parking. The school data in Tables 4 and 5 are much in line with those in Table 2. Free parking seems to increase the car commuting percentage considerably. It is also interesting to note that an upper limit for the car commuting proportion seems to exist. Indeed, in Table 2, an average proportion of 18% of car commuting was found in relation to free parking; this happens to match the school study proportion (where supply covers demand) reported in Table 5. In the main study, one workplace was found at which parking was partly subsidized by the employer. A corresponding observation appears in the school sample. At one school, parking was offered to employees on a monthly basis for SEK 400. Some (5/75 ≈) 7% of the schools employees utilized this offer. These 7% can be compared with the 8% found in the main study’s single case wherein the subsidized parking charge was SEK 500. Schools in Stockholm are rather special workplaces in that they are often equipped with spacious schoolyards. It is convenient, and may thus be tempting, for school principals to offer free parking to school personnel in parts of these schoolyards. Thus, it should come as no surprise that free parking is twice as

234

R.E. Wall Table 4. Data for the 11 schools in the sample at which free parking is provided, but the number of parking spaces is not sufﬁcient to cover employee demand Car commuters School no. Number of employees Number % VIII IX X XI XII XIII XIV XV XVI XVII XVIII

130 80 100 80 80 50 400 200 95 110 100

3 3 6 6 6 5 40 20 10 15 15

2 4 6 8 8 10 10 10 11 14 15

Total

1425

129

9

Table 5. Data for the 11 schools in the sample at which free parking is provided and the number of parking spaces is sufﬁcient to cover employee demand Car commuters School no. Number of employees Number % XIX XX XXI XXII XXIII XXIV XXV XXVI XXVII XXVIII XXIX

50 200 200 60 80 110 80 80 90 85 50

5 20 30 10 14 20 15 15 20 20 25

10 10 15 17 18 18 19 19 22 24 50

Total

1085

194

18

common for school employees as it is for employees elsewhere in the labor market—as observed by Jansson and Wall. To most employers, however, offering parking spaces to the employees implies explicit costs. Therefore, schools should not be regarded as representative in this particular respect, and should not be included in estimations of the total car commuter proportion into the inner city of Stockholm.

Taxation of Car Commuters’ Employer-Subsidized Parking

235

3.3 The Empirical Evidence Summed Up The results from the two ﬁeld studies of parking conditions for workers who commute by car into the inner city of Stockholm prove to be quite consistent with, and therefore reinforce, one another. If the employer provided unlimited free parking, then 15%–20% of the workforce chose to commute to work in his own car. If the commuters themselves were responsible for paying a parking cost of SEK 2000–4000 per month, then only 2%–3% traveled to work in their own car. Instead of driving, employees used public transportation as the means of commuting to work. The ﬁeld studies also offered two in-between observations of the parking cost and car commuter proportion: SEK 500, 8%, and SEK 400, 7%. What new knowledge does the Jansson and Wall Stockholm study reveal, and what is its signiﬁcance? Prior to the Jansson and Wall study, it was believed that avoidance of the car commuters’ tax was a minor issue. However, Jansson and Wall estimated that some 17 000 free parking and 4000 partly subsidized commuters bring their cars into the inner city of Stockholm every weekday, typically during the morning rush hours. Trafﬁc data show that some 70 000 cars are driven into the inner city of Stockholm between 7:00 and 10:00 a.m. on a typical weekday (see, e.g., Xylander 2001). This means that free-parking commuters may constitute as much as (17 000/70 000 ≈) 25% of rush-hour trafﬁc, and those commuters who enjoy partly subsidized parking may make up another (4000/70 000 ≈) 5%. Trafﬁc ﬂow studies and calculations reveal that a reduction in trafﬁc volume of only a few percentage points considerably reduces queuing and congestion problems. For an introduction to this topic, see, e.g., Small (1992). In addition, prior to the Jansson and Wall study, little was known about free-parking car commuters’ sensitivity to changes in parking costs. The parking cost–car commuter proportions observed in the Jansson and Wall study are few. In spite of this, they are plotted in Fig. 1. A nonlinear graph is applied to the observations by visual inspection. This graph illustrates a demand curve for car commuting where (1) car commuters must pay the prevailing market price for parking, (2) car commuters are offered free parking, and (3) the two cases wherein car commuters pay a subsidized price for parking. Figure 1 indicates that the reservation price for parking is low among most commuters. Indeed, a price of SEK 1000 per month may reduce the car commuting proportion to some 5%. Proper taxation of free parking in the inner city of Stockholm would result in an average net cost to the individual car commuter of some SEK 1000 per month. A decrease in the free-parking commuting proportion from 18% to 5% corresponds to a reduction in rushhour trafﬁc volume by some ((0.18–0.05) × 70 000 ≈) 9000 cars. The effect on

236

R.E. Wall

Fig. 1. A parking cost–car commuter proportion relationship in Stockholm

the market of taxation on car commuters’ partially subsidized parking should be added to this total. The calculations and results of the Jansson and Wall study are crude, but the message is clear and simple. If current tax stipulations are more strictly enforced, much of the car trafﬁc congestion problem in Stockholm will disappear.

4 Political Response to the Jansson and Wall Stockholm Study During 2003, the results of the Jansson and Wall Stockholm study became increasingly well known in Sweden. Politicians realized the implications for rush-hour trafﬁc of more strictly enforcing the taxation of car commuter’s free parking. However, the legal response by the Swedish government, approved by its parliament, was only to amend the tax law (Svensk FörfattningsSamling 2001). The amendment (Svensk FörfattningsSamling 2004) stipulates an extension of employers’ responsibility to report fringe beneﬁts to tax authorities. A question asking whether or not the employer provides its employees with free or partly subsidized parking was added to a particular tax form that employers in Sweden must complete for each individual employee once a year. The amendment does not modify prevailing Swedish tax legislation. Its purposes are merely to make it more evident to employers that free parking beneﬁts are subject to taxation, and to require

Taxation of Car Commuters’ Employer-Subsidized Parking

237

them to make an explicit statement about their own parking policies. The law became effective on June 1, 2004, but the ﬁrst reporting year is 2005. Therefore, little is known about the effect of this new tax stipulation on car trafﬁc volume. However, trafﬁc data have not yet revealed any substantial reduction of rush-hour trafﬁc into the inner city of Stockholm. It may be that many employers, for their own reasons, are keen to offer their employees free parking and, conﬁdent that the true parking situation at their workplace will never be revealed, may report on the tax forms that free parking is not offered, thus ensuring that commuting and parking routines continue as before passage of the tax law amendment.

5 Concluding Remarks The legislative response that followed the Jansson and Wall Stockholm study seems to lack parity with the signiﬁcance of the study’s results and the potential of the tax as an instrument of economic policy. One reason for the cautiousness is Swedish politicians’ concerns that integrity issues may result if the taxation of car commuters’ free parking is more strictly enforced. Individual motorists may feel uncomfortable at having their commuting and parking routines monitored and recorded. Many people also dislike the implementation of any more taxes whatsoever. Indeed, the parking taxation issue has been reported on, and has been subject to extensive debate from a variety of angles in every nation-wide newspaper in Sweden during the last few years (see, e.g., Wendt 2003; Lindberg 2005; Kadhammar 2005). Coverage has been extensive in many local newspapers, and also in other forms of mass media such as television and radio. An important, perhaps even crucial, aspect of the question of which system, road pricing, congestion charges, or taxation of free commuters’ parking, should be used to complement transport infrastructure investment in order to mitigate car trafﬁc congestion in large inner cities, is the feecollection costs of the systems. The London system is based on an unsophisticated camera monitoring technique. This made the investment cost in the system low—only corresponding to some USD 15 million. The London system is labor-intensive, however, and operational costs amount to the equivalent of some USD 75 million per year. The total budget for the technologically more sophisticated Stockholm trial corresponds to several hundred million USD. A proper and effective system for taxation of commuters’ free or employer-subsidized parking has yet to be designed, but it seems likely that a reduction in car trafﬁc in inner cities can be achieved by such a system for a cost considerably lower than that of today’s congestion charge or tomorrow’s road-pricing systems.

238

R.E. Wall

The idea of road pricing dates back to at least the 1960s, but not until the last decade have congestion charge systems been planned and put into operation on a large scale. Perhaps more in-depth studies and new advancements in technology will allow for more efﬁcient and effective car-commuter freeparking taxation in the years to come.

References Jansson JO, Swahn H (1987) Parkeringspolitikens roll i innerstaden—En samhällsekonomisk analys. (The importance of parking policy in inner-city areas–a cost beneﬁt analysis.) Byggforskningsrådet, Rapport R49:1987, Stockholm, p 40 Jansson JO, Wall R (2002) Vad betyder fri parkering för vägtraﬁksituationen i Stockholms-området? (Road trafﬁc congestion effects of free parking in Stockholm.) Working Paper, Linköpings Universitet, unpublished Kadhammar P (2005) Struntfrågan som blev lag. (A minor issue becomes law.) Aftonbladet, February 15, Stockholm, p 20–22 Lindberg S (2005) Skärpt kontroll av parkering på jobbet. (Taxation of car commuters’ free parking more strictly enforced.) Dagens Nyheter—Economy Section, January 22, Stockholm, p 4 Small K (1992) Urban transportation economics. Harwood Academic, Luxembourg, Sect. 3.3, p 61–74 Svensk FörfattningsSamling (Swedish Code of Statues) SFS (2001:1244) Förordning om självdeklarationer och kontrolluppgifter. (Income tax and income tax control stipulations.) Finansdepartementet (Ministry of Finance), Stockholm Svensk FörfattningsSamling (Swedish Code of Statues) SFS (2004:284) Förordning om ändring i förordning (2001:1244) om självdeklarationer och kontrolluppgifter. (Stipulation of change in income tax and income tax control stipulations (2001:1244).) Finansdepartementet (Ministry of Finance), Stockholm Wendt C (2003) P-skatt lösning på traﬁkkaos. (Taxation of car commuters’ free parking solve trafﬁc jam problems.) Svenska Dagbladet, March 7, Stockholm, p 7 Xylander A (2001) Traﬁkmätningar i Stockholm 2000 (Car trafﬁc ﬂows in Stockholm 2000), Rapport 2001:1, Stockholms Gatu-och fastighetskontor (transport for Stockholm), Stockholm, p 8

13. Forest Protection System and Optimal Land-Use Management Policy in Japan Masahiro Yabuta and Yasuhito Yamanishi

Summary. The purpose of this chapter is to give a comprehensive analysis of the present conditions and issues around forest conservation in Japan. We shall focus on an evaluation of the Japanese forest policy, and typically the forest protection system that is mainly based on the public beneﬁt of forests. We shall show the analytical framework of land use for the protected forest, and then give an empirical examination concerning the enforcement and effectiveness of the forest protection system on a prefectural base. It is shown that there are large deviations in the proportion of forest protection among prefectures in Japan. By using a simple optimal choice model of land use decisions to explain these deviations, some important factors of the prefecture-based deviations of the proportions of protected forest are shown from empirical viewpoints. Then we shall evaluate the forest protection system of Japan, and suggest its policy implications. Key words. Common pool resources, Forest protection system, Optimal land use, Regional land management, Public beneﬁt of forest

1 Introduction The main purpose of this chapter is to give a comprehensive analysis of the present conditions and issues around forest conservation in Japan. In particular, we focus on an evaluation of the Japanese forest policy, that is now based mainly on the public beneﬁts of forest. Among many kinds of forest policy in Japan, the forest protection system (Hoanrin Seido in

Faculty of Economics, Chuo University, 742-1 Higashinakano, Hachioji, Tokyo, Japan

239

240

M. Yabuta and Y. Yamanishi

Japanese) was established in order to maintain the public beneﬁts of forest. By examining the forest protection system, we will show the analytical framework of economic as well as the empirical factors concerning the environmental values of the forest in Japan. Forests as common pool resources (CPRs) have two natures: the nature of stock goods and the nature of ﬂow goods. Therefore, we must carry out a synthetic evaluation of the economic beneﬁts of the forestry sector and the public beneﬁts of forests. The public beneﬁts of forests have the typical nature of CPRs because of their nonexcludability and rivalry for all residents in the area. Everybody can enjoy all the amenities that a forest gives, but if somebody enjoys only a part of them it becomes more difﬁcult for anybody else to enjoy them. In the case of CPRs, there exists the possibility of inadequate management, such as over-exploitation or over-use. Hence, residents in the area must have rules for the management of forest resources. The essence of the management problem in the forest CPRs is how to balance the public beneﬁts with the economic beneﬁts, rather than a distribution of property rights. In fact, the philosophy of the Forest Law in Japan is the “consistency of revitalization of forestry sector and conservation of forest,” and the main purpose can be summarized as the “promotion of forest management and conservation to fulﬁll multi-functionality.” This philosophy is based on the above-mentioned thoughts. When the self-sufﬁciency ratio for forest products is under 20%, the forestry sector itself is a declining process and the sustainability of the forest stock is in danger. Moreover, maintaining the public beneﬁts of the forest is also in danger. The present forest policy in Japan needs major changes in order to establish consistency in the revitalization of the forestry sector and the conservation of the forest. From these perspectives, we will give an analytical framework of the economic as well as the environmental evaluations of the forest in Japan, mainly from the viewpoint of the forest protection system. In Sect. 2, we will show an outline of the present conditions of the forest and forest management policy in Japan. In Sect. 3, to check the purpose and the present condition of the forest protection system, we will clearly deﬁne the signiﬁcance of the forest protection system in Japan. Then we will show that there are large deviations in the proportion of forest protection among 47 prefectures in Japan (from around 10% to around 80%). In Sect. 4, we use a simple optimal choice model of land-use decisions to explain these deviations. In Sect. 5, on the basis of the model developed in Sect. 4, we will analyze the factors of the prefecture-based deviations in the proportions of forest protection from an empirical viewpoint. Then we shall evaluate the forest protection system of Japan, and suggest its policy implications.

Forest and Land-Use Management Policy in Japan

241

2 Forest Conditions in Japan and the Forest Management Policy 2.1 Current Conditions of Japanese Forests The problems of forests in Japan, such as the revitalization of the forestry sector, the conservation of forests, and the maintenance of the public beneﬁts of forests seem to be integrated into the problems of a declining forestry sector as well as increasing loss from inadequate forest management, rather than the evolution process of the forest resources. From Fig. 1, it can easily be shown that as far as the forest management policy after World War II is concerned, whereas the forest area had been kept at around 25 millions ha and the volume of the stock of forest resources had been increasing due to a trend of long-term cutting, the utilization of domestic timber had been stagnant, and we already depend on foreign timber at a level of over 80%. To grasp the gist of the problems of forests in Japan, it is important to consider the four facts below. 1. The proportion of natural forest is relatively low, and man-made forest, most of which is coniferous forests (Japanese cedar, Japanese cypress, etc.), has been increasing. On the other hand, the proportion of hardwood forest has been decreasing. 2. As far as the price competitiveness of Japanese cedar or Japanese cypress against foreign timber, such as lauan and western hemlock, is

% 1,000m3 140,000

foreign timber domestic timber self-sufficiency ratio

100 90

120,000

80 100,000

70 60

80,000

50 60,000

40 30

40,000

20 20,000 0 1955

10 0 1960

1965

1970

1975

1980

1985

1990

1995

2000

Fig. 1. The trend of the self-sufﬁciency ratio of timber in Japan (from Yabuta (2004))

242

M. Yabuta and Y. Yamanishi

concerned, it cannot necessarily be assumed that the relatively cheaper foreign timber leads to an increase in imports and to a decline in the domestic forestry sector in Japan. 3. The tendency toward a decreasing number of workers in the timber and lumber sectors is serious. In these sectors, most of the workers are older, and average wages are lower than in other industrial sectors. 4. The decrease in timber sales leads to a decrease in the production of domestic timber and an increase in total cost. The net income of the forestry sector has been decreasing. In addition, the ratio of national forest to total forest is only 31.2% for land area and 25.0% for growing stock. We must also pay attention to the high ratio of privately owned forest in 2002 (i.e., 57.7% of land area and 64.3% of growing stock). Hence, in government policies for promoting forests, we must care about the control of public beneﬁts of private forests. From Fig. 2, it can be seen that the area and the growing stock of manmade forest have been increasing rapidly. In particular, the rapid increase in the growing stock of man-made forest suggests that the decline of the forestry sector and the decrease in the adequate management of forests have accelerated the predominance of man-made forest and have resulted in a degradation of the public beneﬁts of forests. As far as fact 2 above is concerned, it should be noted that there is little difference in the price of imported western hemlock and the price of Japanese cedar, both of which are used mainly for construction purposes. Therefore, it is clearly untrue that the decline in the forest industry sector in Japan is mainly due to the lack of price competition. It is not the price but the quality competition that has determined the declining tendency of the Japanese timber industry. To keep Japanese timber at a competitive level, the timber sector in Japan has been forced to introduce new drying equipment, leading to high production costs with poor management. If the market for timber products is competitive, it will strongly inﬂuence the market for inputs, including round woods, and forest workers. Although an increase in production costs in the timber sector will shift the supply curve upward, the price of timber production will remain at the international price under the free-trade system. Then the production of domestic forestry will decrease, and the stumpage value will fall. Therefore, a rise in timber production costs and a simultaneous fall in stumpage value will decrease the forest owner’s expectation of future proﬁt, and will expand the length of the optimal cutting rotation. These processes will necessarily reduce the motivation for replanting and maintenance, and will result not only in a decline of the forestry sector, but also a deterioration of the natural environment of the forests.

Forest and Land-Use Management Policy in Japan man-made forest

243

natural forest

3000

10,000ha

2500 2000 1500

1,724

1,598

1,504

1,475

1,477

793

938

1,022

1,040

1,035

1966

1976

1986

1995

2001

1000 500 0

a

million‚ m3 4500 4000 3500 3000 2500 2000 1500 1000 500 0

b

1,702 1,591 1,502 1,329

1,388

558

798

1966

1976

1,361 1986

1,892

1995

2,338

2002

Fig. 2. Forestry trends in Japan (from Ministry of Agriculture, Forestry, and Fisheries of Japan (1995–2002). a Trends in forest area. b Trends in growing stock

As for fact 3, in 1950, the number of workers in the forestry sector was around 456 000, but in 2000 the number was 67 000, showing a drastic decrease. This rapid decrease caused a high ratio of older workers (the average age is around 50 years). The average wage per day in the forestry sector is about 12 000–13 000 yen, which is lower than that of almost all other sectors. Clearly, the relatively low wage level corresponds to the low added value in the forestry sector. Finally, for fact 4, a decrease in the net income in the forestry sector with decreasing sales and increasing costs must result in serious problems. One of the problems in the forestry sector is the cost of replanting and maintenance, rather than the direct cost of production. The value of stumpage is about 7000 yen/m3, and this price is calculated backward from the cost of cutting, carrying, and so on. Therefore, it is not possible to compensate for the cost of replanting and maintenance, such as thinning, pruning, and weeding. This fact means a negative net proﬁt in the forestry sector as a whole. As long as this condition continues, no one will take care of the

244

M. Yabuta and Y. Yamanishi

replanting and maintenance of forests. It will become impossible to preserve the public beneﬁts of forests in the future.

2.2 Plans and Policies for the Preservation and Use of Forests in Japan At the beginning of the period of strong economic growth after World War II, Japan decided to carry out a policy of planting and afforestation because of the deterioration of forest resources during the war, and the expectation of an increasing demand for forest production. At that time, many conifers, such as Japanese cedar and Japanese cypress, we planted to provide building or other materials. However, their nourishment and the modernization of the forestry sector were inadequate, and Japan imported a great deal of foreign timber, as already described above. Whereas we did not utilize our domestic forest resources very well, we have been able to maintain a sufﬁcient accumulation of these forest resources. Unfortunately, this does not lead to environmental conservation. The owners of unproﬁtable forests, either public or private, have no incentives for positive management of their forest. Even if they manage to cut trees under the severe rules of forest management, they cannot afford the cost of replanting and care. It is clear that the quality of a forest cannot be maintained without both cutting and care. In the view of present condition of forests in Japan, as well as the prospects for long-term demand of materials, we urgently need industrial policies which support a reduction in the cost of cutting and handling, and an improvement in labor efﬁciency. We need a policy for the comprehensive management of small-scale forests, with more incentive for forestry management and which realizes an efﬁcient and stable forest industry. Currently, there is a movement to review the public beneﬁts of forests, and the connection between the revitalization of the forestry sector and the conservation of forests. This means that we must protect the forest environment while revitalizing the declining forestry sector to retain the public beneﬁts of forests. The previous forest law in Japan stated that the forest has ﬁve functions: the prevention of disaster around mountain regions, protection for the living environment, the conservation of headwaters, the promotion of public health and culture, and timber production. However, the latest edition of the Forest Law in Japan, established in 2001, gives three fundamental functions of the forest. These are the maintenance of water and soil, the preservation of the natural environment, and sustainable timber production. The coordination system of the forest policy in Japan still consists of the revitalization of the forest industry and the preservation of the natural environment.

Forest and Land-Use Management Policy in Japan

245

2.3 Forest Planning Systems and a Plan for the Development of Protective Forest The ﬁrst article of the Forest Law, which is the basis of forest policy in Japan, states, “This law provides basic items concerning forest, such as forest planning and protection.” From this article, it can be said that the protective forest system is fundamental to the Japanese forest policy. The protective forest system began with the legislation of forest law in 1898. Deforestation during World War II caused serious ﬂood damage in the 1950s, such as the ﬂoods in the South Kii peninsular in 1953. To tackle such serious ﬂoods or soil erosion, the Japanese government enacted the Temporary Measures Law for the Development of Protective Forest in 1954. Based on this law, plans for the development of protective forest were established. The abstracts and trends of these plans are summarized in Table 1. The committee to re-examine the development of protective forest reported in 2003 that the “area of protective forest has increased four times since the Temporary Measures Law was enacted, and no more expansion of protective forest is needed within the speciﬁed plans. Therefore, the development of protective forest has been attained within the system of Forest Law.”1 Since the Temporary Measures Law lapsed at the end of the ﬁscal year 2003, the Japanese government has managed the development of protected forest

Table 1. The development of the forest protection policy in Japan (from Ministry of Aquiculture, Forestry and Fisheries of Japan (2004) Period Contents The ﬁrst period (ﬁscal years 1954–1963) The second period (ﬁscal years 1964–1973) The third period (ﬁscal years 1974–1983) The fourth period (ﬁscal years 1984–1993) The ﬁfth period (ﬁscal years 1994–2003)

Concentrated development of protective forests for the prevention of disasters like ﬂood and soil erosion in the mountain regions Concentrated development of protective forests for the conservation of headwaters Concentrated development of protective forests for the promotion of public health Minute procedures and qualitative development of protected forests (establishment of a speciﬁed forest protection system) Development of protective forests for the prevention of disasters, to conserve good drinking water, and to conserve a green environment

It was also reported that the signiﬁcance and role of the forest protection system for forest conservation would never disappear. Historical perspectives of the forest protection system are given by Mitsui (2004). 1

246

M. Yabuta and Y. Yamanishi

within a nation-wide forest plan as well as a regional forest plan, both which are based on the Forest Law. Through the transition of institutional and economic conditions, the forest protection system now has an important role in forest policy in Japan.2

3 The Forest Protection System of Japan 3.1 Structure of Forest Protection Protected forest is forest that the Minister of Agriculture, Forestry and Fishery, or the governor of each prefecture, cooperatively designates for the attainment of speciﬁed public beneﬁts. The object of the designation is the forest as deﬁned in Forest Law, and the occurrence of public beneﬁt is a necessary condition for the designation. The purposes of the designation are listed in the 25th Article of Forest Law, and there are 17 kinds of protected forest, as listed in Table 2. The cancellation of a designation is generally prohibited because the purposes of protected forests are the prevention of disasters, and the conservation of the living environment and the natural environment. As shown in the Table 2, both headwater conservation forest and soil run-off prevention forest account for around 90% of the total area of protected forest. Since 1954 when the forest protection system started, the share of protected forest in the total forested areas has gradually increased from about 8% in 1954 to 40% in 2002.3

3.2 Restrictions and Incentives of for Forest Protection As mentioned in the previous section, the purpose of the designation of protected forests is the attainment of speciﬁed public beneﬁts. Therefore, for the effective realization of this purpose, some restrictions must be imposed on the owners of protected forest. Table 3 gives the essential framework of the restrictions on protected forest. If owners of protected forests violate these restrictions, the governor of the prefecture will order them to stop

2 According to the nationwide forest plan for 2005–2018, which was decided at a cabinet meeting in 2003, the land is separated into 44 river basin regions, and the forest needed for water and soil maintenance covers 16.46 millions ha, the forest needed for natural environmental preservation covers 3.28 millions ha, and the forest needed for sustainable timber production covers 3.28 millions ha. As far as the Forest Protection Area is concerned, around 1.245 million ha are planned, which are mainly for headwater conservation, soil run-off prevention, and public health as well as scenic site conservation. 3 Famous examples of protected forest areas include Matsushima in Miyagi Prefecture for scenic site conservation forest, and Mount Fiji for headwater conservation forest.

Forest and Land-Use Management Policy in Japan

247

Table 2. Protective forest areas of each type (1000 ha) in 2002 (from Ministry of Aquiculture, Forestry and Fisheries of Japan (1995–2002) Type of protective forest National Private Total Ratio forest forest of total protective forest (%) 1 2 3 4 5

6 7 8 9 10 11

Headwater conservation forest Soil run-off prevention forest Landslide prevention forest Shifting sand prevention forest Windbreak forest Flood-damage prevention forest Tidal waves and salty winds prevention forest Drought prevention forest Snow-drift prevention forest Fog-inﬂow prevention forest Snow-avalanche prevention forest Rock-fall prevention forest Fire protection forest Fish-breeding forest Navigation landmark forest Public health forest Scenic-site conservation forest

4 228 935 19 4 23 0 5

3216 1 404 37 12 33 1 8

7 444 2 339 56 16 56 1 14

68.4 21.5 0.5 0.1 0.5 0 0.1

45 0 9 5 0 0 8 1 346 13

67 0 53 15 2 0 46 0 338 15

112 0 61 20 2 0 53 1 684 27

1 0 0.6 0.2 0 0 0.5 0 6.3 0.3

Total

5 640

5 247

10 887

Ratio for total protective forest Ratio for total forest area Ratio for each national and private forest Ratio for total land area

51.8 21.1 67.8

48.2 19.4 28.2

14.1

12.9

100

100 40.6

their actions, or else recovery, afforestation, and restoration orders will be imposed. The restriction on privately owned protected forest means a restriction on property rights for the sake of creating a public beneﬁt. Therefore, the owners of protected forests should be given some compensation for their loss in maintaining public beneﬁts. There are four categories of compensation in this system: (1) an exception in the tax system, (2) compensation for losses, (3) low-interest ﬁnancing for cutting, and (4) a subsidy for afforestation. Examples of exceptions in the tax system are real-estate tax, real property acquisition tax, and special land-holding tax, which are free of tax duty. In addition, inheritance tax and donation tax are partially reduced.

248

M. Yabuta and Y. Yamanishi

Table 3. Restrictions on the protected forest Main Contents restrictions

Explanation

Regulation for the method of cutting standing trees

Regulations on the manner of regenerative cutting (minimum management, selective cutting, temporary clear-cutting in limited areas), the standard cutting age, and the area where thinning is permitted

Limit for the cutting of standing trees

Limits for the area of clear-cutting (less 20 ha for headwater conservation forest, windbreak forest, fog-inﬂow prevention forest, and ﬁsh-breeding forest, less 10 ha for other protected forests). Limits for the ratio of selective cutting and thinning

Obligation for planting

Obligation for planting in places where regeneration is difﬁcult without planting after regenerative cutting

Cutting of standing trees

Permission system (notice system for thinning)

In the case of cutting standing trees, the permission of the governor is required

Conversion of land characteristics

Permission system

Acts which convert the land charac teristics are permitted only if the function of the protected forest is not damaged by the activity

Forest management guidelines

4 Theoretical Analysis of the Forest Protection System 4.1 Nash Equilibrium of Land Use Here we introduce a standard theoretical model of land use. It is assumed that L (the land area under analysis) is classiﬁed into three categories, x (forest protection area), y (forest production area), and z (other areas). Therefore we have L=x+y+z

(1)

Forest and Land-Use Management Policy in Japan protection forest

production forest

249 other use

Toyama Nagano Niigata Gunma Yamagata Shimane Tochigi Miyazaki Oita Wakayama Okayama Miyagi Fukushima Akita Kyoto Fukuoka Ishikawa Nara Yamaguchi Kagawa Kagoshima Osaka Okinawa Chiba 0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Fig. 3. Proportions of land use at the prefectural level (from Ministry of Agriculture, Forestry and Fisheries of Japan (2004)

where L is assumed to be constant.4 For reference, we show the land-use composition based on Eq. 1 at each prefecture in Fig. 3. The fundamental viewpoint in this model is the differences in marginal beneﬁt stemming from each land use rather than differences in land quality. For land-use decisions, Lichtenberg (1989), Hardie and Parks (1997), and Amacher et al. (2004) have developed the so-called area–base model. In the area–base model, several land units with different qualities are assumed, and the land-use composition is determined by maximizing of beneﬁt from those units. See also Palmquist (1989). 4

M. Yabuta and Y. Yamanishi

250

First, we shall examine the productive forest. The growing stock of a productive forest per unit area, Q, is determined by growing age, t, and conditions for growth indicated by the vector E. The expected proﬁt per unit area, p, is expressed as ∞

π = ∑ [ pQ(T ; E ) − (1 − θ )c (m)]e −iδT =

1 [ pQ(T ; E ) − (1 − θ )c (m)] e −1

∂Q 0 > , ∂E

c ′′(m) > 0

i =1

∂ 2Q 0 < , ∂E 2

c ′(m) > 0,

δT

(2)

where T is harvesting rotation age, c is the unit cost of replanting, m is the density of the forest, d is the discount rate, and q (<1) is the subsidy rate for a productive forest. The owners of productive forests decide the harvesting rotation age in order to maximize their expected proﬁt. Differentiating p with respect to T and setting ap/qT = 0 yields pQ′(T; E) = d[pQ(T; E) − (1 − q)c(m)] + dp

(3)

The left-hand side of Eq. 3 expresses the marginal beneﬁt at age T. The ﬁrst term of the right-hand side of Eq. 3 expresses the proﬁt loss due to no harvest at age T, and the second term expresses the total loss of future proﬁt due to no harvest at age T. A comparative static analysis clearly yields dT/dp < 0,

dT /dq < 0,

dT/dm > 0

(4)

From the inequality of Eq. 4, we can recognize that an improvement in the economic conditions for managing productive forest (that is a rise in timber price and/or subsidy rate) results in an extension of the harvesting rotation age, but a rise in costs results in a shortening of the harvesting rotation age. In given area, y, the owners of a productive forest will gain an expected proﬁt Py.5

∏ y = δ × y ×π =δ × y ×

1 [ pQ (T ; E ) − (1 − θ )c (m)] e −1 δT

(5)

For other land, z, which is for industry and living, the comprehensive economic beneﬁt Pz can be given by Pz = z × R(z, x; f ), Rz > 0,

Rzz < 0,

Rx > 0,

Rf > 0

(6)

In this model, we comparatively analyze the net beneﬁt of each land use with the standardized net beneﬁt of productive forest. Therefore, for simplicity, we assume that the marginal beneﬁt of productive forest is constant. We also assume the inequality sx(0;g) > ∂Py/∂y > 0. 5

Forest and Land-Use Management Policy in Japan

251

where R is the net beneﬁt per unit area stemming from land area z. We shall assume that there is a diminishing proﬁt on scale, and that the conservation of forest (both protected forest and productive forest) has a positive inﬂuence on industry and living in the land area z. In Eq. 6, Rx means a positive marginal beneﬁt that the protected forest generates directly to the land area z. R, the net beneﬁt per unit area, is also assumed to depend on various economic as well as social activities, represented by f, such as income, production, and population, etc. Lastly, we shall investigate the protected forest. In the designation of protected forest, as mentioned above, the owners must bear the costs due to several restrictions and gain several incentives. Several kinds of cost and beneﬁt are integrated, and denoted as Px. Px = s(x; g) × x,

sx > 0,

sxx < 0

(7)

where s is the net beneﬁt per unit of protected forest. Although the incentive levels for a protected forest are different for each area, it is assumed for analytical convenience that they are given at a constant level by the forest protection policy. The shift parameter concerned is denoted by g in Eq. 7. It seems that a typical owner of a forest makes the decision whether to manage his forest as a productive forest or to receive the designation of protected forest. From Eqs. 5–7, the marginal beneﬁt for the owner in each land area is expressed as 1 ⎧ ∂Π y ⎪ ∂ y = δ × eδT − 1[ pQ(T ; E ) − (1 − θ )c (m)] = constant ⎪ ⎪ ∂Π z = R( z , x; f ) + 2 Rz ( z , x; f ) ⎨ ⎪ ∂z ⎪ ∂Π x = σ ( x; g ) + σ x ( x; g )x ⎪ ⎩ ∂x

(8)

Solving Eq. 9, subject to the land-area constraint in Eq. 1, leads to the Nash equilibrium level of land use ∂Π y ∂Π z ∂Π x (9) = = ∂y ∂z ∂x Figure 4 shows a visual image of the Nash equilibrium, assuming the existence and uniqueness of Nash equilibrium (x*, y*, z*).

4.2 Pareto Equilibrium of Land Use In the model considered in the previous section, we have taken only the private beneﬁts of the landowner into account. In this section, we shall

252

M. Yabuta and Y. Yamanishi marginal benefit

marginal benefit

R (0, x * *; f )

∂Π z / ∂z ∂Π x / ∂x ∂Π y / ∂y

σ (0; g )

dp

z*

y*

O

x* L

Fig. 4. Nash equilibrium of land use

consider the public beneﬁt, which is an economic externality, stemming from each land use. Let W be the total beneﬁt of the area L, and assume that the total beneﬁt of the area is expressed in a simple additional form as W = (1 + f)Px + (1 + y)Py + (1 + g)Pz

(10)

where f, y, and g are the levels of economic externality stemming from the protected forest, the productive forest, and other land use, respectively. A social planner is assumed to have drawn up a land-use plan by which the total beneﬁt to the area, W in Eq. 10, can be maximized. We also assume the inequality in Eq. 11 in consideration of the intent of the forest protection system. f>y>0>g

(11)

Because the total area is constant, it might be reasonable if the control variables for the social planner are the area of protected forest, x, and the area of land for other uses, z. The optimal conditions for the social planner, in consideration of Eqs. 10 and 1, must satisfy the conditions in Eq. 12. ⎧ ∂W = (1 + φ ) ∂Π x − (1 + ψ )δπ + (1 + γ ) ∂Π z = 0 ⎪⎪ ∂x ∂x ∂x ⎨ ⎪ ∂W = − (1 + ψ )δπ + (1 + γ ) ∂Π z = 0 ⎪⎩ ∂z ∂z

(12)

Forest and Land-Use Management Policy in Japan

253

With the conditions in Eq. 11, a rearrangement of Eq. 12 yields ⎧ ∂Π z = 1 + ψ ∂Π y > ∂Π y = constant ⎪ ∂z 1 + γ ∂y ∂y ⎪ ⎨ 2 ∂Π y 1 + γ ∂Π z ∂Π y ⎪ ∂Π z = (1 + γ ) − < = constant ⎪⎩ ∂x (1 + φ )(1 + ψ ) ∂ y 1 + φ ∂x ∂y

(13)

Equations 13 imply efﬁcient Pareto conditions in this model. The Pareto equilibrium point can be given by (x**,y**,z**) in Fig. 5, in contrast to the Nash equilibrium in Fig. 4. In connection with this, what factors generate the difference in the equilibrium points depicted in Figs. 4 and 5? The main factor is whether the positive externality contained in both protected forest and productive forest is considered or not. In addition, the setting of a protected forest has a positive inﬂuence on the other activities in the area. For instance, as the ﬁrst part of Eq. 13 as well as Fig. 5 indicate, it can easily be recognized that any positive externality contained in a productive forest can result in an increase in the attractiveness of the downstream area (i.e., the area for other activities), and any negative externality will result in a decrease in the use of the area for other activities. We can also recognize from the second part of Eq. 13 that a consideration of the positive externality contained in a protected forest will cause the protected forest area to expand. A social planner’s consideration of externalities can result in a decrease in the area for other activities and an increase in the protected forest area, and therefore it cannot be predetermined whether a productive forest area increases or not. marginal benefit

marginal benefit

R (0, x * *; f )

∂Π z / ∂z ∂Π x / ∂x ∂Π y / ∂y

σ (0; g )

dp

z* O

z**

Fig. 5. Pareto equilibrium of land use

y* y**

x* x**

L

254

M. Yabuta and Y. Yamanishi

5 Evaluation of the Forest Protection System Policy in Japan 5.1 The Data In this section we will give an empirical analysis of the land-use model developed in the previous section. The data for this empirical analysis of land-use decisions are given below. 5.1.1 Dependent Variables For the dependent variables used in OLS (Ordinary Least Square Regression) methods, we adopt two variables. One is the ratio of the protected forest area to the total area, and the other is the ratio of the protected forest area to the total forest area in each prefecture. As for the multinominal logit model, we ﬁrst investigate the ratio of the protected forest area to the total area (x/L), and the ratio of the area for other activities to the total area (z/L) in each prefecture. Comparing these with the national mean value, we categorize the 47 prefectures into three groups as follows: (i) j = 0, prefectures where the former (x/L) is lower and the latter (z/L) is higher than national mean value; (ii) j = 1, prefectures where the former is higher or the latter is lower than the national mean value; (iii) j = 2, prefectures where the former is higher and the latter is lower than the national mean value. Instinctively, we recognize that prefectures in category (i), j = 0, choose a land-use strategy with an expansion of industry and agriculture, while prefectures in category (iii), j = 2, seem to have a land-use strategy with a suppressive conversion program of forest area to other uses. 5.1.2 Explanatory Variables We adopt the explanatory variables that seem to be characteristic for each land use, protected forest, productive forest, and others. In practical terms, these are variables that have a strong inﬂuence over the expected proﬁt ratio stemming from productive forest, the beneﬁts from protected forest, and the rates of return from other land uses. In addition, we shall incorporate some other important variables that indicate the progress of the environmental policy toward Pareto equilibrium. A list of explanatory variables is given in Table 4. For convenience, in this table the explanatory variables are categorized into three factor groups: factors of regional and natural characteristics, factors related to the economic and social conditions, and factors related to environmental policy. The factors for the regional and natural characteristics of the land concern conditions such as a natural land shape, and meteorological aspects such as temperature, rainfall, and hours of sunlight, all of which determine the growth rate of the forest. As far as the economic factors are concerned,

Forest and Land-Use Management Policy in Japan

255

the independent variables must correspond to the economic beneﬁts from the various land uses of each industry. We adopt variables which stand as proxies for the marginal beneﬁts of land and labor. Here, we choose the productivity of forestry, prefectural income per capita, the ratio of employees in manufacture, agricultural productivity, and so on. Finally, as for the policy-related factors, we employ governmental investment for forest conservation and ﬂood control, the numbers of staff at environment-related sections in prefectural government, and the area reserved for the natural environment, including natural parks and nature conservation areas. In particular, we have considerable interest in whether or not these variables have a positive inﬂuence on the designation of a protected forest.

5.2 Method of Estimation In this section, we employ two methods of empirical analysis of the land-use decision model (here, the decisions about the protected forest ratio). One is a simple regression analysis by OLS. The other is an analysis by a multinominal logit model. 6 We shall give the fundamental framework of the latter. At the prefecturewide level, despite the restrictions of natural conditions such as mountains and plains, it is likely that we will ﬁnd a strong inﬂuence on the composition of land use of various planning decisions, e.g., development planning, forest planning, or planning for the promotion of agriculture. For example, as mentioned above, the owners of a forest must choose whether or not they receive a designation of protected forest. Formally, according to the owner’s choice, the governor of the prefecture or the Minister of Agriculture, Forestry, and Fisheries actually designates a protected forest. Accordingly, it might be rational to assume that the governor of the prefecture has an important role regarding land use, and this will lead to the actual composition of land use in the prefecture. In Japan, land-use strategies are very different among prefectures. Here, we shall assume that the governor of each prefecture adopts one of three types of strategy concerning land use. The ﬁrst type is the prefecture that has put great stress on the conservation of the natural environment of the forest with minimum development (j = 2). The second type includes cases which mainly focus on rapid regional development (j = 0), and the third case is the one where the governor of the prefecture pursues a middle course for regional development (j = 1). In the multinominal logit model, the probability of choice for each category is given as One rigorous explanation of the procedure is given by Greene (2003). See also McFadden (1974) 6

Table 4. List of fundamental data Variable

Mean

0.697

0.035

0.231

0.117

0.507

OLS(2)

0.815

0.106

0.349

0.127

0.362

Multinominal logit

2.000

0.000

0.851

0.834

0.979

0.925

0.387

0.708

0.126

0.178

Ministry of Land, Infrastructure and Transport

23.4 3 311 2 108 87.3

8.8 893 1 384 27.2

15.2 1 737 1 748 51.7

2.5 494 177 13.4

0.161 0.284 0.101 0.259

Meteorological Agency, 2003 Meteorological Agency, 2003 Meteorological Agency, 2003 Ministry of Agriculture, Forestry and Fisheries, 2000

5 970.0

50.0

2 855.5

1 370.4

0.480

4401 0.285 587

2331 0.096 308

2 869 0.192 467

361 0.049 50

0.126 0.256 0.108

0.058

0.015

0.035

0.009

0.257

Ministry of Agriculture, Forestry and Fisheries, 2000 Cabinet Ofﬁce, 2000 Population Census, 2000 Ministry of Agriculture, Forestry and Fisheries, 2000 Ministry of the Environment, 2002

103 675

6466

40 676

20 582

0.506

1.5755

0.3200

0.6801

0.2869

0.422

37.2

1.9

16.4

8.8

0.536

Factors of an economic and social character Amount of timber production/ number of workers in forestry Prefectural income per capita (¥ 1000) Ratio of employees in manufacture Yield of rice per 10 a Ratio of the facility emitting soot and dust Factors of a policy-related character Governmental investment for land conservation per capita Numbers of staff in the environmental section of prefectural government/ residents Ratio of the reserve areas for the natural environment (natural park and nature conservation) (%)

Coefﬁcient of variation

Data source, year, etc.

Ministry of Agriculture, Forestry and Fisheries, 2002 Ministry of Agriculture, Forestry and Fisheries, 2002 (Refer to the text)

Ministry of Internal Affairs and Communications, 2001 Ministry of the Environment, 2004 Ministry of the Environment, 2004

M. Yabuta and Y. Yamanishi

Factors of regional and natural characteristics Ratio of sloping land, i.e., area with an inclination over 15%, per mountain area Average temperature Annual rainfall Annual hours of sunlight Ratio of natural forest

SD

256

Min.

Dependent variables OLS(1)

Max.

Forest and Land-Use Management Policy in Japan

Prov (Yi = j xi ) = Prov (Yi = 0 xi ) =

exp(b j xi ) , 1 + ∑ exp(bk xi ) 1 , 1 + ∑ exp(bk xi )

257

for j = 1, 2

(14)

for j = 0

(15)

where bj is a vector of the estimated parameter, and xi is a vector of the explanatory variables.

5.3 Outcome of the Estimation The outcome of the estimation is shown in Tables 5–7. Table 5 shows the outcome of the estimation by OLS. In each row, the outcome of all the variables is in the ﬁrst two columns and the outcome of the diminished variables is in the second two columns. In particular, for OLS(1), the ratio of sloping land and the ratio of natural forest have a signiﬁcant positive correlation. Factors of an economic or social character have little signiﬁcant correlation, but conversely, factors of a policy-related character have a strong inﬂuence. Table 6 shows the outcome of estimations by the multinominal logit model. In each case listed in the tables (from case 1 to case 4), choice of variable is intended to maximize the signiﬁcance. With regard to the standardization of the coefﬁcients to zero in the ﬁrst choice (j = 0), we can see that a greater level of timber production per worker in forestry means a lower probability of choosing j = 0 (the strategy which supports the development of industries such as agriculture and manufacturing). In this case also, the probability of choosing j = 1 is higher than the probability of choosing j = 2. We can also recognize that more areas reserved for the natural environment supports a designation of protected forest. We also investigate these effects by a differentiation coefﬁcient of probability with respect to each variable, that is, ∂Pj/∂xi (j = 0,1,2). For example, in case 3, the choice of j = 2 has a high value for the ratio of sloping land, the ratio of natural forest, and government investment for land conservation per capita (0.32, 0.015, and 0.0019, respectively). It seems that a higher value for these variables brings a higher probability of designation as protected forest.

6 Concluding Remarks As mentioned at the beginning, this chapter is about an evaluation of the Japanese forest policy by focusing on the forest protection system. We have adopted both theoretical and empirical analyses of several factors concerning the designation of protected forest. As a result, we have clearly shown

Estimated coefﬁcient

Constant Adj-R2 DW AIC

t value

OLS(2)

Estimated coefﬁcient

t value 2.4797* 1.5685

0.0021

2.2533*

t value

Estimated coefﬁcient

t value

0.1760 −0.0177 0.0000 0.0001 0.0021 0.0000

1.1640 1.4653 0.3096 0.5591 1.3350 0.4305

0.2021 −0.0125

1.6476 1.9926

0.0019

1.6484

0.2632 −0.0157 0.0000 0.0001 0.0029 0.0000

2.1339* 1.5937 0.9866 0.5144 2.2840* 0.3352

0.0000

0.7282

0.0000

0.3830

−0.1023

0.3162

−0.2800

0.7059

0.0005 −0.3602

1.7600 0.1604

0.0005

2.0055

0.0006 −0.2152

1.5498 0.0782

0.0005

1.5119

0.0000

1.4843

0.0000

3.6230**

0.0000

1.3053

0.0000

1.9402

0.0458

0.7669

−0.0074

0.1005

0.0025

1.6028

0.0022

1.5449

0.0045

0.0047

2.7269**

−0.2671 0.4169 2.3038 −80.4508

0.9430

−0.2945 0.4855 2.1840 −91.2936

1.7408

−0.1428 0.2540 2.0419 −61.3222

** Signiﬁcant at 1%; * signiﬁcant at 5%

0.2524 −0.0082

Estimated coefﬁcient

2.3107* 0.4114

−0.0475 0.3639 2.0095 −73.7730

0.2331

M. Yabuta and Y. Yamanishi

Ratio of sloping land Average temperature Annual rainfall Annual hours of sunlight Ratio of natural forest Amount of timber production/number of worker in forestry Prefectural income per capita Ratio of employees in manufacture Yield of rice per 10 a Ratio of facilities emitting soot and dust Governmental investment for land conservation per capita Numbers of staff in the environmental section in prefectural government Reserved areas for the natural environment

OLS(1)

258

Table 5. Outcome of the estimation by OLS Variables

Table 6. Outcome of the analysis by multinominal logit model (1) Case 1 t value

4 792.570 0.358 −448.779 17.386 0.368

0.653 0.000 −0.474 0.601 0.345

1.355

t value

j=1

t value

j=2

−6 564.230 2 151.780 −160.141 68.861 0.165

−0.179 0.127 −0.105 0.395 0.255

−3.140 10.298 −2.517 0.043 0.0037

−0.156 1.221 −1.813* 0.445 2.080**

−65.533 5.617 −0.112 0.270 0.0030

−1.588 0.560 −0.113 1.928* 2.343**

0.348

0.283

0.049

0.008

1.278

0.006

1.109

−5 303.270

−0.635

−10 584.100

−0.191

0.173 −45 425.600

0.027 −0.711

13.149 −77 226.200

0.496* −0.364

0.002

0.067

0.022

0.508

0.013

0.230

0.026

0.520

0.00018

1.580

0.00032

2.163**

−139.218

−0.082

432.353

0.211

1.595

0.343

4.940

1.055

−121.527

−0.392

−24.441

−0.459

−0.831

−0.197

−1.557

−0.019

Kullback– Leibler R2

Kullback– Leibler R2

0.722

Log likelihood

1.000

Log likelihood

−1.866* −14.117

t value

259

Signiﬁcance (P-value): *10%; **5%; ***1%

j=2

Forest and Land-Use Management Policy in Japan

Constant Ratio of sloping land Average temperature Ratio of natural forest Amount of timber production/number of workers in forestry Prefectural income per capita Ratio of employees in manufacture Yield of rice per 10 a Ratio of facilities emitting soot and dust Governmental investment for land conservation per capita Numbers of staff in the environmental section in prefectural government Reserved areas for the natural environment

j=1

Case 2

260

Table 7. Outcome of the analysis by multinominal logit model (2) Case 3 t value

j=2

t value

j=1

t value

−3.818 10.606 −2.279 0.053 0.0034

−0.191 1.480 −1.956** 0.540 2.216**

−59.516 10.746 −0.232 0.260 0.0025

−1.664* 1.261 −0.258 2.105** 2.266**

7.787 8.915 −1.091 −0.032

0.824 1.802* −2.305** −0.597

0.007

1.236

0.004

0.726

0.002

0.091

0.030

0.808

0.014

0.00018

1.809*

0.00029

2.424**

0.00004

−0.753 Log likelihood

Signiﬁcance (P-value): * 10%; ** 5%; *** 1%

−2.084** −14.766

−0.141

−1.319

Kullback– Leibler R2

0.709

−0.284 Log likelihood

j=2

t value

−39.409 8.035 0.086 0.082

−1.617 1.456 0.151 1.289

0.812

0.048

1.482

0.965

0.00013

2.810***

−2.427** −23.004

−0.058 Kullback– Leibler R2

−0.739 0.547

M. Yabuta and Y. Yamanishi

Constant Ratio of sloping land Average temperature Ratio of natural forest Amount of timber production/number of workers in forestry Prefectural income per capita Ratio of employees in manufacture Yield of rice per 10 a Ratio of facilities emitting soot and dust Governmental investment for land conservation per capita Numbers of staff in the environmental section in prefectural government Reserved areas for the natural environment

Case 4

j=1

Forest and Land-Use Management Policy in Japan

261

that the current environmental policy of each prefecture has a positive effect on the choice of protected forest. Nevertheless, there are still some points to be clariﬁed, which entail more detailed investigations of the forestry protection system in Japan. First, some economic factors that inﬂuence the preference for protected forest must be incorporated because we did not have sufﬁciently signiﬁcant explanatory variables in our empirical analysis. Second, a much closer synthesis of theoretical and empirical analyses may be required. Last, whereas we applied a multinominal logit model here, it was based on the assumption of so-called “independence from irrelevant alternatives.” This assumption is not really adequate in our model. The nested logit model modiﬁed in order to consider this assumption is well known, and we should apply that model for our analysis hereafter. In addition, separated analyses for each type of owner of protected forest (nation-owned or privately owned) and for each purpose of protected forest (head water conservation, soil run-off prevention, etc.) will make our analysis more relevant.

References Amacher GS, Koskela E, Ollikainen M (2004) Deforestation, production intensity and land use under insecure property rights. CESifo Working Paper No. 1128 Greene WH (2003) Econometric analysis, 5th edn. Prentice Hall, Englewood Cliffs Hardie IW, Parks PJ (1997) Land use with heterogeneous land quality: an application of an area-based model. Am J Agric Econ 79:299–310 Lichtenberg E (1989) Land quality, irrigation development, and cropping patterns in the northern high plains. Am J Agric Econ 71:187–194 McFadden D (1974) Conditional logit analysis of qualitative choice behavior. Frontiers in econometrics. In: Zarembka P (ed) Academic Press, New York Ministry of Agriculture, Forestry and Fisheries of Japan (1995–2002) Annual report on trends of forest and forestry Ministry of Agriculture, Forestry and Fisheries of Japan (2004) Statistical survey of forests and forestry (in Japanese). Shinrin-Ringyo Tokei Yoran Mitsui S (2004) Protection forest system. Studies on forest policy (in Japanese). Sakai M (ed) J-FIC, Tokyo Palmquist RB (1989) Land as a differentiated factor of production: a hedonic model and its implications for welfare measurement. Land Econ 65:23–28 Yabuta M (2004) Public policy for common pool resources (in Japanese). Shinhyoron, Tokyo

Part IV Economics of Space: Theoretical Analysis

14. Railway Competition in a Park-and-Ride System Tatsuaki Kuroda1 and Kazutoshi Miyazawa2

Summary. The scale effect of city size and the cost advantage of railway over automobile use are examined for a simple park-and-ride commuter system. The main result is that the operation constraint (i.e., the budget constraint for a rail company) is more restrictive for a monopolistic competition equilibrium than for a kinked one. Hence, it is easily possible to construct a case in which the perverse characteristics of kinked equilibrium, e.g., “the larger a city, the greater the number of railways, and the higher the fare is,” could result in a troublesome situation for a park-and-ride system. In addition, the operational constraints might be most restrictive for the social optimal conﬁguration rather than for other market solutions. As a result, railway subsidies may be necessary to balance the budget for a better conﬁguration. Key words. Urban transportation, City size, Park and ride, Kinked equilibrium, Commuter railway

1 Introduction Energy conservation and increased fuel efﬁciency in the transportation sector are important environmental issues on the global, as well as the local, level. Several approaches to saving energy and cutting harmful emissions in urban transportation have been proposed, such as congestion tax on automobiles, the utilization of light rail in downtown areas, the introduction of

Graduate School of Environmental Studies, Nagoya University, Nagoya 464-8601, Japan 2 School of Economics, Nanzan University, Nagoya 466–8673, Japan 1

265

266

T. Kuroda and K. Miyazawa

intelligent transport systems (ITS) technology, and the development of battery/electric-powered vehicles. An additional, classic proposal is the “park-and-ride” system, in which automobiles are used to travel from residences to nearby railway stations, and railways are used to reach the central business district (CBD), where jobs are located. In this manner, private automobile use is reduced, thereby improving energy efﬁciency and minimizing environmental impacts in urban areas. In several countries, parking lots at railway stations in suburban areas charge reasonable fees to encourage their use. Unfortunately, this system is employed at a very limited level; a wider introduction of the system should be encouraged. In the literature of urban economics, most models assume automobile commuting because most research is conducted in the US, where railway transportation is not as common as the use of the commuter vehicle. In contrast, the prevalence of railway utilization in Japan or Europe has resulted in a stream of studies on the related issue of railway competition (see Kanemoto, 1984, among others1). While these papers focus on a single transportation mode, in reality commuters in these countries often use both the automobile and the railway. An exception is Capozza (1973), who examines the effects of alternative transportation modes differentiated by land consumption in urban land use or population density. 2 However, he does not consider the combined use of plural modes. Transportation engineers discuss the importance of such systems as park-and-ride, although the policy implications of these systems have not been sufﬁciently evaluated from an economic point of view. Moreover, the mechanism of choice between several transportation modes, including transportation combinations, should be investigated using a spatial conﬁguration. In this chapter, we will develop a simple model of a park-and-ride system in an urban setting. We imagine a monocentric city in which several railways start from a single CBD and extend to the suburban areas as spokes. In our scenario, households use automobiles ﬁrst from their residences to the nearest railway station circumferentially, where they transfer to trains bound for the CBD, or they may choose to ride in private cars directly through to the CBD. In this model, competition exists between railways and between the railway and automobiles. Direct commuting using automobiles could be

Jensen (1998) discussed the characteristics of railway competition, and Kanemoto and Kiyono (1995) examine a way of regulation for railways. 2 Tabuchi (1993) analyzes the effect of modal choice between automobile and railway in a bottle-neck congestion model wherein space or urban conﬁguration do not matter explicitly. 1

Railway vs Automobiles in a Park-and-Ride System

267

considered “outside good” in a monopolistic competition of railways which are differentiated in location or space. Thus, we can observe the possibility of a “kinked equilibrium” with its perverse characteristics, as originally pointed out by Salop in 1979, who analyzed the work of outside good in a horizontally differentiated market. In our setting, in particular, the possibility of kinked equilibrium implies the existence of potential obstacles to promoting the park-and-ride system. Fares in the symmetrical zero-proﬁt kinked equilibrium decrease with constant marginal costs in proportion to railway distance, and increase with city size. Although some explanation might be possible, this seems very perverse. At the social optimum, we expect all households to use the parkand-ride system, but the system is not as successful in the largest cities such as Tokyo and Los Angeles. Hence, one might conjecture that one reason for the system’s failure to prevail among commuters is the perverse results in kinked equilibrium, which show a scale diseconomy in city size on the commuter side. In contrast, this possibility might justify a subsidy to railway companies, even if they themselves employ constant returns to scale technology. As discussed above, one of the main concerns in this chapter is the scale effects of city size, and the cost advantage of railway over automobiles. Our main results for the other case are as follows: in the case of a monopoly equilibrium, the ratio of the railway fare to the automobile costs, the share of the park-and-ride system, and the unit fare are constants, and are thus irrelevant to the city size. In monopolistic competition equilibrium, the number of railway ﬁrms is linearly proportional to city size, i.e., the radius of a circular city considered. The fares at a certain location and distance from the CBD decrease with city size. These results imply that, from the viewpoint of commuters, the scale economy in city size works well only in a monopolistically competitive case, while city size has no effect in the monopoly case, and may even be harmful due to a kinked equilibrium. At the social optimum, the number of railway ﬁrms and the railway fares are a ﬁxed item in monopolistic competition equilibrium. Finally, in a larger city, the operation constraints (the nonnegative proﬁt condition for railways) are relaxed, which means that scale economy based on city size always works. For commuters, however, city size is useful only in the monopolistically competitive case discussed above when evaluating the market structure of the park-and-ride setting. This chapter is organized as follows. Section 2 describes the basic model and presents the results under market competition. The social optimum is examined in Sect. 3. Section 4 discusses the operational constraints in detail, and is followed by some concluding remarks in Sect. 5.

268

T. Kuroda and K. Miyazawa

2 Model 2.1 Basic Structure 2.1.1 Spatial Configuration of the City The radius of the subject city is denoted by m. For simplicity, the CBD is assumed to be a single point. The distance from the CBD is denoted by x. Each household lives at a unit interval circumferentially. The number of households on the concentric circle at x is 2prx where r denotes the population density as a constant (for simplicity, each household is assumed to consist of one person). The total population, or number of households, in the city is m

N = 2πρ ∫ xdx = πρm2 0

(1)

We assume that each household consumes the same amount of land. Since r does not have much inﬂuence on our results, it is normalized as 1 hereafter. Each household commutes to an ofﬁce in the CBD every day, using either an automobile only or a combination of automobile and railway (i.e., the park-and-ride system). 2.1.2 Automobile It is assumed that roads radiate from the CBD in all directions. In other words, each household/commuter could drive to an ofﬁce in the CBD radially from any residence in the city. Although this is much simpliﬁed, it allows for the contrast of automobile and railway characteristics. The cost function of an automobile for each commuter in a residential lot at x is given as C(x) = cx a

(2)

where a > 0 stands for a distance elasticity of the cost of driving an automobile. If a > 1, the marginal cost of an automobile increases as one drives for a longer distance. This may occur because the higher temperature makes the engine less efﬁcient, or because a longer drive makes the driver less comfortable. The ﬁxed costs of the automobile are ignored here to allow for a contrast between the automobile and railway characteristics, since in general the latter need huge ﬁxed costs. In addition, the ﬁxed costs of automobiles are absorbed, or “sunk” when a household decides to use the vehicle daily. The one-way transportation costs which face each commuter who lives at x and only uses an automobile are cx a . If all households in the city only m 2π c α +1 use automobiles, the total transportation costs are 2π c ∫ xα dx = m 0 α +1

Railway vs Automobiles in a Park-and-Ride System

269

2.1.3 Railway If a railway exists, it radiates from the CBD in one direction. The marginal cost for an additional commuter is assumed to be zero to allow for the contrast of characteristics between modes, as discussed above. A ﬁxed cost F > 0 is required for laying one unit distance of railway. In other words, F may be interpreted as the marginal cost for a unit distance of railway. It is, in effect, a constant marginal cost, and thus we assume constant returns to scale for distance in railway technology itself. Since we assume a circular city where the hinterland of each railway would increase in the direction from the CBD to the suburbs, and assume no marginal costs for the number of commuters, one might conjecture that our model should exhibit some economy of scale due to the spatial conﬁguration and coststructure of the railway. As discussed later, this premise turns out to be false. The railway fare for passing through x is denoted by p(x). The cumulative x railway fare from the CBD to a station at distance x is given by Px ≡ ∫ p ( x ) dx. 0

2.1.4 Park and Ride If there is a station at distance x from the CBD, households residing at x in any direction from the CBD can use the park-and-ride system. A household living at x from the CBD and at a distance from the station y (we denote this location by yx hereafter) drives circumferentially from the residence to the station, and switches to the railway. The park-and-ride transportation cost of the household is C(y) + Px. The park-and-ride system will be used if and only if C(y) + Px ≤ C(x)

(3)

2.2 Monopoly Equilibrium Suppose that n railway ﬁrms operate in the city, yet the railway markets do not overlap (Fig. 1). We call this phase a monopoly. While we will discuss the conditions under which this case matters later, in this subsection we focus on the situation in which each railway ﬁrm in the city lays track to z (0 < z ≤ m). Proposition 1: For any x within z, (i) the ratio of the railway fare to the automobile cost is a/(1 + a) and (ii) the share of a railway ﬁrm is 1/p(1 + 1 a)–a . Proof: A marginal consumer at (x, yx) is characterized by cy ax + Px = cx a The demand for a railway at x is given by

T. Kuroda and K. Miyazawa

270

Cost of automobile

Cost of park & ride

Cost of railway

Demand area of park & ride

Demand area of

Demand area of

automobile

park & ride

Fig. 1. Monopoly equilibrium

(

Px d = 2 yx = 2 x − c M x

α

)

1 α

(4)

The maximization problem for the monopoly is formalized as z

Π M = max ∫ Px dxM dx − zF Px

(5)

0

subject to Eq. 4. The optimality condition for Px is ∂dxM =0 ∂Px which gives the optimal fare as dxM + Px

PxM =

α cxα 1+α

p(x) =

(6)

α2 cxα −1 1+α

(7)

Substituting Eq. 6 into Eq. 4, the demand for the railway at x is given by dxM =

2 1

(1 + α ) α

(8)

x

Each ﬁrm’s share of the park-and-ride system is dMx/2px = 1/p(1 + a) /a 1

[Q.E.D.]

(9)

Railway vs Automobiles in a Park-and-Ride System

271

Equation 6 states that the fare charged by a monopoly railway ﬁrm is a fraction a/(1 + a) of the transportation cost of an automobile, cx a . The greater the value of a, the greater the cost advantage to the monopolist, and the higher the “relative” fare. Equation 9 states that the share of each railway is constant for any x, and decreases with a. Note that any change in c is irrelevant to the share, while any improvement for the automobile on the distance elasticity of the cost, or “distance averter” characteristics (i.e., a decrease in a), decreases the relative fare and thus increases the share of a monopoly. In this case, the city size is irrelevant to the unit fare as well as the share. There is no scale economy effect for commuters due to city size in this case. Note that it is implicitly assumed here that the maximized proﬁt of a ﬁrm is nonnegative, or a sufﬁciently small F is assumed to guarantee its occurrence, which will be discussed in Lemma 3. Lemma 1: Under monopoly equilibrium, the unit fare is constant if and only if the transportation cost of automobiles is linear, i.e., a = 1. The unit fare is given by p = c/2, which is half of the unit cost of automobile use only. Proof: The lemma follows immediately from Eq. 7. [Q.E.D.] Lemma 2: If a monopoly ﬁrm builds a railway to z and gains positive proﬁts, then it has an incentive to extend the railway to z + ∆z. Proof: Substituting Eqs. 6 and 8 into Eq. 5, the monopoly proﬁt is

Π M ( z ) = z ⎡2α (1 + α )− ⎣

1+ α α

( 2 + α )−1 cz1+α − F ⎤

⎦

The ﬁrm operates if and only if P M(z) ≥ 0, i.e., F ≤ 2α (1 + α )−

1+ α α (

2 + α )−1 cz1+α

(10)

The derivative of net return with respect to z is 1+ α ∂Π M = 2α (1 + α )− α cz1+α − F > 0 ∂z

The last inequality comes from Eq. 10. Therefore, the ﬁrm has an incentive to extend the railway if Eq. 10 holds. [Q.E.D.] Lemma 2 says that a monopoly railway ﬁrm will extend the line to the edge of the city if it is proﬁtable at some z ∈ (0,m]. The next lemma follows immediately. Lemma 3: Railway ﬁrms operate in the city if

272

T. Kuroda and K. Miyazawa

F ≤ 2α (1 + α )−

1+ α α (

2 + α )−1 cm1+α ≡ F M

(11)

The maximum number of proﬁtable monopolists is nM = p(1 + a)a− , if we do not care about the integer characteristic of the quantity of ﬁrms. ¯ F M is increasing in m. Hence, in a larger city, the operation constraint in Eq. 11 is relaxed, and thus a monopoly is more proﬁtable. That is, railway ﬁrms enjoy all of the scale economy effect due to city size. Again, any improvement in the automobile on “distance averter” characteristics (i.e., a decrease in a) decreases nM. Note that in this case, the share of a monopoly does not depend on x, which means that at nM, the markets of neighboring railways just touch each other (and do not overlap). 1

2.3 Monopolistic Competition Equilibrium Suppose that n railway ﬁrms operate in the city, and that each market is overlapped at some x. In the short run, the number of ﬁrms is supposed to be ﬁxed (we discuss the long-run equilibrium later). Let us call this phase monopolistic competition (Fig. 2). Proposition 2: In a symmetric equilibrium under monopolistic competition among railways, (i) the ratio of the railway fare to the automobile cost is 2a(p/n) a , and (ii) if a = 1, then the unit fare is constant at 2pc/n. Proof: Suppose that the neighboring ﬁrms charge ¯ P x. A marginal consumer at (x, yx) is characterized by Px + cy xα = Px + c

(

2π x − yx n

)

α

which gives demand per ﬁrm as dxc = 2yx(Px).

Cost of automobile

Cost of park & ride

Cost of railway

Demand Area of a Railway

Fig. 2. Monopolistic competition equilibrium

Railway vs Automobiles in a Park-and-Ride System

273

Note that ∂dxc =− ∂Px

2 2π x ⎡ − yx αc ⎢ y αx −1 + n ⎣

(

)

α −1

⎤ ⎥⎦

The maximization problem for a monopolistically competitive ﬁrm is formalized as m

Π c = max ∫ Px d cx dx − mF Px

(12)

0

∂d = 0 . From symmetry, the ∂Px P x ≡ Pxc . Thus, the optimal fare is equilibrium condition is Px = ¯

The optimality condition is

Pxc = 2α

( ) ⋅ cx π n

α

dxc + Px

c x

α

If a = 1, then the unit fare is given by pxc = 2pc/n. nM = 2p if a = 1. Since n > nM is obvious due to the setting considered, pxc < c. The unit rail fare is always less than the unit cost of automobile use. [Q.E.D.] The proﬁt per ﬁrm is

()

π 1+ α 1+ α ⎡ ⎤ Π c = m ⎢ 4α (2 + α )−1 c m − F⎥ n ⎣ ⎦ Over the long run, the number of railway ﬁrms is determined by the freeentry condition P c = 0. In equilibrium, 1

4αc ⎤ 1+α n =π⎡ m ⎢⎣ ( 2 + α ) F ⎥⎦ c

(13) α

P = ( 2α c x

1 ) 1+ α

⎡ ( 2 + α ) F ⎤ 1+α m −α cxα ⎣⎢ 2c ⎦⎥

(14)

That is, the equilibrium number of railway ﬁrms is linearly proportional to city size, m, under monopolistic competition. In addition, the price schedule is Pxc = Ac

( ) x m

α

α

1 ( 2 + α ) F ⎤ 1+ α where, A = ( 2α )1+α ⎡ ⎣⎢ 2c ⎦⎥

This means that the price is determined by relative distance, x/m, with the power a. Hence, this follows the cost schedule of automobile use under

T. Kuroda and K. Miyazawa

274

equilibrium. Moreover, prices, Pxc, p(x) at a certain distance from CBD are decreasing with city size, m. In this case, the economy of scale due to city size really matters from the viewpoint of commuters. The necessary condition that each market is overlapped is Pxc + c

( ) ≤ cx πx nc

α

α

Substituting Eqs. 13 and 14, the operation constraint for monopolistic competition is F ≤ 4α (1 + 2α )−

1+ α α (

2 + α )−1 cm1+α ≡ F c

(15)

F M for any a > 0. Lemma 4: ¯ F c <¯ F M is equivalent to Proof: From Eqs. 11 and 15, the condition ¯ F c <¯ (1 + 2α )

1+ α α

> 2 (1 + α )

1+ α α

or 1+

α α > 2 1+ α 1+α

The difference function is deﬁned by f(t) = 1 + t − 2t, where t ≡ a/(1 + a) ∈ (0,1) because a > 0. Observe that f is continuous and differentiable, and that f(0) = f(1) = 0, and f″ < 0. Therefore, f(t) > 0 for any t ∈ (0,1). [Q.E.D.]

2.4 Kinked Equilibrium We now consider the stability of the equilibrium or the equilibrium process with free entry and exit of railway companies in the city. Suppose the constant marginal cost of unit distance railway service, F, is sufﬁciently small to satisfy the condition in Eq. 15. In this case, even though the initial phase is the monopoly (short run) equilibrium, new railway companies will enter until proﬁts become zero. According to standard microeconomic theory, the stable, or long-run, equilibrium in this case would be the monopolistic competition long-run equilibrium characterized by equations such as 13 and 14. In our model, however, this may not happen, due to the possibility of direct automobile commuting from home to the working place. This direct commute can be interpreted as the “outside good,” ﬁrst examined by Salop (1979). He found the possibility of the existence of a zero-proﬁt kinked equilibrium, and examined the equilibrium in which the comparative statistics results are really perverse.

Railway vs Automobiles in a Park-and-Ride System

275

Cost of automobile

Cost of park & ride

Cost of railway

Demand area of a railway

Fig. 3. Kinked equilibrium

Since the possibility of a kinked equilibrium in our model is easily proven, let us examine this phase here (Fig. 3). Proposition 3: In a zero-proﬁt kinked equilibrium, (i) the ratio of the railway fare to automobile use cost is 1 − (p/n) a , and (ii) if a = 1, then the unit fare is constant at c(1 − p/n). Proof: There exists a critical fare, Pˆx, such that M

α

dx ⎞ Px + c ⎛ = cxα ⎝ 2 ⎠ if Px ≤ Pˆx

if

Pˆx ≤ Px ≤ cxα ,

α

dx 2π x dx ⎞ Px + c ⎛ ⎞ = Px + c ⎛ − ⎝ 2⎠ ⎝ n 2⎠ c

c

α

The critical fare is derived by setting dMx = dxc = 2px/n.

( ) ⎤⎥⎦cx

π ⎡ Pˆx = ⎢1 − n ⎣

α

α

(16)

2π x dˆx = n If a = 1, then the unit fare is

(17)

( )

π pˆ x = c 1 − n

(18)

[Q.E.D.] The proﬁt per ﬁrm is m Π k = ∫ Pˆx dˆx dx − mF 0

( ) ⎤⎥⎦ 2 + α

π ⎧ 2π ⎡ = m ⎨ ⎢1 − n ⎩n ⎣

α

(

⎫ ⎭

)−1 cm1+α − F ⎬

276

T. Kuroda and K. Miyazawa

The zero proﬁt condition requires that

()

π⎡ π α ⎤ (2 + α ) F (19) − = 1 n ⎢⎣ n ⎥⎦ 2cm1+α Denote the left-hand side of Eq. 19 using a function f(t) = t(1 − t a ), where

(

1

)

t ≡ p/n ∈ (0,1). The maximum is given by f (1 + α )− α = α (1 + α )−

1+ α α

.

The necessary condition for the existence of kinked equilibrium is that

α (1 + α )−

1+ α α

≥ ( 2 + α ) F 2cm1+α , i.e.,

F ≤ 2α (1 + α )−

1+ α α (

2 + α )−1 cm1+α ≡ F k = F M

(20)

Thus, the operation constraint of the kinked equilibrium is the same as the condition for monopoly equilibrium. If Eq. 20 is satisﬁed, the number of railway ﬁrms is given by Eq. 19, and the fare and market share are given by Eqs. 16 and 17. Note that f(t) has a 1 1 maximum when n = p(1 + a)a− , and thus nk ≥ nM = p(1 + a) /a .3 Using Eqs. 16 and 19, the comparative static results that the number of railway ﬁrms is decreasing in F and increasing in m are easily obtained. Thus, the fare in the symmetric zero-proﬁt kinked equilibrium is decreasing with the constant marginal cost per unit distance, and increasing with city size. This seems perverse, yet intuitive, and is explained below. As the city size becomes greater or the marginal cost becomes smaller, more railway ﬁrms enter the market. The market share per ﬁrm shrinks. Under kinked equilibrium, the competitors of railway ﬁrms are not only neighboring ﬁrms, but implicitly also automobiles. As the market shrinks, a marginal consumer comes closer to being a railway ﬁrm. Each railway ﬁrm can charge a higher fare. As discussed above, the characteristics of kinked equilibrium may bring a troublesome situation for the park-and-ride system. As the size of a city increases, the number of railways increases and the fare, Pˆx also increases. Although this chapter discusses a deterministic model, a probabilistic model may be more appropriate if we allow consideration of heterogeneity in tastes, wherein higher railway fares would be explicitly obstructive to the promotion of park-and-ride. 4

Equation 19 has two solutions. The relevant solution of n is the larger one. To see this, suppose that ﬁxed/marginal cost F becomes zero. By Eq. 19, p/n must be zero or one. The economically relevant solution is zero, which corresponds to a sufﬁciently large n. 4 While we focus on the park-and-ride system here, the results could be applied to a busand-rail network system in suburban areas. 3

Railway vs Automobiles in a Park-and-Ride System

277

3 Social Optimum In this chapter, we implicitly assume that commuters work in the CBD to produce certain goods. Yet, given that the population of a city is ﬁxed, the total income of the city is also ﬁxed due to production technology and the given prices of the outputs. Next, suppose that the competitive land market works well whenever ownership of land is either absentee or public. Since the land consumption of each household is ﬁxed, the railway fare should not provide any distortion to the land market. The land rents can simply be cancelled out between payments by households and the income of landlords in the city account. Hence, maximizing the total surplus of the city is the same as minimizing the total transport costs in the city. Thus, a city planner chooses the number of railways, n, and sets the fares, Px, x ∈ (0,m) to minimize the total transportation costs under a balanced budget and the participation constraints. Is it optimal to allow some households to use only automobiles? In our model, the answer is no under the assumed cost structure. The following lemma will be useful. Lemma 5: At the optimum, all households use the park-and-ride system. Proof: Suppose that each market is segmented. As a monopoly case (not for pricing policy, but for each hinterland), a marginal consumer is given by 1 ¯ y = (x a − Px/c)a− . The sum of transportation costs for park-and-ride users is C PR = 2n∫

m

0

∫

y

0

(Px + cy α )dydx

(21)

The sum of transportation costs for automobile users is C A = 2n∫

m

0

( πnx − y )cx dx α

(22)

The budget constraint for the park-and-ride system is m

2n∫ Px ydx = mnF

(23)

0

Using Eq. 23, the total transportation cost is given by m

C PR + C A = 2π c ( 2 + α )−1 m2 +α − 2nc (1 + α )−1 ∫ y1+α dx 0

As long as each market is segmented, i.e., ¯ y < px/n, the planner can reduce the sum of transportation costs by increasing n. Therefore, at the optimum, each market must be overlapped, i.e., all households use the park-and-ride system. [Q.E.D.]

T. Kuroda and K. Miyazawa

278

Proposition 4: At the optimum, the number of railways is linearly proportional to the city size, m. Optimal prices, P*x , p(x) at a certain location away from the CBD are decreasing with city size, m. The number of railway ﬁrms and the railway fare are given by 1

n* = [2 (1 + α )]− 1+α nc 1

Px* = [2 (1 + α )]− 1+α Pxc Proof: The sum of transportation costs is given by πx

m⎛ ⎞ C = ∫ ⎜ 2π xPx + 2n∫ n cy α dy ⎟ dx 0 ⎝ 0 ⎠

=∫

m

0

( )

α +1

2nc π x ⎡ ⎢⎣2π xPx + 1 + α n

(24)

⎤ ⎥⎦ dx

The budget constraint is

∫

m

0

2π xPx dx − mnF = 0

(25)

and the participation constraints are

( πnx ) + P ≤ cx α

c

x

α

(26)

for x ∈ (0,m]. Substituting Eq. 25 into Eq. 24, 2c π 1+ α 1+ α ⎤ C = m ⎡⎢nF + m ⎥ (1 + α )( 2 + α ) nα ⎣ ⎦

(27)

Minimizing the total costs of Eq. 27 with respect to n gives the optimal number of railways: 1

2αc ⎤ 1+ α m n* = π ⎡ ⎣⎢ (1 + α )( 2 + α ) F ⎦⎥

(28)

Note that the optimal number of railways is linearly proportional to the city size, m. 1 Comparing Eq. 28 with Eq. 13, the ratio is given by [2 (1 + α )]− 1+α . That is, the number of railways in monopolistic competition is more than the optimum. This implies over-capacity. Substituting Eq. 28 into Eq. 25, 1

∫

m

0

m2 F ⎡ 2αc ⎤ 1+ α xPx dx = 2 ⎣⎢ (1 + α )( 2 + α ) F ⎦⎥

Railway vs Automobiles in a Park-and-Ride System

279

Infer that Px = Acx a . As the left-hand side is Acm2+a /(2 + a), α

1

A = (α (1 + α ))1+α (( 2 + α ) F 2c )1+α m −α. Therefore,

( )

α Px* = 1+α

1 1+ α

α

⎡ ( 2 + α ) F ⎤ 1+α m −α cxα ⎣⎢ 2c ⎦⎥

(29)

That is, the optimal price is decreasing in city size m. 1 − Comparing Eq. 29 with Eq. 14, the ratio is given by [2 (1 + α )] 1+α . [Q.E.D.] 1 1+a . Logarithmic differenThe ratio is denoted by a function g = [2(1 + a)] − 2 tiation gives g′/g = [1 − ln2(1 + a)]/(1 + a) . The ratio is increasing in a ∈ (0,e/2 − 1), and decreasing in a ∈ (e/2 − 1,∞). Further note that g(0) = g(1) = 2. To the extent that the marginal cost of automobile use is decreasing (a < 1), the ratio corresponds to (2,ee/2). The optimal number of railway ﬁrms is at the most one half of the monopolistically competitive ﬁrms, for instance. This would provide the rationale for public regulation of new-entry railways. In addition, to the extent that the marginal cost is increasing (a > 1), the ratio corresponds to (1,2) and the optimal number of railway ﬁrms is relatively large. Interestingly, within the range of a ∈ (0,e/2 − 1), the optimal number of railway ﬁrms tends to increase when the cost performance of automobile use is improved. Substituting Eqs. 28 and 29 into Eq. 26, the operation constraint for the social optimum is 1

F ≤ 2α (1 + α ) α (1 + 2α )−

1+ α α

Lemma 6: ¯ F* ⱞ ¯ F c> if

( 2 + α )−1 cm1+α ≡ F *

(30)

a ⱝ 1.

Proof: Comparing Eq. 30 with Eq. 15, 1 F* 1 ( ) α = 1 + α Fc 2

The proof is complete because the right-hand side is a decreasing function of a. [Q.E.D.]

4 Discussion A possible commuter system depends on (i) the railway marginal cost for a distance F, (ii) the parameters of automobile use cost a (and c), and (iii) city size, m. To sum up the results above, in a larger city, the operational constraints are relaxed in any case. This is the basic effect of an economy of scale due

280

T. Kuroda and K. Miyazawa

to city size, since we assume a constant marginal cost for distance and no marginal cost for an additional commuter in the circular city, in which each hinterland of railways becomes greater toward the suburbs. Then the order of the critical cost of railways in each type of equilibrium is as follows. If 0 < a < 1, then ¯ F c <¯ F * <¯ F M =¯ F k. If a > 1, then ¯ F * <¯ Fc < ¯ F k. If F >¯ F M, no railway will do business in the city, and this is justiﬁed F M =¯ even from the social point of view. Although we derive the equilibria of monopoly, monopolistic competition, and kinked cases, the monopoly case seems too restrictive as a stable equilibrium conﬁguration. In other words, the average cost of a railway must tangent the demand curve if the case is a zero-proﬁt equilibrium wherein no entry and exit would occur (Fig. 4). It is beyond our purpose here to predict which equilibrium would be observed more often. Yet, the operation constraint (i.e., the feasible range of F) is more restrictive for the monopolistic competition equilibrium. Hence, F k where the perverse characteristics F might happen to drop between¯ F c and¯ of kinked equilibrium become of concern. Unlike the monopoly case, F ≤ ¯ Fk is enough for stable (i.e., zero-proﬁt) kinked equilibrium. That is, if a railway ﬁrm is viable in some city, kinked equilibrium has no trouble with the operational constraint. As discussed above, the characteristics of kinked equilibrium may bring a troublesome situation for the park-and-ride system. The larger a city is, the greater the number of railways, and the higher the fare,

Price Average costs

Costs

Monopoly eql. Kinked eql.

Monopolistic comp. eql. Demand Fig. 4. Kinked demand curve and three types of equilibrium

Railway vs Automobiles in a Park-and-Ride System

281

Pˆx. In the case of larger cities, we should subsidize the railway sector to get rid of the trap of the equilibrium. In addition, to some extent and in reality, it seems that if a > 1, ¯ F * <¯ Fc < k ¯ F . This implies that even if we could ﬁnd a socially optimal conﬁguration of urban transportation system in this manner, it would be necessary to subsidize the railway sector in order to induce the railways to choose the socially optimal conﬁguration by balancing their budget.

5 Concluding Remarks Although the results are interesting, our model is still a prototype of an urban transportation network. Thus, there are many ways to extend the results of this chapter. First, although the lot size is ﬁxed for simplicity here, endogenizing lot size will provide a greater reﬂection of spatial economics as well as reality. Second, we do not consider the effects of congestion in the automobile network. Congestion control by tax is currently an important subject (e.g., the congestion tax employed in London), and incorporating congestionrelated phenomena into our park-and-ride setting would provide a truer model. Third, generalizing the form of transportation cost functions on both sides of ﬁxed and marginal costs would provide more information. Finally, while we assume a single CBD, a monocentric city, in this model, in reality urban areas have several CBDs, e.g., Tokyo, Paris, etc. In such multicentric cities, the effects of park-and-ride might be different from those in a monocentric city.

References Capozza D (1973) Subways and land use. Environ Plann 5:555–576 Jensen A (1998) Competition in railway monopolies. Transp Res Part E 34:267–287 Kanemoto Y (1984) Pricing and investment policies in a system of competitive commuter railways. Rev Econ Stud 51:665–681 Kanemoto Y, Kiyono K (1995) Regulation of commuter railways and spatial development. Reg Sci Urban Econ 25:377–394 Salop SC (1979) Monopolistic competition with outside goods. Bell J Econ 10:141–156 Tabuchi T (1993) Bottleneck congestion and modal split. J Urban Econ 34:414–431

15. An Analysis of the Relationship Between Manufacturer’s Profit and Spatial Economic Structure in the Retail Market Toshiharu Ishikawa

Summary. This chapter ﬁrst analyzes the relationship between manufacturer’s proﬁt and competitive equilibria in the retail market. The analysis identiﬁes the equilibrium in the retail market that maximizes the manufacturer’s proﬁt, and shows the interesting relationships between the retailer’s price and market size and the manufacturer’s proﬁt in several competitive equilibria. Second, the chapter assumes two different oligopolistic retail markets, and derives the proﬁts of the manufacturer and the retailer in these markets. The proﬁts obtained in each market are compared with those gained by the manufacturer in a competitive retail market. This comparison shows the interesting correspondence between the proﬁts generated in the retail markets with different kinds of competition. Key words. Spatial retail market structure, Spatial free-entry equilibrium, Oligopolistic market, Manufacturer’s proﬁt

1 Introduction There are several economic structures in the spatial retail market, free-entry, oligopoly, and monopoly. According to the economic structure established in the retail market, the retailer’s price and market size become different. It follows that the amount of goods produced by the manufacturers, and their proﬁts, vary corresponding to the retail market structure formed. The spatial and economic structure in the retail market is thus important not only for the retailers, but also for the manufacturers. This chapter analyzes the effects of the retail market structure on the proﬁts of the manufacturers as well as the retailers. Faculty of Economics, Chuo University, Hachioji, Tokyo 192-0393, Japan

283

284

T. Ishikawa

Mathewson and Winter (1983) derived a monopolistic manufacturer’s proﬁts in some free-entry equilibria in the retail market, and examined the relationships between the manufacturer’s proﬁts and the competitive equilibria in the market. Their considerations are worth extending to cover the wider relationships between market structures and the proﬁts of the manufacturer and the retailers. Taking two kinds of economic structure, freeentry and oligopoly, into consideration, this chapter analyzes the relationship between retail market structures and the proﬁts of the retailers and the monopolistic manufacturer. This chpater is organized as follows. Section 2 introduces the basic assumptions and the model. Section 3, assumes a free-entry retail market, analyzes the relationship between the equilibria in the competitive retail market and the manufacturer’s proﬁts, and delineates the features of some equilibria in the market. In Sect. 4, ﬁrst, two different oligopolistic retail markets are assumed, and the proﬁts of the manufacturer and the retailers are derived in each market. Second, the proﬁts obtained are compared with the manufacture’s proﬁt gained in a competitive retail market. This comparison shows an interesting correspondence between the proﬁts generated in the retail markets with the different kinds of competition. Section 5 summarizes these considerations of the effects of the retail market structure on the proﬁts of the manufacturer and the retailers.

2 Basic Assumptions and the Model The analysis is carried out with the following assumptions and model. A monopolistic manufacturer is located at the center O of a circle with radius U, as shown in Fig. 1. Consumers are distributed with uniform density 1 on the circumference. Each consumer purchases goods from the retailer whose delivered price is the lowest. Retailers, which are located evenly on the circumference, buy products from the manufacturer and sell them to consumers at a proﬁt-maximizing price. The demand function of the individual consumer is q = a − pr − tqU

(1)

where q is the quantity demanded, a is the consumers’ maximum reservation price, and pr is the retailer’s price. tqU is the transportation cost from the consumer to the retailer, t is the transport cost per mile, and q is the angle formed by two lines: one line connects the consumer (for example, point C1 in Fig. 1) and the center O, and the other connects the retailer (point R1 in Fig. 1) and the center O. The retailer’s proﬁt Yr is given by

Proﬁt and Structure in the Retail Market

285

y C1 R1 B1

θ θ* x 0 U

Fig. 1. Circumference market model with consumers, retailers, and manufacturer

Yr = (pr − pm − cr)Q − Fr

(2)

where pm is the manufacturer’s mill price, and cr and Fr are the retailer’s marginal and ﬁxed costs, respectively. The amount of sales of the goods Q is obtained by Eq. 3. θ*

Q = 2 ∫ (a − pr − tθU )Udθ

(3)

0

where q* is the angle formed by the line connecting the retailer and the center O and the line connecting the center O and the end-point of the retailer’s market area (point B1 in Fig. 1), thereby effectively representing the market area size. The manufacturer’s proﬁt Ym is Ym = (pm − tAU − cm)NQ − Fm

(4)

where cm and Fm are the manufacturer’s marginal and ﬁxed costs, respectively. The transportation of products between the manufacturer and each retailer is borne by the manufacturer, and tA is the transportation cost per mile. A difference between the transportation system and economies of scale are assumed, and thus tA < t. N represents the number of retailers in the retail market, which is expressed by Eq. 5. N = 2pU/2q *U = p/q *

(5)

To avoid the usual nonexistence problem associated with integers, N is assumed to be a nonnegative real number.

286

T. Ishikawa

3 Manufacturer’s Profit and the Market Situation When the Retail Market Is Under Free-Entry Competition 3.1 Retailer’s Price and Market Size in Equilibrium in a Free-Entry Retail Market We now derive the equilibrium price and market size of the retail ﬁrm when the retail market is under free-entry competition. There are two conditions for a spatial free-entry equilibrium to be established in the retail market. These conditions are: (1) the retailer will price the product to maximize its proﬁt; (2) new retailers will keep entering the market until the retailer’s proﬁt becomes zero. The ﬁrst condition is given by Eq. 6 by differentiating the retailer’s proﬁt Eq. 2 with respect to the price pr and setting the result to zero. dYr/dpr = (2(a − pr) − tq *U)q *U + (pr − pm − cr)(−2q *U + 2U(a − pr − tq *U)(dq */dpr)) = 0

(6)

The second condition is shown by setting Eq. 2 to zero. Yr = (pr − pm − cr)(2(a − pr) − tq *U)q *U − Fr = 0

(7)

In Eq. 6, dq */dpr shows the variation of the market length q * where a retailer lowers the price by one unit. When pr′ is denoted as the neighboring rival’s price, the value of dq */dpr will depend on the price conjectural variation dpr′/dpr, which is forecast by the retailer in question. Equation 8 determines this value. dq */dpr = (dpr′/dpr − 1)/(2tU)

(8)

When dpr′/dpr is determined as 1, 0, and −1, then Löschian, Hotelling– Smithies (Nash), and Greenhut–Ohta type competition, respectively, are represented.1 If another value is given to the price conjectural variation, the relevant competition type is indicated. Eventually, when the manufacturer informs the retailers of its mill price and the price conjectural variation is determined, the equilibrium values of pr and q * can be derived for the given mill price under the speciﬁed competition type by solving Eqs. 6 and 7 simultaneously.

See Capozza and Van Order (1978), Greenhut and Ohta (1973), Ishikawa and Toda (1990, 1998), Schöler (1993),and Tirole (1988) for the characteristics of these equilibrium solutions. 1

Proﬁt and Structure in the Retail Market

287

3.2 Manufacturer’s Profit and the Retail Market Situation in Free-Entry Equilibria By changing the value of both the mill price and the price conjectural variation, and solving Eqs. 6 and 7 simultaneously, it is possible to obtain the equilibrium values of the retail price, market size, and proﬁt-maximizing mill price, and then to derive, the relevant proﬁt of the manufacturer for every competition type established in the retail market. Since it is difﬁcult to solve the simultaneous Eqs. 6 and 7 by a mathematical treatment, a numerical computation is employed to obtain the equilibrium values. In a numerical computation, a = 20, U = 10, t = 1, ta = 0.43, cm = 1, cr = 0.75, Fm = 15, and Fr = 10 are assumed for the parameters in the equations. Using the numerical computation method, Fig. 2 shows the equilibrium values of the retail price pr, the market size q *U, the proﬁt-maximizing mill price p*m, and the relevant proﬁt of the manufacturer Ym, and illustrates the relationships between them when the price conjectural variation changes from 1 to −3.05. Referring to the four curves shown in Fig. 2, we examine the relationships among price conjecture, the retailer’s price and market size, the proﬁt-

dp′/dp M0 -3

Ym 2522.8 M3

2500 -1 0.92

1468.7 θ*U

L3

L01 1400

p*m

0.3 10.6

1.0

UC

UD

12

M1

pmD

pmC

12

M2

L1

16 pr

L2

Fig. 2. Relationship between the price conjecture, the free-entry equilibrium, the proﬁtmaximizing mill price, and the manufacturer’s proﬁt

288

T. Ishikawa

maximizing mill price, and the manufacturer’s proﬁt. As the price conjecture decreases from 1 to −3, the retail price becomes lower, while the retailer’s market ﬁrst decreases, and then begins to increase. When the price conjecture is 0.92, the retailer’s market is at its smallest. In accordance with the change in the retailer’s market, the manufacturer’s optimal mill price ﬁrst decreases, and then increases. However, the manufacturer’s proﬁt increases monotonously until the price conjecture decreases to −3, at which point its proﬁt reaches a maximum. In these conditions, two equilibria and their related market structures attract attention. First, in the equilibrium established when the price conjecture is −3, the manufacturer’s proﬁt becomes the highest, i.e., Ym = 2522.8. The market structure in this equilibrium is indicated by the points M0,1,2,3; the mill price p*m = 11.36, the retail price pr = 12.72, and the market size q *U = 1.224. In addition, in this equilibrium, the amount of goods sold by all the retailers becomes the highest, Q = 419.0. Thus, this equilibrium would be favorable for consumers as well as the manufacturer. Second, if the Löschian equilibrium is established, the manufacturer’s proﬁt becomes the lowest, Ym = 1468.7. The market structure in this equilibrium is described by points L0,1,2,3; the mill price p*m = 11.97, the retail price pr = 16.26, and the market size q *U = 0.399. Furthermore, the curves shown in Fig. 2 provide other interesting facts. (1) When the manufacturer’s price is decided in the range from 10.69 (point pmD in Fig. 2) to 11.36 (point pmC), the manufacturer’s proﬁt can take different values even if the proﬁt-maximizing mill price is the same. For instance, when the mill price is 11.36, the manufacturer’s proﬁt can be 2522.8 or 1575.0. (2) When the retailer’s market size is determined in the range from 0.305 (point UD in Fig. 2) to 0.399 (point UC), even if the market size of the retailer is the same, the retailer’s price can take a different value. For example, when the retailer’s market price is 0.399, which emerges when the price conjecture is settled at 0.73 or −1, the retailer’s price can be 13.58 or 16.26. Thus, this fact might provide the rationality for the retailer to sell the same kind of goods at a different price.

3.3 Features of Some Free-Entry Equilibria in the Retail Market By examining the price delivered at the end-point of the market area of the retailer, features of some equilibria, which are not fully shown in Fig. 2, can be revealed more clearly. In Fig. 3, the horizontal axis represents the length of the retailer’s market in equilibrium; the vertical axis indicates the relevant frontier price, FP, which shows the price delivered at the end-point of the retailer’s market area. The curve FPC illustrated in Fig. 3 describes the frontier price in every equilibrium in the retail market.

Proﬁt and Structure in the Retail Market FP

289

FPC

Löschian equilibrium

16.7

dpr′/dpr = 0.92 Hotelling-Smithies equilibrium

14.9

dpr′/dpr = -3

13.8

q*U

0 0.3 0.4

0.7

1.2

Fig. 3. Features of some free-entry equilibria

Figure 3 represents the following facts. (1) The retailer’s market size is not minimized in the Löschian equilibrium, but it is the lowest in the equilibrium when the price conjecture is 0.92, and the retailer’s market size q *U is 0.305. Thus, in this equilibrium, the number of retailers becomes maximized (N = 103.0). On the other hand, the retailer’s market is maximized in the equilibrium when the price conjecture is −3. In this case, the number of the retailers is minimized (N = 25.7). (2) The frontier price is the highest in the Löschian equilibrium (FP = 16.6). (3) The frontier price becomes the lowest (FP = 13.7) in the equilibrium established when the price conjectural variation is −0.1, which is very close to Hotelling–Smithies equilibrium. The Greenhut–Ohta equilibrium does not show any special feature.

4 Profits of the Manufacturer and the Retailer in an Oligopolistic Retail Market 4.1 Profits of the Manufacturer and the Retailer When the Number of Retailers in the Market Is Arranged in Order to Maximize the Individual Retailer’s Profit It is assumed in this subsection that the retailers enjoy an oligopolistic situation in the retail market, and they can control the number N of the retailers in the retail market so as to maximize the individual retailer’s proﬁt. Each retailer is assumed to be able to set the price for a product to maximize the proﬁt in the given optimal market size. With these assumptions, the proﬁts of the manufacturer and the retailer are derived, and the relevant retail market situation is analysed.

T. Ishikawa

290

Since the retailer’s market is expressed as 2pU/N, the sales amount Q of the retailer is given by Eq. 9. Q = 2∫

π /N

0

(a − pr − tθU )Udθ

(9)

The retailer’s proﬁt Yr is shown by Eq. 10.

(

Yr = ( pr − pm − cr ) 2∫

π /N

0

)

(a − pr − tθU )Udθ − Fr

(10)

From Eq. 10, the proﬁt-maximizing price of the retailer can be obtained as pr = (a + pm + cr − t(p/N)U/2)/2

(11)

Substituting Eq. 11 for pr in Eq. 10, the retailer’s proﬁt Yr is rewritten as a function of the number N of the retailers and the mill price pm. Yr = ((p/N)2U)(a − pm − cr − t(p/N)U/2)2 − Fr

(12)

From Eq. 12, the optimal number N* of the retailers that maximizes the individual retailer is shown as Eq. 13, which is an increasing function of the mill price. N* = 3ptU/(2(a − pm − cr))

(13)

Substituting N* for N in Eqs. 11 and 12, the retailer’s proﬁt can eventually be expressed as a function of the manufacturer’s mill price. Yr = (4/27t)(a − pm − cr)3 − Fr

(14)

The manufacturer’s proﬁt is given by Eq. 16 as a function of its mill price. Ym = (2pU/3)(pm − cm − tAU)(a − pm − cr) − Fm

(15)

The proﬁts obtained by Eqs. 14 and 15 are described by the curves Ym and Yr in Fig. 4, and give the same values as were used for the parameters in the previous section. Here, it is very interesting to examine the frontier price, FP, in this oligopolistic retail market. The frontier price is given by Eq. 16. FP = 0.5(a + pm + cr + 3t(P/N*)U/2) = 0.5(2a) = a

(16)

As shown by Eq. 16, the price delivered at the end-point of the retailer’s market always coincides with the maximum reservation price a. In this oligopolistic retail market, then, the sales amount of the good at the boundary of the retailer’s market always becomes zero irrespective of the manufacturer’s mill price. Now, since the proﬁts of the manufacturer and the retailer vary in opposite directions in reaction to the changes in the mill price, it seems to be reasonable that the mill price is determined by negotiations between the manufacturer and the retailers. Assuming that this case is realized, let us examine

Proﬁt and Structure in the Retail Market Y 1000

291

Yr Ym

500

2.5

5

7.5

10

12.5

15

Pm

-500 -1000 -1500 -2000

Fig. 4. Proﬁts of the manufacturer and the retailer when the number of retailers in the market is determined in order to maximize the individual retailer’s proﬁt

the resulting mill price. If the manufacturer has strong enough negotiating powers to dominate the retailers, the manufacturer sets a proﬁt-maximizing price, pm* = 12.28. The manufacturer’s proﬁt becomes 1003.9 and the retailer’s proﬁt becomes 40.2. On the other hand, when the retailers have strong negotiating powers against the manufacturer, the mill price will be set at the lower bound price, pmL = 5.35, at which the manufacturer’s proﬁt is zero and the retailer’s proﬁt becomes 387.7. According to their negotiating power, the mill price will be determined in the range from the lower bound price pmL to the optimal price pm*. 4.1.1 Profits of the Manufacturer and the Retailer When the Mill Price Is Determined in Order to Equate the Manufacturer’s Profit with the Individual Retailer’s Profit In the mill price negotiations, there may be at least two reference prices by which the mill price will be determined: one reference mill price is the price, pmS, which will equate the manufacturer’s proﬁt with the individual retailer’s proﬁt, and the other is the price, pmT, which will maximize the sum of the proﬁts of the manufacturer and all the retailers. First, supposing that the mill price is settled to equate the manufacturer’s proﬁt with the individual retailer’s proﬁt, let us derive the mill price pmS, and obtain the resulting proﬁts of the manufacturer and the retailer. The mill price pmS is derived by combining Eqs. 14 and 15. Using the same values for the parameters, the mill price is obtained at pmS = 6.47, when Eq. 15 equals Eq. 16. The proﬁt of each ﬁrm becomes Ym = Yr = 298.98. The established retail market structure in this case is described by the values of the items in column A1 in Table 1.

292

T. Ishikawa

Table 1. The economic and spatial situation in the oligopolistic retail market A1 A2 B1 B2 Lösch Proﬁt maximum pm Ym Yr TP Pr FP Q N* TP/TFC Q/MA

6.47 298.9 298.9 1401.6 11.48 20 72.6 3.69 26.99 4.26

8.69 734.3 164.6 1468.7 12.96 20 49.6 4.46 24.63 3.52

9.25 1134.7 53.1 2269.4 14.63 16.1 12.62 21.39 9.38 4.29

5.30 1261.4 49.2 2522.8 12.72 13.94 16.33 25.66 9.29 6.67

11.97 1468.7 0 1468.7 16.26 16.66 2.82 78.7 1.83 3.53

11.36 2522.8 0 2522.8 12.72 13.94 16.33 25.66 9.29 6.67

4.1.2 Profits of the Manufacturer and the Retailer When the Mill Price Is Determined in Order to Maximize the Sum of Profits of the Manufacturer and the Retailers Let us derive the mill price pmT that maximizes the sum of the proﬁts of all the ﬁrms. This mill price is obtained by differentiating the equation which is formed by adding Eq. 14 multiplied by N* to Eq. 15 with respect to the mill price, and setting it equal to zero. The mill price is obtained as pmT = 8.69, When the mill price is 8.69, the sum of proﬁts of the manufacturer and the retailers is 1468.7; the manufacturer’s proﬁt becomes Ym = 734.3, and the retailer’s proﬁt is Yr = 164.6. The market situation in this case is indicated in column A2 in Table 1. As shown here, the sum of the proﬁts, 1468.7, coincides with the manufacturer’s proﬁt in Löschian equilibrium in the competitive retail market. It is interesting to anticipate that the correspondence of the highest total proﬁts generated in this oligopolistic market is equal to the lowest proﬁt obtained by the manufacturer in equilibrium in the competitive market. It is also important that although these two markets produce the same proﬁts, the market structures are considerably different. In Löschian equilibrium, huge ﬁxed costs are required to generate the proﬁts of 1468.7, and thus the proﬁt per ﬁxed cost is very low, i.e., 1.83, while in the oligopolistic market, the proﬁt per total ﬁxed cost is very high, i.e., 24.6, since small ﬁxed costs are incurred to produce the same proﬁts.

4.2 Profits of the Manufacturer and the Retailer in the Case Where the Number of Retailers in the Market Is Determined in Order to Maximize the Total Profits of the Retailers This subsection examines the proﬁts of the manufacturer and the retailer in the case where the number N of retailers in the oligopolistic market can be arranged in order to maximize the total proﬁts of the retailers.

Proﬁt and Structure in the Retail Market

293

The total proﬁts TYr of the retailers can be derived in a manner similar to that used in the above subsection.

(

TYr = ( pr − pm − cr ) N 2∫

π /N

0

)

(a − pr − tθU )Udθ − NFr

(17)

As the proﬁt-maximizing retail price is given by Eq. 11, by substituting Eq. 11 into Eq. 17, the total proﬁt of the retailers can be rewritten as Eq. 18. TYr = (pU/2)(a − pm − cr −(p/N)U/2)2 − NFr

(18)

From Eq. 18, the optimal number N* of the retailers in this market is derived as Eq. 19 as a function of the mill price, It is interesting that in this oligopolistic market condition, the number of retailers becomes an increasing function of the mill price. N* = (2/3)(1/3)B/(A0.5(−9A0.5D + 30.5(−4B3 + 27AD2)0.5)) + (−9A0.5D + 30.5(−4B3 + 27AD2)0.5)(1/3)/(2(1/3) 3(2/3)A0.5)

(19)

where A = 4Fr , B = 2(tpU)2(a − pm − cr), and D = t2(pU)3. The substitution of Eq. 19 for N in Eq. 18 provides the total proﬁts of the retailers as a function of the mill price. The manufacturer’s proﬁt can easily be obtained as

(

Ym = ( pm − cm − t AU ) N 2∫

π /N

0

)

(a − pr − tθU )Udθ − Fm

(20)

By substituting Eqs. 11 and 19 for pr and N, respectively, in Eq. 20, the manufacturer’s proﬁt is shown as a function of the mill price. Figure 5 describes the total proﬁts of all the retailers and the manufacturer as a function of the mill price, using the same values for the parameters.

Y 3000 2500

TYr

2000 1500 1000

Ym

500

6

8

10

12

14

Pm

-500

Fig. 5. Proﬁts of all the retailers and the manufacturer when the number of retailers is determined in order to maximize total proﬁt of the retailers

294

T. Ishikawa

4.2.1 Profits of the Manufacturer and the Retailer When the Mill Price Is Determined in Order to Equate the Manufacture’s Profit with the Total Profits of the Retailers Observing the two curves in Fig. 5, the mill price may be decided by negotiations between the retailers and the manufacturer in a range from the proﬁtmaximizing mill price pm* to the proﬁt-zero price pmL. In this case, the mill price pmS that equates the manufacturer’s proﬁt with the total proﬁts of all the retailers can be considered as a key reference price in the negotiations. Then, supposing that the mill price is settled at the price pmS in the negotiations, let us derive the proﬁts of the manufacturer and the retailer in this oligopolistic market. This mill price, pmS, is derived by combining the equations of the proﬁts of the manufacturer and the total proﬁts of the retailers, and solving it with respect to the mill price, pm. Using the same values for the parameters, the mill price is obtained as pmS = 9.25. Based on this price, the manufacturer’s proﬁt and the retailer’s proﬁt are derived as 1134.7 and 53.1, respectively. The market situation in this case is shown in column B1 in Table 1. 4.2.2 Profits of the Manufacturer and the Retailer When the Mill Price Is Determined in Order to Maximize the Sum of Profits of the Manufacturer and the Retailers In the mill price negotiations between the ﬁrms, the other key reference price is the price pmT that maximizes the sum of proﬁts of the manufacturer and the retailers. The price pmT is obtained in a similar way to that used in Subsect. 4.1.2. The mill price is determined as pmT = 5.30, It is interesting that this mill price pmT is determined to be less than the lower bound price pmL, 5.33, in order to maximize the sum of proﬁts. Based on this mill price, the sum of proﬁts of all the ﬁrms becomes 2522.8, the manufacturer’s proﬁt becomes 1261.4, and the individual retailer’s proﬁt is 49.2. The resulting market situation in this case is shown in column B2 in Table 1. The total proﬁts of 2522.8 correspond to the highest proﬁt of the manufacturer in the equilibrium in the competitive retail market. It is very interesting that markets with different kinds of competition produce the same total proﬁts. The columns “Lösch” and “Proﬁt maximum” in Table 1 show the market situations when the retail market is under Lösch equilibrium and under the equilibrium that maximizes the manufacturer’s proﬁt in the competitive retail market. In addition, the lines TP/TFC and Q/MA show the total proﬁt per ﬁxed cost and the retailer’s sales amount per market area, respectively. These two ratios are considered to be the indicators of the economic efﬁciency and spatial efﬁciency of the retail market formation, respectively.

Proﬁt and Structure in the Retail Market

295

Using these indicator, therefore, it is possible to compare the efﬁciencies of formations in the competitive and oligopolistic retail markets. A comparison of the ﬁgures provided in Table 1 shows several interesting facts. When the number of retailers in the oligopolistic market is decided in order to maximize the individual retailer’s proﬁt, and the mill price is determined in order to equate the manufacturer’s proﬁt with the retailer’s proﬁt, the economic efﬁciency is the highest. The spatial efﬁciency becomes the highest in two cases. One is the case where the competitive equilibrium is established when the price conjectural variation is −3. The other is the case where the number of retailers in the oligopolistic market is decided in order to maximize the total proﬁts of the manufacturer and the retailers, and the mill price is determined in order to maximize the sum of proﬁts of all the ﬁrms.

5 Conclusions We ﬁrst analyzed the relationship between the manufacturer’s proﬁt and the competitive equilibria in the retail market using the price conjectural variation. The analysis identiﬁed the equilibrium in the retail market that maximizes the manufacturer’s proﬁt; this equilibrium is established when the price conjectural variation is −3. Second, the analysis showed that even if the market size of the retailer is the same, the retailer’s price could take different values; even if the manufacturer sets the same price for a product, its proﬁt can be different. Third, in this analysis, interesting characteristics of some competitive equilibria are revealed; the retailer’s market becomes minimized in the equilibrium established when the price conjecture is 0.92; the frontier price is the highest in Löschian equilibrium, while it becomes the lowest in equlibrium close to Hotelling–Smithies equilibrium. We examined the relationship between the oligopolistic market and the proﬁts of the manufacturer and the retailer. This examination showed two things. (1) When the number of retailers in the oligopolistic market is determined to maximize the individual retailer’s proﬁt, the highest total proﬁts generated in this market correspond to the proﬁt obtained by the manufacturer in Löschian equilibrium in the competitive market. Although these two retail markets provide the same total proﬁts, the market constitutions such as the retailer’s price and market size are quite different; the economic efﬁciency performed in the oligopolistic market is higher than that of Löschian equilibrium. (2) When the number of retailers in the oligopolistic market is arranged in order to maximize the sum of proﬁts of the manufacturer and the retailers, the highest total proﬁts derived in this market correspond to the highest proﬁt gained by the manufacturer in the competitive

296

T. Ishikawa

market. These two markets, which are of different competitive nature, provide the same total proﬁts, and their economic and spatial efﬁciency is the same.

References Capozza D, van Order R (1978) A generalized model of spatial competition. Am Econ Rev 68:896–908 Greenhut ML, Ohta H (1973) Spatial conﬁgurations and competitive equilibrium. Weltwirtschaftiches Arch 109:87–104 Ishikawa T, Toda M (1990) Spatial conﬁgurations, competition and welfare. Ann Reg Sci 24:1–12 Ishikawa T, Toda M (1998) An application of the frontier price concept in spatial equilibrium analysis. Urban Stud 35:1345–1358 Mathewson GF, Winter RA (1983) Vertical integration by contractual restraints in spatial markets. J Bus 56:497–517 Schöler K (1993) Consistent conjectural variations in a two-dimensional spatial market. Reg Sci Urban Econ 21:765–778 Tirole J (1988) The theory of industrial organization. MIT Press, Cambridge

16. Redistribution Through Local Competition Frederick Guy

Summary. We have developed a model with two kinds of shops (local and out-of-town) and two kinds of consumer (mobile and immobile). We assume that local shops operate in monopolistic competition, while the market structure for out-of-town shops is a stable oligopoly among large retail chains. We show that policies which raise the net cost (price plus consumer travel costs) of shopping out of town may cause a discontinuous drop in the price level in local shops. The price drop is accompanied by both the entry of new local shops and a reduction of excess capacity in local shops. We draw the following conclusions from the model. To the extent that local shops serve a poorer clientele, a rise in prices at out-of-town shops will have a progressive distributive effect if it results in the local price reduction predicted here. Moreover, the same measures have the potential to improve allocative efﬁciency through a combination of reductions in local excess capacity and the internalization of social and environmental costs of automobile use.

1 Introduction We have come to think of local shops—shops situated for easy access by foot, bicycle, or bus, rather than car—as having high prices and small selections. We may use them when we run out of something, or because we value the esthetic and social experience of the down-town, the old high street, or the corner store; serious modern shopping, however, is elsewhere.

Department of Management, Birkbeck, University of London, London, UK

297

298

F. Guy

Of course, the dominance of large out-of-town shops has its costs. For some, the local shop is all there is. Without a car, shopping out of town may be prohibitively expensive, whether in terms of time or of cash. If those without cars are, on average, poorer than those with cars, a higher price level at local shops has distributional implications. Out-of-town shopping, with the attendant shift from foot to car, from neighborhood to anonymity, also visits well-known external costs on both the natural and social environments. Even acknowledging such factors, it is tempting to view an attachment to small neighborhood shops as sentimental and impractical in the twenty-ﬁrst century. Yet, we are all familiar with areas, in or near the center of even the largest cities, where clusters of small retailers offer goods at prices often below those available in suburban malls: Manhattan and London, in the author’s experience, both offer numerous examples of this. These shops presumably manage this despite high rents, and despite the awkward logistics of delivering goods to small shops in congested cities. These shops beneﬁt from trafﬁc generated both by other shops, and by commuting and tourism. Another way of putting this is that for many consumers, the marginal cost of travel to these shops is low, because the consumers are in town anyway. In this chapter, we develop a model with two kinds of shops (local and out-of-town) and two kinds of consumer (mobile and immobile). We assume that local shops operate in monopolistic competition, while competition among the companies running out-of-town shops (supermarket chains) is Cournot in the sense that prices are relatively stable. We show that policies that raise the net cost (price plus consumer travel costs) of shopping out of town may cause a discontinuous drop in the price level of local shops. The price drop is accompanied by both the entry of new local shops and a reduction of excess capacity in local shops. We draw the following conclusions from the model: to the extent that local shops serve a poorer clientele, a rise in prices at out-of-town shops will have a progressive distributive effect if it results in the local price reduction predicted here. Moreover, the same measures have the potential to improve allocative efﬁciency through a combination of reductions in local excess capacity and the internalization of social and environmental costs of automobile use.

2 The Model There are N consumers in the local area, of whom some proportion, a, are immobile, and some (1 − a) are mobile. Mobile consumers have cars. Immobile consumers are restricted to nonautomobile transport (foot, bus, or

Redistribution Through Local Competition

299

bicycle; also, for the purposes of this chapter, home delivery of goods), which we assume makes out-of-town shopping infeasible. There are two kinds of shops: local and out of town. All consumers can reach local shops, while only mobile consumers can reach out-of-town shops. Call the number of local shops M. Local shops are in monopolistic competition equilibrium, with an equilibrium price P L . Out-of-town shops are an oligopoly, with a price PO. The behavior of the out-of-town shops is not modeled in this chapter, and PO is treated as parametric. Each mobile shopper j pays a net travel cost, cj, when shopping out of town. The variation in travel costs across mobile consumers may be understood as representing different personal circumstances: whether one already drives for work or the school run, for instance, or the walking distance to the local shops. For simplicity, travel costs are distributed uniformly across the interval [d, d + e], where d is the minimum travel cost and d + e is the maximum. The density of the distribution is given by 1/e at all points in the interval. Note that nothing in the model requires a particular sign for cj: for some consumers it may be less costly to drive to out-of-town shops than to walk, cycle, or take a bus to local shops. Also for simplicity, mobile consumers are assumed to choose between doing all shopping locally and doing all shopping out of town. A mobile consumer j will shop locally if P L < PO + cj, and otherwise will shop out of town. If P L − PO > d + e, then P L > PO + cj for all mobile consumers, and they all shop out of town. Conversely, if P L − PO < d, all mobile consumers shop locally. Assuming linear demand, each consumer’s demand is given by the relation Q j = a − bP

(1)

Let QL be the total demand faced by a representative local shop. The demand curve consists of three segments. Where P L − PO > d + e, and all mobile consumers shop out of town, local demand is QL = a(a − bP L)(N/M)

(2a)

Where d + e > P L − PO > d, the proportion of mobile consumers who buy locally varies with P L , so the local shop’s demand curve is

{

}

P O + (d + e ) − P L (1 − α )(a − bP L )⎤( N / M ) QL = ⎡⎢α (a − bP L ) + ⎥⎦ e ⎣

(2b)

Finally, where d > P L − PO and all consumers shop locally, the local shop’s demand curve is QL = (a − bP L)(N/M)

(2c)

300

F. Guy

Equation 2b requires some explanation. The term P O + (d + e ) − P L e

(2b′)

represents the proportion of mobile consumers who shop locally, a proportion which changes as P L changes. When P L = PO + (d + e), the lowest local price at which all mobile consumers shop out of town, the numerator and the proportion both equal zero. When P L = PO + d, the local price below which all mobile consumers shop locally, the proportion is 1; we can see this by substituting PO + d for P L in the numerator. Under the assumption that the distribution of costs is uniform within the [d, d + e] range, the proportion of those shopping locally is, within the relevant range, a linear function of P L . However, under the assumption that demand from those consumers who shop in a given area is linear, the demand curve faced by the representative local shop is not linear in this range, because a change in price causes both a linear change in the proportion shopping locally, and a linear change in the quantity purchased by each consumer; the interaction of these gives us a quadratic in P L . We differentiate equations 2a, 2b and 2c with respect to P L . When e < P L − PO (i.e., when all mobile consumers shop out of town), the slope of the demand curve faced by the local shop is dQL = −αb( N / M ) dP L

(3a)

When d + e > P L − PO > d, so that the proportion of mobile consumers who buy locally varies with P L , the slope of the demand curve is

{

}

dQL 1−α = − αb + [(a − bP L ) + b(P O + d + e − P L )] ( N / M ) L dP e

(3b)

Finally, where d > P L − PO, so that all consumers shop locally, the slope of the local shop’s demand curve is dQL = −b( N / M ) dP L

(3c)

The expressions above apply to segments of a demand curve (Fig. 1). Equation 3a gives us the slope of the left-most segment, and Eq. 3c that of the right-most. Both are linear and, since a < 1, the segment on the left (immobile shoppers only) is steeper. Equation 3b gives the slope of the middle segment, which changes with P L . Since PO + d + e − P L > 0 in the domain of this segment, we can see that its slope is negative throughout. The second derivative of Eq. 2b is

Redistribution Through Local Competition

301

PL D

LRAC P L1 P L-P O = d 1+e P L-P O = d 1 D'

QL1

QL

Fig. 1. Local monopolistic competition equilibrium with all mobile consumers shopping out of town

d 2Q L

(dP )

L 2

=

1−α 2b( N / M ) > 0 e

(4)

so the central segment is convex to the origin. At the left-hand (upper) end of the middle segment, where PO + d + e − P L approaches zero, its slope approaches

{

− αb +

}

1−α [a − bP L ] ( N / M ) e

(3b′)

which is ﬂatter than the ﬁrst segment’s −ab(N/M); the curve is kinked where these two segments join, and remains convex to the origin. At the other end of the middle segment, where PO + d + e − P L approaches e, the slope approaches

{

− b+

}

(1 − α )(a − bP L ) (N / M) e

(3b″)

which is ﬂatter than the third segment’s slope of −b(N/M). Thus, the curve is kinked here, too, but this time the kink is concave to the origin.

3 Comparative Statics Assuming free entry and exit by local shops with u-shaped long-run average cost curves, the monopolistic competition equilibrium may lie on any of the three segments of the curve. We are particularly interested in the question

302

F. Guy

of whether the equilibrium lies on the ﬁrst segment or the middle one. Figure 1 shows the ﬁrst of these, the low volume, high-price equilibrium for a representative local shop. From Eq. 2b, it should be clear that changes in PO and d have the same effect on the position of the middle segment of the local shop’s demand curve; that they have the same effect reﬂects the fact that, for the purposes of this model, both are simply changes in the net cost of shopping out of town. We study this effect with reference to an increase in d, which is to say a positive shock to the out-of-town travel costs of mobile consumers. Differentiating Eq. 2b with respect to d gives us dQL (1 − α )(a − bP L ) (N / M) > 0 = dd e

(5)

So an increase in travel costs raises local demand; since the second derivative is zero, we see that under the assumptions of the model this rise is linear. Turning to the effect of a change in d (or PO) on the slope of the demand curve, we differentiate Eq. 3b with respect to d to obtain ∂QL 1−α (N / M) < 0 = −b L ∂d ∂P e

(6)

Thus, a positive shock to d not only shifts the curve out, but also ﬂattens it. From Eq. 4 we see that a shock to d has no effect on the curvature of the curve. The shift in the local shop’s demand curve following a positive shock to d is shown in Fig. 2 as the shift from DD′ to DD″.

PL D

LRAC P L-P O = d2 +e P L-P O = d1+e P L-P O = d2 P L-P O = d1

D'' D' QL

Fig. 2. Rise in minimum travel cost for mobile shoppers from d1 to d2 shifts the local demand curve, creating proﬁt opportunities for local shops at lower price levels

Redistribution Through Local Competition

303

If the initial equilibrium is in the ﬁrst segment (only immobile consumers purchase from local shops), an outward shift and ﬂattening of the second segment will affect the equilibrium for the representative local shop only if it raises the second segment far enough that it crosses the shop’s average cost curve. When this happens, the change is discontinuous, with the equilibrium tipping from high prices and low volumes to low prices and high volumes in local shops. The dynamics of the tipping process are illustrated in Figs. 2 and 3. The outward shift of the demand curve makes positive proﬁts available to shops charging lower prices (Fig. 2). New local shops will then enter. The new entry divides local trade among more shops, shifting the demand curve for the representative local shop to the left, until a new equilibrium is reached: this is the shift, shown in Fig. 3, from DD″ to D*D**. The entry of additional local shops will, of course, raise the price elasticity of demand, further ﬂattening all segments of the demand curve. This change is reﬂected neither in our equations nor our ﬁgures, but doing so would only amplify our results. The new equilibrium must be at a lower price level, due to the shape of the long-run average cost curve. Moreover, the representative local shop is operating on a larger scale, and there are more local shops. Thus, an increase in the costs associated with out-of-town shopping by mobile consumers beneﬁts immobile consumers. Within the range d + e > P L − PO > d, further rises in the cost of out-oftown shopping would lead to further reductions in local prices. Unlike the step-change illustrated above, however, here we will see incremental price reductions in response to incremental cost increases.

PL D

LRAC

D*

P L2 D'' D** QL QL2

Fig. 3. Entry of new local shops shifts the demand curve for typical local shops to the left. New monopolistic competition equilibrium at lower price (P L2) and higher capacity utilization (Q L2) for the typical local shop

304

F. Guy

4 Policy Implications Our model assumes that among the consumers in an area served by one set of competing local shops, some have cars and some do not. The distributional implications which we will draw from the model rest on the further assumption that those who have cars are, on average, richer that those who do not. This entails an assumption of income heterogeneity within a local urban market, in the sense that within that market high- and low-income groups co-exist. It thus differs from models of the Alonso (1964) type, in which a city’s residents are sorted and segregated spatially into internally homogeneous wealth or income groups, with the poorest in the center and so on. Our model is a reasonable approximation of many European, East Asian, and Australasian cities, if not of many North American ones. Bearing these assumptions in mind, we see that in the case where the local equilibrium tips from the ﬁrst segment of the demand curve to the middle one, our model has an important distributional implication: poor consumers become better off in the market where the tipping occurs, because prices in the local shops they depend on fall. Various policies, discussed below, may cause this tipping (or the reverse) to occur. To draw any practical conclusions from the model, however, we need to recognize that no urban area has a single local market: each urban area has more than one, and some have a great many. Even if we assume homogeneity in the out-of-town market and within each local market in some urban area, there is no reason to expect that the demand curves and cost curves, together with travel costs for mobile consumers from that market, will all be the same in each local market. Therefore, a positive shock to travel costs and/or outof-town prices which affects the entire urban area may tip some local markets from a high-price to a low-price equilibrium, while leaving others unaffected. In any that are not affected, immobile consumers experience no change and mobile consumers bear higher costs. In this light, it might seem that securing redistribution by raising the net cost of out-of-town shopping would be quite costly in terms of allocative efﬁciency, and that if redistribution were our goal, we would be better off turning to taxes and cash transfers. This may, in fact, be the case, but before concluding that it is, we should revisit the following points. First, the tipping is discontinuous: at some point, a small change in the net cost of shopping out of town triggers a step change in local prices. An implication of this discontinuity is that the rise in net out-of-town costs which is necessary to gain the beneﬁts of tipping might not be very large. Second, in markets which tip to the low-price equilibrium, excess capacity is reduced. In the short run this is presumably mirrored by increased excess

Redistribution Through Local Competition

305

capacity in out-of-town shops; the reduction in local excess capacity is, however, a long-run equilibrium condition. In cases where the model presented here is relevant, a substantial proportion of the difference between out-of-town and local prices will be due, not to differences in logistical efﬁciency inherent in the two modes of delivery, but to high excess capacity in local shops. Third, there are policies which affect the relative cost of shopping out of town and which can be calibrated to a particular local area, such as parking controls. Finally, the progressive redistribution occurs as a result of a process which also reduces the externalities from motor vehicle trafﬁc, notably carbon emissions (Ruth 2006), and which produces positive social externalities by re-invigorating local shopping districts, one of the elusive desiderata of generations of urban policy (Evans 1999). Indeed, an explicit policy of this nature, which restricts the development of out-of-town retail centers for exactly these reasons (Policy Planning Guidance 6, or PPG6) has recently been introduced in the UK (Ofﬁce of the Deputy Prime Minister 1999). Our model provides a justiﬁcation for, and explanation of, such a policy, by showing that it has distributional consequences. It may be useful to review a range of transport and land-use policies in terms of which equilibrium they favor for local shops. In general, policies which reduce the cost of automobile transport to out-of-town shops favor the high-price, low-volume local equilibrium. Such policies include measures which reserve parking spaces for residents, whether on-street, or by allowing (or requiring) the provision of off-street resident parking: ease of coming and going in a car is a reduction in the cost of travel by automobile. Moreover, priority parking for residents may come at the expense of parking for local shops. Although driving to local shops is not addressed explicitly in the model presented here, some provision for customer parking at local shops can be a factor in competing for mobile consumers, to the beneﬁt of immobile consumers in the area. These issues have long been understood by those concerned with the survival of neighborhood and town-center retail businesses. The point made here is that in addition to the survival of such businesses, the prices charged by these businesses is at issue. Similar issues arise in connection with the control of trafﬁc congestion. Measures to manage trafﬁc congestion often focus on reducing automobile trips to town centers (Green et al. 1977; Quinet and Vickerman 2004; Atzema et al. 2005). The points to be aware of here are: (i) to the extent that what we are calling local shops are located in areas targeted for congestion reduction, reducing congestion can have the effect of reducing (for mobile consumers) the relative cost of travel to out-if-town shops; (ii) even outside of the areas subject to congestion-reduction policies, if these policies succeed in reducing

306

F. Guy

overall road trafﬁc, the effect may be a reduction (again, only for mobile consumers) of the cost of traveling out of town. On the other hand, trafﬁc management policies which focus not on congested areas per se, but on reducing the overall level of automobile trafﬁc within a given region, would in many cases raise the relative cost of shopping out of town. Again, this issue is well understood by retailers, such as those in and around the congestion-charging zone of central London. Policy interventions on behalf of such business have, understandably, focused on the problem of the survival of their businesses, on their availability to immobile consumers, and on their employment levels. Again, the contribution of this chapter is to show that the same factors that threaten the survival of such businesses can raise the prices they charge.

5 Limitations and Future Research The model presented here is highly stylized: there are two types of consumer and two types of shop; for mobile consumers, an all-or-nothing choice between local and out-of-town shopping; a uniform distribution of travel costs across mobile consumers; no choice beyond the local shops for immobile consumers; and a parametric out-of-town price. We believe that this simple model does the job of supporting our central arguments. First, that in a local monopolistically competitive market with one class of consumers who have an outside option and another class who do not, a change in the net costs of using the outside option can bring about a discontinuous adjustment in the local price/capacity-utilization equilibrium. Second, that if the consumers without the outside option are also poorer than those with the outside option, then the change in the net costs of using the outside option has distributional consequences. For an assessment of the empirical relevance of the model, and for an estimated magnitude of the welfare effects (including effects on the wealthier/mobile consumers, not addressed here), we would need a more complete and ﬂexible speciﬁcation. The binary mobile–immobile distinction made in the present model seems the least problematic, since it is based on the fact of access (or not) to an automobile. On the other hand, the distribution of travel costs within the mobile group could be modeled in a number of different ways: are the results affected if costs are distributed logistically, say, or in a Poisson distribution? Is there an empirical basis for choosing a distribution of costs? The linearity of demand might also be relaxed. An explicit model of the out-of-town retail oligopoly and conditions under which its pricing might respond to those of local shops is also needed, along with a formalization of the welfare model.

Redistribution Through Local Competition

307

The present model is silent on the question of product variety; intuitively, tipping to the low-price local equilibrium, involving as it does both higher per-retailer sales volumes and the entry of new retailers, should lead to increased variety. Would this be borne out in a Dixit–Stiglitz (1977)-type variant of the present model? The model should be empirically relevant where two conditions apply: ﬁrst, where nonautomobile oriented shops are located in a neighborhood in which both car-owning and non-car-owning households are a signiﬁcant presence; second, where these retailers are close enough to the tipping point hypothesized here that small changes in the real cost of shopping out of town could cause a step-change in equilibrium prices and capacity utilization. For the hypothesized distributional effect to be important, the neighborhoods in question would have to be signiﬁcantly heterogeneous as regards household wealth, and have wealth positively correlated with car ownership. While the model appears to be applicable to many European, East Asian, and Australasian cities, the effects predicted in the model need empirical veriﬁcation and measurement. Acknowledgments. I am grateful to Philip McCann for his invaluable advice and encouragement, and to Maureen Kilkenny for her comments on an early version of this chapter.

References Alonso W (1964) Location and land use. Harvard University Press, Cambridge Atzema O, Rietveld D, Shefer D (2005) Regions, land consumption, and sustainable growth. Edward Elgar, Cheltenham Dixit A, Stiglitz J (1977) Monopolistic competition and optimum product diversity. Am Econ Rev 67:297–308 Evans AW (1999) The land market and government intervention. In: Cheshire P, Mills ES (eds) Handbook of regional and urban economics. North-Holland, Amsterdam Green DL, Jones DW, Delucci MA (1977) The full costs and beneﬁts of transportation. Springer, Heidelberg Ofﬁce of the Deputy Prime Minister, UK (1999) Policy Planning Guidance 6 Quinet E, Vickerman RW (2004) Principles of transport economics. Edward Elgar, Cheltenham Ruth M (2006) Smart growth and climate change. Edward Elgar, Cheltenham

Subject Index

a accumulation of deﬁcits 123 active ﬁscal policy rules 113 active monetary policy rule 113 Adaptive inﬂation climate 3 Adjustment costs 55 agglomeration 220 aggregate government debt 112 aggregate wealth 111 area-base model 249

b balance of payments adjustment process 107, 121 balance of payments 102, 110 Barro 130 Benhabib 134 bialternate product 163, 167 bifurcation value 137 bounded rationality 3 Boyesion spatial autoregressive model 188 Boyesion spatial Durbin model 189 Boyesion spatial error model 186, 187 budget dynamics 123 budget equations 109 business purposes 231

c Cagan money demand function 114 Cagan-type real-money demand function 111

capital account 104 capital accumulation equation 130, 131, 135, 140 capital ﬂows 103 capital mobility 105 car commuter proportion 235 car-commuter free-parking taxation 238 central bank 103 centrifugal forces 123 characteristic equation 152 charge-collection costs 228 city size 265, 267 clustering 209 Cobb–Douglas utility function 147 coefﬁcients 116 commercial parking market 229 common pool resources (CPRs) 240 commuter railway 265 commuters’ free or employer-subsidized parking 237 commuting 229 competition 286 competitive market 295 competitive retail market 294 congestion charge systems 238 constant returns to scale 267 consumers 284 cooperation 209, 214 corridor stability 130, 135, 136, 141 cost of replanting and maintenance 243 creativity 208 criteria of simple Hopf bifurcations 169

309

310

Subject Index

criterion of simple Hopf bifurcations 161, 166, 171 critical value of the delay 156 cumulative instability 121 current account 103 current account surpluses 110

d 3D dynamics 115, 117 data 213 database 211 debt 112 debt accumulation 83 debt–creditor relationship 125 delay in production 145 delayed system 152 demand curve for car commuting 235 dependent variables 254 destabilizing effect 151 destabilizing feedback channel 120 differentiation coefﬁcient of probability 257 discontinuous policy function 55 disposable income 105, 113 distance elasticity of the cost 271 distribution, real income 298, 304 diversity 220 Domar 129, 130 domestic economy 124 domestic wealth 104 dynamic model of renewable resources and population 147 dynamics of temporary equilibrium 148

e economic efﬁciency 294, 295 economic externality 252 economic view 130 economy of scale 279 education production function 176, 180–182, 184, 186, 189, 190, 193, 195, 200 efﬁcient Pareto conditions 253 emissions in urban transportation 265 employee demand 234

employee parking spaces 232 employees 236 employer 236 endogeneity 184, 185 energy conservation 265 entrepreneurship 208 environmental costs 297 environmental degradation 145 environmental externalities 298, 305 environmental issues 265 EPCP income 132, 136, 137 establishment 211 expectation formation 83 expected per-capita permanent income see also EPCP income 131 explanatory variables 254 explicit costs 234 external debt position 101 externalities 178, 179

f face-to-face 207 fertility function 148 ﬁnance 228 ﬁnancial stocks 102 ﬁscal and monetary policy 117, 123 ﬂow budget equations 102 foreign economy 124 foreign exchange market policies 109 Forest Law in Japan 240 forest policy in Japan 245 forest protection system 239 free-entry competition 286 free-entry equilibrium 286 Friedman 131, 132 frontier price 288, 289 Fuller criterion 161, 164, 166, 170

g Gabisch 134 generalized Tobin model 170, 171 geographical 211 global dynamics 55 government budget constraint 105, 120, 123 government budget restriction 100 government debt 121

Subject Index government expenditure 122 government expenditure shock 107 Greenhut-Ohta equilibrium 289 growth cycle 29 Guckenheimer 134, 140, 142

h Harrod 129, 130 Harrod’s knife-edge 29 harvesting rotation age 250 high-dimensional dynamic Keynesian model 83 high-technology 210 Hirsch 136, 139 Holmes 134, 140, 142 Hopf bifurcation 29, 133, 134, 137, 168 Hopf bifurcation theorem 133, 140, 142 Hopf bifurcations 165, 166 Hopf cycle 135–138, 140 Hopf cycles 131 Hotelling-Smithies equilibrium 289, 295 hysteresis 136, 140

i imperfect competition 29 independence from irrelevant alternatives 263 inner city of Stockholm 235 innovation 206, 223 instability 115, 124 integrated dynamics 114 interest-rate elasticity 116 inter-ﬁrm 214 IS-restriction 109

j Jansson and Wall (2002) Stockholm study 227 Jansson and Wall study 229 jump variable technique 100, 106, 107, 115, 118

k Keynesian legacy 83 kinked equilibrium 265, 267

knife-edge and bifurcation 40 knowledge spillovers 206 Kuznets 138

l labor market 233 labor mobility 210 land-use decision model 255 land-use policies 305 Leijonhufvud 130, 134 Leijonhufvud’s 135 Liénard–Chipard criterion 166 Lines 136 Liu criterion 165, 166, 170 LM curve 109, 111 local 215 local asymptotic stability 117 local referendum 228 local unit 212 locally asymptotically stable 119, 152 location 207 logistic function 149 long-run dynamics 154 Lorenz 134, 135 Löschian equilibrium 289, 292, 295 loss of stability 156 Lotka–Volterra predator–prey model 149

m Maastricht Treaty 121 Malthusian population dynamics 148 Mankiw 130 man-made forest 241 manufacturer 284 manufacturer’s mill price 285 manufacturer’s proﬁt 291, 292 market size 287, 288 market structure 288 maximum reservation price 284, 290 median voter 178, 179 Medio 136 mill price 293, 294 mill price negotiations 294 Miyao 134

311

312

Subject Index

model for small island 146 monetary shock 106 money supply 120 money supply adjustment rule 101 monopolistic competition equilibrium 267 monopolistic competition 267, 298, 299, 301, 303 monopoly equilibrium 267 motorization 227 multinominal logit model 254, 257 multiple steady states 55 Mundell–Fleming–Tobin model 103

n Nash equilibrium 251 nation-wide forest plan 246 natural forest 241 nearly constant propensity 138 Nearly Constant Propensity to Consume 137 negative externality 253 negotiating powers 291 neoclassical aggregate model of growth see also “S–S model” 130 neoclassical growth model 140 neoclassical steady state 130 neoclassical view of the economy see also “economic view” 140 nested logit model 261 net salary 230 new Keynesian theory 118 New-Keynesian Phillips curve 3 nominal adjustments 118 nonexcludability 240 nonlinear dynamic system 154 nonlinearities 108 nonproﬁt organizations 230 nonrandom 232 number of retailers 285, 293 numerical simulations 155

o oligopolistic 208 oligopolistic market 292, 294, 295 oligopolistic situation 289 OLS methods 254 omitted variable bias 187, 188, 199

open economies 108 open market operations 103 optimal market size 289 oscillatory dynamics 151 outside good 267 Owase 134, 135 ownership of land 277

p Pareto equilibrium 253 park and ride 265 parking 305 parking cost 235 parking policies 237 parking taxation 228 perfectly open economies 101, 119 periodic paths see also “Hopf cycle” 131 permanent component of consumption 132 permanent implementation of congestion charges 228 permanent income hypothesis see also “PI hypothesis” 130 permanent income hypothesis 140 Phillips curve 113 PI hypothesis 130 policy rules 124 politicians 236 pooled data 214 population growth 145 positive externality 253 price conjectural variation 286, 287 price conjecture 288 private and government wealth 105 private disposable income 110, 114 private ﬁrms 230 private schools 180, 181, 185, 190, 199, 200 probability of choice 255 probit 213 process 213 process innovation 40 product 213 product cycle 208 product innovation 40 proﬁciency test 182, 190, 192, 195 proﬁt-maximizing mill price 287, 288 proﬁt-maximizing retail price 293

Subject Index public and private schools 177 public authorities 230 public beneﬁts of forest 240 public debt 110 public regulation 279 public schools 180, 181, 185, 190, 199, 200 purchasing power parity condition 100 Putty–clay technology 29

q quality competition 242 quantity demanded 284

r R&D 210 R&D cooperation 223 real interest 113 real interest ﬂows 104 real rate of return 106 real wealth 106 redistribution 304 regional forest plan 246 relations 220 rest of the world 124 restriction on property rights 247 restrictions on the protected forest 248 retail market formation 294 retail market structure 283, 291 retail price 287 retailer’s proﬁt 294 retailers 284 revitalization of the forestry 244 rise and fall of Easter Island 146 rivalry 240 road pricing 229, 238 Robinson 136 Romer 138 Rosser 134 Routh–Hurwitz criterion 161, 162, 164, 166 rush-hour trafﬁc 237

s Sala-i-Martin 130 scale economies 182 scale economy 267

313

school 233 school competition 176, 177, 180, 181, 190, 193, 195, 199, 200 school inputs 180, 182, 192, 193, 195 school vouchers 176, 199, 200 schooling production 178, 179 Shaefer harvesting production 147 simple Hopf bifurcation 161, 167–170 simple Hopf bifurcations 165 simulations with a production delay 154 size 215 Smale 136, 139 social optimum 267, 277 social planner 252 Solow 129, 137 spatial efﬁciency 294, 295 spatial retail market 283 spatial statistics 177, 185–189, 192, 193, 199 specie-ﬂow mechanism 101, 107 spillovers 178, 179, 185, 188, 189, 192, 194, 195, 199, 200 S–S model 131 S–S model see also “neoclassical growth model” 130 stability 119 stability criteria 163 stability criterion 161, 162, 166 stability index 137, 142 stability properties 115 stabilization policy 83 stable dynamics 156 stationary points 108 steady state 118, 119, 121 stock-ﬂow relationships 100 strictly enforced 231 stumpage value 240 subsidies 265 subsidize the railway sector 281 subsidy 267 suburbs 230 Survey of Professional Forecasters 3 surveys 221 Swan 129, 137

t tax policy 101 temporary equilibrium

148

314

Subject Index

Temporary Measures Law for the Development of Protective Forest 245 tolling 228 total wealth 112 trafﬁc congestion 237 transitory component of consumption 132 transparency 209 transport cost per mile 284 transport infrastructure investment 237 travel cost 299 travel costs 298, 302, 304, 306 turnover 215 twin deﬁcits 102, 116 Two-Country Case 124 two-country world 125

u uncovered interest parity 99 unstable adjustment process 108 unstable mechanisms 125 urban transportation 265

v value added 182 variety 207

w wealth 114 weight matrix 186 working day 231 workplace 233