Editorial policy: The Journal of Econometrics is designed to serve as an outlet for important new research in both theoretical and applied econometrics. Papers dealing with estimation and other methodological aspects of the application of statistical inference to economic data as well as papers dealing with the application of econometric techniques to substantive areas of economics fall within the scope of the Journal. Econometric research in the traditional divisions of the discipline or in the newly developing areas of social experimentation are decidedly within the range of the Journal’s interests. The Annals of Econometrics form an integral part of the Journal of Econometrics. Each issue of the Annals includes a collection of refereed papers on an important topic in econometrics. Editors: T. AMEMIYA, Department of Economics, Encina Hall, Stanford University, Stanford, CA 94035-6072, USA. A.R. GALLANT, Duke University, Fuqua School of Business, Durham, NC 27708-0120, USA. J.F. GEWEKE, Department of Economics, University of Iowa, Iowa City, IA 52240-1000, USA. C. HSIAO, Department of Economics, University of Southern California, Los Angeles, CA 90089, USA. P. ROBINSON, Department of Economics, London School of Economics, London WC2 2AE, UK. A. ZELLNER, Graduate School of Business, University of Chicago, Chicago, IL 60637, USA. Executive Council: D.J. AIGNER, Paul Merage School of Business, University of California, Irvine CA 92697; T. AMEMIYA, Stanford University; R. BLUNDELL, University College, London; P. DHRYMES, Columbia University; D. JORGENSON, Harvard University; A. ZELLNER, University of Chicago. Associate Editors: Y. AÏT-SAHALIA, Princeton University, Princeton, USA; B.H. BALTAGI, Syracuse University, Syracuse, USA; R. BANSAL, Duke University, Durham, NC, USA; M.J. CHAMBERS, University of Essex, Colchester, UK; SONGNIAN CHEN, Hong Kong University of Science and Technology, Kowloon, Hong Kong; XIAOHONG CHEN, Department of Economics, Yale University, 30 Hillhouse Avenue, P.O. Box 208281, New Haven, CT 06520-8281, USA; MIKHAIL CHERNOV (LSE), London Business School, Sussex Place, Regents Park, London, NW1 4SA, UK; V. CHERNOZHUKOV, MIT, Massachusetts, USA; M. DEISTLER, Technical University of Vienna, Vienna, Austria; M.A. DELGADO, Universidad Carlos III de Madrid, Madrid, Spain; YANQIN FAN, Department of Economics, Vanderbilt University, VU Station B #351819, 2301 Vanderbilt Place, Nashville, TN 37235-1819, USA; S. FRUHWIRTH-SCHNATTER, Johannes Kepler University, Liuz, Austria; E. GHYSELS, University of North Carolina at Chapel Hill, NC, USA; J.C. HAM, University of Southern California, Los Angeles, CA, USA; J. HIDALGO, London School of Economics, London, UK; H. HONG, Stanford University, Stanford, USA; MICHAEL KEANE, University of Technology Sydney, P.O. Box 123 Broadway, NSW 2007, Australia; Y. KITAMURA, Yale Univeristy, New Haven, USA; G.M. KOOP, University of Strathclyde, Glasgow, UK; N. KUNITOMO, University of Tokyo, Tokyo, Japan; K. LAHIRI, State University of New York, Albany, NY, USA; Q. LI, Texas A&M University, College Station, USA; T. LI, Vanderbilt University, Nashville, TN, USA; R.L. MATZKIN, Northwestern University, Evanston, IL, USA; FRANCESCA MOLINARI (CORNELL), Department of Economics, 492 Uris Hall, Ithaca, New York 14853-7601, USA; F.C. PALM, Rijksuniversiteit Limburg, Maastricht, The Netherlands; D.J. POIRIER, University of California, Irvine, USA; B.M. PÖTSCHER, University of Vienna, Vienna, Austria; I. PRUCHA, University of Maryland, College Park, USA; P.C. REISS, Stanford Business School, Stanford, USA; E. RENAULT, University of North Carolina, Chapel Hill, NC; F. SCHORFHEIDE, University of Pennsylvania, USA; R. SICKLES, Rice University, Houston, USA; F. SOWELL, Carnegie Mellon University, Pittsburgh, PA, USA; MARK STEEL (WARWICK), Department of Statistics, University of Warwick, Coventry CV4 7AL, UK; DAG BJARNE TJOESTHEIM, Department of Mathematics, University of Bergen, Bergen, Norway; HERMAN VAN DIJK, Erasmus University, Rotterdam, The Netherlands; Q.H. VUONG, Pennsylvania State University, University Park, PA, USA; E. VYTLACIL, Columbia University, New York, USA; T. WANSBEEK, Rijksuniversiteit Groningen, Groningen, Netherlands; T. ZHA, Federal Reserve Bank of Atlanta, Atlanta, USA and Emory University, Atlanta, USA. Submission fee: Unsolicited manuscripts must be accompanied by a submission fee of US$50 for authors who currently do not subscribe to the Journal of Econometrics; subscribers are exempt. Personal cheques or money orders accompanying the manuscripts should be made payable to the Journal of Econometrics. Publication information: Journal of Econometrics (ISSN 0304-4076). For 2011, Volumes 160–165 (12 issues) are scheduled for publication. Subscription prices are available upon request from the Publisher, from the Elsevier Customer Service Department nearest you, or from this journal’s website (http://www.elsevier.com/locate/jeconom). Further information is available on this journal and other Elsevier products through Elsevier’s website (http://www.elsevier.com). Subscriptions are accepted on a prepaid basis only and are entered on a calendar year basis. Issues are sent by standard mail (surface within Europe, air delivery outside Europe). Priority rates are available upon request. Claims for missing issues should be made within six months of the date of dispatch. USA mailing notice: Journal of Econometrics (ISSN 0304-4076) is published monthly by Elsevier B.V. (Radarweg 29, 1043 NX Amsterdam, The Netherlands). Periodicals postage paid at Rahway, NJ 07065-9998, USA, and at additional mailing offices. USA POSTMASTER: Send change of address to Journal of Econometrics, Elsevier Customer Service Department, 3251 Riverport Lane, Maryland Heights, MO 63043, USA. AIRFREIGHT AND MAILING in the USA by Mercury International Limited, 365 Blair Road, Avenel, NJ 07001-2231, USA. Orders, claims, and journal inquiries: Please contact the Elsevier Customer Service Department nearest you. St. Louis: Elsevier Customer Service Department, 3251 Riverport Lane, Maryland Heights, MO 63043, USA; phone: (877) 8397126 [toll free within the USA]; (+1) (314) 4478878 [outside the USA]; fax: (+1) (314) 4478077; e-mail:
[email protected]. Oxford: Elsevier Customer Service Department, The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, UK; phone: (+44) (1865) 843434; fax: (+44) (1865) 843970; e-mail:
[email protected]. Tokyo: Elsevier Customer Service Department, 4F Higashi-Azabu, 1-Chome Bldg., 1-9-15 Higashi-Azabu, Minato-ku, Tokyo 106-0044, Japan; phone: (+81) (3) 5561 5037; fax: (+81) (3) 5561 5047; e-mail:
[email protected]. Singapore: Elsevier Customer Service Department, 3 Killiney Road, #08-01 Winsland House I, Singapore 239519; phone: (+65) 63490222; fax: (+65) 67331510; e-mail:
[email protected]. Printed by Henry Ling Ltd., Dorchester, United Kingdom The paper used in this publication meets the requirements of ANSI/NISO Z39.48-1992 (Permanence of Paper)
Journal of Econometrics 161 (2011) 1–5
Contents lists available at ScienceDirect
Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom
Editorial
Introduction to measurement with theory Preface by W. A. Barnett and W. Erwin Diewert: This special issue on ‘‘Measurement with Theory’’ was originally proposed by our coeditor, the famous econometrician, Arnold Zellner. He worked with us very actively at all stages of our role in producing this special issue, which is in a subject area that has been important to Arnold’s research throughout his eminent career. Arnold died on August 11, 2010, only a few days after the final accepted version of the special issue was completed. Sadly he will not get to see in print this Journal of Econometrics special issue, which was his idea and for which he should get the primary credit. We shall miss that great man. 1. Overview This special issue of the Journal of Econometrics is devoted to papers that address various measurement problems that arise when economists and statisticians attempt to construct estimates of important economic variables. The emphasis is on measurement methods that are internally consistent with the economic theory that is relevant to the use of the data. It turns out that it is not an easy task to construct best-practice theory-based empirical estimates of important economic variables, and indeed, the construction of these variables is often contentious, when not formally connected to economic theory. Thus all of the papers in this special issue have a substantial economic theory component that helped guide the authors, when difficult choices had to be made in the construction of the various important variables that are discussed in this issue. The empirical construction of important economic variables is a topic that is not discussed very often in the Journal of Econometrics, although some of the most important contributions in that area have been published in that journal. But all of applied econometrics depends on economic data and if they are poorly constructed, no amount of clever econometric technique can overcome the fact that generally, garbage in will imply garbage out, as has been emphasized by Zellner and Montmarquette (1971), Zellner et al. (1987), and especially Zellner and Sankar (2005). Thus, we hope that the publication of this special issue will motivate econometric journals to encourage regular appearance of articles on economic measurement problems and will increase recognition among econometricians of the importance of theory-based economic measurement. This special issue includes major contributions to those objectives. The economic variables that are the focus of the articles in this issue are as follows:
• The measurement of technical change using index number techniques.
• The measurement and reconciliation of individual household wealth, income, savings and labour supply over time in a consistent fashion, using panel data on a sample of households, with emphasis on sources of upward mobility. • The measurement of gross employment flows (a worker perspective of flows in and out of employment) and gross job flows (an establishment perspective of flows in and out of employment). We will discuss each of these areas in turn in the following sections. 2. Measuring the flow of monetary services The stocks of various monetary assets (balances in savings accounts, checking accounts, etc.) can be summed into an overall asset stock, and this financial asset stock has historically played an important role in monetary theory and macroeconomics. For many countries, central banks collect information on the various components of the monetary stock and publish these estimates on a regular basis. These asset stocks are then compared with various macroeconomic indicators such as the CPI or nominal GDP, and inferences are made about the ‘‘tightness’’ or ‘‘looseness’’ of monetary policy. But are these ‘‘simple sum’’ asset stock estimates the ‘‘right’’ measures of the influence of money in the economy, when what is needed is the monetary service flow or the economic capital stock of money, defined to be the discounted present value of the flow? William A. Barnett and Marcelle Chauvet, in their paper in this special issue, argue that it is usually most appropriate to use a measure of the services of money as the ‘‘right’’ monetary measure that should be compared to nominal GDP or a general measure of inflation. Even if some monetary economists do not agree with this suggestion, it seems reasonable to compare monetary service flows with the flow of output and with a measure of general inflation, which is also a flow measure. Barnett (1980) is usually viewed as the paper that began the modern index-number-theoretic literature on measurement of monetary service flow, in a manner that is coherent with the theory on which applications of the data are based.1 Econometric inference of functional instability, when caused by use of monetary aggregate data formulas that are internally inconsistent with the nested relevant theory, is now sometimes called ‘‘The Barnett Critique’’.2 This econometric point of view is consistent with a long standing
• The measurement of the money supply or more specifically, the flow of services from the stock of monetary assets.
• The measurement of the Consumer Price Index (CPI) or more specifically, how scanner data can be used to improve the reliability of the CPI. 0304-4076/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2010.09.001
1 Subsequently extended to multilateral aggregation by Barnett (forthcoming-a) and to risk aversion by Barnett and Wu (forthcoming). 2 See, e.g., Chrystal and MacDonald (1994) and Belongia and Ireland (forthcoming).
2
Editorial / Journal of Econometrics 161 (2011) 1–5
tradition in aggregation theory. See, e.g., Zellner (1969) and Tobias and Zellner (2000). Based on that tradition, Barnett and Chauvet show how, at key times, the Divisia monetary aggregate behaves substantially differently from the financial asset stock, as measured by a simple sum, and has done so since monetary assets began yielding interest. They also show that data on the economic capital stock of money, measured as the discounted service flow, behave differently from the simple sum aggregate, since money is now a joint product producing both monetary services and investment interest yield. The economic capital stock of money, which is the discounted service flow, is equal to the simple sum aggregate minus the expected discounted interest yield.3 They go on to argue that recent economic history and many common views in monetary economics might have been quite different, had policy makers used and supplied to economists the Divisia monetary aggregates (or the closely related Fisher Ideal monetary services index) rather than the simple sum monetary aggregates. The econometric analysis provided in their paper supports the view that many of the ‘‘paradoxes’’ in monetary economics and monetary policy over the past 20 years can be understood as examples of The Barnett Critique.4 3. Scanner data and the Consumer Price Index The Consumer Price Index is probably the single most important statistic produced by a country’s statistical agency and thus, improving its accuracy is important. Most economists do not realize that there are some measurement problems associated with the production of a CPI in all countries. We will briefly explain how a typical CPI is constructed. First, household expenditures are broken up into a number of groups or strata which usually number somewhere between 200 and 1000. In each of the elementary strata and in each month, agents go to retail outlets and sample the prices of a number of items (perhaps 3 to 5) in each strata. If the price of an item can be collected in a base period and in the current period for the same outlet, then a price relative is formed for that item (the current price divided by the base period price). The price relatives for all items in a strata are then averaged in order to form an aggregate price relative for that strata.5 Up to this point, there is generally little or no weighting of the individual item price relatives by their economic importance. But once the strata price relatives (which are called elementary indexes) have been calculated, there is a final stage of aggregation where the aggregate household expenditure shares for a base period are used as weights for the elementary indexes and this is the overall CPI which compares prices in the current period to prices in the base period.6 From the above description of a ‘‘typical’’ CPI, it can be seen that it can only be a rough approximation to the ‘‘true’’ price level in the current period relative to the base period; i.e., only relatively few
3 Recent research has extended the economic capital stock of money formula to the case of risk aversion. 4 They also argue that misperceptions about systemic risk induced by the ‘‘Great Moderation’’ may have been associated with the use of the simple sum monetary aggregates. Misperceptions of decreased systemic risk are often viewed to have been responsible for the increased private risk-taken that led up to the current ‘‘Great Recession’’. This inherently controversial case is more heavily documents in Barnett (forthcoming-b). Economic agents base their decisions on conditional probabilities and conditional expectations. The information set, upon which economic agents condition, affects all sectors of the economy. 5 The details on exactly what form of averaging is to be used may be found in the ILO (2004). Generally, geometric averaging of the price relatives is recommended. 6 There is another level of complication in that the base period expenditure shares (collected in a separate household budget survey) generally refer to a base year which will be different from the base period month for the item prices!
item prices are collected out of the millions of prices that exist for household consumption items in an advanced economy and only imperfect expenditure shares for a possibly distant base year are used to weight the elementary indexes.7 We are all familiar with the use of scanners to record the price and quantity purchased of each item for our retail purchases. With the growth of computer power, this opens up the possibility of getting very accurate and detailed data on all retail sales for all items that have a separate bar code that identifies the item. In particular, it would not be necessary to sample these items (all item prices will be available) and it would not be necessary to have a separate survey for expenditure weights by item, since this information could also be obtained from the scanner data. Also, it should be much cheaper to collect the price and quantity data using scanner data as opposed to having statistical agency employees laboriously record individual prices as they visit retail outlets.8 For these reasons, it can be seen that the use of scanner data to improve the accuracy of a CPI is an important research topic. But initial experiments with scanner data proved to be disappointing. Statistics Netherlands was a pioneer in using scanner data from Dutch grocery chains as an aid in computing components of their CPI. Before we can explain the problem, it is first necessary to note that there are two possible philosophies that can be used to construct a CPI:
• A base month can be chosen and current item prices are always compared to their counterpart base prices; this leads to direct or fixed base index numbers. • The item prices collected this month are compared to the corresponding item prices collected in the previous month, and the resulting month to month index is used to update the level of the previous month’s index value; this leads to chained index numbers. The problem with fixed base indexes is that the list of available products tends to change rapidly over time; i.e., not only do new products rapidly get introduced to the marketplace but retail outlets often drop products for a few months (perhaps in order to promote a competitor’s products) and then bring them back at a later data. Thus if we attempt to compute fixed base index numbers, we soon end up with very few item matches (and thus the accuracy of a fixed base CPI will suffer). The above paragraph suggests that it may be best to use the chain principle in order to maximize product overlap. But if we use the chain principle, another problem emerges: the problem of chain drift. This problem dates back to Walsh (1901, p. 401) and Frisch (1936, p. 8)9 and it can be explained as follows: if we use the chain principle to compute indexes over a number of periods and the price and quantity data of the last period is exactly equal to the price and quantity data of the first period, then the chain principle will usually not give the correct answer to overall price change over the entire period (which is no change at all). Now if the prices and quantities change in a smoothly trending manner over the entire period and a superlative price index10 is used, usually the amount of chain drift will be small. But if we are computing indexes
7 These distant weights may not be representative of current period weights. 8 It seems to be possible to talk at least some retailing firms into supplying their data on sales to statistical agencies as a public good; e.g., some Dutch grocery chains supply their scanner data information to Statistics Netherlands free of charge. Note that scanner data are not available for the entire range of household consumption purchases, so that scanner data can only be used for components of the CPI. 9 See the paper by Ivancic, Diewert and Fox in this issue. 10 A superlative index number formula will usually approximate a household’s true cost of living index very closely; see Diewert (1976) and the ILO (2004) for the details. The Fisher (1922) ideal price index (the geometric average of the Laspeyres and Paasche indexes) is an example of a superlative index.
Editorial / Journal of Econometrics 161 (2011) 1–5
on a weekly or monthly frequency, then the Dutch experiments with scanner data showed tremendous downward drift if the chain principle was used. This downward chain drift persists even if a superlative index is used as the chain linking formula. Why does this downward drift occur? The problem appears to be periodic sales. Thus when an item goes on sale, the volume sold can jump enormously. The superlative chain-link index for the sale period, when compared with the previous period when the item was not on sale, will record a big downward movement (which is appropriate). When the sale ends and the item reverts back to its ‘‘normal’’ presale price, we would expect the superlative index to exactly reverse its previous downward movement and so the chained index should end up back where it started in the presale period. But this does not tend to happen. The reason for this is that purchasers have loaded up on the item which was on sale, so that in the subsequent period, they will tend to purchase less than they did in the period preceding the sale. Then in the post sale period, the chained index will end up lower than it was in the presale period, even though prices in the presale and post sale periods are identical. Thus it is this asymmetric movement of quantities surrounding a sale that explains the downward chain drift of the chained index. Thus the question arises: how can we use scanner data in an efficient manner to improve a CPI while avoiding the chain drift problem? The present special issue has three papers that look at aspects associated with the use of scanner data in a Consumer Price Index context. The first paper, by Lorraine Ivancic, W. Erwin Diewert and Kevin J. Fox, and the second paper, by Jan de Haan and Heymerik van der Grient, suggest a ‘‘new’’ method of aggregation that will lead to a large amount of matching of products over time (and thus avoid the problems associated with fixed base index numbers) but will also avoid the chain drift problem. The basic idea is to adapt Gini’s (1931) method for making international comparisons of prices between countries to the time series context.11 Thus each month is regarded as a ‘‘country’’ and a Fisher (1922) ideal price index for each month in a rolling window of say the last 13 months is computed; i.e., all 13 months are compared with the base month 1, giving rise to 13 monthly indexes of a fixed base nature (and thus there is no chain drift in these indexes). Then month 2 is used as the base month and all 13 months are compared to this new base month, leading to another 13 monthly indexes of a fixed base type. And this process is repeated until all 13 months have been used as the base month, leading to 13 sets of index numbers. The final Gini type parities are taken to be the geometric mean of these 13 sets of indexes. The movement in the final index between the last two months of the 13 month rolling window is used to update the on going monthly index. The details of the method will be left to the papers by IDF and HG to be explained, but the bottom line is that this new method for processing scanner data seems to work. It should be noted that IDF proposed the new method and illustrated it using scanner data for Australia but HG did much more extensive experiments on the method using Dutch data. At this stage, it seems likely that this method will be adopted by statistical agencies in coming years.12 The third paper using supermarket scanner data is by Alice O. Nakamura, Emi Nakamura and Leonard I. Nakamura (NNN). Their data set is very extensive. It consists of weekly price and quantity
11 Balk (1981) proposed this idea many years ago in order to deal with the problem of seasonality. IDF took this idea a bit further in combining it with the rolling year idea, which was proposed by Diewert (1983, 1999) in the context of dealing with seasonality. 12 The Netherlands and some other countries are already using the new method as a check on their existing methods which are ‘‘traditional’’ and not based on scanner data.
3
observations for product sales at grocery stores across the United States for the years 2001–2005 for a national sample of hundreds of grocery stores belonging to approximately 100 grocery chains. They focus on the weekly price and quantity data for all of the Universal Product Codes for three groups of products: coffee, cold cereal, and soft drinks. Their research focuses on two areas:
• What are the determinants of price variation for a particular item; i.e., how much variation in prices is there for stores in the same chain versus variation in prices across chains? The answer to this question has some relevance for the sampling procedures for constructing a CPI: if most of the variation in prices is between chains, then sampling a single store in a chain may be sufficient to capture price movements across all stores in that chain. • If sale prices are excluded from a price index, does this affect the overall accuracy of the resulting price index? The answer to the first question appears to be that between-store variation in prices is far more important than the variation in prices across the stores in a retail chain of stores. The answer to the second question is uncertain: looking at the results of NNN, excluding sale prices tends to give index values which show a greater rate of increase in prices if we use equally weighted Fisher indexes while the results are more ambiguous in the case of weighted Fisher indexes. Thus countries which presently exclude sale prices from their CPIs should carefully evaluate their procedures to ensure that there is no upward bias in their CPIs. 4. The measurement of technical change using the output distance function The measurement of productivity growth is an important measurement problem, since productivity growth (output growth less input growth) is the main determinant of improvement in living standards. An important component of productivity growth is technical progress: an expansion of the production possibilities set due to the development of new and improved techniques of production and management. Solow (1957) in his classic article showed how technical progress could be measured in the context of a one output, two input model of production using the continuous time differentiation techniques that date back to the French economist, Divisia (1926). Solow connected the rather mechanical total differentiation technique used by Divisia to economics by representing the technology by means of a constant returns to scale, one output, two input production function, and he also assumed competitive profit maximizing behavior on the part of producers. Using these assumptions, he was able to derive a workable nonparametric approximation to measuring technical progress using only observable price and quantity data that pertained to the production unit under consideration. The rather restrictive Solow assumptions of only one output and two inputs was relaxed by Jorgenson and Griliches (1967). They used the Divisia–Solow technique of continuous time differentiation and competitive profit maximizing behavior with a constant returns to scale technology, but their model allowed for an arbitrary number of outputs and inputs. In the end, they too were able to obtain a nonparametric approximation to the underlying rate of technical progress using only observable price and quantity data. Instead of using a single output production function to represent the technology, as was the case with Solow’s model, Jorgenson and Griliches used a transformation function t to represent the underlying technology; i.e., the output vector y is producible using the input vector x if and only if t (y, x) = 0. Unfortunately, the use of the transformation function to represent the underlying technology is problematic; i.e., if t (y, x) = 0 can represent the set of efficient output and input vectors, y and x respectively, then so can
4
Editorial / Journal of Econometrics 161 (2011) 1–5
g [t (y, x)] = 0 where g (z ) is any monotonic function of one variable z where g (0) = 0. Since the Divisia type analysis of Jorgenson and Griliches relies on the differentiability of t, it is unclear what this assumption, that t (y, x) is differentiable with respect to the components of y and x, entails. Shortly after the contribution of Jorgenson and Griliches, Shephard (1970) showed that distance functions could be used to represent the technology of a production unit. We will not formally define these functions here,13 but we will note that the use of these functions to describe a technology avoids the ambiguities associated with the use of transformation functions to describe technologies. But there is another reason why distance functions have become the most popular way to describe technologies in the applied production function literature. The reason is that distance functions can be used to represent the degree of inefficiency of a particular production unit as compared to the best practice technology in the industry. This relative inefficiency literature started with Farrell (1957) and was greatly popularized by Charnes and Cooper (1985), who introduced the term Data Envelopment Analysis (DEA) to describe the technique.14 This lengthy introduction brings us to the contribution of the paper by Feng and Serletis. This contribution was invited and processed for this special issue and was inadvertently published in a regular issue Volume 159 issue 1 December 2010. The main contribution of their paper is that they obtain a counterpart to the continuous time Divisia results of Jorgenson and Griliches but with three major changes:
• They represent the underlying technology by means of an output distance function rather than a transformation function;
• They do not necessarily assume competitive profit maximizing behavior and
• They do not necessarily assume constant returns to scale. Feng and Serletis show that when competitive profit maximizing behaviour along with a constant returns to scale technology is assumed, then their results collapse down to the classic results of Jorgenson and Griliches (1967). Feng and Serletis also show that when they assume monopolistic profit maximizing behaviour, their results essentially reduce to the results of Diewert and Fox (2008), who used a more parametric translog cost function approach to derive their results. Thus the paper by Feng and Serletis fills a gap in the productivity literature and generalizes the results of earlier researchers in this important area of applied economics. Feng and Serletis conclude their paper by looking at the axiomatic properties of their suggested measure of technical progress. 5. The measurement of household wealth, income, savings and labour in a consistent accounting framework with emphasis on sources of upward mobility Pawasutipaisit and Townsend, in their paper in this volume, use detailed income, balance sheet, and cash flow statements, constructed for households in a long monthly panel, in an emerging market economy, and some recent contributions in economic theory, to document and better understand the factors underlying success in achieving upward mobility in the distribution of net worth. Wealth inequality is decreasing over time, and many households work their way out of poverty and lower wealth over the seven year period. The accounts establish that, mechanically,
13 See Färe et al. (1985) and Färe and Primont (1995) for more complete explanations of the usefulness of distance functions to represent technologies. 14 The underlying idea of the method for measuring inefficiency is due to Debreu (1951).
this is largely due to savings rather than incoming gifts and remittances. In turn, the growth of net worth can be decomposed, household by household, into the savings rate and how productively that savings is used, the return on assets (ROA). The latter plays the larger role. ROA is, in turn, positively correlated with higher education of household members, younger age of the head, and with a higher debt/asset ratio and lower initial wealth, so it seems from cross-sections that the financial system is imperfectly channeling resources to productive and poor households. Household fixed effects account for the larger part of ROA, and this success is largely persistent, undercutting the story that successful entrepreneurs are those that simply get lucky. Persistence does vary across households, and in at least one province with much change and increasing opportunities, ROA changes as households move over time to higher-return occupations. But for those households with high and persistent ROA, the savings rate is higher, consistent with some micro founded macro models with imperfect credit markets. Indeed, high ROA households save by investing in their own enterprises and adopt consistent financial strategies for smoothing fluctuations. More generally growth of wealth, savings levels and/or rates are correlated with TFP and the household fixed effects that are the larger part of ROA. As ROA is a widely accepted indicator of success in corporate accounting, but less so in economic theory, we also estimate total factor productivity (TFP) as was anticipated earlier. Pawasutipaisit and Townsend find that it is correlated with ROA and in turn with candidate covariates, but less than before. The data are also adjusted for aggregate risk, consistent with the perfect markets, capital asset pricing model at the village level, utilizing the work of Samphantharak and Townsend (2009). Pawasutipaisit and Townsend find a correlation of risk-adjusted returns with ROA, as a measure of individual talent, and a correlation of risk-adjusted returns with growth of net worth. But overall results are weaker, for example, high risk-adjusted return households do not invest more in their own enterprises. This suggests again that the capital markets are not perfect in these data, though there remain some consumption anomalies. Their evidence of potential imperfections in the credit market is strong: in production function estimation and the divergence of marginal products of capital from the average interest rate, the correlation of savings with persistence of ROA, in the reinvestment of the profits of high ROA households into their own enterprises, and in financial strategies, that high ROA households are not using capital assets to smooth consumption and conversely are using consumption to finance investment deficits. Clearly there are opportunities for much further productive research in this area. 6. The measurement of gross employment and gross job flows The measurement of gross flows of workers into and out of employment has occupied applied economists for more than thirty years. For decades the Bureau of Labor Statistics (BLS) has derived these measurements from the Current Population Survey (CPS) in the United States. In other countries statistical agencies use similar instruments, usually called labor force surveys, to measure gross worker flows. For discussion of gross worker flow problems in the context of the Current Population Survey, see Abowd and Zellner (1985). For relevant early research, see Galenson and Zellner (1956). A coherent aggregate story has emerged from studies of worker and job flows. Gross flows greatly exceed net flows. Furthermore, worker flows — accessions and separations — exceed job flows — creations and destructions. The magnitude of the ‘‘churning’’ depends upon the state of the economy, weakly, and the whether the employer is growing, staying constant, or shrinking. Modeling these gross flows, especially during economic downturns and for
Editorial / Journal of Econometrics 161 (2011) 1–5
different demographic groups, has been a goal of many individual and agency researchers. The Quarterly Workforce Indicators (QWI) are local labor market data produced and released every quarter by the United States Census Bureau. Unlike any other local labor market series produced in the US or the rest of the world, the QWI measure employment flows for workers (accession and separations), jobs (creations and destructions) and earnings for demographic subgroups (age and gender), economic industry (NAICS industry groups), detailed geography (block (experimental), county, CoreBased Statistical Area, and Workforce Investment Area), and ownership (private, all) with fully interacted publication tables. State participation is sufficiently extensive to permit Abowd and Vilhuber, in their paper in this volume, to present the first national estimates constructed from these data. Their national estimates from the QWI are an important enhancement to existing series, because they include demographic and industry detail for both worker and job flow data compiled from underlying micro-data that have been integrated at the job and establishment levels by the Longitudinal Employer–Household Dynamics Program at the Census Bureau. The estimates presented in their paper were compiled exclusively from public-use data series and are available for download. Abowd and Vilhuber find that prior estimates of worker reallocation rates and ‘‘churning’’ are seriously biased and they directly measure the extent of the bias. References Abowd, J.M., Zellner, A., 1985. Estimating gross labor force flows. Journal of Business and Economic Statistics 3, 254–283. Balk, B.M., 1981. A simple method for constructing price indices for seasonal commodities. Statistische Hefte 22, 1–8. Barnett, W.A., 1980. Economic monetary aggregates: an application of aggregation and index number theory. Journal of Econometrics 14, 11–48; Barnett, W.A., Serletis, A. (Eds.), 2000. The Theory of Monetary Aggregation. Elsevier, Amsterdam, pp. 6–10. (Reprinted). Barnett, W.A., 2007. Multilateral aggregation-theoretic monetary aggregation over heterogeneous countries. Journal of Econometrics 136, 457–482; Barnett, W.A., Chauvet, M. (Eds.), 2010. Financial Aggregation and Index Number Theory: A Survey. World Scientific. Singapore (Reprinted) (Chapter 6) (forthcoming-a). Barnett, W.A., 2011. Getting It Wrong: How Faulty Monetary Statistics Undermine the Fed, the Financial System, and the Economy. MIT Press. Cambridge, MA (forthcoming-b). Barnett, W.A., Wu, S., 2005. On user costs of risky monetary assets. Annals of Finance 1, 35–50; Barnett, W.A., Chauvet, M. (Eds.), 2010. Financial Aggregation and Index Number Theory: A Survey. World Scientific. Singapore (Reprinted) (Chapter 3) (forthcoming). Belongia, M., Ireland, P., 2010. The Barnett critique after three decades: a new Keynesian analysis. Journal of Econometrics (forthcoming). Charnes, A., Cooper, W.W., 1985. Preface to topics in data envelopment analysis. Annals of Operations Research 2, 59–94. Chrystal, A., MacDonald, R., 1994. Empirical evidence on the recent behaviour and usefulness of simple-sum and weighted measures of the money stock. Federal Reserve Bank of St. Louis Review 76, 73–109. Debreu, G., 1951. The coefficient of resource utilization. Econometrica 19, 273–292. Diewert, W.E., 1976. Exact and superlative index numbers. Journal of Econometrics 4, 114–145.
5
Diewert, W.E., 1983. The treatment of seasonality in a cost of living index. In: Diewert, W.E., Montmarquette, C. (Eds.), Price Level Measurement. Statistics Canada. Ottawa. pp. 1019–1045. Diewert, W.E., 1999. Index number approaches to seasonal adjustment. Macroeconomic Dynamics 3, 1–21. Diewert, W.E., Fox, K.J., 2008. On the estimation of returns to scale, technical progress and monopolistic markups. Journal of Econometrics 145, 174–193. Divisia, F., 1926. L’Indice Monetaire et la Theorie de la Monnaie. Societe anonyme du Recueil Sirey, Paris. Färe, R., Grosskopf, S., Lovell, C.A.K., 1985. The Measurement of Efficiency of Production. Kluwer-Nijhoff, Boston. Färe, R., Primont, D., 1995. Multi-Output Production and Duality: Theory and Applications. Kluwer, Amsterdam. Farrell, M.J., 1957. The measurement of production efficiency. Journal of the Royal Statistical Society, Series A 120, 253–278. Fisher, I., 1922. The Making of Index Numbers. Houghton Mifflin, Boston. Frisch, R., 1936. Annual survey of general economic theory: the problem of index numbers. Econometrica 4, 1–39. Galenson, W., Zellner, A., 1956. International comparison of unemployment rates. In: National Bureau of Economic Research Volume, The Measurement and Behavior of Unemployment. Princeton University Press, Princeton, pp. 439–581. Gini, C., 1931. On the circular test of index numbers. Metron 9, 3–24. International Labour Organization, 2004. P. Hill (Ed.), Consumer Price Index Manual: Theory and Practice. ILO, Geneva. Jorgenson, D.W., Griliches, Z., 1967. The explanation of productivity change. The Review of Economic Studies 34, 249–283. Samphantharak, K., Townsend, R.M., 2009. Households as Corporate Firms: An Analysis of Household Finance Using Integrated Household Surveys and Corporate Financial Accounting (Econometric Society Monographs). Cambridge University Press, Cambridge, UK. Shephard, R.W., 1970. Theory of Cost and Production Functions. Princeton University Press, Princeton. Solow, R.M., 1957. A contribution to the theory of economic growth. Quarterly Journal of Economics 70, 65–94. Tobias, J., Zellner, A., 2000. A note on aggregation, disaggregation and forecasting performance. Journal of Forecasting 19, 457–469. Walsh, C.M., 1901. The Measurement of General Exchange Value. Macmillan, New York. Zellner, A., 1969. On the aggregation problem: a new approach to an old problem. In: Fox, K., et al. (Eds.), Economic Models, Estimation, and Risk Programming: Essays in Honor of Gerhard Tintner. Springer Verlag, Berlin, pp. 365–374. Zellner, A., Garcia-Ferrer, A., Highfield, R.A., Palm, F., 1987. Macroeconomic forecasting using pooled international data. Journal of Business and Economic Statistics 5, 53–67. Zellner, A., Montmarquette, C., 1971. A study of some aspects of temporal aggregation problems in econometric analyses. Review of Economics and Statistics 53, 335–342. Zellner, A., Sankar, U., 2005. On errors in the variables. In: Mythili, G., Hema, R. (Eds.), Topics in Applied Economics: Tools, Issues and Institutions. Academic Foundation, New Delhi, pp. 49–86.
William A. Barnett ∗ University of Kansas, United States E-mail address:
[email protected]. W. Erwin Diewert University of British Columbia, Canada Arnold Zellner University of Chicago, United States Available online 15 September 2010 ∗ Corresponding editor.
Journal of Econometrics 161 (2011) 6–23
Contents lists available at ScienceDirect
Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom
How better monetary statistics could have signaled the financial crisis William A. Barnett a,∗ , Marcelle Chauvet b a
University of Kansas, United States
b
University of California at Riverside, United States
article
info
Article history: Available online 15 September 2010 JEL classification: E40 E52 E58 C43 E32 Keywords: Measurement error Monetary aggregation Divisia index Aggregation Monetary policy Index number theory Financial crisis Great moderation Federal Reserve
abstract This paper explores the disconnect of Federal Reserve data from index number theory. A consequence could have been the decreased-systemic-risk misperceptions that contributed to excess risk-taking prior to the housing bust. We find that most recessions in the past 50 years were preceded by more contractionary monetary policy than indicated by simple-sum monetary data. Divisia monetary aggregate growth rates were generally lower than simple-sum aggregate growth rates in the period preceding the Great Moderation, and higher since the mid 1980s. Monetary policy was more contractionary than likely intended before the 2001 recession and more expansionary than likely intended during the subsequent recovery. © 2010 Elsevier B.V. All rights reserved.
1. Introduction 1.1. The great moderation and the current crisis Over the past couple of decades, there has been an increasingly widely accepted view that monetary policy had dramatically improved. In fact, some attribute this as the cause of the ‘‘Great Moderation’’ in the economy’s dynamics, by which output volatility had decreased substantially since the mid 1980s. On Wall Street, that widely believed view was called the ‘‘Greenspan Put’’. Even Lucas (2003), who has become a major authority on the business cycle through his path-breaking publications in that area (see e.g., Lucas, 1987), had concluded that economists should redirect their efforts towards long-term fiscal policy aimed at increasing economic growth. Since central banks were viewed as having become very successful at damping the business cycle, he concluded that possible welfare gains from further moderations in the business cycle were small. It is not our intent to take a position on what the actual causes of that low volatility had been, to argue that the Great Moderation’s
∗
Corresponding author. E-mail address:
[email protected] (W.A. Barnett).
0304-4076/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2010.09.002
causes are gone, or that the low volatility that preceded the current crisis will not return. Rather our objective is to investigate whether changes in monetary aggregates were in line with the intended monetary policy before and during the Great Moderation, and if there was a significant change in the latter period that would legitimate assertions regarding a lower risk of future recessions since the mid 1980s. We provide years of empirical evidence that do not support the view that monetary policy had significantly improved before or after the Great Moderation and, mostly important, around recessions. We also argue that this lack of support was not factored into the decisions of many decision makers, who had increased their leverage and risk-taking to levels now widely viewed as having been excessive. This conclusion will be more extensively documented in Barnett (forthcoming). A popular media view about the recent crisis is that the firms and households that got into trouble are to blame. According to much of the popular press and many politicians, the Wall Street professionals and bankers are to blame for having taken excessive, self-destructive risk out of ‘‘greed’’. But who are the Wall Street professionals who decided to increase their leverage to 35:1? As is well known, they comprise a professional elite, including some of the country’s most brilliant financial experts. Is it reasonable to assume that such people made foolish, self-destructive decisions out of ‘‘greed?’’ If so, how should we define ‘‘greed’’ in economic theory, so that we can test the hypothesis? What about the mortgage
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
lenders at the country’s largest banks? Were their decisions dominated by greed and self-destructive behavior? Economic theory is not well designed to explore such hypotheses, and if the hypotheses imply irrational behavior, how would we reconcile a model of irrational behavior with the decisions of some of the country’s most highly qualified experts in finance? Similarly how would one explain the fact that the Supervision and Regulation Division of the Federal Reserve Board’s staff ignored the high risk loans being made by banks? Were they simply not doing their job, or perhaps did they too believe that systemic risk had declined, so that increased risk-taking by banks was viewed to be prudent? To find the cause of the crisis, it is necessary to look carefully at the data that produced the impression that the business cycle had been moderated permanently to a level supporting greater risk taking by investors and lenders. To find the causes of the ‘‘Great Moderation’’, central bank policy may be the wrong place to look. The federal funds rate has been the instrument of policy in the US for over a half century, and the Taylor rule, rather than being an innovation in policy design, is widely viewed as fitting historic Federal Reserve behavior for a half century.1 The Great Moderation in real business cycle volatility may have been produced by events unrelated to monetary policy, such as the growth of US productivity, improved technology and communications permitting better planning of inventories and management, financial innovation, the rise of China as a holder of American debt and supplier of low-priced goods, perhaps permitting an expansionary monetary policy that otherwise might have been inflationary, or even a decrease in the size and volatility of shocks, known as the ‘‘good luck’’ hypothesis. This paper provides an overview of some of the data problems that produced the misperceptions of superior monetary policy. The focus of this paper is not on what did cause the Great Moderation, but rather on the possible causes of the misperception that there was a permanent reduction in systemic risk associated with improved monetary policy. This paper documents the fact that the quality of Federal Reserve data in recent years has been poor, disconnected from reputable index number theory, and inconsistent with competent accounting principles. It is postulated that these practices could have contributed to the misperceptions of a permanent decrease in systemic risk. This paper’s emphasis is on econometric results displayed in graphics. In all of the illustrations in this paper, the source of the misperceptions is traced to data problems. We find, for example, that the largest discrepancies between microeconomic theorybased monetary aggregate (Divisia) and simple sum monetary aggregates occur during times of high uncertainty, such as around recessions or at the beginning or end of high interest rate phases. In particular, the rate of growth of Divisia monetary aggregates decreases a lot more before recessions and increases substantially more during recessions and recoveries than simple sum aggregates. In addition, we find that the rate of growth rate of Divisia monetary aggregates was generally lower than the rate of growth of simple sum aggregates in the period that preceded the Great Moderation, and higher since the mid 1980s. For the recent years, this indicates, for example, that monetary policy could have been more contractionary than intended before recessions and more expansionary than intended during the most recent recovery after the 2001 recession. There is a train of thought that maintains that the current US financial crisis was prompted by excessive
1 Although some other countries have adopted inflation targeting, Federal Reserve’s policy innovation in recent years, if any, has most commonly been characterized as New Keynesian with a Taylor rule. The Greenspan Put has most commonly been characterized as an asymmetric interest rate policy more aggressive on the downside than the upside, but that was not without precedent.
7
money creation fueling the bubbles. The process started in early 2001, when the money supply was increased substantially to minimize the economic recession that had started in March of that year. However, in contrast with previous recessions, money supply continued to be high for a few years after the recession’s end in November 2001. In fact, the money supply was high until mid 2004, after which it started decreasing slowly. The argument is that the monetary expansion during this first period led to both speculation and leveraging, especially regarding lending practices in the housing sector. This expansion is argued as having made it possible for marginal borrowers to obtain loans with lower collateral values. On the other hand, the Federal Reserve started increasing the target value for the federal funds rate since June 2004. We find that the rate of growth of money supply as measured by Divisia monetary aggregate fell substantially more than the rate of growth of simple sum monetary aggregate. Thus, the official simple sum index could have veiled a much more contractionary policy by the Federal Reserve than intended, which also occurred prior to most recessions in the last 50 years. For the recent period, when money creation slowed, housing prices began to decline, leading many to own negative equity and inducing a wave of defaults and foreclosures. We see no reason to believe that the Federal Reserve would have as a goal to create ‘excessive’ money growth. Had they known that the amount of money circulating in the economy was excessive and could generate an asset bubble, monetary policy would have been reverted long before it did. Conversely, we see no reason to believe that the Federal Reserve intended to have an excessive contractionary monetary policy before the crisis. We provide evidence indicating that data problems may have misled Federal Reserve policy to feed the bubbles with unintentionally excess liquidity and then to burst the bubbles with an excessively contractionary policy. In short, in every illustration that we provide, the motives of the decision makers, whether private or public, were good. But the data were bad. 1.2. Overview Barnett (1980) derived the aggregation-theoretic approach to monetary aggregation and advocated the use of the Divisia or Fisher Ideal index with user cost prices in aggregating over monetary services. Since then, Divisia monetary aggregates have been produced for many countries.2 But despite this vast amount of research, most central banks continue officially to supply the simple-sum monetary aggregates, which have no connection with aggregation and index number theory. In contrast, the International Monetary Fund (2008) has provided an excellent discussion of the merits of Divisia monetary aggregation, and the Bank of England publishes them officially. The European Central Bank’s staff uses them in informing the Council on a quarterly basis, and the St. Louis Federal Reserve Bank provides them for the US. The simple sum monetary aggregates have produced repeated inference errors, policy errors, and needless paradoxes leading up to the most recent misperceptions about the source of the Great Moderation. In this paper, we provide an overview of that history in chronological order.
2 For example, Divisia monetary aggregates have been produced for Britain (Batchelor, 1989; Drake, 1992; Belongia and Chrystal, 1991), Japan (Ishida, 1984), the Netherlands (Fase, 1985), Canada (Cockerline and Murray, 1981), Australia (Hoa, 1985), and Switzerland (Yue and Fluri, 1991), among many others. More recently, Barnett (2007) has extended the theory to multilateral aggregation over different countries with potentially different currencies, and Barnett and Wu (2005) have extended to the case of risky contemporaneous interest rates, as is particularly relevant when exchange rate risk is involved. That research was particularly focused on the needs of the European Central Bank.
8
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
We conclude with a discussion of the most recent research in this area, which introduces state-space factor modeling into this literature. We also display the most recent puzzle regarding Federal Reserve data on nonborrowed reserves and show that the recent behavior of that data contradicts the definition of nonborrowed reserves. Far from resolving the earlier data problems, the Federal Reserve’s most recent data may be the most puzzling that the Federal Reserve has ever published. 1.3. The history There is a vast literature on the appropriateness of aggregating over monetary asset components using simple summation. Linear aggregation can be based on Hicksian aggregation (Hicks, 1946), but that theory only holds under the unreasonable assumption that the relative user-cost prices of the services of individual money assets do not change over time. This condition implies that each asset is a perfect substitute for the others within the set of components. Simple sum aggregation is an even more severe special case of that highly restrictive linear aggregation, since simple summation requires that the coefficients of the linear aggregator function all be the same. This, in turn, implies that the constant user-cost prices among monetary assets be exactly equal to each other. Not only must the assets be perfect substitutes, but must be perfect one-for-one substitutes—i.e., must be indistinguishable assets, with one unit of each asset being a perfect substitute for exactly one unit of each of the other assets. In reality, financial assets provide different services, and each such asset yields its own particular rate of return. As a result, the user costs, which measure foregone interest and thereby opportunity cost, are not constant and are not equal across financial assets. The relative user-cost prices of US monetary assets fluctuate considerably. For example, the interest rates paid on many monetary assets are not equal to the zero interest rate paid on currency. These observations have motivated serious concerns about the reliability of the simple-sum aggregation method, which has been disreputable in the literature on index number theory and aggregation theory for over a century. In addition, an increasing number of imperfectly substitutable shortterm financial assets has emerged in recent decades. Since monetary aggregates produced from simple summation do not accurately measure the quantities of monetary services chosen by optimizing agents, shifts in the series can be spurious and can produce erroneous appearance of instability of structural functions containing monetary services variables. Microeconomic aggregation theory offers an appealing alternative approach to the measurement of money, compared to the atheoretical simple-sum method. The quantity index under the aggregation-theoretic approach extracts and measures the income effects of changes in relative prices and is invariant to substitution effects, which do not alter utility and thereby do not alter perceived services received. The simple-sum index, on the other hand, does not distinguish between income and substitution effects and thereby confounds together substitution effects with actual monetary services received. The aggregation-theoretic monetary aggregator function, which correctly internalizes substitution effects, can be tracked accurately by the Divisia quantity index, constructed by using expenditure shares as the component growth-rate weights. Barnett (1978, 1980) derived the formula for the user-cost price of a monetary asset, needed in computation of the Divisia index’s share weights, and thereby originated the Divisia monetary aggregates. The growth rate weights resulting from this approach are different across assets, depending on all of the quantities and interest rates in each share, and those weights can be time-varying at each point in time. For a detailed description of the theory underlying this construction, see Barnett (1982, 1987).
The user-cost prices are foregone interest rates, with foregone interest measured as the difference between the rate of return on a pure investment, called the benchmark asset, and the own rate of return on the component asset. It is important to understand that the direction in which an asset’s growth-rate weight will change with an interest rate change is not predictable in advance. Consider the Cobb–Douglas utility. Its shares are independent of relative prices, and hence of the interest rates within the component usercost prices. For other utility functions, the direction of the change in shares with a price change, or equivalently with an interest rate change, depends upon whether the own price elasticity of demand exceeds or is less than −1. In elementary microeconomic theory, this often overlooked phenomenon produces the famous ‘‘diamonds versus water paradox’’ and is the source of most of the misunderstandings of the Divisia monetary aggregates’ weighting, as explained by Barnett (1983). Several authors have studied the empirical properties of the Divisia index compared with the simple sum index. The earliest comparisons are in Barnett (1982) and Barnett et al. (1984). Barnett and Serletis (2000) collect together and reprint seminal journal articles from this literature.3 Barnett (1997) has documented the connection between the well-deserved decline in the policycredibility of monetary aggregates and the defects that are peculiar to simple sum aggregation. The most recent research in this area is Barnett et al. (2009), who compare the different dynamics of simple-sum monetary aggregates and the Divisia indexes, not only over time, but also over the business cycle and across high and low inflation and interest rate phases. Information about the state of monetary growth becomes particularly relevant for policymakers, when inflation enters a high-growth phase or the economy begins to weaken. Factor models with regime switching have been widely used to represent business cycles (see e.g. Chauvet, 1998, 2001; Chauvet and Hamilton, 2006; Chauvet and Piger, 2008), but without relationship to aggregation theory. Barnett, Chauvet, and Tierney’s model differs from the literature as the focus is not only on the estimated common factor, but on the idiosyncratic terms that reflect the divergences between the simple sum and Divisia monetary aggregate in a manner relevant to aggregation theory. 2. Monetary aggregation theory 2.1. Monetary aggregation Aggregation theory and index-number theory have been used to generate official governmental data since the 1920s. For example, the Bureau of Economic Analysis uses the Fisher ideal index in producing the national accounts. One exception still exists. The monetary quantity aggregates and interest rate aggregates officially supplied by many central banks are not aggregationtheoretic index numbers, but rather are the simple unweighted sums of the component quantities and the quantity-weighted or arithmetic averages of interest rates. The predictable consequence has been induced instability of money demand and supply functions and a series of ‘puzzles’ in the resulting applied literature. In contrast, the Divisia monetary aggregates, originated by Barnett (1980, 1987) are derived directly from economic index-number theory.
3 More recent examples include Belongia (1996), Belongia and Binner (2000), Belongia and Ireland (2006), and Schunk (2001). The comprehensive survey found in Barnett and Serletis (2000) has been updated to include more recent article reprints in Barnett and Chauvet (forthcoming). Other overviews of published theoretical and empirical results in this literature are available in Barnett et al. (1992) and Serletis (2006).
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
Data construction and measurement procedures imply the theory that can rationalize the aggregation procedure. The assumptions implicit in the data construction procedures must be consistent with the assumptions made in producing the models within which the data are nested. Unless the theory is internally consistent, the data and its applications are incoherent. Without that coherence between aggregator function structure and the econometric models within which the aggregates are embedded, stable structures can appear to be unstable. This phenomenon has been called the ‘Barnett critique’ by Chrystal and MacDonald (1994) and in a very important recent paper by Belongia and Ireland (forthcoming). 2.2. Aggregation theory versus index number theory The exact aggregates of microeconomic aggregation theory depend on unknown aggregator functions, which typically are utility, production, cost, or distance functions. Such functions must first be econometrically estimated. Hence the resulting exact quantity and price indexes become estimator and specification dependent. This dependency is troublesome to governmental agencies, which therefore view aggregation theory as a research tool rather than a data construction procedure. Statistical index-number theory, on the other hand, provides nonparametric indexes which are computable directly from quantity and price data, without estimation of unknown parameters. Within the literature on aggregation theory, such index numbers depend jointly on prices and quantities in two periods, but not on unknown parameters. In a sense, index number theory trades joint dependence on 2-period prices and quantities for dependence on unknown parameters. Examples of such statistical index numbers are the Laspeyres, Paasche, Divisia, Fisher ideal, and Törnqvist indexes. The formerly loose link between index number theory and aggregation theory was tightened, when Diewert (1976) defined the class of second-order ‘superlative’ index numbers, which track any unknown aggregator function up to the second order. Statistical index number theory became part of microeconomic theory, as economic aggregation theory had been for decades. Statistical index numbers are judged by their non-parametric tracking ability to the aggregator functions of aggregation theory. For decades, the link between statistical index-number theory and microeconomic aggregation theory was weaker for aggregating over monetary quantities than for aggregating over other goods and asset quantities. Once monetary assets began yielding interest, monetary assets became imperfect substitutes for each other, and the ‘price’ of monetary-asset services was no longer clearly defined. That problem was solved by Barnett (1978, 1980) derivation of the formula for the user cost of demanded monetary services.4 2.3. The economic decision Consider a decision problem over monetary assets. The decision problem will be defined in the simplest manner that renders the relevant literature on economic aggregation immediately applicable.5 Initially we shall assume perfect certainty.
4 Subsequently Barnett (1987) derived the formula for the user cost of supplied monetary services. A regulatory wedge can exist between the demand and supplyside user costs, if non-payment of interest on required reserves imposes an implicit tax on banks. Another excellent source on the supply side is Hancock (1991), who correctly produced the implicit tax on banks formula. 5 Our research in this paper is not dependent upon this simple decision problem, as shown by Barnett (1987), who proved that the same aggregator function and index number theory applies, regardless of whether the initial model has money in
9
Let m′t = (m1t , m2t , . . . , mnt ) be the vector of real balances of monetary assets during period t, let rt be the vector of nominal holding-period yields for monetary assets during period t, and let Rt be the one period holding yield on the benchmark asset during period t. The benchmark asset is defined to be a pure investment that provides no services other than its yield, Rt , so that the asset is held solely to accumulate wealth. Thus, Rt is the maximum holding-period yield in the economy during period t. Let yt be the real value of the total budgeted expenditure on monetary services during period t. Under conventional assumptions, the conversion between nominal and real expenditure on the monetary services of one or more assets is accomplished using the true cost of living index, p∗t = p∗t (pt ), on consumer goods, where the vector of consumer goods prices is pt .6 The optimal portfolio allocation decision is: maximize u(mt )
(1)
subject to π′t mt = yt , where π′t = (π1t , . . . , πnt ) is the vector of monetary-asset real user costs, with
πit =
Rt − rit 1 + Rt
.7
(2)
The function u is the decision maker’s utility function, assumed to be monotonically increasing and strictly concave.8 Let m∗t be derived by solving decision (1). Under the assumption of linearly homogeneous utility, the exact monetary aggregate of economic theory is the utility level associated with holding the portfolio, and hence is the optimized value of the decision’s objective function: Mt = u(m∗t ).
(3)
the utility function, or money in a production function, or neither, so long as there is intertemporal separability of structure and certain assumptions are satisfied for aggregation over economic agents. The aggregator function is the derived function that has been shown in general equilibrium always to exist, if money has a positive value in equilibrium, regardless of the motive for holding money. 6 The multilateral open economy extension is available in Barnett (2007). 7 There is a long history regarding the ‘‘price of money’’. Keynes and the classics were divided about whether it was the inflation rate of ‘‘the rate of interest’’. The latter would be correct for noninterest bearing money in continuous time. In that case, as can be seen from Eq. (2), the user cost becomes Rt . More recently, Diewert (1974) acquired the formula relevant to discrete time for noninterest bearing money, Rt /(1 + Rt ). Perhaps the first to recognize the relevance of the opportunity cost, Rt − rt , for interest bearing money was Hutt (1963, p. 92, footnote), and he advocated what later become known as the CE index derived by Rotemberg et al. (1995). Friedman and Schwartz (1970, pp. 151–152), document many attempts by Friedman’s students to determine the user cost formula and apply index number theory to monetary aggregation. But that work preceded the user cost derivation by Barnett and the work of Diewert (1976, 1978) on superlative index number theory. Those attempts were not based on valid user cost formulas or modern index numbers. The best known initial attempt to use aggregation theory for monetary aggregation was by Chetty (1969). But he used an incorrect user cost formula, which unfortunately was adopted by many other economists in subsequent research in monetary aggregation. Through analogous economic reasoning, Donovan (1978) acquired the correct real user cost formula, (2). As a result of the confusion produced by the competing user cost formulas generated from economic reasoning, application to monetary aggregation was hindered until Barnett (1978, 1980) formally derived the formula by the normal method of proof using the sequence of flow of funds identities in the relevant dynamic programming problem. Regarding that formal method of proof, see Deaton and Muellbauer (1980). Barnett’s proof and his derivation within an internally consistent aggregation theoretic framework marked the beginning of the modern literature on monetary aggregation. 8 To be an admissible quantity aggregator function, the function u must be weakly separable within the consumer’s complete utility function over all goods and services. Producing a reliable test for weak separability is the subject of much intensive research, most recently by Barnett and Peretti (2009). If yt were nominal, then the user cost formula would have to be nominal, acquired by multiplying Eq. (2) by the price index used to deflate nominal to real income. Regarding the choice of deflator, see Feenstra (1986).
10
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
2.4. The Divisia index
2.6. Dual space
Although Eq. (3) is exactly correct, it depends upon the unknown function, u. Nevertheless, statistical index-number theory enables us to track Mt exactly without estimating the unknown function, u. In continuous time, the monetary aggregate, Mt = u(m∗t ), can be tracked exactly by the Divisia index, which solves the differential equation
User cost aggregates are duals to monetary quantity aggregates. Either implies the other uniquely. In addition, user-cost aggregates imply the corresponding interest-rate aggregates uniquely. The interest-rate aggregate rt implied by the user-cost aggregate Πt is the solution for rt to the equation:
d log Mt dt
=
−
sit
Rt − rt
d log m∗it
(4)
dt
i
for Mt , where πit m∗it sit = yt is the i’th asset’s share in expenditure on the total portfolio’s service flow.9 The dual user cost price aggregate Πt = Π (πt ), can be tracked exactly by the Divisia price index, which solves the differential equation d log Πt dt
=
−
sit
d log πit
i
dt
.
(5)
The user cost dual satisfies Fisher’s factor reversal in continuous time:
Πt Mt = π′t mt .
(6)
As a formula for aggregating over quantities of perishable consumer goods, that index was first proposed by François Divisia (1925), with market prices and quantities of those goods used in Eq. (4). In continuous time, the Divisia index, under conventional neoclassical assumptions, is exact. In discrete time, the Törnqvist approximation is: log Mt − log Mt −1 =
−
s¯it (log m∗it − log m∗i,t −1 ),
(7)
i
where 1 s¯it = (sit + si,t −1 ). 2 In discrete time, we often call Eq. (7) simply the Divisia quantity index.10 After the quantity index is computed from (7), the user cost aggregate most commonly is computed directly from Eq. (6).
1 + Rt
= Πt .
Accordingly, any monetary policy that operates through the opportunity cost of money (that is, interest rates) has a dual policy operating through the monetary quantity aggregate, and vice versa. Aggregation theory implies no preference for either of the two dual policy procedures or for any other approach to policy, so long as the policy does not violate the principles of aggregation theory. In their current state-space comparisons, Barnett, Chauvet, and Tierney model in quantity space rather than the user-costprice or interest-rate dual spaces. Regarding policy in the dual space, see Barnett (1987) and Belongia and Ireland (2006). 2.7. Aggregation error and policy slackness Fig. 1 displays the magnitude of the aggregation error and policy slackness produced by the use of the simple sum monetary aggregates. Suppose there are two monetary assets over which the central bank aggregates. The quantity of each of the two component assets is y1 and y2 . Suppose that the central bank reports, as data, that the value of the simple sum monetary aggregates is Mss . The information content of that reported variable level is contained in the fact that the two components must be somewhere along the Fig. 1 hyperplane, y1 + y2 = Mss , or more formally that the components are in the set A: A = {(y1 , y2 ) : y1 + y2 = Mss }. But according to Eq. (3), the actual value of the service flow from those asset holdings is u(y1 , y2 ). Consequently the information content of the information set A regarding the monetary service flow is that the service flow is in the set E:
2.5. Risk adjustment
E = {u(y1 , y2 ) : (y1 , y2 ) ∈ A}.
Extension of index number theory to the case of risk was introduced by Barnett et al. (1997), who derived the extended theory from Euler equations rather than from the perfect-certainty first-order conditions used in the earlier index number-theory literature. Since that extension is based upon the consumption capital-asset-pricing model (CCAPM), the extension is subject to the ‘equity premium puzzle’ of smaller-than-necessary adjustment for risk. We believe that the under-correction produced by CCAPM results from its assumption of intertemporal blockwise strong separability of goods and services within preferences. Barnett and Wu (2005) have extended Barnett, Liu, and Jensen’s result to the case of risk aversion with intertemporally non-separable tastes.11
Note that E is not a singleton. To see the magnitude of the slackness in that information, observe from Fig. 1 that if the utility level (service flow) is Mmin , then the indifference curve does touch the hyperplane, A, at its lower right corner. Hence that indifference curve cannot rule out the Mss reported value of the simple sum monetary aggregate, although a lower level of utility is ruled out, since indifference curves at lower utility levels cannot touch the hyperplane, A. Now consider the higher utility level of Mmax and its associated indifference curve in Fig. 1. Observe that that indifference curve also does have a point in common with the hyperplane, A, at the tangency. But higher levels of utility are ruled out, since their indifference curves cannot touch the hyperplane, A. Hence the
9 In Eq. (4), it is understood that the result is in continuous time, so the time subscripts are a shorthand for functions of time. We use t to be the time period in discrete time, but the instant of time in continuous time. 10 Diewert (1976) defines a ‘superlative index number’ to be one that is exactly correct for a quadratic approximation to the aggregator function. The discretization (7) to the Divisia index is in the superlative class, since it is exact for the quadratic translog specification to an aggregator function. In practice, the resulting ‘‘monetary services index’’ often is computed as a Fisher ideal index, rather than as a Törnqvist index. Diewert (1978) has shown that the two indexes approximate each other very well. 11 The Federal Reserve Bank of St. Louis Divisia database, which we use in this paper, is not risk corrected. In addition, it is not adjusted for differences in
marginal taxation rates on different asset returns or for sweeps, and its clustering of components into groups was not based upon tests of weak separability, but rather on the Federal Reserve’s official clustering. The St. Louis Federal Reserve Bank is in the process of revising its MSI database, perhaps to incorporate some of these adjustments. Regarding sweep adjustment, see Jones et al. (2005). At the present stage of this research, we felt it was best to use data available from the Federal Reserve for purposes of replicability and comparability with the official simple sum data. As a result, we did not modify the St. Louis Federal Reserve’s MSI database or the Federal Reserve Board’s simple sum data in any way. This decision should not be interpreted to imply advocacy by us of the official choices.
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
Fig. 1. Demand side aggregation error range.
information about the monetary service flow, provided by the reported value of the simple sum aggregate, Mss , is the interval E = [Mmin , Mmax ]. The supply side aggregation is analogous, but the lines of constant supplied service flow for financial firms are production possibility surfaces, not indifference surfaces, as shown by Barnett (1987). 3. The history of thought on monetary aggregation The fields of aggregation and index number theory have a long history. The first book to put together the properties of all of the available index numbers in a systematic manner was the famous Fisher (1922). He made it clear in that book that the simple sum and arithmetic average indexes are the worst known indices. On p. 29 of that book he wrote: ‘‘The simple arithmetic average is put first merely because it naturally comes first to the reader’s mind, being the most common form of average. In fields other than index numbers it is often the best form of average to use. But we shall see that the simple arithmetic average produces one of the very worst of index numbers, and if this book has no other effect than to lead to the total abandonment of the simple arithmetic type if index number, it will have served a useful purpose’’. On p. 361 Fisher wrote: ‘‘The simple arithmetic should not be used under any circumstances, being always biased and usually freakish as well. Nor should the simple aggregative ever be used; in fact this is even less reliable’’. The simple sum monetary aggregates published by the Federal Reserve are produced from the ‘‘simple aggregative’’ quantity index.12 Indeed data-producing agencies and data-producing newspapers switched to reputable index numbers, following the appearance of Fisher’s book. But there was one exception: the world’s central banks, which produced their monetary aggregates as simple sums. While the implicit assumption of perfect substitutability in identical ratios might have made sense during the first half of the 20th century, that assumption became unreasonable, as interest-bearing substitutes for currency were introduced by financial intermediaries, such as interest bearing checking and saving accounts. Nevertheless, the nature of the problem was understood by Friedman and Schwartz (1970, 151–152), who wrote the following:
12 While Fisher’s primary concern was price index numbers, any price index formula has an analogous quantity index number acquired by interchanging prices and quantities. Since the formulas are intended to track an increasing, linearly homogeneous, concave aggregator function in either case, the properties of the formula are the same, whether used as a price or a quantity index number.
11
Fig. 2. Seasonally adjusted normalized velocity during the 1970s.
‘‘The [simple summation] procedure is a very special case of the more general approach. In brief, the general approach consists of regarding each asset as a joint product having different degrees of ‘moneyness’, and defining the quantity of money as the weighted sum of the aggregated value of all assets . . . . We conjecture that this approach deserves and will get much more attention than it has so far received’’. More recently, subsequent to Barnett’s derivation of the Divisia monetary aggregates, Lucas (2000, p. 270) wrote: ‘‘I share the widely held opinion that M1 is too narrow an aggregate for this period [the 1990s], and I think that the Divisia approach offers much the best prospects for resolving this difficulty’’. 4. The 1960s and 1970s Having surveyed the theory and some of the relevant historical background, we now provide some key results. We organize them chronologically, to make the evolution of views clear. We first provide results for the 1960s and 1970s. The formal econometric source of the graphical results in this section, along with further modeling and inference details, can be found in Barnett et al. (1984) and Barnett (1982). Demand and supply of money functions were fundamental to macroeconomics and to central bank policy until the 1970s, when questions began to arise about the stability of those functions. It was common for general equilibrium models to determine real values and relative prices, and for the demand and supply for money to determine the price level and thereby nominal values. But it was believed that something went wrong in the 1970s. In Fig. 2, observe the behavior of the velocity of M3 and M+ 3 (later called L), which were the two broad aggregates often emphasized in that literature. For the demand for money function to have the correct sign for its interest elasticity (better modeled as user-cost price elasticity), the velocity should move in the same direction as nominal interest rates. Fig. 3 provides an interest rate during the same time period. Note that while nominal interest rates were increasing during the growing inflation of that decade, the velocity of the simple sum monetary aggregates in Fig. 2 were decreasing. While the source of concern is evident, note that the problem did not exist, when the data were produced from index number theory. Much of the concern in the 1970s was focused on 1974, when it was believed that there was a sharp structural shift in money markets. Fig. 4 displays a source of that concern. As is evident from this figure – which plots velocity against a bond rate, rather than against time – there appears to be a dramatic shift downwards in that velocity function in 1974. But observe that this result was acquired using simple sum M3. Fig. 5 displays the same cross plot of velocity against an interest rate, but with M3 computed as its
12
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
Fig. 3. Interest rates during the 1970s: 10 year government bond rate.
Fig. 6. Simple sum M3 base multiplier versus interest rate: deviation from time trend of Moody’s Baa corporate bond rate, monthly 1969.1–1981.8.
Fig. 7. Divisia M3 monetary aggregate base multiplier versus deviation from time trend of Moody’s Baa corporate bond interest rate, monthly 1969.1–1981-8.
Fig. 4. Simple sum M3 velocity versus interest rate: Moody’s AAA corporate bond rate, quarterly,1959.1–19980.3.
Fig. 5. Divisia M3 velocity versus interest rate: Moody’s AAA corporate bond rate, quarterly, 1959.1–19980.3.
Divisia index. Observe that the velocity no longer is constant, either before or after 1974. But there is no structural shift. There were analogous concerns about the supply side of money markets. The reason is evident from Fig. 6, which plots the base multiplier against a bond rate’s deviation from trend. The base multiplier is the ratio of a monetary aggregate to the monetary base. In this case, the monetary aggregate is again simple sum M3. Observe the dramatic structural shift. Prior to 1974, the function was a parabola. After 1974 the function is an intersecting straight line. But again this puzzle was produced by the simplesum monetary aggregate. In Fig. 7, the same plot is provided, but with the monetary aggregate changed to Divisia M2. The structural shift is gone. The econometric methods of investigating these concerns at the time were commonly based on the use of the Goldfeld (1973) demand for money function, which was the standard specification used by the Federal Reserve System. The equation was a linear regression of a monetary aggregate on national income, a regulated interest rate, and an unregulated interest rate. It was widely believed that the function had become unstable in the 1970s. Swamy and Tinsley (1980), at the Federal Reserve Board in Washington, DC, had produced a stochastic coefficients approach to estimating a linear equation. The result was an estimated stochastic process for each coefficient. The approach permitted testing the null hypothesis that all of the stochastic processes are constant. Swamy estimated the processes for the model’s three coefficients at the Federal Reserve Board with quarterly data from 1959:2 to 1980:4, and the econometric results were published by Barnett et al. (1984). The realizations of the three coefficient processes are displayed in Figs. 8–10. The solid line is the process’s realization, when money is measured by simple sum M2. The dotted line is the realization, when the monetary aggregate is measured by the Divisia index. The instability of the coefficient is very clear, when the monetary aggregate is simple sum; but the processes look like noise around a constant, when the monetary
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
13
Fig. 10. Regulated interest rate (passbook rate) coefficient time path. Fig. 8. Income coefficient time path.
Fig. 9. Market interest rate (commercial paper rate) coefficient time path.
aggregate is Divisia. The statistical test could not reject constancy (i.e., stability of the demand for money function), when Divisia was used. But stability was rejected, when the monetary aggregate was simple sum. 5. The monetarist experiment: November 1979–November 1982 Following the inflationary 1970s, Paul Volcker, as Chairman of the Federal Reserve Board, decided to bring inflation under control by decreasing the rate of growth of the money supply, with the instrument of policy being changed from the federal funds rate to nonborrowed reserves. The period, November 1979– November 1982, during which that policy was applied, was called the ‘‘Monetarist Experiment’’. The policy succeeded in ending the escalating inflation of the 1970s, but was followed by an unintended recession. The Federal Reserve had decided that the existence of widespread 3-year negotiated wage contracts precluded a sudden decrease in the money supply growth rate to the intended long run growth rate. The decision was to decrease from the high double-digit growth rates to about 10% per year and then gradually decrease towards the intended long run growth rate to avoid inducing a recession.
Fig. 11. Seasonally adjusted annual M3 growth rates. Divisia (—), simple sum (– – –). The last three observations to the right of the vertical line are post sample period.
Table 1 Mean growth rates during the period November 1979–August 1982. Monetary aggregate
Mean growth rate
Divisia M2 Simple sum M2 Divisia M3 Simple sum M3
4.5 9.3 4.8 10.0
Fig. 11 and Table 1 reveal the cause of the unintended recession. As is displayed in Fig. 11 for the M3 levels of aggregation, the rate of growth of the Divisia monetary aggregate was substantially less than the rate of growth of the official simple-sum-aggregate intermediate targets. As Table 1 summarizes, the simple sum aggregate growth rates were at the intended levels, but the Divisia growth rates were half as large, producing a negative shock of
14
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
substantially greater magnitude than intended. For computational details, see Barnett (1984). A recession followed. 6. End of the monetarist experiment: 1983–1984 Following the end of the Monetarist Experiment and the unintended recession that followed, Milton Friedman became very vocal with his prediction that there had just been a huge surge in the growth rate of the money supply, and that surge would work its way through the economy and produce a new inflation. He further predicted that there would subsequently be an overreaction by the Federal Reserve, plunging the economy back down into a recession. He published this view repeatedly in the media in various magazines and newspapers, with the most visible being his Newsweek article, ‘‘A Case of Bad Good News’’, which appeared on p. 84 on September 26, 1983. We have excerpted some of the sentences from that Newsweek article below: ‘‘The monetary explosion from July 1982 to July 1983 leaves no satisfactory way out of our present situation. The Fed’s stepping on the brakes will appear to have no immediate effect. Rapid recovery will continue under the impetus of earlier monetary growth. With its historical shortsightedness, the Fed will be tempted to step still harder on the brake—just as the failure of rapid monetary growth in late 1982 to generate immediate recovery led it to keep its collective foot on the accelerator much too long. The result is bound to be renewed stagflation— recession accompanied by rising inflation and high interest rates . . . . The only real uncertainty is when the recession will begin’’. But on exactly the same day, September 26, 1983, William Barnett published a very different view in his article, ‘‘What Explosion?’’ on p. 196 of Forbes magazine. The following is an excerpt of some of the sentences from that article: ‘‘People have been panicking unnecessarily about money supply growth this year. The new bank money funds and the super NOW accounts have been sucking in money that was formerly held in other forms, and other types of asset shuffling also have occurred. But the Divisia aggregates are rising at a rate not much different from last year’s . . . the ‘apparent explosion’ can be viewed as a statistical blip’’. Milton Friedman would not have taken such a strong position without reason. You can see the reason from Fig. 12. The percentage growth rates in that figure are divided by 10, so should be multiplied by 10 to acquire the actual growth rates. Notice the large spike in growth rate, which rises to near 30% per year. But that solid line is produced from simple sum M2, which was greatly overweighting the sudden new availability of super-NOW accounts and money market deposit accounts. There was no spike in the Divisia monetary aggregate, represented by the dashed line. If the huge surge in the money supply had happened, then inflation would surely have followed, unless money is extremely non-neutral even in the short run—a view held by very few economists. But there was no inflationary surge and no subsequent recession. 7. The rise of risk adjustment concerns: 1984–1993 The exact monetary quantity aggregator function, mt = u(mt ), can be tracked very accurately by the Divisia monetary aggregate, mdt , since its tracking ability is known under perfect certainty. However, when nominal interest rates are uncertain, the Divisia monetary aggregate’s tracking ability is somewhat compromised. That compromise is eliminated by using the extended Divisia monetary aggregate under risk derived by Barnett et al. (1997). Let
Fig. 12. Monetary growth rates. Divisia M2 (– – –), simple sum M2 (—), 1970–1996, from St. Louis Federal Reserve’s database.
mGt denote the extended ‘‘generalized’’ Divisia monetary aggregate. The only difference between mGt and mdt is the risk-adjusted user cost formula used to compute the prices in the generalized Divisia index formula. Let πitG denote the generalized user cost of monetary asset i. Under CCAPM (consumptions capital asset pricing) assumptions, Barnett et al. (1997) prove that
πitG = πite + ϕit where Et (Rt − Rit )
πite =
Et (1 + Rt )
and
ϕit =
∂T Et (1 + Rit ) Cov Rt , ∂ Ct +1
Et (1 + Rt )
∂T ∂ CT
−
Cov Rit , ∂ C∂ T
t +1
∂T ∂ Ct
,
where T = Et
∞ −
β t F (ct , mGt ).
t =0
Barnett et al. (1997) show that the values of ϕit determine the risk premia in interest rates. Note that πitG reduces to Eq. (2) under perfect certainty. Using that extension, Barnett and Xu (1998) demonstrated that the velocity will change, if the variance of an interest rate stochastic process changes. Hence the variation in the variance of an interest rate ARCH or GARCH stochastic process cannot be ignored in modeling monetary velocity. By calibrating a stochastic dynamic general equilibrium model, Barnett and Xu (1998) showed that the usual computation of the velocity function will be unstable, when interest rates exhibit stochastic volatility. But when the CCAPM adjusted variables above are used, so that the variation in variance is not ignored, velocity is stabilized. Fig. 13 displays the simulated slope coefficient for the velocity function, treated as a function of the exact interest rate aggregate, but without risk adjustment. All functions in the model are stable, by construction. Series 1 was produced with the least stochastic volatility in the interest rate stochastic process, series 2 with greater variation in variance, and series 3 with even more stochastic volatility. Note that the velocity function slope appears to be increasingly unstable, as stochastic volatility increases. By the model’s construction, the slope of the velocity function is constant, if the CCAPM risk adjustment is used. In addition, with real economic data, Barnett and Xu (1998) showed that the evidence of velocity instability is partially explained by overlooking the variation in the variance of interest rates over time.
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
Fig. 13. Simulated velocity slope coefficient with stochastic volatility of interest rates.
15
Fig. 14. M2 joint product and economic capital stock of money. M2 = simple sum joint product; CEM2 = economic capital stock part of the joint product.
Subsequently Barnett and Wu (2005) found that the explanatory power of the risk adjustment increases, if the assumption of intertemporal separability of the intertemporal utility function, T , is weakened. The reason is the same as a source of the well known equity premium puzzle, by which CCAPM under intertemporal separability under-corrects for risk. The Divisia index tracks the aggregator function, which measures service flow. But for some purposes, the economic capital stock, computed from the discounted expected future service flow, is relevant, especially when investigating wealth effects of policy. The economic stock of money (ESM), as defined by Barnett (2000) under perfect foresight, follows immediately from the manner in which monetary assets are found to enter the derived wealth constraint, (2.3). As a result, the formula for the economic stock of money under perfect foresight is
] ∞ − n [ ∗ − p∗s (1 + ris ) ps − mis , Vt = ρs ρs+1 s=t i=1 ∗
where the true cost of living index on consumer goods is ps = p∗s (ps ), with the vector of consumer goods prices being ps , and where the discount rate for period s is
1 s−1 ρs = ∏ (1 + Ru )
for s = t for s > t .
u =t
The CCAPM extension of the economic capital stock formula to risk is available from Barnett et al. (2006). During the late 1980s and early 1990s, there was increasing concern about the substitution of monetary assets within the monetary aggregates (especially money market mutual funds) with stock and bond mutual funds, which are not within the monetary aggregates. The Federal Reserve Board staff considered the possibility of incorporating stock and bond mutual funds into the monetary aggregates. Barnett and Zhou (1994a) used the formulas above to investigate the problem. They produced the figures that we reproduce below as Figs. 14 and 15. The dotted line is the simple sum monetary aggregate, which Barnett (2000) proved is equal to the sum of economic capital stock of money, Vt , and the discounted expected investment return from the components. Computation of Vt requires modeling expectations. In that early paper, Barnett and Zhou (1994a) used martingale expectations rather than the more recent approach of Barnett, Chae, and Keating, using VAR forecasting. When martingale expectations are used, the index is called CE. Since the economic capital stock of money, Vt , is what is relevant to macroeconomic theory, we should concentrate on the solid lines in those figures. Note that Fig. 15 displays nearly
Fig. 15. M2+ joint product and economic capital stock of money. M2+ = simple sum joint product; CEM2+ = economic capital stock part of the joint product.
parallel time paths, so that the growth rate is about the same in either. That figure is for M2+, which was the Federal Reserve Board staff’s proposed extended aggregate, adding stock and bond mutual funds to M2. But note that in Fig. 14, the gap between the two graphs is decreasing, producing a slower rate of growth for the simple sum aggregate than for the economic stock of money. The gap between the two lines is the amount motivated by investment yield. Clearly those gaps had been growing. But it is precisely that gap which does not measure monetary services. By adding the value of stock and bond mutual funds into Fig. 14 to get Fig. 15, the growth rate error of the simple sum aggregate is offset by adding in an increasing amount of assets providing nonmonetary services. Rather than trying to stabilize the error gap by adding in more and more nonmonetary services, the correct solution would be to remove the entire error gap by using the solid line in Fig. 14, which measures the actual economic capital stock of money. 8. The Y2K computer bug: 1999–2000 The next major concern about monetary aggregates and monetary policy arose at the end of 1999. In particular, the financial press became highly critical of the Federal Reserve for what was perceived by those commentators to be a large, inflationary surge in the monetary base. The reason is clear from Fig. 16. But in fact there was no valid reason for concern, since the cause was again a problem with the data.
16
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
Fig. 18. Financial general equilibrium without required reserves.
Fig. 16. Monetary base surge.
Fig. 17. Y2K computer bug.
The monetary base is the sum of currency plus bank reserves. Currency is dollar for dollar pure money, while reserves back deposits in an amount that is a multiple of the reserves. Hence as a measure of monetary services, the monetary base is severely defective, even though it is a correct measure of ‘‘outside money’’. At the end of 1999, there was the so-called Y2K computer bug, which was expected to cause temporary problems with computers throughout the world, including at banks. Consequently many depositors withdrew funds from their checking accounts and moved them into cash. While the decrease in deposits thereby produced an equal increase in currency demand, the decrease in deposits produced a smaller decline in reserves, because of the multiplier from reserves to deposits. The result was a surge in the monetary base, even though the cause was a temporary dollar-fordollar transfer of funds from demand deposits to cash, having little effect on economic liquidity. Once the computer bug was resolved, people put the withdrawn cash back into deposits, as is seen from Fig. 17.
by financial asset holders, can be slightly different from the supply-side Divisia monetary aggregate, measuring service flows produced by financial intermediaries. The reason is the regulatory wedge resulting from non-interest-bearing required reserves. That wedge produces a difference between demand side and supplyside user-cost prices and thereby can produce a small difference between the demand side and supply side Divisia aggregates. When there are no required reserves and hence no regulatory wedge, the general equilibrium looks like Fig. 18, with the usual separating hyperplane determining the user cost prices, which are the same on both sides of the market. The production possibility surface between deposit types 1 and 2 is for a financial intermediary, while the indifference curve is for a depositor allocating funds over the two asset types. In equilibrium, the quantity of asset i demanded, mit , is equal to the quantity supplied, µit , and the slope of the separating hyperplane determines the relative user costs on the demand side, πit , which are equal to those on the supply side, γit . That diagram assumes that the same user-cost prices are seen on both sides of the market. But when noninterest-bearing required reserves exist, the foregone investment return to banks is an implicit tax on banks and produces a regulatory wedge between the demand and supply side. It was shown by Barnett (1987) that under those circumstances, the user cost of supplied financial services by banks is not equal to the demand price, (2), but rather is
γit =
(1 − kit )Rt − rit , 1 + Rt
9. The supply side
where kit is the required reserve ratio for account type i, rit again is the interest rate paid on deposit type i, and now the bank’s benchmark rate, Rt , is its loan rate. Note that this supply-side user cost is equal to the demand-side formula, (2), when kit = 0, if the depositor’s benchmark rate is equal to the bank’s loan rate, as in classical macroeconomics, in which there is one pure investment rate of return.13 The resulting general equilibrium diagram, with the regulatory wedge, is displayed in Fig. 19. Notice that one tangency determines the supply-side prices, while the other tangency produces the demand-side prices, with the angle between the two straight lines being the ‘‘regulatory wedge’’. Observe that the demand equals the supply for each of the two component assets. Although the component demands and supplies are equal to each other, the failure of tangency between the production possibility curve and the indifference curve can result in a wedge
While much of the concern in this literature has been about the demand for money, there is a parallel literature about the supply of money by financial intermediaries. Regarding the aggregation theoretic approach, see Barnett and Hahm (1994) and Barnett and Zhou (1994b). It should be observed that the demand-side Divisia monetary aggregate, measuring perceived service flows received
13 An excellent source on the supply side is Hancock (1985a,b, 1991), who independently acquired many of the same results described in this section, but using a different procedure for determining inputs and outputs in production. Our view is that the distinction should be based upon value added in production, rather than upon the ex-post realization of the user cost price.
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
17
Fig. 19. Financial equilibrium with positive required reserves. Fig. 21. Simple sum M3, Divisia demand DDM3, and Divisia supply SDM3.
side Divisia monetary aggregates. Furthermore, in recent years reserve requirements have been low and largely offset by sweeps. In addition, the Federal Reserve recently began paying interest on required reserves, although not necessarily at bank’s full loan rate. The difference between the demand and supply side Divisia monetary aggregates now is much smaller than during the time period displayed in Figs. 20 and 21. 10. The great moderation
Fig. 20. Squared coherence between Divisia demand and supply side Divisia.
between the growth rates of aggregate demand and supply services, as reflected in the fact that the user cost prices in the Divisia index are not the same in the demand and the supply side aggregates. To determine whether this wedge might provide a reason to compute and track the Divisia monetary supply aggregate as well as the more common demand-side Divisia monetary aggregate, Barnett et al. (1986) conducted a detailed spectral analysis in the frequency domain. Fig. 20 displays the squared coherence between the demand and supply side Divisia monetary aggregates, where coherence measures correlation as a function of frequency. The figure provides those plots at three levels of aggregation. Note that the correlation usually exceeds 95% for all three levels of aggregation at all frequencies, but the coherence begins to decline at very high frequencies (i.e., very short cycle periods in months). Hence the difference between the demand and supply side monetary aggregates is relevant only in modeling very short run phenomena. To put this into context, we displays plots in the time domain for simple-sum M3, the supply-side M3 Divisia index (SDM3), and the demand-side M3 Divisia index (DDM3) over the same time period used in producing the frequency domain comparisons. See Fig. 21 for those plots. Notice that it takes over a decade for the difference between the demand side and supply side Divisia index to get wider than a pencil point, but the divergence between simple sum and either of the two Divisia aggregates begins immediately and is cumulative. In short, the error in using the simple-sum monetary aggregates is overwhelmingly greater than the usually entirely negligible difference between the demand and supply
The most recent research on the comparison of microeconomic aggregated-based Divisia and the simple sum monetary aggregates is Barnett et al. (2009). The paper proposes a latent factor Markov switching approach that separates out common dynamics in monetary aggregates from their idiosyncratic movements. The dynamic factor measures the common cyclical movements underlying the monetary aggregate indices. The idiosyncratic term captures movements peculiar to each index. The approach is used to provide pairwise comparisons of Divisia versus simple-sum monetary aggregates quarterly from 1960:2 to 2005:4. In that paper, they introduced the connection between the state-space timeseries approach to assessing measurement error and the aggregation theoretic concept, with emphasis upon the relevancy to monetary aggregation and monetary policy. 10.1. The model Let Yt be the n × 1 vector of monetary indices, where n is the number of monetary indices in the model:
1Yt = λ1Ft + γ τt + vt ,
(8)
where 1 = 1 − L and L is the lag operator. Changes in the monetary aggregates, 1Yt , are modeled as a function of a scalar unobservable factor that summarizes their commonalities, 1Ft ; an idiosyncratic component n × 1 vector, which captures the movements peculiar to each index, vt ; and a scalar potential time trend τt . The factor loadings, λ, measure the sensitivity of the series to the dynamic factor, 1Ft . Both the dynamic factor and the idiosyncratic terms follow autoregressive processes:
1Ft = αSt + φ(L)1Ft −1 + ηt vt = 0S h + d(L)vt −1 + εt , t
ηt ∼ N (0, σ 2 ),
εt ∼ i.i.d. N (0, 6),
(9) (10)
where ηt is the common shock to the latent dynamic factor, and εt are the measurement errors. In order to capture potential nonlinearities across different monetary regimes, the intercept of the monetary factor switches regimes according to a Markov variable, St , where αSt = α0 +α1 Stα , and Stα = 0, 1. That is, monetary indices can either be in an expansionary regime, where the mean growth
18
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
rate of money is positive (Stα = 1), or in a contractionary phase with a lower or negative mean growth rate (Stα = 0). We also assume that the idiosyncratic terms for each index follow distinct two-state Markov processes, by allowing their drift terms, 0S h to switch between regimes. For example, in the case t of two monetary indices, n = 2, there will be two idiosyncratic β terms, each one following an independent Markov process St and β δ δ St , where St = 0, 1 and St = 0, 1. Notice that we do not constrain β
the Markov variables Stα , St , and Stδ to be dependent of each other, but allow them instead to move according to their own dynamics. In fact, there is no reason to expect that the idiosyncratic terms would move in a similar manner to each other or to the dynamic factor, since by construction they represent movements peculiar to each index not captured by the common factor. The switches from one state to another are determined by the transition probabilities of the first-order two-state Markov pro∑1 cesses, pkij = P (Stk = j|Stk−1 = i), where j=0 pkij = 1, i, j = 0, 1, with k = α, β, δ identifying the Markov processes for the dynamic factor and the two idiosyncratic terms, respectively. The model separates out the common signal underlying the monetary aggregates from individual variations in each of the indexes. The dynamic factor captures simultaneous downturns and upturns in money growth indices. On the other hand, if only one of the variables declines, e.g. M1, this would not characterize a general monetary contraction in the model and would be captured by the M1 idiosyncratic term. A general monetary contraction (expansion) will occur when all n variables decrease (increase) at about the same time. That is, ηt and vt are assumed to be mutually independent at all leads and lags for all n variables, and d(L) is diagonal. The dynamic factor is the outcome of averaging out the discrete states. Although the n monetary indices represent different measurements of money, the estimated dynamic factor is a nonlinear combination of them, representing broader movements in monetary aggregates in the US. On the other hand, once a contraction or expansion is clearly under way, the idiosyncratic term for a particular aggregate can be highly informative near a turning point. Dynamic factor models with regime switching have been widely used to represent business cycles. The proposed model differs from the literature in its complexity, as it includes estimation of the parameters of three independent Markov processes. The model is cast in state space form, where (11) and (12) are the measurement and transition equations, respectively:
1Yt = Zξt + Gτt
(11)
ξt = µξst + Tξt −1 + ut .
(12)
A particular state space representation for the estimated indicator using two variables is:
] [ 1Y1t λ1 1 , Z= 1Y2t 1 0 1Ft αst v1t βst ξt = µξst = , v2t δst 1Yt =
[
Ft −1
φ1 0 T= 0 1
0 1
]
0 , 0
0 0 d1 0 0
0 0 d2 0
0 0 , 0 1
[ ] γ1 G= γ2
ηt ε1t and ut = . ε2t 0
The term Ft −1 is included in the state vector to allow estimation of the dynamic factor in levels from the identity 1Ft −1 = Ft −1 − Ft −2 . Barnett et al. (2009) estimate three models, one for each pair of the rate of growth of monetary indices: simple sum M1 and Divisia M1, simple sum M2 and Divisia M2, and simple sum M3 and Divisia M3, where Divisia corresponds to the ‘‘monetary services index’’
Fig. 22. Idiosyncratic terms for M3 growth (– – –), Divisia M3 growth (—), difference between Divisia M3 growth and simple sum M3 growth (– –), and NBER recessions (shaded area).
(MSI) computed from the Divisia index by the St. Louis Federal Reserve Bank. 10.2. Results There are some remarkable short run differences between simple sum and Divisia monetary aggregates, as shown in Barnett et al. (2009). Fig. 22 displays the idiosyncratic terms specific to the growth rates of Divisia M3 and simple sum M3, and NBER-dated recessions.14 Recall that the idiosyncratic terms indicate in which ways the two aggregates differ. First, the idiosyncratic term for M3 is generally higher than the one for Divisia M3 before 1984, especially in the 1970s. However, after 1984 this pattern reverts, and M3 is generally lower than Divisia M3. This is also the case for the difference between the indices Divisia M3 and simple sum M3, also shown in Fig. 22. That is, before 1984 the more precise measure of the growth rate of money supply (Divisia M3) was smaller than what was perceived (as measured by the official simple sum M3). On the other hand, during the Great Moderation the rate of growth of money supply as measured by Divisia was higher than perceived, as measured by simple sum. That is, monetary policy was more contractionary before 1984 and more expansive during the Great Moderation than shown by simple sum. This result, at the least, calls for a reexamination of the extensive literature that investigates tightness of monetary policy before and after 1984. Second, the idiosyncratic terms differ substantially around recessions. Compare Divisia M3’s idiosyncratic downward spikes in Fig. 22 with simple sum M3’s idiosyncratic behavior and then compare the relative predictive ability of the two extracted idiosyncratic terms with respect to NBER recessions. The Divisia M3 decreases a lot more before recessions (at the peak of inflation phases) and increase substantially more during recessions and recoveries (corresponding to low interest rate phases) than the simple sum aggregate M3, respectively. This is also observed in the difference between the two indices as depicted in Fig. 22. Accordingly, the Divisia index displays a business cycle pattern more consistent with monetary policy. However, notice as well as that more contractionary policy before recessions than intended could have contributed for the onset of these phases of weak economic activity.
14 This pattern is also observed between the simple sum M2 and Divisia M2 monetary aggregate. In particular, the major divergences between simple sum M3 and Divisia M3 growth coincide in time and amplitude with the differences between simple sum M2 and Divisia M2 growth. We will report here the results for M3. Details on the results for M1 and M2 can be found in Barnett et al. (2009).
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
Fig. 23. Idiosyncratic terms for M3 growth (– – –), Divisia M3 growth (—), difference between Divisia M3 growth and simple sum M3 growth (– –), and high interest rate phases (shaded area).
The most notable differences between Divisia M3 and simple sum M3 take place between 1978 and 1982, a period that includes the changes in the Fed operation procedures, a slowdown, and two recessions. As discussed in Section 5, the Divisia index was substantially lower than the simple sum M3, especially right before the 1980 and 1981 recessions. The only recession that was not preceded by a fall in the Divisia index was the 1990–1991 (the Divisia index decreased in 1989 but had a subsequent increase before the onset of this recession). Thus, with the exception of the 1990 recession, monetary policy has been more contractionary before recessions than intended. This is the case before or during the Great Moderation. Fig. 23 plots the idiosyncratic terms for Divisia and simple sum, but now against high interest rates phases.15 The differences between the Divisia M3 and the simple sum M3 are even more striking during these phases. Generally, at the beginning of high interest rate phases the idiosyncratic term for Divisia M3 starts to fall, and reaches its lowest values towards the end of these phases. At the trough, the idiosyncratic term for Divisia M3 starts to increase.16 This pattern is not observed in the idiosyncratic term for simple sum M3. That is, the dynamics of these Divisia indices correspond more closely to the expected movements related to interest rates and inflation. Recall that the idiosyncratic terms show when the two series differ—that is, what they do not have in common. In a speech at the Fourth ECB Central Banking Conference, Chairman Bernanke (2006) discussed some potential problems of the use of simple sum monetary aggregates for monetary policy. In particular, he mentions that between 1991 and 1992 simple sum M2 grew much more slowly than was predicted by models used by the Federal Reserve Board. In particular, the so-called P ∗ model, which used simple sum M2 in the quantity theory of money and estimates of long-run potential output and velocity to predict longrun inflation trends even predicted deflation for these years. Fig. 24 shows the differences between Divisia M2 and simple sum M2. The rate of growth of Divisia M2 was substantially higher than the rate of growth of simple sum M2 during this period.
15 Barnett et al. (2009) classify a high interest rate phase as one in which the Federal Funds rate increases persistently for two quarters, and lasts until it reaches a peak. Analogously, low interest rate phases start when the Fed Funds rate falls for two quarters, and lasts until it reaches a trough. 16 Note, however, that this relationship is not one-to-one. For general utility functions, the direction of the change in shares with a price change, or equivalently with an interest rate change, depends upon whether the own price elasticity of demand exceeds or is less than −1. This is the case, for example, in the early and late 1960s.
19
Fig. 24. Idiosyncratic terms for M2 growth (– – –), Divisia M2 growth (—), difference between Divisia M2 growth and simple sum M2 growth (– –), and NBER Recession (shaded area).
11. The 2008 financial crisis 11.1. Prior to July 2004 We believe that the highly-leveraged investment, borrowing, and lending that led up to the current crisis were not ‘‘irrational’’ relative to the views that had widely arisen about the Great Moderation and the Greenspan Put. The problem was the information set which the expectations were conditioned. There is evidence that the monetary policy in recent decades may have been more expansionary than was realized by the Federal Reserve and thereby may have fed the bubbles. There is also evidence that policy may have been more contractionary than realized by the Federal Reserve at the start of the crisis. We wish to emphasize that the evidence is not unambiguous in these regards, largely as a result of unfortunate Federal Reserve data limitations. In addition to the problems emphasized in this paper, there also is the fact that the Federal Reserve reports demand deposits postsweeps, thereby biasing downwards both M1, M2, and M3, with the bias in M1 being especially severe. However, since sweeps are excluded from both simple sum and Divisia aggregates, the difference between these two indices are not affected by this problem.17 In any case, we feel that it is worthwhile providing the evidence that we have. There is a strain of thought that maintains that the current US financial crisis was prompted by excessive money creation fueling the bubbles. The process started in early 2001, when money supply growth was increased substantially to minimize the economic recession that had started in March of that year. However, in contrast with previous recessions, money supply growth continued to be high for a few years after the recession’s end in November 2001. In fact, money supply growth was high until mid 2004, after which it started decreasing slowly. The argument is that the monetary expansion during this first period led to both speculation and leveraging, especially regarding lending practices in the housing sector. This expansion is argued as having made it possible for marginal borrowers to obtain loans with lower collateral values. When money creation slowed, housing prices began to decline, leading many to own negative equity and inducing a wave of defaults and foreclosures.
17 In order to evade reserve requirements, banks sweep demand deposits into money-market deposit savings accounts, but continue to service them as regular demand deposits. Thus, the method of aggregation is not the only problem. See, e.g., Jones et al. (2005).
20
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
Fig. 25. Idiosyncratic terms for M3 growth (– – –), Divisia M3 growth (—), difference between Divisia M3 growth and simple sum M3 growth (– –).
If this were the case, it would be worthwhile to ask what would have motivated the policy that had this outcome. We see no reason to believe that the Federal Reserve would have as a goal to create ‘excessive’ money growth. Had they known that the amount of money circulating in the economy was excessive and could generate an asset bubble, monetary policy would have been reverted long before it did. Fig. 25(a) and (b) display the idiosyncratic term for simple sum M3 and Divisia M3, as well as the difference between the two monetary aggregates for the recent period, plotted against the NBER-dated recession and high interest rate phases, respectively. First, notice that, as in the case of the previous recessions, the idiosyncratic term for Divisia M3 before the 2001 recession falls substantially more than the idiosyncratic term for simple sum M3. This is also observed in the difference between the rate of growth of the two indices. As shown in the figure, the subsequent increase in money supply growth during the 2001 recession, as measured by Divisia M3, was substantially larger than as measured by simple sum M3. In addition, Divisia money supply growth continued to be much higher than simple sum money supply between 2001 and 2003. Thus, the actual money supply growth post-recession was even higher than what was intended. These differences between the two indexes confounded monetary policy by underestimating the real amount of money available in the economy, as measured by the Divisia index. 11.2. Subsequent to July 2004 There may be some truth to the view that the recent bubble economy was accommodated by years of excessively expansionary monetary policy. Since all bubbles eventually burst, it is thereby argued that the current problems were unavoidable. Whether or not that view is correct, it is interesting to ask what broke the bubble, even if it eventually would have burst anyway. Comparison of the Divisia and simple sum monetary aggregates and inspection of Federal Reserve data provide relevant information. The Federal Reserve had been increasing the target value for the federal funds rate since June 2004. As can be seen in Fig. 25(b), this corresponded to a decrease in the idiosyncratic terms for M3 and Divisia M3. However, Divisia M3 fell substantially more than the simple sum M3. The difference between the two indices increases substantially until the end of the sample in 2005:4.18 Thus, the official simple sum M3 masks, once again, a much more contractionary policy by the Federal Reserve than intended, as also occurred prior to most recessions in the last 50 years. By conventional measures, the Federal Reserve has been easing its monetary policy stance by reducing its target value for the
18 We have no way of knowing the pattern of the broader aggregates, M3 and L, from 2006 on, since collection of both has been terminated by the Federal Reserve.
Fig. 26. Total reserves until very recently.
federal funds interest rate from 5.25% in September 2007 to its recent level of near zero percent in 2008 and 2009. Has the Fed thereby been engaging in actions that are stimulative to economic activity? Low interest rates do not necessarily an expansionary monetary policy make. It is helpful to illustrate the problem with a different central bank activity: sterilized exchange rate intervention. When the Fed decides to intervene in foreign exchange markets, its foreign desk swaps dollar-denominated assets for assets denominated in a foreign currency. Left unchecked at this point, the reserves of the US banking system (and the US money supply) would change, as would the market value of the federal funds interest rate. To sterilize the foreign exchange transaction, the domestic desk of the Fed, in a subsequent operation, either buys or sells US Treasuries in a magnitude sufficient to offset the impact of the foreign desk’s activity and thereby keeps the US money supply, the federal funds rate, and the reserves of the US banking system unchanged. On net, two things are accomplished by these offsetting transactions by the Fed’s foreign and domestic desks: creating the symbolic gesture of ‘‘doing something’’ about the dollar’s value and exposing the US taxpayer to potential losses, if subsequent changes in the exchange rate cause losses in the market value of the foreign assets now on the Fed’s books. Similarly, much recent Federal Reserve activity, including its role in bailouts, has been sterilized and has had little effect on bank reserves, while exposing the taxpayers to sub-standard asset risk. To illustrate the point, the Federal Reserve Fig. 26 shows the total amount of reserves in the US banking system over the past five years. Note that reserves – the raw material from which loans and spending are created – are lower in mid-2008 than in August of 2003! But changes in the funds rate are usually interpreted in the media as the product of Fed policy actions. According to that view, if the funds rate declines, it must be the result of an expansionary monetary policy action. Missing from this analysis is the other side of the reserves market: those who demand reserves have some
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
21
Fig. 28. Taylor rule federal funds rate. Fig. 27. Total reserves including recent surge.
ability to affect the price – i.e., the federal funds rate – at which reserves trade. Those demanders are banks that see the demand for reserves rise and fall along with the demand for loans. When the demand for loans falls, the demand for reserves by banks declines. Hence, the federal funds rate can decline, because of declines in the demands for loans and reserves, without the Fed taking any policy action. While a decline in the funds rate is usually interpreted as ‘‘evidence’’ of an easy policy stance, the real signal in the market may be that the economy is weakening. The Great Depression and the recent history of Japan’s long stagnation reveal that low interest rates, per se, are ambiguous indicators of the relative ease of monetary policy. The missing ingredient is the flow of bank reserves, the ultimate source of credit from which all other lending ultimately grows. For better or for worse, intentional or unintentional, herein may be the pin that pricked the recent bubble. Subsequent to the Fed’s publication of the discouraging Fig. 26 chart, there has been an enormous surge of reserves injected into the banking system through the Fed’s lender-of-last-resort function at its discount window; through the new credit facilities, such as the Primary Dealer Credit Facility and Term Auction Facility; and through the long overdue initiation of the Fed’s payment of interest on reserves—an important new reform that provides an incentive for banks to increase their holdings of reserves. See Fig. 27. 12. The most recent data Considering these most recent results along with the many others provided in this paper, and the relevant microeconomic aggregation theory, you might find it to be worthwhile to compare the most recent behavior of the Taylor rule, which does not use money at all. Fig. 28 is reproduced from the St. Louis Federal Reserve Bank’s publication, Monetary Trends. That figure displays the range of the target for the federal funds rate produced from the Taylor rule along with the actual interest rate over that time period, where the actual funds rate is the dark solid line. Notice that the actual interest rate was off-target for more than three successive years. Perhaps we now have a new paradox: the appearance of instability of the Taylor rule. As documented in this paper, monetary policy and monetary research have been plagued by bad monetary aggregates data, resulting from simple sum aggregation, which has been disreputable in aggregation and index number theorist for over a half century. In addition, we have shown that the puzzles that have arisen since the early 1970s were produced by simple sum aggregation and would be resolved, if reputable index number formulas were used. With so much history and evidence and so much research documenting the data problems, it might be assumed that central banks would
Fig. 29. Nonborrowed reserves.
now be taking much care to provide high quality data that is consistent with economic theory. But look at Fig. 29, which was downloaded from the St. Louis Federal Reserve Bank web site and is produced from official Federal Reserve Board data. Recall that during Volcker’s ‘‘Monetarist Experiment’’ period, the instrument of policy was nonborrowed reserves. Fig. 29, displays official recent data on nonborrowed reserves from the Federal Reserve Board. Total reserves are the sum of borrowed reserves and nonborrowed reserves. Nonborrowed reserves are those reserves that were not borrowed, while borrowed reserves are those reserves that were borrowed. Clearly everything included in borrowed reserves must be reserves, and everything contained in nonborrowed reserves must be reserves. Hence it is impossible for either borrowed reserves or nonborrowed reserves to exceed total reserves. A negative value for either borrowed reserves or nonborrowed reserves would be an oxymoron. Observe that nonborrowed reserves recently have crashed to about minus 50 billion dollars. The Federal Reserve’s explanation is that they are including the new auction borrowing from the Federal Reserve in nonborrowed reserves, even though the new auction facility borrowing need not be held as reserves. Hence according to this ‘‘data’’, the instrument of monetary policy during Volcker’s Monetarist Experiment period now has been driven to a very negative value, which contradicts the definition of ‘‘nonborrowed reserves’’.19 Since the Bank of England is the only central bank in the world that publishes Divisia money officially, it is especially interesting to look at UK data.20 As the current recession developed, the Bank of England adopted a policy of ‘‘quantity easing’’ focusing on
19 See Barnett (2009). 20 The US and ECB Divisia data are not supplied in a formally official manner. The St. Louis Federal Reserve Bank computes and supplies Divisia monetary aggregate
22
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23
contractionary as the bubbles burst. But we do find evidence supporting the view that the misperceptions and poor decisions in the private and public sectors of the economy were connected with defective data that are inconsistent with modern aggregation and index number theory and could have produced unrealistically excessive confidence in the capabilities of the Federal Reserve, as in Wall Street’s confidence in ‘‘The Greenspan Put’’.21 In addition, we show that the misperceptions were connected with excessive liquidity that fed the bubble, and the crisis was connected with a more contractionary policy than intended. The recent economic consequences can be understood in that context. Fig. 30. Growth in M4 simple sum (– – –) and M4 Divisia aggregate (—) for the UK.
References expanding the supply of monetary services, rather than interest rates, which already were at very low levels. Following that change in policy, there was little evidence of positive consequences. While this puzzled many who were following the Bank of England’s simple sum monetary aggregates, Fig. 30 displays both simple sum M4 and Divisia M4, both from the official Bank of England source. Clearly Divisia M4 reflects a tightening of policy, rather than the intended loosening, implied by simple sum M4. For further details of this phenomenon, see Rayton and Pavlyk (2010). 13. Conclusion We have shown that most of the puzzles and paradoxes that have evolved in the monetary economics literature were produced by the simple-sum monetary aggregates, provided officially by most central banks, and are resolved by use of aggregationtheoretic monetary aggregates. We argue that official central-bank data throughout the world have not significantly improved, despite the existence of better data internal to some of those central banks for their own use. We document the fact that the profession, financial firms, borrowers, lenders, and central banks have repeatedly been misled by defective central-bank monetary data over the past half century. Many commonly held views need to be rethought, since many such views were based upon atheoretical data. For example, the views on the Great Moderation need to be reconsidered, at least relative to the current crisis and the role of monetary policy. We find no reason to believe that the moderation in the business cycle during the past two decades had any appreciable connection with improved monetary policy, and in fact we find no reason to believe that there have been significant improvements in monetary policy over that time period. In particular, we believe that the increased risk-taking that produced the recent financial crisis resulted from a misperception of cyclical systemic risk. The misperception was caused by rational expectations conditioned upon a faulty information set. We do not take a position on what produced the Great Moderation, only on what did not. We are not comfortable with the widespread view that the source of the crisis is the irrational ‘‘greed’’ of the victims of the misperceptions, and we are not aware of a definition of the word ‘‘greed’’ in the field of economics. We similarly do not believe that the policy of the Federal Reserve was intentionally too expansionary during the evolution of the bubbles that preceded the current crisis or intentionally excessively
data for the US, but the Federal Reserve Board in Washington, DC does not. The European Central Bank’s (ECB) Governing Council is provided quarterly projections on economic and financial variables by the ECB’s staff, along with information based upon the Divisia monetary aggregates in accordance with Barnett (2007). Since that ECB staff information is used to inform the Council on a confidential basis, the data are not provided to the public.
Barnett, W.A., 1978. The user cost of money. Economics Letters 1, 145–149. Reprinted in: Barnett W.A. and Serletis A. (Eds.), 2000. The Theory of Monetary Aggregation. North Holland, Amsterdam, pp. 6–10 (Chapter 1). Barnett, W.A., 1980. Economic monetary aggregates: an application of aggregation and index number theory. Journal of Econometrics 14, 11–48. Reprinted in: Barnett W.A. and Serletis A. (Eds.), 2000. The Theory of Monetary Aggregation. North Holland, Amsterdam, pp. 6–10 (Chapter 1). Barnett, W.A., 1982. The optimal level of monetary aggregation. Journal of Money, Credit and Banking 14, 687–710. Reprinted in: W.A. Barnett and A. Serletis (Eds.), 2000. The Theory of Monetary Aggregation. North Holland, Amsterdam, pp. 125–149 (Chapter 7). Barnett, W.A., 1983. Understanding the new divisia monetary aggregate. Review of Public Data Use 11, 349–355. Reprinted in: W.A. Barnett and A. Serletis (Eds.), 2000. The Theory of Monetary Aggregation. North Holland, Amsterdam, pp. 100–108 (Chapter 4). Barnett, W.A., 1984. Recent monetary policy and the divisia monetary aggregates. American Statistician 38, 162–172. Reprinted in: W.A. Barnett and A. Serletis (Eds.), 2000. The Theory of Monetary Aggregation. North Holland, Amsterdam, pp. 563–576 (Chapter 23). Barnett, W.A., 1987. The microeconomic theory of monetary aggregation. In: Barnett, W.A., Singleton, Kenneth (Eds.), New Approaches to Monetary Economics. Cambridge University Press, Reprinted in: W.A. Barnett and A. Serletis (Eds.), 2000. The Theory of Monetary Aggregation. North Holland, Amsterdam, pp. 49–99 (Chapter 3). Barnett, W.A., 1997. Which road leads to stable money demand? The Economic Journal 107, 1171–1185. Reprinted in: W.A. Barnett and A. Serletis (Eds.), 2000. The Theory of Monetary Aggregation. North Holland, Amsterdam, pp. 577–592 (Chapter 24). Barnett, W.A., 2000. A reply to Julio J. Rotemberg. In: Belongia, M.T. (Ed.), Monetary Policy on the 75th Anniversary of the Federal Reserve System. Kluwer Academic, Boston, pp. 232–243. Reprinted in: Barnett, W.A., Serletis A. (Eds.), 1991. The Theory of Monetary Aggregation. Amsterdam, North Holland, pp. 296–306 (Chapter 14). Barnett, W.A., 2007. Multilateral aggregation-theoretic monetary aggregation over heterogeneous countries. Journal of Econometrics 136 (2), 457–482. Barnett, W.A., 2009. Who’s looking at the fed’s books? New York Times A35. Barnett, W.A., 2011. Getting It Wrong: How Faulty Monetary Statistics Undermine the Fed, the Financial System, and the Economy. MIT Press, Cambridge, MA (forthcoming). Barnett, W.A., Chae, U., Keating, J., 2006. The discounted economic stock of money with VAR forecasting. Annals of Finance 2 (2), 229–258. Barnett, W.A., Chauvet, M., 2010. Financial Aggregation and Index Number Theory: A Survey. World Scientific, Singapore (forthcoming). Barnett, W.A., Chauvet, M., Tierney, H.L.R., 2009. Measurement error in monetary aggregates: a Markov switching factor approach. Macroeconomic Dynamics 13 (Suppl. 2), 381–412. Barnett, W.A., Fisher, D., Serletis, A., 1992. Consumer theory and the demand for money. Journal of Economic Literature 30, 2086–2119. Reprinted in: Barnett, Serletis (Eds.), 2000. pp. 389–427 (Chapter 18). Barnett, W.A., Hahm, J.H., 1994. Financial firm production of monetary services: a generalized symmetric barnett variable profit function approach. Journal of Business and Economic Statistics 12, 33–46. Reprinted in: W.A. Barnett and J. Binner (Eds.), 2004. Functional Structure and Approximation in Econometrics. North Holland, Amsterdam, pp. 351–380 (Chapter 15). Barnett, W.A., Hinich, M.J., Weber, W.E., 1986. The regulatory wedge between the demand-side and supply-side aggregation-theoretic monetary aggregates. Journal of Econometrics 33, 165–185. Reprinted in: W.A. Barnett and A. Serletis (Eds.), 2000. The Theory of Monetary Aggregation. North Holland, Amsterdam, pp. 433–453 (Chapter 19). Barnett, W.A., Liu, Y., Jensen, M., 1997. CAPM risk adjustment for exact aggregation over financial assets. Macroeconomic Dynamics 1, 485–512.
21 Also see Chari et al. (2008).
W.A. Barnett, M. Chauvet / Journal of Econometrics 161 (2011) 6–23 Barnett, W.A., Offenbacher, E.K., Spindt, P.A., 1984. The new divisia monetary aggregates. Journal of Political Economy 92, 1049–1085. Reprinted in: W.A. Barnett and A. Serletis (Eds.), 2000. The Theory of Monetary Aggregation. North Holland, Amsterdam, pp. 360–388 (Chapter 17). Barnett, W.A., Peretti, P., 2009. Admissible clustering of aggregator components: a necessary and sufficient stochastic semi-nonparametric test for weak separability. Macroeconomic Dynamics 13 (Suppl. 2), 317–334. Barnett, W.A., Serletis, A. (Eds.), 2000. The Theory of Monetary Aggregation. In: Contributions to Economic Analysis Monograph Series, Elsevier, Amsterdam. Barnett, W.A., Wu, S., 2005. On user costs of risky monetary assets. Annals of Finance 1, 35–50. Barnett, W.A., Xu, H., 1998. Stochastic volatility in interest rates and nonlinearity in velocity. International Journal of Systems Science 29, 1189–1201. Barnett, W.A., Zhou, G., 1994a. Partition of M2+ as a joint product: commentary. Federal Reserve Bank of St. Louis Review 76, 53–62. Barnett, W.A., Zhou, G., 1994b. Financial firm’s production and supply-side monetary aggregation under dynamic uncertainty. Federal Reserve Bank of St. Louis Review (March–April), 133–165. Reprinted in: W.A. Barnett and J. Binner (Eds.), 2004. Functional Structure and Approximation in Econometrics. North Holland, Amsterdam, pp. 381–427 (Chapter 16). Batchelor, R., 1989. A monetary services index for the UK. Mimeo. Department of Economics, City University, London. Belongia, M., 1996. Measurement matters: recent results from monetary economics reexamined. Journal of Political Economy 104 (5), 1065–1083. Belongia, M., Binner, J. (Eds.), 2000. Divisia Monetary Aggregates: Theory and Practice. Palgrave, Basingstoke. Belongia, M., Chrystal, A., 1991. An admissible monetary aggregate for the United Kingdom. Review of Economics and Statistics 73, 497–503. Belongia, M., Ireland, P., 2006. The own-price of money and the channels of monetary transmission. Journal of Money, Credit and Banking 38 (2), 429–445. Belongia, M., Ireland, P., 2010. The barnett critique after three decades: a new keynesian analysis, Journal of Econometrics (forthcoming). Bernanke, B., 2006. Monetary aggregates and monetary policy at the federal reserve: a historical perspective. In: Speech at the Fourth ECB Central Banking Conference. Frankfurt, Germany, November. Chari, V.V., Christiano, L., Kehoe, P., 2008. Facts and myths about the financial crisis of 2008. Working Paper 666. Federal Reserve Bank of Minneapolis. Chauvet, M., 1998. An econometric characterization of business cycle dynamics with factor structure and regime switches. International Economic Review 39 (4), 969–996. Chauvet, M., 2001. A monthly indicator of Brazilian GDP. Brazilian Review of Econometrics 21 (1), 1–15. Chauvet, M., Hamilton, J., 2006. Dating business cycle turning points in real time. In: Milas, C., Rothman, P., van Dijk, D. (Eds.), Nonlinear Time Series Analysis of Business Cycles. Elsevier, Amsterdam, pp. 1–54. Chauvet, M., Piger, J., 2008. A comparison of the real-time performance of business cycle dating methods. Journal of Business Economics and Statistics 26 (1), 42–49. Chetty, K.V., 1969. On measuring the nearness of near-moneys. American Economic Review 59, 270–281. Chrystal, A., MacDonald, R., 1994. Empirical evidence on the recent behaviour and usefulness of simple-sum and weighted measures of the money stock. Federal Reserve Bank of St. Louis Review 76, 73–109. Cockerline, J., Murray, J., 1981. A comparison of alternative methods for monetary aggregation: some preliminary evidence. Technical Report #28. Bank of Canada. Deaton, A., Muellbauer, J.N., 1980. Economics and Consumer Behavior. Cambridge University Press, Cambridge. Diewert, W.E., 1974. Intertemporal consumer theory and the demand for durables. Econometrica 42, 497–516.
23
Diewert, W.E., 1976. Exact and superlative index numbers. Journal of Econometrics 4, 115–145. Diewert, W.E., 1978. Superlative index numbers and consistency in aggregation. Econometrica 46, 883–900. Divisia, F., 1925. L’Indice monétaire et la théorie de la monnaie. Revue d’Economie Politique 39, 980–1008. Donovan, D., 1978. Modeling the demand for liquid assets: an application to Canada. IMF Staff Papers 25, 676–704. Drake, L., 1992. The substitutability of financial assets in the UK and the implication for monetary aggregation. Manchester School of Economics and Social Studies 60, 221–248. Fase, M., 1985. Monetary control: the dutch experience: some reflections on the liquidity ratio. In: van Ewijk, C., Klant, J.J. (Eds.), Monetary Conditions for Economic Recovery. Martinus Nijhoff, Dordrecht, pp. 95–125. Feenstra, Robert C., 1986. Functional equivalence between liquidity costs and the utility of money. Journal of Monetary Economics 17 (2), 271–291. Fisher, I., 1922. The Making of Index Numbers: A Study of their Varieties, Tests, and Reliability. Houghton Mifflin, Boston. Friedman, M., Schwartz, A., 1970. Monetary Statistics of the United States: Estimation, Sources, Methods, and Data. Columbia University Press, New York. Goldfeld, S.M., 1973. The demand for money revisited. Brookings Papers on Economic Activity 3, 577–638. Hancock, D., 1985a. Bank profitability, interest rates, and monetary policy. Journal of Money, Credit and Banking 17, 189–202. Hancock, D., 1985b. The financial firm: production with monetary and nonmonetary goods. Journal of Political Economy 93, 859–880. Hancock, D., 1991. A Theory of Production for the Financial Firm. Kluwer Academic Publishers, Norwell, Massachusetts. Hicks, J.R., 1946. Value and Capital. Clarendon Press, Oxford. Hoa, T.V., 1985. A divisia system approach to modelling monetary aggregates. Economics Letters 17, 365–368. Hutt, W.H., 1963. Keynesianism—Retrospect and Prospect. Regnery, Chicago. International Monetary Fund, 2008. Monetary and Financial Statistics: Compilation Guide. International Monetary Fund Publication Services. Washington, DC. pp. 183–184. Ishida, K., 1984. Divisia monetary aggregates and the demand for money: a Japanese case. Bank of Japan Monetary and Economic Studies 2, 49–80. Jones, B., Dutkowsky, D., Elger, T., 2005. Sweep programs and optimal monetary aggregation. Journal of Banking and Finance 29, 483–508. Lucas, R.E., 1987. Models of Business Cycles. Basil Blackwell, New York. Lucas, R.E., 2000. Inflation and welfare. Econometrica 68 (62), 247–274. Lucas, R.E., 2003. Macroeconomic priorities. American Economic Review 93, 1–14. Rayton, B.A., Pavlyk, K., 2010. On the recent divergence between measures of the money supply in the UK. Economics Letter 108, 159–162. Rotemberg, J.J., Driscoll, J.C., Poterba, J.M., 1995. Money, output, and prices: evidence from a new monetary aggregate. Journal of Business and Economic Statistics 13, 67–83. Schunk, D., 2001. The relative forecasting performance of the divisia and simple sum monetary aggregates. Journal of Money, Credit and Banking 33 (2), 272–283. Serletis, A. (Ed.), 2006. Money and the Economy. World Scientific. Swamy, P.A.V.B., Tinsley, P., 1980. Linear prediction and estimation methods for regression models with stationary stochastic coefficients. Journal of Econometrics 12, 103–142. Yue, P., Fluri, R., 1991. Divisia monetary services indices for Switzerland: are they useful for monetary targeting? Federal Reserve Bank of St. Louis Review 73, 19–33.
Journal of Econometrics 161 (2011) 24–35
Contents lists available at ScienceDirect
Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom
Scanner data, time aggregation and the construction of price indexes Lorraine Ivancic a , W. Erwin Diewert b , Kevin J. Fox c,∗ a
Centre for Applied Economic Research, University of New South Wales, Australia
b
Department of Economics, University of British Columbia, Canada
c
School of Economics and Centre for Applied Economic Research, University of New South Wales, Sydney 2052, Australia
article
info
Article history: Available online 15 September 2010 JEL classification: C43 E31 Keywords: Price indexes Scanner data Chain drift Multilateral index number methods Rolling window GEKS
abstract We examine the impact of time aggregation on price change estimates for 19 supermarket item categories using scanner data. Time aggregation choices lead to a difference in price change estimates for chained indexes which ranged from 0.28% to 29.73% for a superlative index and an incredible 14.88%–46,463.71% for a non-superlative index. Traditional index number theory appears to break down with weekly data, even for superlative indexes. Monthly and (in some cases) quarterly time aggregation were insufficient to eliminate downward drift in superlative indexes. To eliminate drift, a novel adaptation of a multilateral index number method is proposed. © 2010 Elsevier B.V. All rights reserved.
1. Introduction Aggregation of price and quantity information is fundamental to the construction of any price or quantity index. Prior to any index number calculation, decisions must be made as to how individual transaction price and quantity data are to be aggregated to obtain price and quantity vectors that will be inserted into a bilateral index number formula. Aggregation decisions in the compilation of the Consumer Price Index (CPI) are generally limited by the use of regular but infrequent surveys to collect data. However, the advent of high-frequency electronic-point-of-sale ‘‘scanner data’’ has made increasingly detailed and comprehensive data on consumer purchases available to price statisticians. The use of more detailed data means that aggregation issues become even more complex when attempting to estimate price change. There are a number of dimensions over which data can potentially be aggregated before an index is calculated: transactions can be aggregated over different items, different package sizes, over different stores, over different regions or over different time periods. These aggregated prices and quantities are then inserted into a bilateral index number formula of the type studied by Fisher (1922). We are primarily concerned with how different methods of time aggregation affect estimates of price change. Only a handful of authors have used scanner data to examine this issue, including Reinsdorf (1999), Hawkes (1997), Bradley
∗
Corresponding author. Tel.: +61 2 9385 3320; fax: +61 2 9313 6337. E-mail address:
[email protected] (K.J. Fox).
0304-4076/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2010.09.003
et al. (1997), Haan and Opperdoes (1997), Dalen (1997) and Feenstra and Shapiro (2003). Reinsdorf (1999) found that the use of different aggregation methods over time resulted in estimates of price change which differed by as much as 7.9% while Haan and Opperdoes (1997, 10) found that ‘taking unit values [average prices] over one week every month instead of unit values over the entire month as the price concept shows differences that exceed by far the differences due to alternative elementary aggregate index number formulas’. These results indicate that time aggregation decisions are likely to be important, particularly when highfrequency data are used. A limitation of existing studies is that they typically use data on a small number of product categories. For instance, Reinsdorf (1999), Hawkes (1997) and Haan and Opperdoes (1997) all had information on only one product category (coffee), while Dalen (1997) had information on four product categories (fats, detergent, breakfast cereal and frozen fish). This makes it difficult to draw broad conclusions or make generalisations from these studies. A major benefit of the current study is that we have information on 19 major supermarket item categories and over 8000 individual products. This allows us to examine whether the results found in other studies hold for a larger set of products and whether regularities, resulting from different aggregation methods and the use of different index number formulae, can be identified across different item categories. We also examine the use of fixed base indexes versus chained indexes. Fixed base indexes have the advantage of being free of ‘‘chain drift’’; that is, they will return to unity when prices in the current period return to base period levels. Chain drift is thought
L. Ivancic et al. / Journal of Econometrics 161 (2011) 24–35
to result from what is known as price oscillation or bouncing which is often accompanied by quantity shifts; see Hill (1993, 388). Empirical work by Haan (2008) using scanner data has shown that quantity shifts in response to price discounts are substantial; they can spike up by 100 fold. Chain index drift thus becomes increasingly problematic when high-frequency (scanner) data are used. Drift-free fixed base indexes have a major disadvantage: over time, new products appear and old products disappear and it becomes increasingly difficult to match items that are available in the current period with items which were available in the base period. As a result, the relevance of a fixed base index diminishes over time. These considerations suggest that we make use of fixed base comparisons but use each month in turn as the fixed base and then take the (geometric) average of the resulting comparisons. This would make maximum use of all possible matches across the time period under consideration, and each of the separate fixed base monthly indexes is free from chain drift. This method is precisely analogous to the multilateral method of Gini (1931), Eltetö and Köves (1964) and Szulc (1964) (GEKS). We show that the GEKS multilateral method works well with our scanner data set, spanning 15 months of data. An issue that arises with using this multilateral methodology in the CPI context is that as each new month of data becomes available all of the previous parities would have to be recomputed. This is not acceptable for many statistical agencies where the CPI has to remain unrevised. To overcome the problem of revisions, we propose an innovative method which we describe as a rolling window GEKS method. Understanding how best to use scanner data in the context of constructing consumer price indexes is particularly important at present as statistical agencies worldwide are becoming increasingly interested in using scanner data in their official CPI figures. To our knowledge, scanner data are currently used directly in the CPI by only a handful of statistical agencies: the Central Bureau of Statistics in the Netherlands, Statistics Norway and Statistics Switzerland. New Zealand uses scanner data to help inform weighting decisions in the CPI. The establishment of robust methods for using these scanner data, which allow maximum matching of products over time, while avoiding chain drift problems associated with the use of chained indexes, is an important priority for statistical agencies. The paper is set out as follows. Section 2 provides a discussion of the time aggregation problem and the use of unit values as prices. Section 3 describes the various unit value concepts that are used in later sections along with a description of bilateral price indexes that are calculated in Section 4. Section 3 also discusses how chain drift can be defined in a formal manner. Section 4 provides a brief description of the data and provides estimates of price change for each of the 19 food groups over a 65 week period, using various unit value concepts and both fixed base and chained index numbers. The results indicate that monthly and weekly chained indexes have a considerable amount of chain drift. A solution is proposed in Section 5 by introducing a novel drift-free multilateral index number method, which also allows for the maximum amount of product matching. Section 6 compares the new method to official CPI figures. Section 7 concludes.
25
other words, at the first stage of aggregation, when we are constructing vectors of prices and quantities for two periods in order to insert these vectors into a bilateral index number formula, we are forced to aggregate the individual transactions which occur within a period into some sort of period average prices and total quantities. This leads to unit value prices as being the natural prices at this first stage of aggregation. We want the product of the aggregate price and quantity to equal the value of transactions for the specified commodity. The price that matches up with the total quantity is the unit value price, which is equal to the transacted value divided by the total quantity transacted. While the definition of a unit value price is fairly straightforward, its implementation is not necessarily straightforward. When a statistical agency decides to calculate a unit value, it has to decide on the scope of the unit value: what items should appear in the unit value, should the aggregation be over stores in the same chain, over what region should aggregation take place and finally, what is the length of the period over which the unit value is calculated? A unit value is, in effect, an average price over transactions, over a certain time period, over a particular product group and over stores.2 With respect to item groupings, it seems best to work with the finest classification of items that is available; i.e., use each product code as a separate unit value category. With respect to the time dimension, it might seem to be best practice if we chose a week as the unit of time rather than a month or a quarter, since if inflation in the country is very rapid, weekly indexes will be more relevant than monthly or quarterly indexes. However, as the time period becomes shorter, two problems emerge. Transactions become more sporadic and there can be a lack of matching of items between any two (short) periods. Also, price discounts lead to large fluctuations in quantities purchased and this leads to large fluctuations in overall measures of price change; fluctuations which are not entirely reversed when the item reverts to its regular price.3 Thus heavily discounted prices typically lead to a chain drift problem. The ‘‘optimal’’ aggregation period is therefore unclear. The issue of whether or not to aggregate items over stores was considered in tandem with the time aggregation problem as it is of interest to know if such store aggregation mitigates the effects of the choice of time aggregation. Currently, most statistical agencies appear to aggregate items over stores to form a unit value. This implicitly assumes that stores within the aggregation unit are ‘alike’ or offer the same level of quality. Not aggregating items over stores to form unit values will implicitly compensate for unmeasured quality differences across stores. It may be argued that rather than using unit values, a handful of what are thought to be ‘‘representative’’ price quotes could be used. However, this course of action would involve a loss of much of the information on consumer purchases that scanner data has to offer. Furthermore, Diewert (1995, 23) argued that ‘‘it should be evident that a unit value for the commodity provides a more accurate summary of an average transaction price than an isolated price quotation’’. Balk (1998) showed that a unit value index may actually be more accurate than a single price quotation.4
2. Aggregation and the construction of unit values Diewert (1995, 20) noted, ‘‘at some level of disaggregation, bilateral index number theory breaks down and it becomes necessary to define the average price and total quantity. . . using what might be called a ‘unilateral’ index number formula’’.1 In
1 Diewert followed Walsh (1901, 96) (1921, 88) on this point.
2 See Hawkes and Piotrowski (2003) for a range of potential aggregation units. See Balk (1998) for the theoretical properties of unit values. 3 See Feenstra and Shapiro (2003) for more discussion on this point. 4 Balk (1998, 9) argued that ‘‘if the unit value index is appropriate for a certain commodity group then it is equal to each single price ratio, and all those price ratios are equal’’. ‘‘In practice, however, there may be small distortions’’. A unit value index is able to capture these price distortions whereas a single price quote cannot.
26
L. Ivancic et al. / Journal of Econometrics 161 (2011) 24–35
3. Estimating price change using scanner data and the chain drift problem A number of different index number formulae were used to calculate the overall price change. The commonly used base period weighted Laspeyres index and its current period weighted counterpart, the Paasche index, were calculated, along with theoretically more attractive ‘‘superlative’’ indexes (Fisher, Törnqvist and Walsh indexes); see Diewert (1976). Of the superlative indexes, only results obtained from the Fisher formula are reported here, as the choice of superlative index had no noticeable impact on results.5 The (fixed base) Laspeyres price index can be written as follows: Laspeyrest =
−
wi0
pit pi0
i
,
(1)
where pi0 is the base period price of item i, pit is the price of item i in period t, for t = 1, . . . , T , and wi0 is good i’s share of total expenditure in period 0. In practice, the prices are unit values for commodity class i for each period t of some pre-specified length (e.g. a week, month or quarter). Note that Eq. (1) aggregates unit value indexes by using appropriately defined share weights. A common counterpart to the Laspeyres price index is the Paasche price index, which can be written as follows: Paaschet =
− i
wit
pi0 pit
−1
,
(2)
where wit is good i’s share of total expenditure in period t, for t = 0, . . . , T . The Fisher index formula is the geometric mean of the Laspeyres and Paasche indexes, i.e. Fishert = [Paaschet × Laspeyrest ]1/2 . For each index number formula, average prices and total quantities were aggregated in turn, over weekly, monthly and quarterly intervals. Items were in turn treated as different items if they were not located in the same store (no item aggregation over stores) or treated as the same good no matter which store they were in (item aggregation over stores). Direct (or fixed base) and chained indexes were also calculated for all of these combinations. For direct indexes, the basket of goods over which the price index is constructed is held fixed over time,6 while for chained indexes, the base period index value is incrementally updated. Two types of chained indexes were estimated in this study. First, an index we refer to as a ‘‘fixed basket’’ index was calculated using a basket of items which was matched with the direct index. No new items which appeared in the sample period were incorporated into this index over time. This type of index provides a ‘pure’ comparison with the direct index as it is not affected by new items which appeared in periods subsequent to the first period.7 Second, a ‘‘flexible basket’’ index that incorporated new items as they became available over time was also calculated; each chain link index used the set of all items
5 Diewert (1978) noted that all superlative indexes approximate each other to the second order and thus it should not matter which superlative index is used. Hill (2006) noted that Diewert’s result breaks down for quadratic mean of order r indexes as r becomes large in magnitude. However, for ‘‘standard’’ superlative indexes, Diewert’s approximation result appears to hold. 6 For the direct comparison between the first and the last period, the index was computed using only the products which were purchased in both periods. 7 For the ‘‘fixed basket’’ chained index, we started with the set of items which were sold in both the first and last periods. When calculating the chain link between periods 1–2, we intersected this starting set of items with the set of items which were also sold in period 2; when calculating the chain link between periods 2–3, we intersected the starting set of items with the set of items which were also sold in periods 2 and 3 and so on.
which were sold in the two adjacent periods. It is of interest to see how this second chained index behaves relative to the ‘‘fixed chain’’ as new items ‘‘may experience price changes that differ substantially from the price changes of existing items’’ (ILO, 2004, 138). One of the important features of chained indexes is that the basket of goods is able to be constantly updated as new and disappearing items can be incorporated into estimates of price change over time. However, chained indexes may suffer from what is known as chain drift.8 Chain drift occurs when an index ‘‘does not return to unity when prices in the current period return to their levels in the base period’’; ILO (2004, 445). An objective method to test for the existence of chain drift is the multiperiod identity test, which was proposed by Walsh (1901, 401) and Szulc (1983, 540)9 : P (p1 , p2 , q1 , q2 )P (p2 , p3 , q2 , q3 )P (p3 , p1 , q3 , q1 ) = 1.
(3)
P (p1 , p2 , q1 , q2 ) and P (p2 , p3 , q2 , q3 ) are price indexes between periods 1 and 2, and then 2 and 3, respectively, where pt and qt are the price and quantity vectors pertaining to periods t for t = 1, 2, 3. Their product gives the chained price index between periods 1 and 3. Each index in Eq. (3) is referred to as a chain link. Note that there is an additional link in the chain in Eq. (3), P (p3 , p1 , q3 , q1 ), which is a price index between periods 3 and 4, where the period 4 price and quantity data are the same as the period 1 data. So P (p3 , p1 , q3 , q1 ) takes us from period 3 directly back to period 1. The price index formula P will not suffer from chain drift or chain link bias if the product of all of these factors equals 1. Alternatively, if the index formula satisfies the time reversal test, so that P (p3 , p1 , q3 , q1 ) = 1/P (p1 , p3 , q1 , q3 ), then the multiperiod identity test is the same as Fisher’s (1922) circularity test. In the following section, we will compute various chain indexes over our entire sample period and compare each of them with the corresponding direct indexes from the first period to the last period. If the direct and chained indexes give us the same results (and the index number formula satisfies the time reversal test), then (3) will be satisfied. However, if the direct and chained indexes are not equal, then chain drift is present. Direct and chained indexes were estimated over a 15 month period as follows: 1. quarterly estimates of direct price change compared prices in quarter 1 with quarter 5; chained estimates compared prices in all quarters, from quarter 1 to quarter 5; 2. monthly estimates of direct price change compared prices in month 1 with month 15; chained estimates compared prices in all months, from month 1 to month 15; and 3. weekly estimates of direct price change compared prices in week 1 with week 65; while chained estimates compared prices in all weeks, from week 1 to 65. 4. Direct and chained results for Laspeyres, Paasche and Fisher indexes We use a scanner data set collected by ACNielsen, which contains information on four supermarket chains located in one of the major capital cites in Australia. In total, over 100 stores are included in this data set with these stores accounting for approximately 80% of grocery sales in this city; see Jain and Abello (2001). The data set contains 65 weeks of data, collected between February 1997 and April 1998. Information on 19 different supermarket
8 This term dates back to Frisch (1936, 8): ‘‘The divergency which exists between a chain index and the corresponding direct index (when the latter does not satisfy the circular test) will often take the form of a systematic drifting’’. 9 Diewert (1993, 40–53) gave the test this name.
L. Ivancic et al. / Journal of Econometrics 161 (2011) 24–35 Table 1 Data: descriptive statistics. Item category
Observations
Biscuits Bread Butter Cereal Coffee Detergent Frozen peas Honey Jams Juices Margarine Oil Pasta Pet food Soft drinks Spreads Sugar Tin tomatoes Toilet paper
2,452,797 752,884 225,789 1,147,737 514,945 458,712 544,050 235,649 615,948 2,639,642 312,558 483,146 1,065,204 2,589,135 2,140,587 283,676 254,453 246,187 438,525
1327 430 79 554 205 177 231 113 389 1125 98 314 715 1073 966 103 118 130 164
17,401,624
8311
Total
Number of items
item categories, such as bread, biscuits and soft drinks are included. A large number of observations on transactions exist for all item categories, with a minimum of 225,789 observations for the item category ‘‘butter’’ and a maximum of 2639,642 observations for the item category ‘‘juices’’. An observation here refers to the average weekly price (weekly unit value) and total weekly quantity sold of each item transacted in each store in each week. For example, from Table 1, there were 2452,797 sales observations on biscuits over the 65 week period. For each item category the data set contains price and quantity information on all of the different items, brands and package sizes which are sold in that particular item category in all of the stores in each week; for example, Table 1 shows there were 1327 different types of biscuits traded across all stores over the period. Additional information includes the item brand name, a unique 13 digit identifier (known as the European Article Number/Australian Product Number (EANAPN)) and, where relevant, the physical weight of the item. Price change estimates are presented for Fisher, Paasche and Laspeyres indexes, and for direct and chained indexes using the methods described in Section 3 for each of the 19 major supermarket item categories. In general, the results point to a high degree of variation in index number estimates across the different methods of time aggregation and different index number formulae; see Tables 2–7. The results are presented in index terms with a base of 100, so that, e.g., 100.21 − 100 = 0.21% price change over the period. In general, the results indicate that more time aggregation leads to increasingly stable estimates of price change, for all types of indexes. However, the degree of the instability varies considerably across the different indexes. The impact of time aggregation is extremely pronounced when chained indexes are used. This is particularly true for the Laspeyres index, where a number of price change estimates appear to explode as the frequency of chaining increases. For example, Table 5 shows that Laspeyres price change estimates for the item category toilet paper based on quarterly, monthly and weekly time aggregation (with no item aggregation over stores) range from a somewhat reasonable (106.71 − 100=) 6.71% (quarterly, fixed basket) to a massive (11,955 − 100=) 11,855% (weekly, fixed basket) over the 15 month period.10 Overall, for the Laspeyres chained (fixed and
10 For Paasche indexes, the converse occurs, with chained estimates of price change falling rapidly.
27
flexible basket) indexes, the difference in price change estimates for the 19 item categories across different methods of time aggregation ranges from 14.88% to an incredible 46,463.71%. With item aggregation over stores and using flexible basket chained Laspeyres indexes (Table 2), over the 19 item categories the average absolute difference between weekly and quarterly price change estimates is approximately 298%. When we look at indexes where items have been disaggregated over stores (Table 5) this becomes 3176%! The Fisher index appears to be relatively less affected by time aggregation than the Laspeyres and Paasche indexes. Despite this, even the Fisher index shows a degree of variation which seems to be a cause for concern. For example, from Table 7, the Fisher flexible basket chained estimates of price change for the item category toilet paper (no item aggregation over stores) were calculated at (100.43 − 100=) 0.43%, (98.61 − 100=) −1.39% and (79.86 − 100=) −20.14% for quarterly, monthly and weekly time aggregation, respectively. Overall, for chained (fixed and flexible basket) Fisher indexes, the difference in price change estimates for the 19 item categories across different methods of time aggregation ranges from 0.28% to a surprisingly large 29.73%. With item aggregation over stores and using the flexible basket chained Fisher index, we find that on average the absolute difference between weekly and quarterly price change estimates is approximately 8%. When we look at indexes where items have been disaggregated over stores (Table 7), the average absolute difference increases to approximately 14%. The observed volatility and extreme nature of some of our index number estimates (which is particularly evident when low levels of aggregation are combined with chaining) are consistent with findings in the existing literature; see Feenstra and Shapiro (2003), Reinsdorf (1999) and Dalen (1997). It is known that nonsuperlative (Laspeyres) indexes are prone to drift when price bouncing is evident; see Frisch (1936), Forsyth and Fowler (1981) and Szulc (1983). Importantly, our results indicate that even superlative indexes, when applied to weekly supermarket data, do not seem to be able to deal well with price bouncing behaviour; chained weekly superlative indexes, while not as unstable as the chained Paasche and Laspeyres results, also give us some implausible results. Estimates of possible bias in CPIs due to the use of a fixed basket price index formula can also be obtained from Tables 2–7. The last row in each table contains the geometric mean of the Laspeyres, Paasche and Fisher indexes for the whole sample period across all of the 19 supermarket item categories. The geometric mean of the 19 item category estimates of price change for the Laspeyres direct quarterly, monthly and weekly index number estimates (with item aggregation over stores) were 102.15 (quarterly), 102.90 (monthly) and 103.75 (weekly); see Table 2. The corresponding geometric mean of the 19 item category estimates of price change for the Fisher direct quarterly, monthly and weekly index number estimates were 101.77 (quarterly), 101.95 (monthly) and 102.00 (weekly); see Table 4. By subtracting the geometric mean of the Fisher index numbers from their Laspeyres counterparts we obtain an approximate estimate of the average bias which is introduced when the Laspeyres formula is used in place of the superlative Fisher formula.11 Bias is estimated at 0.38, 0.95 and 1.75 index points for the quarterly, monthly and weekly indexes, respectively. These estimates of bias are based on unit values which aggregate transactions of the same item over stores in the region.
11 Statistical agencies do not actually use the Laspeyres formula; they use what is now called the Lowe (1823) index. However, under certain conditions, it can be shown that the bias in the Lowe index as compared to a superlative index is likely to be of the same order of magnitude (or bigger) than the bias between the Laspeyres index and the superlative index; see ILO (2004, 272–274).
28
L. Ivancic et al. / Journal of Econometrics 161 (2011) 24–35
Table 2 Laspeyres index: price change estimates—item aggregation over stores. Direct
Chained (fixed basket)
Chained (flexible basket)
Quarterly
Monthly
Weekly
Quarterly
Monthly
Weekly
Quarterly
Monthly
Weekly
Biscuits Bread Butter Cereal Coffee Detergent Frozen peas Honey Jams Juices Margarine Oil Pasta Pet food Soft drinks Spreads Sugar Tin tomatoes Toilet paper
98.89 104.33 100.95 100.27 111.14 102.71 100.78 104.77 100.49 101.74 104.29 92.93 100.88 100.46 104.13 104.86 106.37 101.33 100.61
100.74 106.69 102.91 102.00 112.38 105.71 100.73 105.93 101.52 101.77 102.80 90.87 101.16 101.64 106.41 107.88 107.20 98.93 99.62
101.94 108.87 100.11 104.02 115.70 105.25 101.75 105.52 102.08 104.21 104.10 87.37 104.88 103.52 108.65 107.14 106.71 101.68 100.46
98.50 104.91 101.50 100.94 111.57 103.09 101.28 104.87 100.99 102.69 106.86 93.48 101.22 101.11 105.95 104.98 106.07 101.95 103.99
109.04 114.05 106.85 107.45 126.21 112.31 108.25 108.14 107.29 110.82 124.53 100.48 110.46 106.17 132.27 111.163 111.39 110.51 125.71
185.77 562.24 145.14 215.57 274.76 165.05 202.12 120.40 174.15 332.11 1606.77 141.16 347.14 165.54 1074.89 122.84 149.44 165.82 1656.92
96.21 104.88 101.91 100.65 111.49 102.64 100.94 104.42 100.09 101.90 106.81 92.82 100.30 100.82 105.83 104.70 106.09 101.14 103.67
101.66 113.76 107.48 107.01 125.72 111.54 107.24 107.27 105.47 109.65 124.86 100.05 109.38 105.64 132.21 110.64 111.43 109.42 124.69
166.95 615.50 145.60 210.04 267.83 164.11 195.92 119.30 167.01 318.52 1562.35 142.56 342.19 161.59 1024.45 121.94 149.47 164.62 1571.90
Geo mean
102.15
102.90
103.75
102.88
112.52
269.10
102.41
111.54
263.97
Table 3 Paasche index: price change estimates—item aggregation over stores. Direct
Chained (fixed basket)
Chained (flexible basket)
Quarterly
Monthly
Weekly
Quarterly
Monthly
Weekly
Quarterly
Monthly
Weekly
Biscuits Bread Butter Cereal Coffee Detergent Frozen peas Honey Jams Juices Margarine Oil Pasta Pet food Soft drinks Spreads Sugar Tin tomatoes Toilet paper
98.44 102.83 100.30 100.23 109.30 102.39 100.33 104.37 100.39 100.69 103.14 91.05 100.37 100.56 102.77 103.91 106.14 101.32 99.32
99.68 102.89 101.25 100.73 110.04 104.67 100.32 105.30 100.73 99.43 97.96 87.72 100.63 99.88 102.31 105.87 106.99 98.16 96.58
99.71 101.66 99.07 102.64 111.23 103.82 100.21 104.52 98.18 98.65 102.39 83.21 100.92 101.84 103.32 105.57 106.23 98.73 87.06
97.24 102.79 99.84 99.33 108.71 101.89 100.11 104.12 99.67 100.15 101.37 90.07 99.78 99.92 101.33 103.81 105.93 100.46 96.61
91.93 97.14 97.74 94.98 98.70 97.83 93.65 102.82 95.49 91.77 80.57 75.76 92.05 95.35 80.19 103.123 101.23 89.45 76.65
48.12 19.33 66.45 43.82 35.00 61.16 44.32 89.54 46.62 27.29 5.52 42.41 25.75 59.10 6.06 88.23 66.06 53.31 3.68
96.38 102.35 99.98 99.12 108.62 101.52 99.86 103.84 99.04 99.12 100.72 88.93 99.25 99.65 101.01 103.73 105.97 99.5892 96.70
88.75 94.48 97.85 94.47 98.43 96.68 92.77 102.42 94.24 90.23 80.31 74.02 90.25 94.73 79.36 102.85 101.25 88.64 76.67
45.28 16.91 66.50 43.87 35.57 59.82 44.94 89.13 46.37 27.37 5.59 39.83 24.17 59.74 6.22 87.85 66.06 51.96 3.82
Geo mean
101.40
101.01
100.27
100.62
92.06
33.03
100.20
91.10
32.58
Theoretically, it would be more appropriate to treat items sold in different stores as separate commodities in the index number formula since the various stores may have differences in the quality of their service.12 The geometric mean of the 19 item category estimates of price change for the Laspeyres direct quarterly, monthly and weekly index number estimates (with no item aggregation over stores) were 102.83 (quarterly), 104.12 (monthly) and 105.55 (weekly); see Table 5. The corresponding geometric mean of the 19 item category estimates of price change for the Fisher direct quarterly, monthly and weekly index number estimates were 101.98 (quarterly), 102.12 (monthly) and 102.35 (weekly); see Table 7. The approximate estimates of bias are 0.85, 2.00 and 3.20 index points for the quarterly, monthly and weekly indexes, respectively. These are very substantial bias estimates, which suggest that there are
12 However, the drawback to treating each item in each store as a separate item is that matching sales of items across time periods becomes more difficult. If the time period is a month or a quarter, this difficulty is not a substantial one.
potentially large gains in index accuracy when moving from the use of a fixed base index to a superlative index.13 The chained estimates of price change appear to be quite unreliable and so cannot be used to assess the choice of whether to aggregate over stores in forming unit values. However, we can look at the direct comparisons of the Fisher indexes in Tables 4 and 7 to cast some light on the differences that result from different methods of aggregating over stores. From Table 4, the geometric means of the Fisher formula direct quarterly, monthly and weekly estimates of price change over the five quarters were 101.77 (quarterly), 101.95 (monthly) and 102.00 (weekly). These estimates are based on unit values that were formed by
13 It is somewhat troublesome that the bias estimates are so much larger (for weekly and monthly data) when we use the most disaggregated unit values as our price data as opposed to when we use unit values that are aggregated over stores. This unanticipated divergence in results suggests that even superlative price indexes may just be inherently unreliable when the unit value concept is defined over short time periods and disaggregated over stores due to the irregularity of purchases and the lack of matching.
L. Ivancic et al. / Journal of Econometrics 161 (2011) 24–35
29
Table 4 Fisher index: price change estimates—item aggregation over stores. Direct
Chained (fixed basket)
Chained (flexible basket)
Quarterly
Monthly
Weekly
Quarterly
Monthly
Weekly
Quarterly
Monthly
Weekly
Biscuits Bread Butter Cereal Coffee Detergent Frozen peas Honey Jams Juices Margarine Oil Pasta Pet food Soft drinks Spreads Sugar Tin tomatoes Toilet paper
98.66 103.58 100.62 100.25 110.22 102.55 100.55 104.57 100.44 101.21 103.72 91.99 100.62 100.51 103.45 104.39 106.26 101.32 99.96
100.21 104.77 102.08 101.37 111.20 105.19 100.52 105.61 101.12 100.59 100.35 89.28 100.90 100.76 104.34 106.87 107.10 98.55 98.09
100.82 105.20 99.59 103.33 113.44 104.53 100.98 105.02 100.11 101.39 103.24 85.26 102.88 102.68 105.95 106.35 106.47 100.20 93.52
97.87 103.85 100.67 100.13 110.13 102.49 100.70 104.49 100.33 101.41 104.08 91.76 100.50 100.51 103.62 104.39 106.00 101.20 100.23
100.12 105.25 102.19 101.02 111.61 104.82 100.68 105.45 101.22 100.84 100.16 87.25 100.84 100.61 102.99 107.07 106.19 99.43 98.16
94.55 104.25 98.20 97.19 98.07 100.48 94.64 103.83 90.10 95.21 94.17 77.37 94.55 98.91 80.70 104.11 99.36 94.02 78.13
96.29 103.61 100.94 99.88 110.05 102.08 100.40 104.13 99.56 100.50 103.72 90.86 99.77 100.23 103.39 104.22 106.03 100.363 100.13
94.99 103.67 102.56 100.54 111.24 103.84 99.74 104.81 99.69 99.47 100.14 86.05 99.36 100.04 102.43 106.67 106.22 98.48 97.77
86.95 102.03 98.40 95.99 97.61 99.08 93.83 103.12 88.00 93.37 93.44 75.35 90.95 98.25 79.80 103.50 99.36 92.49 77.51
Geo mean
101.77
101.95
102.00
101.74
101.78
94.28
101.30
100.80
92.73
Table 5 Laspeyres index: price change estimates—no item aggregation over stores. Direct
Chained (fixed basket)
Chained (flexible basket)
Quarterly
Monthly
Weekly
Quarterly
Monthly
Weekly
Quarterly
Monthly
Weekly
Biscuits Bread Butter Cereal Coffee Detergent Frozen peas Honey Jams Juices Margarine Oil Pasta Pet food Soft drinks Spreads Sugar Tin tomatoes Toilet paper
99.77 104.81 101.26 100.77 111.97 103.27 101.27 104.87 101.50 102.33 105.54 93.00 101.28 101.32 106.37 104.77 106.97 102.48 101.49
102.11 108.10 103.22 103.56 114.25 106.61 101.51 105.97 103.28 102.86 106.09 91.10 102.61 102.01 108.51 107.67 108.44 101.12 101.24
102.99 112.48 100.78 104.53 116.98 105.69 102.88 105.85 105.61 106.13 107.85 88.33 108.07 104.82 113.28 107.49 108.51 103.57 102.66
101.60 106.18 102.59 102.54 113.70 104.15 102.35 105.32 102.23 104.12 111.53 94.18 102.44 102.93 111.39 105.72 107.43 103.44 106.71
121.16 125.77 113.99 123.24 155.80 125.14 119.17 111.22 118.08 124.84 182.67 103.21 122.15 114.15 175.13 115.39 119.64 119.06 158.29
318.33 3146.25 193.00 361.49 543.34 227.96 300.51 128.45 294.13 821.30 13 897.59 132.41 790.75 263.49 46 575.10 140.14 176.18 212.26 11 955.97
100.65 106.16 102.80 102.36 113.72 103.50 101.92 105.05 101.40 103.51 111.94 94.10 101.97 102.53 111.82 105.51 107.20 103.15 107.31
116.05 126.05 114.15 122.85 154.65 123.70 117.13 110.65 114.53 123.64 187.85 104.66 123.78 113.264 175.88 115.43 119.17 117.36 162.65
281.30 2815.28 193.21 354.71 511.04 228.01 273.91 126.76 257.39 764.47 14 578.97 155.57 788.53 241.45 28 420.37 140.69 173.62 208.30 11 815.05
Geo mean
102.83
104.12
105.55
104.68
127.27
612.55
104.46
126.85
aggregating sales of a particular item across stores. From Table 7, the geometric means of the Fisher formula direct quarterly, monthly and price change over the five quarters were 101.98 (quarterly), 102.12 (monthly) and 102.35 (weekly). Estimates in Table 7 (since the unit values are not aggregated over stores) are uniformly higher than their counterparts in Table 4, the differences being 0.21 (quarterly), 0.17 (monthly) and 0.35 (weekly) index points. Our results show that making an inappropriate decision on store aggregation can result in an annual bias in the order of 0.1%–0.3% points a year. So this leads to the question of when to aggregate over stores. In general, it is assumed that aggregation should occur across ‘alike’ or homogeneous units (Balk, 1998; Dalen, 1992; Reinsdorf, 1994). Typically, in this literature, stores are considered to be homogeneous if they offer the same level of service or quality. Therefore, statistical agencies will need to determine whether the stores (or any subset of the stores) which comprise their sample are considered to be homogeneous as incorrect aggregation will lead to biased estimates of price change. As our data set does not include information on store characteristics it is difficult to determine which aggregation method is appropriate for this particular data
579.88
set. However, our results do indicate that, in general, aggregating over stores to construct unit values will lead to lower estimates of price change. An important caveat does exist to the above recommendations. We have seen that as the time period over which we construct unit values becomes smaller (from quarterly to monthly to weekly) our index number estimates become increasingly volatile and unreliable. This same pattern of increased volatility is also present when we move from constructing unit values for items over all stores to constructing unit values for an item over each store. In general our results indicate that, when using scanner data, indexes which are based on highly disaggregated unit values will lead to unstable estimates of price change. Therefore, our (tentative) recommendation to not aggregate over heterogeneous stores should only be implemented when doing so does not result in unwarranted price index volatility. Tables 2–7 also indicate that index estimates of price change are generally higher for the fixed basket chained indexes relative to their flexible basket chained counterparts. Thus looking at Table 2 (where unit values are aggregates over stores), we see that the geometric means of the 19 quarterly, monthly and
30
L. Ivancic et al. / Journal of Econometrics 161 (2011) 24–35
Table 6 Paasche index: price change estimates—no item aggregation over stores. Direct
Chained (fixed basket)
Chained (flexible basket)
Quarterly
Monthly
Weekly
Quarterly
Monthly
Weekly
Quarterly
Monthly
Weekly
Biscuits Bread Butter Cereal Coffee Detergent Frozen peas Honey Jams Juices Margarine Oil Pasta Pet food Soft drinks Spreads Sugar Tin tomatoes Toilet paper
98.25 102.63 100.00 100.04 108.87 102.09 100.37 104.18 100.86 100.57 102.17 90.92 100.48 100.44 101.76 103.82 106.15 100.93 98.26
98.99 101.11 100.47 99.96 108.79 104.06 99.97 104.89 101.19 98.89 97.28 87.89 99.98 99.25 100.50 105.47 106.34 97.31 92.66
99.07 98.53 98.52 101.92 110.46 102.61 99.97 104.27 97.60 97.17 100.06 84.03 97.74 100.90 101.23 105.11 105.46 97.46 86.90
96.37 102.113 98.95 98.39 107.07 101.43 99.65 103.66 100.21 99.21 96.92 89.68 99.03 98.85 97.46 103.49 105.31 100.18 93.89
84.02 88.35 91.88 83.71 79.44 87.81 86.20 99.90 89.29 82.54 55.60 77.50 83.65 88.78 59.76 98.77 95.36 83.08 59.74
23.68 3.20 48.46 19.75 13.65 37.90 26.71 81.14 23.92 10.51 0.45 54.02 8.33 35.64 0.12 73.13 46.09 35.65 0.48
95.25 101.87 98.91 98.04 106.97 100.64 99.20 103.38 98.49 98.09 96.73 88.65 98.28 98.48 96.74 103.27 105.08 99.53 93.98
80.41 86.54 91.93 82.69 79.83 86.46 85.79 99.54 86.80 80.96 54.99 73.65 79.39 88.18 59.49 97.86 94.49 83.09 59.78
22.67 3.50 48.23 20.11 15.08 37.11 29.23 80.94 25.79 10.82 0.43 42.06 7.65 37.41 0.19 70.60 46.55 37.28 0.54
Geo mean
101.14
100.15
99.24
99.50
81.92
12.77
98.95
80.69
13.18
Table 7 Fisher index: price change estimates—no item aggregation over stores. Direct
Chained (fixed basket)
Chained (flexible basket)
Quarterly
Monthly
Weekly
Quarterly
Monthly
Weekly
Quarterly
Monthly
Weekly
Biscuits Bread Butter Cereal Coffee Detergent Frozen peas Honey Jams Juices Margarine Oil Pasta Pet food Soft drinks Spreads Sugar Tin tomatoes Toilet paper
99.01 103.72 100.63 100.41 110.41 102.68 100.82 104.52 101.18 101.45 103.85 91.95 100.88 100.88 104.04 104.29 106.56 101.70 99.86
100.54 104.54 101.84 101.74 111.49 105.33 100.73 105.43 102.23 100.86 101.59 89.48 101.28 100.62 104.43 106.56 107.38 99.20 96.86
101.01 105.27 99.64 103.22 113.67 104.14 101.42 105.06 101.53 101.55 103.88 86.16 102.78 102.84 107.09 106.29 106.97 100.47 94.45
98.95 104.13 100.75 100.45 110.34 102.78 100.99 104.49 101.22 101.63 103.97 91.90 100.72 100.87 104.19 104.60 106.36 101.80 100.10
100.90 105.41 102.34 101.57 111.25 104.83 101.35 105.41 102.68 101.51 100.77 89.43 101.08 100.67 102.30 106.76 106.81 99.46 97.24
86.82 100.26 96.71 84.50 86.13 92.95 89.60 102.09 83.88 92.90 79.26 84.58 81.18 96.90 75.53 101.23 90.11 86.99 75.79
97.91 104.00 100.83 100.18 110.30 102.06 100.55 104.21 99.93 100.76 104.06 91.33 100.11 100.49 104.01 104.39 106.14 101.32 100.43
96.60 104.44 102.44 100.79 111.11 103.42 100.24 104.95 99.71 100.05 101.63 87.80 99.13 99.94 102.29 106.28 106.12 98.75 98.61
79.86 99.32 96.53 84.47 87.79 91.99 89.48 101.29 81.48 90.94 79.35 80.89 77.68 95.04 74.28 99.66 89.90 88.12 79.86
Geo mean
101.98
102.12
102.35
102.05
102.10
88.45
101.67
101.17
87.43
weekly measures of chained fixed basket end of period prices are 102.86, 112.52 and 269.10 respectively. The geometric mean of the 19 corresponding measures of chained flexible basket Laspeyres end of period prices are 102.41, 111.54 and 263.97, so that the flexible chained basket estimates are lower than their fixed basket counterparts by 0.45, 0.98 and 5.13 index points. There are similar differences between the fixed basket and flexible basket Paasche and Fisher indexes in Tables 3 and 4, with the fixed basket estimates being higher than their flexible basket counterparts. These differences are quite pronounced when Laspeyres and Paasche indexes are used. When the superlative Fisher index is used, this result is still apparent but considerably less pronounced.14 Since the flexible basket methodology seems to be clearly ‘‘better’’ in the sense that the flexible basket comparisons make maximum use of the data pertaining to any two consecutive periods (whereas the fixed basket comparisons do not), our results
14 Our results may be due to severe discounting of discontinued items. The fixed basket method would not pick up this discounting.
suggest that it is important to introduce new items into the basket as soon as they show up in the marketplace. If our findings can be generalised to other item categories, then this implies that fixing a market basket, particularly for item categories where item turnover is high, could bias price change estimates upwards. We would expect the impact of time aggregation on direct index estimates of price change to be minimal. However, if there are substantial trends in prices within the first and last quarters, then comparing price change from the first week to the last week in the 65 weeks in our data set is different from comparing price change from the first quarter to the last quarter. In any case, the differences between some of the estimates of price change due to time aggregation are considerable. Our tentative conclusions are as follows:
• The use of weekly chained index numbers, even those based on superlative index number formulae, is not recommended due to the erratic nature of the resulting indexes. • Fixed base or direct comparisons of a current period with a base period seem to give reasonably reliable results, at least using monthly or quarterly data. However, these fixed base
L. Ivancic et al. / Journal of Econometrics 161 (2011) 24–35
comparisons suffer from the problems associated with new and disappearing goods; i.e., over time, it becomes increasingly difficult to match items. This lack of matching seems to result in an upward bias. • Fixed base Laspeyres or Paasche indexes have large biases and should not be used in the scanner data context. • It appears that forming unit values by aggregating the same product over stores in the same local market leads to superlative index numbers which are consistently lower than their counterpart indexes which do not aggregate over stores. We recommend that statistical agencies using scanner data should form their unit values by not aggregating over stores if quality differences exist across the stores which comprise the sample. This type of aggregation (i.e., no item aggregation over stores) will implicitly compensate for unmeasured quality differences across stores. However, if stores offer the same (or similar) levels of quality (and this may be the case with stores which belong to the same supermarket chain), then our recommendation is reversed. Although we believe that the above recommendations are useful, they do not resolve the problem of how a statistical agency should use scanner data to aid in computing their CPI. A statistical agency could simply fix a base month or quarter and make a series of direct comparisons of the price and quantity data in the current month with the corresponding data in the base month using a superlative formula but due to the introduction of new items and the disappearance of old items, the amount of item matching would steadily decrease over time, resulting in increasingly unreliable indexes. The use of chained indexes would avoid this problem, but as we have seen, price and quantity bouncing makes chained indexes very unreliable. However, we believe that the problems associated with both direct and chained indexes outlined above can be solved by applying multilateral index number theory to our data. 5. The use of a multilateral index number method to eliminate chain drift Multilateral index numbers are often used for price and output comparisons across economic entities, such as countries; see Kravis (1984), Caves et al. (1982) and Diewert (1999). These multilateral indexes satisfy Fisher’s (1922) circularity test so that the same result is achieved if entities are compared with each other directly, or with each other through their relationships with other entities. Standard bilateral index number formulae do not satisfy this circularity or ‘‘transitivity’’ requirement. The transitive GEKS multilateral index (Gini, 1931; Eltetö and Köves, 1964; Szulc, 1964) is the geometric mean of the ratios of all bilateral Fisher indexes, where each entity is taken in turn as the base.15 Consider the case where there are M entities that we wish to make transitive comparisons across. Let Pj,l denote a Fisher price index between entities j and l, l = 1, . . . , M, and let Pk,l denote a Fisher price index between k and l. Then the GEKS index between j and k, can be written as follows: GEKSj,k ≡
M ∏
Pj,l /Pk,l
1/M
.
(4)
31
Table 8 Quarterly GEKS and chained (flexible) Fisher indexes. Item aggregation over stores
No item aggregation over stores
GEKS
Fisher
GEKS
Fisher
Biscuits Bread Butter Cereal Coffee Detergent Frozen peas Honey Jams Juices Margarine Oil Pasta Pet food Soft drinks Spreads Sugar Tin tomatoes Toilet paper
98.34 103.48 100.72 100.10 110.16 102.40 100.43 104.42 100.16 101.01 103.65 91.61 100.34 100.46 103.42 104.34 106.25 101.05 100.03
96.29 103.61 100.94 99.88 110.05 102.08 100.40 104.13 99.56 100.50 103.72 90.86 99.77 100.23 103.39 104.22 106.03 100.36 100.13
98.88 103.67 100.70 100.29 110.44 102.56 100.76 104.44 100.74 101.28 103.78 91.80 100.65 100.84 104.12 104.35 106.51 101.58 100.03
97.91 104.00 100.83 100.18 110.30 102.06 100.55 104.21 99.93 100.76 104.06 91.33 100.11 100.49 104.01 104.39 106.14 101.32 100.43
Geo mean
101.64
101.30
101.91
101.67
GEKSj,k = GEKSj,l × GEKSl,k . If we treat each time period as an ‘entity’ we can make transitive comparisons across time periods using Eq. (4).16 Changing notation for ease of exposition in what follows, for periods, time periods t = 1, . . . , T and reference period 0, we can write the GEKS price index between 0 and t as:
GEKS0,t ≡
t ∏
P0,l × Pl,t
1/(T +1)
l =0
=
t ∏
GEKSτ −1,τ .
(5)
τ =1
The second equality in (5) comes from the circularity property being satisfied, so the direct GEKS index between periods 0 and t , GEKS0,t , is identical to a period-to-period chained index, ∏t τ =1 GEKSτ −1,τ . Given the use of the Fisher index formula in (5), the multiperiod identity test of Eq. (3) is also satisfied and any resulting index is hence free of chain drift.17 The advantage of this approach over fixed base indexes is that we can use the flexible basket approach for each of the bilateral comparisons in the GEKS index. We estimate two types of GEKS indexes. First, the ‘standard’ GEKS index and second, a Rolling Window GEKS index. 5.1. Standard GEKS indexes ‘Standard’ GEKS indexes were calculated by an application of Eq. (5) to the 19 item categories. As GEKS indexes provide us with a drift-free measure of price change, we compare the results to those of their chained index counterparts to determine the extent of chain drift in the latter indexes.
It can be easily shown that this index satisfies Fisher’s circularity test, so that GEKSj,k = GEKSj,l /GEKSk,l , and given that the bilateral Fisher index satisfies the entity reversal test, we can also write
16 This approach is typically not used for constructing indexes across time due to the loss of characteristicity; see Drechsler (1973). Characteristicity refers to the ‘‘degree to which weights are specific to the comparison at hand’’; Caves et al. (1982). Drechsler (1973, 17) noted that ‘‘characteristicity and circularity are always . . . in conflict with each other’’. This conflict is usually resolved in the time series context by imposing chronological ordering as the unique ordering so that the issue of transitivity or circularity is not considered. 17 Kokoski et al. (1999, 141) noted that the use of transitive multilateral index
15 Sometimes the term ‘GEKS’, or just ‘EKS’, is used to refer to the method of making any bilateral index number formula transitive using the same geometric averaging technique. Here we employ the more common usage of the term so that it refers to the multilateral index based on the bilateral Fisher index formula.
number methods would eliminate the chain drift problem in a time series context. Balk (1981) also applied multilateral indexes in the time series context to maximise matching of items across time. What is new in our proposed method is the suggestion that the last link in a rolling year multilateral index be used to update a month to month or quarter to quarter CPI.
l =1
32
L. Ivancic et al. / Journal of Econometrics 161 (2011) 24–35
Fig. 1. GEKS and chained index results.
GEKS chained indexes were calculated at both quarterly and monthly intervals, and with item aggregation over stores and no item aggregation over stores; see Tables 8 and 9. Quarterly and monthly GEKS and Fisher indexes are plotted for two item categories: jam and oil, to illustrate the differences between the indexes over time; see Fig. 1(a)–(h). When GEKS and Fisher indexes constructed with quarterly time aggregation are compared, the Fisher indexes tend to exhibit downward drift. The Fisher index is found to be lower than the GEKS index for 15 of the 19 item categories when there is item aggregation over stores and 14 of the 19 item categories when there is no item aggregation over stores. In some cases the drift appears to be quite small, although for a number of item categories (e.g., biscuits, pasta, jams and juices) the extent of drift was not negligible. For the item categories that exhibit downward drift, the extent of drift ranges from −0.03% to −2.05% for item aggregation over stores and −0.11% to −0.97% with no item aggregation over stores.
With monthly time aggregation, the Fisher indexes again appear to be consistently lower than the GEKS indexes; lower for 16 of the 19 item categories when there is item aggregation over stores and 15 of the 19 item categories when there is no item aggregation over stores. With monthly time aggregation (as opposed to quarterly time aggregation), the extent of downward drift observed for many item categories over a relatively short period of time is quite substantial. The downward drift ranges from approximately −0.12% to −5.13% for item aggregation over stores and approximately −0.4% to −3.9% with no item aggregation over stores. Our results show that the use of a monthly chained superlative index such as the Fisher may be problematic, particularly over longer time periods where drift may lead to increasingly (downwardly) biased estimates of price change. Overall, the results indicate that for some item categories, quarterly aggregation over time appears to be able to sufficiently smooth out the price and corresponding quantity bouncing be-
L. Ivancic et al. / Journal of Econometrics 161 (2011) 24–35
33
Table 9 Monthly GEKS, RYGEKS and chained (flexible) Fisher indexes. Item aggregation over stores
No item aggregation over stores
GEKS
RYGEKS
Fisher
GEKS
RYGEKS
Fisher
Biscuits Bread Butter Cereal Coffee Detergent Frozen peas Honey Jams Juices Margarine Oil Pasta Pet food Soft drinks Spreads Sugar Tin tomatoes Toilet paper
100.12 104.11 102.30 101.24 111.22 104.83 100.42 105.39 100.88 100.52 99.82 88.46 100.40 100.72 104.47 106.79 107.11 98.71 97.98
100.11 103.95 102.34 101.16 111.25 104.75 100.37 105.35 100.82 100.48 99.77 88.33 100.32 100.70 104.43 106.82 107.12 98.81 97.93
94.99 103.67 102.56 100.54 111.24 103.84 99.74 104.81 99.69 99.47 100.14 86.05 99.36 100.04 102.43 106.67 106.22 98.48 97.77
100.53 104.10 101.91 101.49 111.63 105.04 100.72 105.35 101.86 100.88 101.36 89.21 100.97 100.76 104.41 106.72 107.36 99.45 97.00
100.51 103.97 101.93 101.38 111.61 104.95 100.71 105.34 101.75 100.86 101.31 89.14 100.90 100.79 104.31 106.80 107.35 99.58 97.02
96.60 104.44 102.44 100.79 111.11 103.42 100.24 104.95 99.71 100.05 101.63 87.80 99.13 99.94 102.29 106.28 106.12 98.75 98.61
Geo mean
101.76
101.73
100.80
102.05
102.02
101.17
haviour that is captured in scanner data that leads to chain drift. However, even with quarterly time aggregation, considerable drift is still found for a number of item categories. The results for monthly chained indexes indicate that the Fisher index tends to exhibit considerable downward chain drift. Importantly, this downward drift can readily be controlled using the suggested GEKS methodology. A potential drawback of using the GEKS method as described above is that when a new period of data becomes available all of the previous period parities must be recomputed. For a statistical agency, this continuous process of revision is likely to be unacceptable. To overcome this problem while still maintaining the attractive properties of GEKS indexes we propose the use of, what we have termed, a Rolling Window GEKS index. 5.2. Rolling Window GEKS We start by considering how we could update our standard GEKS index when new data becomes available, without having to revise all previously reported values. One possibility is that we update the existing values by using a ‘‘chain link’’, as follows. Let 0 ,T GEKS0,T +1 denote observation T + 1 of the GEKS index calculated over all T + 2 periods. Then using (5), we can write:
0, . . . , T , over which our first GEKS index is calculated. When a new period of data becomes available our window moves forward one period in time, and will then be comprised of data for the periods 1, . . . , T + 1, from which our next GEKS index is calculated. The first GEKS series is then updated using a chain link derived in an analogous fashion to that in Eq. (7). For each new time period that becomes available, the first time period is dropped from the rolling window and the new time period is added to our rolling window. Let the window length be denoted by W + 1. Then a general expression for the RWGEKS index going from period 0 to period T > W + 1 is as follows18 : RWGEKS0,T = GEKS0,W ×
T ∏
T ∏
PT −1,t /Pt ,T
1/(W +1)
.
(8)
t =W +1 t =T −W
To calculate a RWGEKS index, a decision must be made about the number of periods included in the window. A natural choice for the length of a window is 13 months as it allows strongly seasonal commodities to be compared. We will refer to an index with a window based on a 13 month period as a Rolling Year GEKS (RYGEKS) index. For the fourteenth month, the RYGEKS is: RYGEKS0,13 = GEKS0,12 ×
13 ∏
P12,t /P13,t
1/13
.
(9)
t =1
GEKS0,T +1 0,T GEKS0,T +1
T +1
=
∏
Pt ,T +1 /Pt ,T
1/(T +2)
.
(6)
t =0
Given the lack of chain drift, the right hand side of Eq. (6) can be used as a chain link to update the GEKS index calculated using only the data available in period T , as follows: T +1
GEKS0,T +1 = GEKS0,T ×
∏
Pt ,T +1 /Pt ,T
1/(T +2)
.
The RYGEKS method is the method we recommend for use by statistical agencies.19 As most statistical agencies produce monthly price series we calculate a monthly RYGEKS series. As for the standard GEKS, RYGEKS indexes were calculated for the 19 item categories, and with item aggregation over stores and no item aggregation over stores.
(7)
t =0
This updating of the original GEKS index over can be continued to extend the series as more periods go by. However, the earlier data in the sample become less and less relevant for later comparisons. Hence, we propose a Rolling Window GEKS (RWGEKS). The RWGEKS approach uses a moving window to obtain the chain links to continuously update the price series as data for new periods become available, without the need to revise parities for previous periods. The rolling window works as follows. Suppose we initially have a window that covers data for the periods
18 Ivancic et al. (2009) provide a detailed step-by-step explanation of the implementation of the RWGEKS index. Here we use a more compact notation consistent with Haan and van der Grient (2009). 19 While a RWGEKS index, such as the RYGEKS, will not satisfy transitivity in practice and hence will be potentially subject to chain drift, comparisons within each window are transitive. Using this approach, chain drift is therefore unlikely to be a significant problem in any context likely to be faced by a statistical agency. Also, alternative approaches to linking the indexes could be investigated, such as using different overlapping periods for doing the linking, taking the geometric mean of overlapping comparisons in multiple windows, and so forth. The most obvious approach is pursued in this paper and works well in our empirical applications. An investigation into alternative approaches is left for future research.
34
L. Ivancic et al. / Journal of Econometrics 161 (2011) 24–35
Fig. 2. GEKS and RYGEKS indexes.
It is of interest to compare the GEKS and RYGEKS series as this gives us some indication of whether the RYGEKS index is sensitive to the length of window chosen. It also indicates whether a 13 month window is long enough to provide us with a stable price series. From Table 9, the results show that there is very little difference between the standard GEKS and RYGEKS series. The average absolute differences between the GEKS and RYGEKS price series at the end of the 15 month period ranged from 0.005% to 0.16% for item aggregation over stores and 0.01%–0.13% for no item aggregation over stores. To illustrate how close the GEKS and RYGEKS indexes are over the whole time series, both GEKS and RYGEKS series are plotted for two item categories; toilet paper and butter; see Fig. 2(a)–(d). These results are very encouraging, particularly from the point of view of a statistical agency, as they indicate that the GEKS indexes provide us with a very stable method for estimating price change. Perhaps most importantly, the GEKS indexes (unlike some of their chained index counterparts) give plausible estimates of price change. Such a stability indicates that GEKS indexes can deal well with volatile price and quantities series.
Our scanner data set contains information from supermarket chains located in one city, yet official CPI figures for our item categories were not available at the city level. Therefore, the official CPI figures that we use will reflect price change for the relevant item category for the whole of Australia. Thus our comparisons are only indicative of possible bias in the official CPI. From Table 10 our two GEKS price indexes are very similar, with the method of item aggregation having only a minimal impact. The GEKS indexes seem to be fairly similar to the official figures, with the exception of the product category cereal. There also does not seem to be any consistent pattern between the differences in the GEKS estimates and the official figures. Overall, the (absolute) differences between the GEKS indexes and the official figures range from 0.13% to 2.08% with item aggregation over stores and 0.04%–2.11% with no item aggregation over stores. When the item category cereal is excluded the differences range from 0.14% to 0.75% with item aggregation over stores and 0.04%–1.27% with no item aggregation over stores. Over this time period the official figures seem to compare well with the GEKS figures.20 7. Conclusion
6. Comparison of ABS quarterly CPI with corresponding quarterly GEKS index GEKS indexes by product category are compared with the corresponding official CPI figures in order to determine whether there might be any potential bias in the official figures. GEKS indexes were calculated for six item categories. Categories were chosen where official CPI figures are available for what were thought to be comparable item categories; see Table 10 for the subgroup headings in the Australian CPI which were matched to our scanner data item categories. In Australia the CPI is estimated on a quarterly basis. As our scanner data time series was quite short we are able to match only 4 quarters worth of data with the official CPI series (i.e. the first quarter ends in June 1997, the second quarter ends in September 1997, the third quarter ends in December 1997 and the fourth quarter ends in March 1998). With four quarters of data it is not possible to estimate any RYGEKS indexes. Therefore, GEKS indexes were calculated between quarters 1 and 4, for both item aggregation over stores and no item aggregation over stores.
When using high-frequency data, we have shown that decisions about how to aggregate and whether or not chaining is used can have a huge impact on estimates of price change. It is known that when price bouncing is present, the use of chained indexes in combination with non-superlative indexes tend to exhibit large chain drift. However, the extent of drift for many item categories over a relatively short time period is greatly surprising. It is of concern that indexes which we would typically consider to be much more stable, such as chained superlative indexes, show a troubling degree of volatility when high-frequency data are used. Thus, traditional index number theory appears to break down when high-frequency data are used.
20 Ivancic et al. (2009) proposed an adaptation of another multilateral comparisons method to this context, the Country Product Dummy (CPD) method, again borrowed from the international comparisons literature; see Summers (1973) and Diewert (2004). Results were found to be very similar to those reported here for the GEKS indexes.
L. Ivancic et al. / Journal of Econometrics 161 (2011) 24–35
35
Table 10 Index number comparison: ABS CPI and GEKS indexes. GEKS indexes Item category
Official CPI figures Australia
Item aggregation over stores
No item aggregation over stores
Scanner data
ABS
April 97–March 98 (4 Quarters) base = 100
April 97–March 98 (4 Quarters) base = 100
April 97–March 98 (4 Quarters) base = 100
Cereal Bread Butter Juices Sugar Soft drinks
Breakfast cereals Bread Butter Fruit juice Sugar Soft drinks & cordial
99.59 101.85 99.75 101.63 104.60 103.90
99.62 102.20 99.93 101.69 104.72 104.70
97.51 102.41 99.89 100.99 105.35 103.43
101.87
102.13
101.56
Geo mean
Our results suggest that unit values defined over months or quarters are preferable to unit values defined over weeks. Whether or not items are aggregated over stores in constructing the unit values appears to be a relatively minor consideration compared to the choices of time aggregation and index number formula, but we did find that Fisher indexes that did not aggregate over stores were consistently higher than their counterparts formed using unit values based on aggregating over stores. An additional contribution of the paper is the proposal of a practical rolling window multilateral index number method, which provides drift-free estimates of price change. When monthly chained Fisher indexes are compared with their driftfree counterparts, they are typically found to exhibit downward drift. This drift can be substantial. Even quarterly time aggregation may not be sufficient to mitigate this problem, indicating the importance of our proposed drift-free index method. Acknowledgements The authors gratefully acknowledge constructive comments from Marshall Reinsdorf, Robert Feenstra, Jan de Haan, Alice Nakamura and Fei Wong, the financial support from the Australian Research Council (LP0347654 and LP0667655) and the SSHRC of Canada, and the provision of the data set by the Australian Bureau of Statistics. References Balk, B.M., 1981. A simple method for constructing price indices for seasonal commodities. Statistische Hefte 22 (1), 1–8. Balk, B.M., 1998. On the use of unit value indices as consumer price subindices. In: Lane, W. (Ed.), Proceedings of the Fourth Meeting of the Ottawa Group. Bureau of Labor Statistics, Washington, DC. Bradley, R., Cook, B., Leaver, S.G., Moulton, B.R., 1997. An overview of research on potential uses of scanner data in the US. CPI. In: Paper Presented at the International Conference on Price Indices. Voorburg. Caves, D.W., Christensen, L.R., Diewert, W.E., 1982. Multilateral comparisons of output, input, and productivity using superlative index numbers. Economic Journal 92, 73–86. Dalen, J., 1992. Computing elementary aggregates in the Swedish consumer price index. Journal of Official Statistics 8 (2), 129–147. Dalen, J., 1997. Experiments with Swedish scanner data. In: Paper Presented at the International Conference on Price Indices. Voorburg. Diewert, W.E., 1976. Exact and superlative index numbers. Journal of Econometrics 4, 114–145. Diewert, W.E., 1978. Superlative index numbers and consistency in aggregation. Econometrica 46, 883–900. Diewert, W.E., 1993. The early history of price index research. In: Diewert, W.E., Nakamura, A.O. (Eds.), Essays in Index Number Theory, vol. 1. North-Holland, Amsterdam, pp. 33–65. Diewert, W.E., 1995. Axiomatic and economic approaches to elementary price indexes. Discussion Paper No. 95-01. Department of Economics, University of British Columbia. Diewert, W.E., 1999. Axiomatic and economic approaches to international comparisons. In: Heston, A., Lipsey, R.E. (Eds.), International and Interarea Comparisons of Income, Output and Prices. In: Studies in Income and Wealth, vol. 61. The University of Chicago Press, Chicago, pp. 13–87.
Diewert, W.E., 2004. On the stochastic approach to linking the regions in the ICP. Discussion Paper 04-16. Department of Economics, The University of British Columbia, Vancouver Canada, V6T 1Z1. Drechsler, L., 1973. Weighting of index numbers in international comparisons. Review of Income and Wealth 19 (1), 17–47. Eltetö, Ö., Köves, P., 1964. On a problem of index number computation relating to international comparisons. Statisztikai Szemle 42, 507–518. (in Hungarian). Feenstra, R.C., Shapiro, M.D., 2003. High frequency substitution and the measurement of price indexes. In: Feenstra, R.C., Shapiro, M. (Eds.), Scanner Data and Price Indexes. University of Chicago Press, Chicago, pp. 123–146. Fisher, I., 1922. The Making of Index Numbers. Houghton Mifflin, Boston. Forsyth, F.G., Fowler, R.F., 1981. The theory and practice of chain price index numbers. Journal of the Royal Statistical Society. Series A 144, 244–246. Frisch, R., 1936. Annual survey of general economic theory: the problem of index numbers. Econometrica 4, 1–39. Gini, C., 1931. On the circular test of index numbers. Metron 9 (9), 3–24. Haan, J.de, 2008. Reducing drift in chained superlative price indexes for highly disaggregated data. In: Paper Presented at the Economic Measurement Group Workshop. Sydney. Haan, J.de, Opperdoes, E., 1997. Estimation of the coffee price index using scanner data: the choice of the micro index. In: Paper Presented at the International Conference on Price Indices. Voorburg. Haan, J.de, van der Grient, H., 2009. Eliminating chain drift in price indexes based on scanner data. Statistics Netherlands. The Hague. Hawkes, W.J., 1997. Reconciliation of consumer price index trends with corresponding trends in average prices for quasi-homogenous goods using scanner data. In: Paper Presented at the International Conference on Price Indices. Voorburg. Hawkes, W.J., Piotrowski, F.W., 2003. Using scanner data to improve the quality of measurement in the consumer price index. In: Feenstra, R., Shapiro, M. (Eds.), Scanner Data and Price Indexes. University of Chicago Press, Chicago, pp. 17–38. Hill, T.P., 1993. Price and volume measures. In: System of National Accounts 1993. Brussels/Luxembourg, New York, Paris, New York, and Washington, DC: Eurostat, IMF, OECD, World Bank and United Nations. p. 379. Hill, R.J., 2006. Superlative indexes: not all of them are super. Journal of Econometrics 130, 25–43. ILO. 2004. Consumer price manual: theory and practice. Geneva, International Labour Organization. Ivancic, L., Diewert, W.E., Fox, K.J., 2009. Scanner data, time aggregation and the construction of price indexes. Discussion Paper 09-09. Department of Economics, The University of British Columbia, Vancouver Canada, V6T 1Z1. Jain, M., Abello, R., 2001. Construction of price indexes and exploration of biases using scanner data. In: Paper Presented at the Sixth Meeting of the International Working Group on Price Indices. Canberra. Kokoski, M.F., Moulton, B.R., Zieschang, K.D., 1999. Interarea price comparisons for heterogeneous goods and several levels of commodity aggregation. In: Heston, A., Lipsey, R.E. (Eds.), International and Interarea Comparisons of Income, Output and Prices. In: Studies in Income and Wealth, vol. 61. The University of Chicago Press, Chicago, pp. 123–169. Kravis, I.B., 1984. Comparative studies of national incomes and prices. Journal of Economic Literature 22, 1–39. Lowe, J., 1823. The Present State of England in Regard to Agriculture, Trade and Finance, second ed. Longman, London, Hurst, Rees, Orme and Brown. Reinsdorf, M., 1994. Price dispersion, seller substitution and the US CPI. Bureau of Labor Statistics Working Paper 252. March. Reinsdorf, M., 1999. Using scanner data to construct CPI basic component indexes. Journal of Business and Economic Statistics 17 (2), 152–160. Summers, R., 1973. International comparisons with incomplete data. Review of Income and Wealth 29 (1), 1–16. Szulc, B., 1964. Indices for multiregional comparisons. Przeglad Statystyczny 3, 239–254 (in Polish). Szulc, B.J. (Schultz), 1983. Linking price index numbers. In: Diewert, W.E., Montmarquette, C. (Eds.), Price Level Measurement. Statistics Canada, Ottawa, pp. 537–566. Walsh, C.M., 1901. The Measurement of General Exchange Value. Macmillan, New York. Walsh, C.M., 1921. The Problem of Estimation. P.S. King & Son, London.
Journal of Econometrics 161 (2011) 36–46
Contents lists available at ScienceDirect
Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom
Eliminating chain drift in price indexes based on scanner data Jan de Haan ∗ , Heymerik A. van der Grient Statistics Netherlands, Division of Macro-economic Statistics and Dissemination, P.O. Box 24500, 2490 HA The Hague, The Netherlands
article
info
Article history: Available online 15 September 2010 JEL classification: C43 E31 Keywords: Consumer price index (CPI) Chain drift Multilateral index number methods Scanner data Superlative indexes
abstract The use of scanner data in the CPI makes it possible to compile superlative price indexes at detailed aggregation levels since prices and quantities are available. A potential drawback is the high attrition rate of items. The usual solution to handle this problem, high-frequency chaining, can create drift in the index series due to price and quantity bouncing arising from sales. Ivancic, Diewert and Fox (2009) have recently proposed an approach that provides drift free, superlative-type indexes through adapting multilateral index number theory. In this paper we apply their proposal to seven product groups and find promising results. We compare the results with those obtained by using the Dutch method to deal with supermarket scanner data. © 2010 Elsevier B.V. All rights reserved.
1. Introduction The advantage of using scanner data in the Consumer Price Index (CPI) is that prices and quantities on all goods are available so that the construction of weighted (preferably superlative) price indexes at detailed aggregation levels becomes feasible. But scanner data also have a number of potential drawbacks, such as a high attrition rate of goods and volatility of the prices and quantities due to sales. High-frequency chaining seems a natural solution at first sight to handle new and disappearing goods, but that could lead to drift in weighted indexes when prices and quantities oscillate or ‘bounce’.1 Quantity bouncing arises from the fact that households tend to stock up during sale periods and consume from inventory at times when the goods are not on sale. According to Triplett (2003, p. 152) we require ‘‘a theory that adequately describes search, storage, shopping, and other household activities that drive a wedge between acquisitions periodicity and consumption periodicity’’. While that may be true, in our opinion it is unnecessary to wait until all problems associated with the use of scanner data are resolved. Producing official statistics will always involve making assumptions and pragmatic choices. In particular, we assume that for a homogeneous good the unit value computed across all purchases from a single retail chain during a month is an acceptable measure of the average price paid
∗
Corresponding author. Tel.: +31 0 703375720. E-mail address:
[email protected] (J. de Haan).
1 Szulc (1983) seems to have been the first to address the problem of price bouncing and chaining. 0304-4076/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2010.09.004
by the representative consumer.2 Essentially we are assuming that price and quantity variation within a month represents noise in the data and is not meaningful in the context of a CPI. Still, sales cause considerable bouncing of monthly unit values and quantities. A trivial solution to the problem of drift is not to chain at all and use a direct index, as suggested by Feenstra and Shapiro (2003). This is problematic considering the small number of products that match over time. Another solution would be to exclude goods that are on sale, which is what Statistics Norway does to compute monthly chained price indexes from scanner data; see Rodriguez and Haraldsen (2006). This is unsatisfactory too: it often happens that popular goods go on sale and excluding such goods leads to biased indexes unless long-run price trends are unaffected.3 An interesting approach has recently been proposed by Ivancic et al. (2009). They adapt multilateral index number theory to provide weighted indexes which make maximum use of all possible matches in the data between any two months and are free of drift. They write: ‘‘Discussion of methods of how best to use scanner data in the context of constructing consumer price indexes is particularly important at the present moment as statistical agencies worldwide are becoming increasingly interested in using scanner data in their official CPI figures. To our knowledge, scanner data are currently used only by two statistical agencies: the Central
2 Thus we aggregate across stores belonging to one chain, which often have a common pricing policy, but we do not aggregate across different chains. This is consistent with empirical findings by Ivancic (2007). For more information on the use of unit values, see Diewert (1995), Balk (1998), and ILO et al. (2004). 3 de Haan (2008a) investigated a third option where the superlative index number formula in a chained index is allowed to change over time.
J. de Haan, H.A. van der Grient / Journal of Econometrics 161 (2011) 36–46
Bureau of Statistics in the Netherlands and Statistics Norway’’. In January 2010, Statistics Netherlands expanded the use of scanner data to six major supermarket chains as part of an ongoing redesign of the CPI (de Haan, 2006; van der Grient and de Haan, 2010). The method developed by Ivancic et al. (2009) is not used, however, for reasons we will explain later on. The aim of the present paper is to give some background material on this novel approach, apply it to a large Dutch scanner data set to investigate whether it works as expected, and compare the results with those obtained using the new Dutch method for treating scanner data. The paper is structured as follows. Section 2 describes the scanner data set we will utilize, which covers seven product categories and 44 months. The data come from a single supermarket chain in the Netherlands and are currently inputs to the CPI. We focus on aspects like price and quantity bouncing, the lack of matching over time, and temporarily unavailable products. Section 3 confirms what others found earlier, namely that high-frequency chaining of price indexes, including superlative ones, can lead to drift when sales occur. For the monthly chained Törnqvist index we observe downward drift in most instances and illustrate why this is the case. In Section 4 we discuss the method proposed by Ivancic et al. (2009) to eliminate chain drift and find promising results. A slightly amended version is also presented. Section 5 addresses the Dutch method to handle supermarket scanner data and compares the results with those obtained by applying the Ivancic et al. (2009) method. The Dutch method is based on monthly chained (matched-items) Jevons indexes with three modifications: the use of cut-off sampling to remove items with very low expenditure shares, imputations for temporarily ‘missing’ prices and a filter that excludes items exhibiting a sudden strong drop in both prices and quantities. Section 6 concludes and points to future work in this area. 2. Features of scanner data Supermarket scanner data have three important features which should be borne in mind when compiling price index numbers: price and quantity bouncing as a result of sales, a high attrition rate of new and disappearing items, and temporarily unavailable items or ‘missing prices’. In this section we present illustrative examples of those features. Our data set covers 191 weeks of data (from January 2005 to August 2008) on seven product categories: detergents, toilet paper, diapers, candybars, nuts and peanuts, beef, and eggs. The product categories are not a random selection; we selected them for their heavy price bouncing behaviour. The data come from a large sample of stores belonging to one of the major supermarket chains in the Netherlands and are currently used in the CPI.4 Individual items are identified by the European Article Number (EAN). For all EANs, aggregate weekly expenditures and quantities are known, as well as a short item description. Dividing expenditures by quantities purchased gives the unit value, which is our measure of (average) price. Fig. 1 shows the weekly unit values, quantities and expenditures for a detergent referred to as XXX tablets. This particular item was on the market until the beginning of 2007. There seems to be a ‘regular price’ of slightly more than 6.5 euros. In quite a number of weeks the item is on sale, with price reductions up to 50%. From our own experience we know that a sales period in this supermarket chain lasts for exactly a week (Monday through Sunday), which coincides with our weekly data. Nevertheless, the unit value for the week after a heavy discount is
4 The scanner data are provided to Statistics Netherlands at marginal cost. The agency has a policy of not paying for data which are directly used for the compilation of statistics. Scanner data are confidential and cannot be made publicly available.
37
Fig. 1. Weekly unit values, quantities and expenditures; XXX tablets.
consistently much lower than the ‘regular price’. This might be due to the fact that people who wish to buy a good that is on sale but happens to be sold out are entitled to purchase it at the sale price during the next week. So the unit values for post-sales weeks often include sale prices.5 Fig. 1 also shows what we have called quantity bouncing. The quantity shifts associated with sales are dramatic. Consumers react instantaneously to discounts and purchase large quantities of the good—as a matter of fact, they hardly buy the good when it is not on sale. In this respect it is inappropriate to speak of a regular price during non-sale weeks. Note that the pattern of expenditures is almost identical to the pattern of quantities. Starting from the data for 191 weeks, we constructed monthly data by attributing either 4 or 5 weeks to actual (calendar) months. A priori one might expect the volatility of price and quantity data to diminish if we aggregated across months instead of weeks. This is not true for XXX tablets, as Fig. 2 shows. The monthly prices and quantities exhibit bouncing similar to the weekly data. For the larger part this is a result of the irregular pattern of weekly sales. Looking at the monthly unit values, the term regular price is indeed a misnomer: sale prices are now just as common as non-sale prices. Another aspect of supermarket scanner data is the huge attrition rate: the number of disappearing and new items is usually large. Conversely, the number of items that are available in the stores for many weeks in a row is typically low. Fig. 3 displays the number of matched items for monthly data on detergents in three ways. The downward sloping curve shows how the set of items at the beginning of the period (January 2005) shrinks over time. Only seven out of the 58 initial items can still be purchased at the end of the period (August 2008).6 The upward sloping curve should be read in reverse order: it depicts the number of matches
5 This explanation was suggested to us by Lida Martens. In the post-sales week there may also be some goods left on the shelves that can still be bought at the sale price. 6 The obvious lesson for price measurement is that adhering to a strict matcheditem principle—in other words, using a completely fixed sample of items—is
38
J. de Haan, H.A. van der Grient / Journal of Econometrics 161 (2011) 36–46
Fig. 4. Monthly unit values; YYY toilet paper.
The EAN is a unique identifier at the lowest level of aggregation. In some cases this level may be too detailed: goods that are identical from the consumer’s perspective may have different EANs. A fraction of the ‘holes’ in the data set could be attributable to this effect. Matching by EAN might thus understate the number of matched products and overstate the rate of turnover of new and disappearing products. This is perhaps just a minor issue. 3. Chained superlative indexes 3.1. The problem of chain drift
Fig. 2. Monthly unit values, quantities and expenditures; XXX tablets.
Chained indexes may suffer from what is known as chain drift or chain link bias. Chain drift occurs if a chained index ‘‘does not return to unity when prices in the current period return to their levels in the base period’’ (ILO et al., 2004, p. 445). In this section we address chain drift in superlative price indexes.7 Let p0i and s0i denote the price and expenditure share of good i in the base period 0; pti and sti denote the corresponding values in the comparison period t (t > 0). For a fixed set of goods U the Fisher and Törnqvist price indexes are defined as
∑
( / )
sti
pti
p0i −1
;
(1)
i∈U
Fig. 3. Number of matched items; detergents.
impossible. This point is also stressed by Silver and Heravi (2005). They are especially interested in the use of quality adjustment methods to account for new and disappearing items.
1/2
i∈U
PF0t = ∑
between the last month (August 2008) and each earlier month. A comparison with the downward sloping curve indicates that the total number of different types of detergent changes little in the long run because there are almost as many entries as exits. The third curve depicts the number of monthly matched items, i.e. items which are available in consecutive months. In the short run some marked changes occur. For example, it seems as if in August 2005 the supermarket chain removed part of its detergents assortment and replenished it gradually. Fig. 4 plots monthly unit values for YYY toilet paper. This product has been unavailable during many months—the quantities are zero, giving rise to ‘holes’ in the data set. Practitioners would probably say that the prices are temporarily missing. Any monthly chained, matched-item index number method misses the price change between the last month the item was available and the month it re-enters the stores. For instance, the price increase between April 2005 and October 2007 in Fig. 4 would be left out from the computation. The practical solution is to impute the ‘missing prices’. We will return to this issue in Section 5 when discussing the new Dutch method.
s0i (pti /p0i )
PT0t
=
∏
(pti /p0i )(si +si )/2 . 0
t
(2)
i∈U
If the expenditure shares of all goods would coincide (sti = s0i = 1/N, where N denotes the number of goods), the Törnqvist index reduces to the Jevons index PJ0t =
∏
(pti /p0i )1/N .
(3)
i∈U
Many statistical agencies are nowadays using the Jevons index to compile price indexes at the elementary level if expenditure data are lacking. For scanner data an unweighted index number formula seems irrelevant, but the new Dutch method for the treatment of scanner data does apply the Jevons formula, as will be outlined in Section 5. We will start by distinguishing three periods: 0, 1 and 2. The chained Fisher and Törnqvist price indexes going from period 0 to
7 The attraction of superlative price indexes is that they approximate the underlying cost of living index to the second order while being easy to compute (Diewert, 1976). These indexes also have many desirable axiomatic properties; see e.g. and ILO et al. (2004). The Fisher and Törnqvist indexes are the best known superlative indexes. Ehemann (2005) addresses chain drift in Fisher and Törnqvist indexes. On chaining, see also Forsyth and Fowler (1981).
J. de Haan, H.A. van der Grient / Journal of Econometrics 161 (2011) 36–46
39
period 2 are
∑
s0i (p1i /p0i )
i∈U
PF02,chain = ∑
s1i (p1i /p0i )−1
1/2 ∑
=
∏
0 1 p0i (si +si )/2
( / ) p1i
i∈U ∑ 2
si (p2i /p1i )−1
;
(4)
i∈U
i∈U
PT02,chain
1/2
s1i (p2i /p1i )
i∈U
∏
( /p1i )(si +si )/2 . 1
p2i
2
(5)
i∈U
Price bouncing for a single good is a stylized version of a situation we often observe in supermarket scanner data. Suppose good 1 has been on sale in period 1 and its price has decreased considerably (p11 /p01 < 1) while in period 2 the price returned to the initial value
Fig. 5. Weekly chained price indexes; detergents.
(p21 = p01 or p21 /p11 = p01 /p11 ). The prices of all other goods are assumed fixed. Expressions (4) and (5) then simplify to PF02,chain =
[
s01 {(p11 /p01 ) − 1} + 1
]1/2 ;
s21 {(p11 /p01 ) − 1} + 1
(6)
PT02,chain = (p11 /p01 )(s1 −s1 )/2 . 0
2
(7)
Standard micro-economic theory assumes that, given a set of prices, the quantities are uniquely determined. So if prices bounce we would expect the quantities, and hence the expenditure shares, to return to their initial levels (s21 = s01 ) so that PF02,chain = PT02,chain = 1. However, ‘distortions’ may give rise to a difference between s01 and s21 . In this stylized example we have PF02,chain < 1 and PT02,chain <
Fig. 6. Monthly chained price indexes; detergents.
1 for < and > 1 and > 1 for > This example does not represent our weekly data very well. From Section 2 the following pattern emerges. In week 0 good 1 is sold at the regular price and the quantity is very low or almost zero. In week 1, when the good is sold at the low sales price, the quantity is extremely high. In week 2 the price of good 1 is only slightly higher than in week 1 (though much lower than the regular price) but now the quantity is low, though not as low as in week 0. In week 3 both the price and the quantity return to their initial levels. Assuming again that the prices of the other goods stay the same, the four-period chained Törnqvist index can be written as
Different goods can be on sale at different times. Furthermore, the set of goods U is typically not fixed. If it were, there was no use in chaining—direct superlative price indexes such as the Fisher and Törnqvist, given by (1) and (2), should then be used. Aggregation across time might help reduce the problem of chain drift, assuming that high frequency price and quantity variation represents noise in the data. Statistical agencies do not compile CPIs on a weekly basis anyway, so it is rather obvious to work with monthly unit values and quantities. In Section 3.2 we present some evidence on this topic.
PT03,chain = (p11 /p01 )(s1 +s1 )/2 (p21 /p11 )(s1 +s1 )/2 (p31 /p21 )(s1 +s1 )/2 .
3.2. Results
s21
s01 ,
PF02,chain
0
1
PT02,chain
1
2
s01 .
s21
2
3
(8)
Can anything be said a priori about the expected sign of chain drift in PT03,chain in the case of storable goods? The first component of (8) is
probably the leading term: the strong price decrease p11 /p01 receives extraordinary large weight due to the high quantity purchased in period 1 (in particular when the quantities of the other goods have decreased, which is most likely for substitutable goods). Although the weight of the second component of (8) may even be greater, the price increase p21 /p11 is small and we expect its impact to be modest. The strong price increase p31 /p21 receives relatively small weight since the quantity in period 3 returns to the period 0 level. All in all, we would expect PT03,chain to be below unity so that downward drift prevails. In real life the situation is more complicated. The sign of the drift depends on the magnitude of the price decrease and the associated quantity shifts of all goods belonging to the product group, and on the periodicity of acquisition and consumption.8
8 Feenstra and Shapiro (2003), using data on canned tuna, found that the weekly chained Törnqvist index had an upward drift: ‘‘in periods when the prices are low, but there are no advertisements, the quantities are not high [. . . ]. Because the ads occur in the final period of the sales, the price increases following the sales receive much greater weight than the price decreases at the beginning of each sale. This leads to the dramatic upward bias of the chained Törnqvist’’. That consumers are misinformed without advertisements surprises us a little bit. As was shown in Section 2, in our data set we observe instantaneous responses of consumers to strong price reductions: the quantities immediately increase dramatically and drop to almost zero in after-sales weeks.
Fig. 5 confirms what others have found before (Feenstra and Shapiro, 2003; Ivancic, 2007; de Haan, 2008a; Ivancic et al., 2009): weekly chaining of superlative indexes can lead to exceptionally large drift. For detergents we observe downward drift. Fisher and Törnqvist indexes measure a totally unrealistic price decrease of more than 90% in less than four years. The downward trend of the Jevons index is much smaller. This accords with expectations as it is the asymmetry of expenditure weights that drives chain drift in superlative price indexes. Still, the price decrease measured by the Jevons seems rather large. As can be seen from Fig. 6, aggregating price and quantity data across months instead of weeks dramatically reduces chain drift. Although we cannot be sure that the monthly chained index numbers for detergents are completely free of drift, at least they look plausible. Notice that the Fisher and Törnqvist index numbers are almost identical, notwithstanding the volatility of the monthly price and quantity data. Monthly chaining raises the superlative indexes above the Jevons index. Nevertheless, the monthly Jevons price index numbers are higher than the weekly numbers. The sensitivity of the Jevons to time aggregation surprises us a bit. Fig. 7 shows what happens if we further aggregate over time and use quarterly unit values and quantities to compute quarterly chained indexes. This is not very helpful for statistical agencies that compile monthly CPIs, but it may be considered in Australia, New Zealand and other countries where the CPI is published on a quarterly basis. The results for detergents are striking. Quarterly
40
J. de Haan, H.A. van der Grient / Journal of Econometrics 161 (2011) 36–46
∑M ∑M j =1
chained superlative indexes measure a price increase of 20% or more. We find this implausible. The Fisher and Törnqvist indexes for the last quarter differ by five points, which is remarkable too. Fig. 7 seems to suggest that quarterly data suffer from ‘too much’ aggregation across time—the noise in the data has been eliminated but at the cost of messing up the trend. 4. GEKS and rolling year GEKS indexes
P jl /P kl
1/M
l =1
=
M ∏
P jl × P lk
1/M
,
(9)
l =1
where the second expression holds when the bilateral indexes satisfy the ‘entity reversal test’, so that P kl = 1/P lk . It can easily be shown that jk
jl
kl PGEKS = PGEKS /PGEKS .
T ∏
P 0l /P tl
1/(T +1)
=
T ∏
P 0l × P lt
1/(T +1)
,
(11)
l =0
l =0
provided that the bilateral indexes satisfy the time reversal test. t0 In that case the GEKS index also satisfies this test, i.e. PGEKS = 0t 1/PGEKS . The transitivity property implies that the GEKS index can be written as a period-to-period chained index, i.e. t ∏
τ −1,τ
PGEKS ,
(12)
τ =1
Ivancic et al. (2009), henceforth IDF, have recently proposed a method for constructing price indexes that use all matches in the data between any two periods and that are, in contrast to highfrequency chained indexes, free of drift. The method is an adapted version of the multilateral GEKS (Gini, 1931; Eltetö and Köves, 1964; Szulc, 1964) approach. The GEKS index is the geometric mean of the ratios of all bilateral indexes (computed with the same index number formula) between a number of entities, where each entity is taken as the base. Let P jl and P kl be the bilateral indexes between entities j and l (l = 1, . . . , M ) and between entities k and l, respectively. The GEKS index between j and k can then be written as M ∏
0t PGEKS =
0t PGEKS =
4.1. The basic idea and some background
jk
(ln P ∗jk − ln P jk )2 , being the sum of squared differences
between the logarithms of a (multilateral) index P ∗jk for a pair of countries j, k and the direct (bilateral) index P jk . Notice that the direct index ‘counts twice’ in Eq. (9), namely for l = j and l = k. IDF adapt the GEKS method to price indexes across time by treating each time period as an entity.10 That is, j and k in expression (9) are now time periods and l is the link period. Suppose we have data on prices and quantities at our disposal for periods 0, 1, . . . , T . Choosing 0 as the index reference period and denoting the comparison periods by t (t = 1, . . . , T ), we can write the adapted GEKS index going from 0 to t as
Fig. 7. Quarterly chained price indexes; detergents.
PGEKS =
k=1
(10)
Expression (10) says that the GEKS price index satisfies the circularity or transitivity requirement: the same result is obtained if entities are compared with each other directly or via their relationships with other entities. Multilateral indexes such as the GEKS are often used to make price comparisons across countries (or regions); see Diewert (1999a) and Balk (2001, 2008) for overviews. Transitivity is particularly useful to circumvent the choice of base or bridge country, but a drawback is that a transitive index for two countries depends on the data of all other countries—there is a loss of characteristicity.9 The GEKS method can be justified as a means of preserving characteristicity as much as possible. More specifically, the GEKS price index is the solution to minimizing
9 Characteristicity is ‘‘the property that requires the transitive multilateral comparisons between members of a group of countries to retain the essential features of the intransitive binary comparisons that existed between them before transitivity’’ (Eurostat and OECD, 2006, p. 127). Caves et al. (1982) refer to characteristicity as the ‘‘degree to which weights are specific to the comparison at hand’’.
which should be free of chain drift. The bilateral indexes are all matched-item indexes: only price relatives of items that are purchased in the two periods compared enter the indexes. IDF call this a flexible basket approach. The GEKS approach thus makes maximum use of all possible matches in the data between any two periods, which can be seen as its most important property. Imputations to deal with ‘missing prices’ are therefore unnecessary. Any matched-item index, including the GEKS, does not explicitly account for quality change.11 For many fast-moving goods purchased in supermarkets quality change is arguably a minor issue. Even if quality changes are substantial, measuring prices of matched items might suffice under competitive market circumstances. 0t PGEKS , given by (11), depends on the price and quantity data of all time periods, including t + 1, . . . , T . In real time we cannot produce an index based on future data. What we can do in practice is calculate the GEKS index for the current (most recent) period T using all the available data and update the time series as time passes. It is now more convenient to write the GEKS index going from period 0 to period T as 0T PGEKS =
T ∏
P 0t /P Tt
t =0
1/(T +1)
=
T ∏
P 0t × P tT
1/(T +1)
.
(13)
t =0
10 In the context of price indexes for seasonal goods, Balk (1984, Ch. 4) describes a method that turns out to be equivalent to the GEKS method. Note that IDF borrow an alternative method from the international comparisons literature, the Country Product Dummy (CPD) method, and adapt it to provide price indexes free of chain drift. The resulting estimates have standard errors associated with them. They argue that the lack of standard errors is a drawback of the GEKS methodology. We disagree with this view. The choice of index number formula is what matters. Index numbers that do not rely on sampling, as with scanner data, have no standard errors, or at least no sampling error (unless there would be imputations involved). The CPD approach, like any model-based approach, adds error because of the use of a stochastic model. 11 Quality change can best be seen as the appearance of new products and the disappearance of ‘old’ ones at the lowest possible aggregation level. From an index number point of view quality adjustment methods should therefore estimate what the prices of those products would have been if they had been available. Put otherwise, quality adjustment methods such as hedonic regression are essentially imputation methods; see Diewert et al. (2007) and de Haan (2008b). This raises the question whether the GEKS approach would still be of use if we imputed all (temporarily) ‘missing prices’ through hedonic regression or the like, and if so, how the imputations would affect the GEKS index.
J. de Haan, H.A. van der Grient / Journal of Econometrics 161 (2011) 36–46
Before discussing the updating of the time series we address one other issue first. While transitivity is a useful property, it is not a necessary requirement in a time series context where chronological ordering of the price indexes is the unique ordering. ∑T ∑T 0T ∗ts GEKS index PGEKS results from minimizing − s=0 t =0 (ln P ln P ts )2 for any two periods s and t. But why should this be the optimal rule for deriving a price index going from 0 to T ? Minimizing the sum of squared differences is a natural choice for a comparison between countries because the direct (bilateral) indexes are ‘better’ than other indexes. In a time series context, where a lack of matched items is the problem, the direct index may not be best. Suppose that the number of matches gradually decreases over time. The longer the period, the less we want to rely on the direct index. In other words, while in this case the direct index P 0T is less representative than the indirect indexes P 0t × P tT (t ̸= 0, T ), it has twice the weight.12 We therefore alternatively consider the unweighted geometric mean of the direct and indirect indexes, which obviously also makes use of all matches in the data between any two time periods: 0T PALT =
T ∏
P 0t × P tT
1/T
.
(14)
t =1 0T It can easily be shown that PALT is not transitive. If the bilateral 0T indexes satisfy the time reversal test then so does PALT . Now we turn to updating the time series. The GEKS index for period T + 1 using price and quantity data pertaining to all periods t = 0, . . . , T + 1 is 0 ,T + 1
PGEKS =
T +1
P 0t /P T +1,t
∏
1/(T +2)
T +1
=
∏
P 0t × P t ,T +1
1/(T +2)
. (15)
t =0
t =0
A drawback is that the index number for period T would be revised if we re-computed it using the extended data set.13 We denote the 0T revised index number by PGEKS (0,T +1) . There is however no need to publish the revised numbers. Since the time series is free of drift, we may use the change in the GEKS index (15) between T + 1 and T 0,T +1 0T (i.e. PGEKS divided by PGEKS (0,T +1) , which are both computed on the data of periods 0, . . . , T + 1), as the chain link to update the time series. Due to transitivity, for bilateral price indexes that satisfy the time reversal test we have T +1
0 ,T + 1
0T PGEKS /PGEKS (0,T +1) =
P t ,T +1 /P tT
∏
1/(T +2)
,
(16)
t =0
so that the index for period T + 1 would become 0 ,T + 1
0T PGEKS = PGEKS
T +1
∏
P t ,T +1 /P tT
1/(T +2)
.
(17)
t =0
The same approach could be followed to extend the time series to periods T + 2, T + 3, etc. Clearly, any index changes derived from the time series constructed in this way, for instance the annual inflation rate, are affected by the prices and quantities pertaining to earlier periods. To diminish the loss of characteristicity, IDF use a so-called rolling year approach.
12 On the other hand, if (nearly) all items do match between period 0 and period T , then we would in fact prefer the direct index. This suggests taking a weighted average of the direct and indirect indexes, where the weights somehow depend on the number of matches. Weights can be inserted into the minimization rule (see e.g. Balk, 2008, Ch. 7), but it is not easy to see how to derive weights without making arbitrary choices. 13 In the words of Hill (2004), the GEKS index violates time fixity. Most statistical agencies would find this unacceptable.
41
Fig. 8. Initial and revised (44 months) GEKS-Törnqvist indexes; detergents.
We assume that, like in most countries, the CPI is a monthly statistic. The rolling year approach uses the price and quantity data for the last 13 months to compute GEKS indexes. As in (17), the most recent month-to-month index change is then chain linked to the existing time series. The choice for a 13 month moving window is optimal in the sense that it allows a comparison of strongly seasonal items.14 Longer windows could be chosen, but that would 0,12 lead to a greater loss of characteristicity. Using PGEKS as the starting point for constructing a monthly time series, the rolling year GEKS (RGEKS) index for month 13 becomes 0,13
0,12
PRGEKS = PGEKS
13 ∏
P 12,t /P 13,t
1/13
t =1 12
=
P 0t /P 12,t
∏ t =0
13 1/13 ∏
P 12,t /P 13,t
1/13
.
(18)
t =1
The general expression for the RGEKS index going from an arbitrary base month 0 to the current month T (T > 12) is 0T PRGEKS =
12 ∏
P 0t /P 12,t
t =0
T T ∏ 1/13 ∏
P T −1,t /P T ,t
1/13
.
(19)
t =13 t =T −12
The rolling year method can also be applied to the alternative index 0,12 given by expression (14), using PALT as the starting point. GEKS and RGEKS indexes are preferably based on superlative bilateral indexes because they satisfy the time reversal test and have other desirable axiomatic properties. IDF calculate GEKS indexes using bilateral Fisher indexes. They also estimate RGEKS indexes for (no more than) three months—their data series is only 15 months long. We chose to work with Törnqvist price indexes and compute GEKS and RGEKS for a much longer time period. In addition we will use Jevons bilateral price indexes to investigate the impact of weighting and to compare the results with monthly chained Jevons price indexes presented in Section 5. The Jevons also satisfies the time reversal test. 4.2. Results To get an idea of the potential effects of revisions, Fig. 8 depicts two monthly GEKS-Törnqvist indexes for detergent during January 2005–January 2006. The first one uses the data of those 13 months only, the second one is based on all data that are available to us (44 months), including data from February 2006 through August 2008. The revision is downward. While being small as compared to the volatility of the index numbers, it cannot be ignored. Fig. 9 shows monthly RGEKS-Törnqvist and RGEKS-Jevons indexes for all seven product categories. The alternative indexes in which the direct bilateral (Törnqvist or Jevons) index counts once,
14 Strongly seasonal goods can only be purchased during some months of the year. For a discussion on the problems associated with seasonality, see Diewert (1999b).
42
J. de Haan, H.A. van der Grient / Journal of Econometrics 161 (2011) 36–46
Fig. 9. Rolling year GEKS-Törnqvist and GEKS-Jevons indexes.
are also shown. The RGEKS-Törnqvist indexes show no obvious sign of drift, as expected. The highly volatile pattern is somewhat surprising as we would expect the RGEKS approach to smooth price fluctuations. In most cases the RGEKS-Jevons is much lower than the RGEKS-Törnqvist. For example, at the end of the sample period (August 2008) the RGEKS-Jevons and -Törnqvist indexes for detergents end up at 93 and 102, respectively. A similar difference was found in Fig. 6 for the monthly chained versions. Thus, the choice of aggregation method at the elementary level makes a lot of difference. Our results suggest that low expenditure items exhibited relatively small price increases or large price decreases. The volatility of the RGEKS-Jevons is less than that of the RGEKSTörnqvist but still substantial. Notice that in general the alternative indexes are slightly higher than their RGEKS counterparts. Fig. 10 compares the RGEKS-Törnqvist indexes (presented in Fig. 9) with monthly chained Törnqvist indexes and direct Törnqvist indexes. Except for detergents, where we find no obvious sign of drift, monthly chaining leads to downward drift. In a number of cases the drift is severe; for toilet paper the difference
between the RGEKS-Törnqvist and the chained Törnqvist has risen to 30 index points in August 2008. Direct price indexes are of course free of chain link bias but have the drawback of relying on an increasingly smaller set of items. Fig. 10 confirms that the direct (matched items) Törnqvist index should not be used. 5. Chained jevons indexes Scanner data were first introduced into the Dutch CPI in 2002. Price index numbers for two supermarket chains were calculated with the Lowe formula, based on a large cut-off sample of items (EANs) for each product group. The expenditure weights of the items were updated annually, or sometimes bi-annually, and the short-term index series were chained in December to obtain long-run series. Although weighting at the item level is a strong point, it had the drawback of ‘amplifying’ the impact of sales as often the more popular items go on sale, and thus led to volatile index numbers. More importantly, new items could only be introduced in December unless they were selected as replacements
J. de Haan, H.A. van der Grient / Journal of Econometrics 161 (2011) 36–46
43
Fig. 10. Rolling year GEKS-Törnqvist, chained Törnqvist and direct Törnqvist price indexes.
for disappearing items. Searching for replacement items and trying to adjust for quality changes was a very labour intensive and time consuming process. This was true also for the initial selection of the basket of items. As from January 2010 the use of scanner data has been extended to six major supermarket chains. The Jevons instead of the Lowe index number formula is now used. To update item samples as quickly as possible and enhance efficiency, monthly chained matched-item Jevons price indexes are computed. The method has a number of potential drawbacks for which solutions had to be found. Since the Jevons is an unweighted index, relatively unimportant items, in terms of their expenditure shares, would have the same impact on the index as more important items. To reduce this effect somewhat a crude type of implicit weighting will be applied through cut-off sampling: important items will be included in the sample with certainty whereas unimportant items will be excluded. An item i is selected for the index between months t − 1 and t if its average expenditure share (with respect to the set of matched items) in both months, (sti −1 + sti )/2, is above a certain
threshold value. The threshold is given by 1/(N t −1,t × χ t −1,t ), where N t −1,t denotes the number of matched items. Initially we chose χ t −1,t = 2. This means that, for example, if N t −1,t = 50, then all items with an average expenditure share of more than 1% would be selected. Note that the number of matched items in the sample, nt −1,t , as well as the sample aggregate expenditure share,
∑nt −1,t i =1
(sti −1 + sti )/2, will change over time. Statistical agencies
usually have fixed-size samples (‘panels’) to compute elementary aggregate price indexes (see e.g. Balk, 2005). As mentioned earlier, the second drawback of a strictly matched-items method is that temporarily missing items are excluded from the computation so that price changes occurring between the last month these items were in the sample and the month they re-enter the sample will be missed. The ‘missing prices’ are imputed, as is often done by statistical agencies, by multiplying the last observed price by the (Jevons) price index of the matched items within the product group in question. In a way we are forcing a panel element onto the dynamic matched-items approach. Finally, like any matched-items approach, the method does not explicitly take quality changes into account. Since implicit
44
J. de Haan, H.A. van der Grient / Journal of Econometrics 161 (2011) 36–46
Fig. 11. Chained Jevons price indexes; toilet paper.
quality-adjustment methods have been most prominent in the Dutch CPI in the past, in this respect the new method is similar to the old one. The newly-built computer system does allow for making explicit adjustments, just in case. In particular, quantity
adjustments for changes in package size or contents could be made when deemed necessary. We expect this feature to be used infrequently (and hopefully not at all). The impact of both adjustments, cut-off sampling and imputation, on the chained matched-items Jevons price index for toilet paper is shown in Fig. 11. The unadjusted index clearly has a downward drift. Cut-off sampling (χ t −1,t = 2) makes things worse. Imputing ‘missing prices’ turns the downward trend of the sample-based index into an upward trend, particularly during 2008. Fig. 12 compares the adjusted chained Jevons indexes for all product groups with the RGEKS-Törnqvist indexes (from Fig. 9) to assess whether both adjustments eliminate the downward bias. The evidence is a bit mixed. For toilet paper the adjusted Jevons ends at the same level as the RGEKS but in the middle of the observation period the difference is large. For detergents, diapers, candybars and beef the adjusted Jevons performs rather well. On the other hand, for nuts and peanuts and for eggs the adjusted Jevons has a severe downward bias.
Fig. 12. Rolling year GEKS-Törnqvist indexes and chained Jevons price indexes with imputations and based on a cut-off sample.
J. de Haan, H.A. van der Grient / Journal of Econometrics 161 (2011) 36–46
To find a possible explanation for this bias, we had a closer look at the data. It turned out that some items exhibit a considerable price drop compared to the previous month in combination with an even sharper drop in the quantities sold. Apparantly those items have become unpopular and are dumped. We decided to build a ‘dumping filter’ into the CPI system which excludes items exhibiting both a price decline of more than 20% and a decrease in expenditure of more than 80%. At the same time, based on our empirical work, we chose to slightly reduce the cut-off sample by setting χ t −1,t = 1.25 instead of χ t −1,t = 2. The improved results are also shown in Fig. 12. Particularly due to the dumping filter, the strong downward bias for nuts and peanuts and for eggs has now disappeared. We conclude that although the new Dutch methodology is not without difficulties, it produces satisfactory results in most cases. 6. Conclusions and future work In this paper we have applied the method developed by Ivancic et al. (2009) and computed rolling year GEKS price index numbers for seven product categories. The method performs as expected: in contrast to monthly chained superlative price indexes, the RGEKS indexes show no sign of (chain) drift. In spite of the promising results, Statistics Netherlands decided not to implement the RGEKS method in 2010 when scanner data from six major supermarket chains were incorporated into the CPI. Even if we wanted to, it would have been impossible due to time constraints—designing and testing an official computer system takes a lot of time and effort, and we would not have been able to develop such a system on time.15 A drawback of the RGEKS method is a lack of transparency. CPI practitioners may have difficulties in trying to come up with explanations for implausible price changes. In our opinion this is not a convincing argument against using the RGEKS approach; if a method is clearly better than others, it should be implemented, unless there are serious practical problems or high costs that would prevent this. There is one reason, apart from time constraints, why this new methodology cannot immediately be used in the Dutch CPI. Statistics Netherlands has a policy of using only methods that are widely accepted. We interpret this rather vague statement as follows: methods do not necessarily have to be widely used, but they should be accepted as good practice by experts in the field and by the international statistical community. The RGEKS method is obviously in an early stage, and more evidence is needed to get it widely accepted. We encourage other statistical agencies—especially those that are already using scanner data and those that are interested in doing so in the near future—to consider the RGEKS method and present empirical evidence. Three issues could be addressed. First, it would be useful to compare RGEKS indexes for seasonal goods, such as fresh fruit, with scanner data based price indexes calculated using traditional methods to cope with seasonality. Second, RGEKS price indexes can be computed at various levels of product aggregation. Our computations were done at a detailed level but it would be worthwhile to compare them to indexes at higher aggregation levels. Third, in addition to monthly indexes, RGEKS indexes can be computed for weekly and quarterly data to investigate how increased aggregation over time affects the results. As they should be drift free, we would expect weekly, monthly and quarterly RGEKS indexes to exhibit similar trends. Statistical agencies that publish the CPI on a quarterly basis, like the Australian Bureau of Statistics and Statistics New Zealand, are most likely interested in quarterly aggregations. We did
15 For this study we have used a statistical package (SAS) and a spreadsheet program. This would not be allowed for producing the Dutch CPI.
45
Fig. 13. Monthly and quarterly rolling year GEKS-Törnqvist, quarterly chained Törnqvist and quarterly direct Törnqvist price indexes; detergents.
some preliminary work on this and constructed quarterly RGEKSTörnqvist indexes, using a five quarter window, for all seven product categories. For six categories, the RGEKS method appeared to be insensitive to increased aggregation over time, the quarterly RGEKS indexes being very similar to the monthly counterparts. The exception is detergents. Fig. 13 depicts the quarterly RGEKSTörnqvist indexes for detergents together with the quarterly direct and quarterly chained Törnqvist indexes as well as monthly chained Törnqvist indexes. The latter are calculated as re-scaled three-month averages of the index numbers shown in Fig. 6. The quarterly RGEKS index is much higher than the monthly RGEKS, which is a puzzling result that calls for further investigation. Acknowledgements We would like to thank Erwin Diewert and participants at the eleventh Ottawa Group meeting (27–29 May 2009, Neuchâtel, Switzerland) for helpful comments. We are also grateful to Lorraine Ivancic for letting us use her SAS program to compute the price index numbers. The views expressed in this paper are those of the authors and do not necessarily reflect the views of Statistics Netherlands. References Balk, B.M., 1984. Studies on the construction of price index numbers for seasonal products. Ph.D. Thesis. University of Amsterdam. Balk, B.M., 1998. On the use of unit value indices as consumer price sub-indices. In: W. Lane (Ed.), Proceedings of the Fourth Meeting of the Ottawa Group. Bureau of Labor Statistics, Washington, DC, pp. 112–120. Balk, B.M., 2001. Aggregation methods in international comparisons: what have we learned? ERIM Report. Erasmus Research Institute of Management, Erasmus University Rotterdam. Balk, B.M., 2005. Price indexes for elementary aggregates: the sampling approach. Journal of Official Statistics 21, 675–699. Balk, B.M., 2008. Price and Quantity Index Numbers: Models for Measuring Aggregate Change and Difference. Cambridge University Press, New York. Caves, D.W., Christensen, L.R., Diewert, W.E., 1982. Multilateral comparisons of output, input, and productivity using superlative index numbers. The Economic Journal 92, 73–86. Diewert, W.E., 1976. Exact and superlative index numbers. Journal of Econometrics 4, 115–145. Diewert, W.E., 1995. Axiomatic and economic approaches to elementary price indexes. Discussion Paper No. 95-01. Department of Economics, University of British Columbia. Diewert, W.E., 1999a. Axiomatic and economic approaches to international iomparisons. In: Heston, A., Lipsey, R.E. (Eds.), International and Interarea Comparisons of Income, Output and Prices. In: Studies in Income and Wealth, vol. 61. University of Chicago Press, Chicago, pp. 13–87. Diewert, W.E., 1999b. Index number approaches to seasonal adjustment. Macroeconomic Dynamics 3, 1–21. Diewert, W.E., Heravi, S., Silver, M., 2007. Hedonic imputation versus time dummy hedonic indexes. IMF Working Paper No. 07/234. IMF. Washington, DC. Ehemann, C., 2005. Chain drift in leading superlative indexes. Working Paper No. 2005-09. Bureau of Economic Analysis, Washington, DC. Eltetö, Ö., Köves, P., 1964. On a problem of index number construction relating to international comparisons. Statisztikai Szemle 42, 507–518 (in Hungarian). Eurostat and OECD. 2006. Methodological manual on PPPs European Commission, Brussels.
46
J. de Haan, H.A. van der Grient / Journal of Econometrics 161 (2011) 36–46
Feenstra, R.C., Shapiro, M.D., 2003. High frequency substitution and the measurement of price indexes. In: Feenstra, R.C., Shapiro, M.D. (Eds.), Scanner Data and Price Indexes. University of Chicago Press, Chicago, pp. 123–146. Forsyth, F.G., Fowler, R.F., 1981. The theory and practice of chain price index numbers. Journal of the Royal Statistical Society: Series A 144, 244–246. Gini, C., 1931. On the circular test of index numbers. Metron 9 (9), 3–24. van der Grient, H.A., de Haan, J., 2010. The use of supermarket scanner data in the Dutch CPI. Paper Presented at the Joint ECE/ILO Workshop on Scanner Data. Geneva, 10 May 2010. Also available on: www.cbs.nl. de Haan, J., 2006. The re-design of the Dutch CPI. Statistical Journal of the United Nations Economic Commission for Europe 23, 101–118. de Haan, J., 2008a. Reducing drift in chained superlative price indexes for highly disaggregated data. Paper Presented at the Ninth Economic Measurement Group Workshop, Sydney, 10–12 December 2008. de Haan, J., 2008b. Hedonic price indexes: a comparison of imputation, time dummy and other approaches. Working Paper no. 2008/01. Centre for Applied Economic Research, University of New South Wales, Sydney. Hill, R.J., 2004. Superlative index numbers: not all of them are super. Journal of Econometrics 130, 25–43.
ILO, IMF, OECD, Eurostat. United Nations. World Bank, 2004, Consumer Price Index Manual: Theory and Practice ILO Publications, Geneva. Ivancic, L., 2007. Scanner data and the construction of price indices. Ph.D. Thesis. University of New South Wales, Sydney. Ivancic, L., Diewert, E.W., Fox, K.J., 2009. Scanner data, time aggregation and the construction of price indexes. Discussion Paper 09-09. Department of Economics, University of British Columbia, Vancouver Canada, V6T 1Z1. Rodriguez, J., Haraldsen, F., 2006. The use of scanner data in the norwegian CPI: the new index for food and non-alcoholic beverages. Economic Survey 4, 21–28. Silver, M., Heravi, S., 2005. A failure in the measurement of inflation: results from a hedonic and matched experiment using scanner data. Journal of Business and Economic Statistics 3, 269–281. Szulc, B., 1964. Indices for multiregional comparisons. Przeglad Statystyczny 3, 239–254. (in Polish). Szulc, B.J., 1983. Linking price index numbers. In: W.E. Diewert and C. Montmarquette (Eds.), Price Level Measurement Statistics Canada, Ottawa. pp. 537-566. Triplett, J.E., 2003. Using scanner data in consumer price indexes: some neglected conceptual considerations. In: Feenstra, R.C., Shapiro, M.D. (Eds.), Scanner Data and Price Indexes. University of Chicago Press, Chicago, pp. 151–162.
Journal of Econometrics 161 (2011) 47–55
Contents lists available at ScienceDirect
Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom
Price dynamics, retail chains and inflation measurement Alice O. Nakamura a,∗,1 , Emi Nakamura b,1 , Leonard I. Nakamura c,1 a
University of Alberta, Canada
b
Columbia University, United States
c
Federal Reserve Bank of Philadelphia, United States
article
info
Article history: Available online 22 September 2010 JEL classification: C43 E30 E31 Keywords: Price rigidity Inflation measurement Menu costs Price indexes CPI
abstract We use a large scanner price dataset to study grocery price dynamics. Previous analyses based on store scanner data emphasize differences in price dynamics across products. However, we also document large differences in price movements across different grocery store chains. A variance decomposition indicates that characteristics at the level of the chains (as opposed to individual stores) explain a large fraction of the total variation in price dynamics. Thus, retailer characteristics are found to be crucial determinants of heterogeneity in pricing dynamics, in addition to product characteristics. We empirically explore how the price dynamics we document affect price index measures. © 2010 Elsevier B.V. All rights reserved.
1. Introduction Price index specialists are involved in defining, compiling and assessing the measures of inflation that central bankers and macroeconomists rely on. Price dynamics have implications for the choice of a true target index for inflation. The nature of price dynamics is an important determinant also of the data collection methodologies adopted for price index programs.2 For instance, sectors or products that exhibit more volatile prices (e.g., fresh fruit and vegetables) are typically sampled more frequently than product categories with more stable prices. As well, certain types of price dynamics can cause the selected measures for a target index to be biased.3 Ivancic et al. (2009) and Haan and van der Grient (2009) explain, for example, that the chain drift bias problem is caused by a particular sort of price dynamics known as
∗
Corresponding author. Tel.: +1 604 264 1549; fax: +1 604 264 1518. E-mail addresses:
[email protected] (A.O. Nakamura),
[email protected] (E. Nakamura),
[email protected] (L.I. Nakamura). 1 The views expressed here are those of the authors and do not necessarily reflect those of the Federal Reserve Bank of Philadelphia or of the Federal Reserve System. 2 See ILO (2004) and IMF (2004), the new International CPI and PPI Manuals. 3 A price index formula used for evaluating a target consumer price index is said to be biased if the expected value differs from the target index. 0304-4076/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2010.09.005
‘‘price bouncing’’, and they provide empirical evidence of this bias problem for Australia and The Netherlands.4 Temporary sales are a recognized source of price bouncing, and there issome evidence that the frequency of temporary sales has been increasing in the United States.5 The US Bureau of Labor Statistics (BLS) collects prices regardless of whether they are identified as ‘‘sale’’ or ‘‘regular’’ prices. The same is true for Statistics Canada. 6 Thus the BLS and Statistics Canada are interested in evidence about the severity of chain drift price index bias and ways of reducing this problem. In contrast, Germany, Italy and Spain do not include price discounts for seasonal sales periods in their Consumer Price Index (CPI) data collection.7 Thus the statistical agencies in those nations are interested in empirical evidence about whether the movements of regular prices and sale prices are redundant for measuring trends and business cycle fluctuations in inflation.
4 For other sorts of price dynamics that can cause price index bias, see Diewert et al. (2009), Diewert and Nakamura (1993), Feenstra and Shapiro (2003) and Sobel (1984). The paper of Ivancic and Fox (2010) also contains related material. 5 See Pashigian (1988) and Nakamura and Steinsson (2008). 6 Statistics Canada (1996, p. 5): ‘‘Since the Consumer Price Index is designed to measure price changes experienced by Canadian consumers, the prices used in the CPI are those that any consumer would have to pay on the day of the survey. This means that if an item is on sale, the sale price is collected’’. 7 See the Technical Appendix in Dhyne et al. (2005).
48
A.O. Nakamura et al. / Journal of Econometrics 161 (2011) 47–55
Many previous analyses of retail price dynamics have focused on heterogeneity in pricing behavior across products, of which there is a great deal. One reason for this focus is that data broken down by product category has been more readily available. For example, the database of prices underlying the CPI for the United States is collected and organized according to product category. Also, numerous studies of retail price dynamics have used the Dominick’s database at the University of Chicago Graduate School of Business which is for one retail chain. In contrast, in this paper, we use a dataset consisting of millions of price observations per year at a large number of grocery stores in numerous retail chains to document the nature and dispersion of high frequency price dynamics across stores and chains (in addition to products). We document a vast amount of heterogeneity across retailers in the nature of pricing behavior even for identical products. While some chains exhibit frequent price drops associated with temporary sales and ‘‘high–low’’ pricing schemes, others exhibit more price stability such as those associated with ‘‘everyday low prices’’. We find that pricing patterns at the level of chains (as opposed to individual stores) play a particularly large role in accounting for price dynamics. There is far more variation in pricing dynamics across chains for a given product than among stores within a given chain. We carry out a variance decomposition to analyze the importance of various determinants of the prevalence of price volatility and temporary sales. Our analysis reveals that the treatment of chains is an important issue for measures of aggregate pricing dynamics. Our analysis also provides a useful reference point for existing studies of individual stores or chains such those based on the Dominick’s database.8 Our empirical analysis confirms that temporary sales, which occur frequently in many stores, are important determinants of price dynamics in the United States. To investigate the implications of this phenomenon, we compare price index measures calculated using all prices and those calculated using only ‘‘regular prices’’ (i.e., using only prices excluding temporary sales). Our results also confirm the importance for the United States of the chain drift problem documented for Australia by Ivancic et al. (2009) and for The Netherlands by Haan and van der Grient (2009). Our findings indicate that the extent of chain drift is likely to differ significantly across different products and chains. In the price index area, another contribution of our results is to show that a substantial portion of the price dynamics associated with an individual outlet might be captured by looking at price movements for any representative store from the same chain. This finding is relevant as well for considering how the elemental unit of observation for inflation measurement should be defined. For example, Diewert (forthcoming) suggests that: ‘‘[I]nstead of calculating outlet specific unit values for a commodity, a unit value could be calculated over all outlets in the market area’’. Section 2 presents a simple macroeconomic model of price rigidity as a conceptual framework for our empirical results. We introduce our data in Section 3. In Section 4, we present estimates of the frequency of price change computed with, and without, temporary sales as well as regular prices. We confirm for multiple product categories that the measured frequencies (and
8 Influential studies based on the Dominick’s database include Chevalier et al. (2003). Many other studies of pricing and inventory behavior use descriptive statistics computed using the Dominick’s data. For example, Woodford (2009) makes use of statistics reported for the Dominick’s database by Midrigan (2008), and Kryvtsov and Midrigan (2008) calibrate their model based on statistics computed using the Dominick’s database.
other attributes) of price change differ greatly depending on the treatment of temporary sales. Also in the latter part of Section 4 and then in Section 5, we show that the measured attributes of price change differ greatly not just over products, but over stores, and especially for stores in different retail chains. In Section 6 we directly explore the implications of temporary sales for price index making, and find more evidence that temporary sales matter. Section 7 concludes. 2. Index number theory and models of price rigidity The measures of price change studied by price index specialists, as well as central bankers and macroeconomists, have traditionally been defined in the context of utility theoretical models. In the economic theory of index numbers, the study of household price indexes focuses on the Konüs (1939) true cost of living index. This index is defined as the ratio of the minimum cost of achieving a certain reference utility level in a base period, given the prices prevailing at that time, versus at a later ‘‘current’’ period given the prices then. The target national price index for consumer products is a generalization of the Konüs true cost of living index for a single household (see Pollak, 1980, 1981 and Schultze and Mackie, 2002). Diewert (1998) explains that in the index number literature, individuals, individually and collectively, are typically viewed as maximizing utility subject to exogenously given prices and incomes. One puzzle for central bankers and macroeconomists is that the economic circumstances faced by businesses are in constant flux, but prices for most products change relatively infrequently.9 Menu cost models were created as an aid to understanding the observed price rigidity. Utility maximizing households play a central role in these models, much as in the models that price index specialists specify in defining a target consumer price index. However, instead of treating the prices and incomes as exogenously determined, a general equilibrium framework is used. To illustrate the role of price dynamics in macroeconomic models, we sketch out a simple menu cost model with utility maximizing households and profit maximizing decision makers for grocery stores, chains, and the producers of the products sold in retail grocery stores. The model is a simple version of standard models in the monetary economics literature. A key feature is that retailers must pay fixed costs – ‘‘menu costs’’ – in order to make nominal price changes. Related models of price rigidity are widely used by macroeconomists as well as policymakers for analyzing the effects of monetary policy. Parameters such as the frequency of price change have important effects on the predictions of such models for the evolution of macroeconomic variables, and for their prescriptions for optimal monetary policy. Here, we have adapted the model to recognize the existence of chains, stores, and products with potentially different menu costs and idiosyncratic shocks.10 We should note that, while we view it as a useful starting point, there are important limitations of this simple model’s ability to fit retail price dynamics.11 Households are assumed to consume a continuum of differentiated products represented by the vector z. Element i, s, c of z represents the volume of product i bought at store s in chain c. That
9 See Nakamura and Steinsson (2008) and Klenow and Kryvtsov (2008) for recent analyses of this phenomenon. 10 See, for example, Golosov and Lucas (2007), Midrigan (2008) or Nakamura and Steinsson (2009) for recent examples of menu cost models in the monetary economics literature. The model we present here is adapted from the model analyzed in Nakamura and Steinsson (2009). 11 For example, simple versions of this model do not generate temporary sales. See e.g., Kehoe and Midrigan (2008) for an extension of the menu cost model designed to fit this feature of the data.
A.O. Nakamura et al. / Journal of Econometrics 161 (2011) 47–55
is, a particular value of z represents a particular product (e.g., a 2 L bottle of Coca Cola) purchased from a particular store in a particular chain. The composite consumption good, Ct , is a Dixit–Stiglitz index of the differentiated product–store–chain combinations defined as: 1
[∫
θ −1 θ dz
ct (z )
Ct =
] θ−θ 1
,
(1)
0
where ct (z ) denotes household consumption in period t, and θ denotes the elasticity of substitution for the differentiated goods. For any given level of spending in period t, households are assumed to choose the consumption bundle that yields the highest level of the index Ct . This implies that household demand for z is ct (z ) = Ct
pt (z )
(2)
Pt
1 ] 1−θ
1
pt (z )
1−θ
Pt =
dz
,
(3)
0
and the households are assumed to maximize their discounted expected utility given by Et
∞ − j =0
βj
[
1 1−γ
1−γ
C t +j −
] ω ψ+1 Lt +j , ψ +1
(4)
where Et is an expectations operator conditional on information known at time t , Ct is household consumption, and Lt is household labor supply. As is typical for menu cost models, we assume that households discount future utility by a factor β per period, have constant relative risk aversion equal to γ , and have parameters ω and ψ that determine the level and convexity, respectively, of the disutility of labor. On the producer side, we assume that the production function for each product–chain–store combination is yt (z ) = At (z )Lt (z ),
(5)
where, for period t , yt (z ) denotes the output level which in equilibrium will equal consumption ct (z ), and Lt (z ) denotes the quantity of labor employed for the production of z in period t. At (z ) is an ‘‘idiosyncratic shock’’ to the marginal cost of producing one unit (in quality adjusted terms) of the product–store–chain combination z. This term is a stand-in for all factors generating a desire for stores to adjust their prices. We assume that At (z ) has a distribution with mean 1 and variance σ 2 (z ). Notice that our specification allows for the possibility that these idiosyncratic shocks may vary across the different product–store–chain combinations. Letting Dt ,t +j denote the producer discount factor for time period t versus t + j, the expected discounted profits are given by Et
∞ −
as well as product demand and the evolution of wages, all of which may depend on the state of the macroeconomy. Central bankers and macroeconomists are interested in determining when and how monetary policy can be used to tame inflation. Thus, they are interested in how exogenous price shocks are, or are not, passed on. In the model described above, the key parameter in determining how rapidly underlying shocks are passed on to prices is the magnitude of the menu cost K (z ). Standard models of how monetary policy affects the economy, including the workhorse policy models of central banks, build in assumptions about the frequency of price change based on estimated values reported in empirical studies. A key question, therefore, is what determines this cost, and whether it varies over time and across different firms in the economy.
−θ
where pt (z ) is the price vector for z in t. The price level for the economy, Pt , is
[∫
49
Dt ,t +j Πt +j (z ),
(6)
j =0
where profits in t are given by
Πt (z ) = pt (z )yt (z ) − Wt Lt (z ) − K (z )It (z ).
(7)
It (z ) is an indicator variable that equals 1 if the firm changes its price in period t and 0 otherwise, K (z ) is the ‘‘menu cost’’ (i.e., the fixed cost of price adjustment for product i in store s and chain c) and Wt is the wage rate. For simplicity, we assume that Wt is common to all firms, and that all idiosyncratic cost factors are incorporated in the idiosyncratic shock, At (z ). The producer is assumed to maximize the discounted expected sum of profits, given by Eq. (6), subject to Eq. (5) which is the production function,
3. Data Our analysis is based on proprietary scanner price data, consisting of weekly price and quantity observations for product sales at grocery stores across the United States.12 The scanner dataset we use is from a national sample of hundreds of grocery stores belonging to numerous grocery chains. The dataset represents over 20 billion dollars of retail sales annually for thousands of UPCs, with tens of millions of observations per year. Thus, we are working with a dataset that, in multiple dimensions, is orders of magnitude larger than the typical micro price data analyzed by the BLS.13 The retail stores covered by our data are a sample of grocery stores, including some supercenters, but excluding drug stores. These data are for the years 2001 through 2005. We focus on three categories of products: coffee, cold cereal, and soft drinks. Each product category contains a large number of distinct products identified by individual UPCs. We construct weekly average prices (i.e., unit prices) for products defined at the Universal Product Code (UPC) level by dividing store level dollar sales by the sales volumes.14 Recent evidence indicates that what we refer to as temporary sales, or simply as sales, account for a large fraction of the volatility in unit prices.15 Our dataset includes a flag for whether a given price–quantity observation is associated with a temporary sale.16 Numerous previous studies have been carried out using scanner data for one or for a small number of stores or chains. For example, a number of important studies are based on the Dominick’s database, maintained by the University of Chicago Graduate School of Business, which is for a single retail chain. However, without broad store and chain coverage, it is impossible to properly address some of the questions taken up in our study.
12 Our analysis is based on Information Resources, Inc. (‘‘IRI’’) data, as analyzed and interpreted by us, which are a subset of the data described in Bronnenberg et al. (2008). IRI has neither reviewed nor approved of any analyses or conclusions described in this paper. 13 It should be noted, however, that the store and chain coverage for the data are determined by the data provider instead of by a statistical sampling procedure of the type used by the BLS. 14 A UPC is an exact identifier of a product. This type of product identifier is much more specific than the product categories typically used by national statistical agencies (such as the ELI system used by the BLS or the European COICOP system). Our unit prices are for products identified by their UPC codes. This is the situation in which the use of unit values is said to be ideal, unlike some other uses made of unit values to describe price movements for mixed groups of products (see Diewert, forthcoming). For more on retail sector measurement, see Nakamura (1999). 15 For more on the importance of temporary sales for explaining retail price dynamics, see Pesendorfer (2002), Hosken and Reiffen (2004a,b) and Nakamura and Steinsson (2008). 16 The temporary sales are identified using a standardized algorithm, implemented by the data vender, which identifies cases in which prices decline temporarily by substantial amounts. This algorithm is similar to the one described in Kehoe and Midrigan (2008).
50
A.O. Nakamura et al. / Journal of Econometrics 161 (2011) 47–55
Fig. 1. Prices, ‘‘regular’’ prices and sales. The figure plots a price series for 12packs of a particular soft drink UPC in the dataset. The frequent ‘‘V’’ shaped price movements are temporary sales.
A key advantage of the scanner dataset we use relative to other potential data sources, such as the CPI Research Database used in a number of studies (e.g., Hosken and Reiffen, 2004a; Bils and Klenow, 2004), is that it includes many different store quotes for most UPCs. In contrast, for the CPI Research Database, an average of just seven price quotes are collected per month for each product category and geographical area. Moreover, BLS price collection methods often result in different UPCs being collected at different stores so prices for a given UPC are only available for a single store in a particular geographical area. Even for questions that can be appropriately addressed using data for one store or chain, or for a small number of stores or chains, our data permit a check on the generality of the prior results. 4. Basic statistics on retail price dynamics Fig. 1 depicts a typical price series from our dataset. The regular price of the product remains unchanged throughout the time interval despite frequent sales. In addition, there are missing price observations that arise due to stock-outs and in periods when a product was available but no units sold. In principle, price changes are readily observable: one simply looks for differences in value for successive prices. However, to calculate the frequency of price change for data series like the one in Fig. 1, a procedure is needed for dealing with missing observations. We are also interested in studying the frequency of price change separately for regular prices and those including temporary sales. One approach to measuring the frequency of price change is to focus only on contiguous pairs of observations. Fig. 2 shows how price changes are recognized for this procedure. The frequency of price change is the number of contiguous pairs of price observations (regular, or regular and sale prices, as specified) that have different values (i.e., the number of price changes for contiguous pairs) divided by the total number of contiguous pairs (including those with no price change). A second procedure involves ‘‘filling in’’ the last observed price of any sort through periods with missing prices when computing the frequency of price change including sales, or filling in the last observed regular price through periods with missing prices and/or sale prices when computing the frequency of price change excluding sales.17 This procedure has the advantage that it yields many more observations for calculating the frequency of price change.
17 See Nakamura and Steinsson (2008) for a detailed discussion of the differences between ‘‘filled-in’’ and contiguous observations.
Fig. 2. Frequency of price change calculation for contiguous observations. Note: The figure illustrates that the first procedure can be used to estimate the frequency of price change in the presence of sales and missing values. The top row gives the sale flag value (‘‘R’’ a regular price while ‘‘S’’ indicates a sale price). The ‘‘w/Sales’’ row toward the bottom gives the values of an indicator variable set equal to 1 for an observed price change (including price changes due to sales) and is set equal to 1 otherwise. The very bottom row gives the values for an analogous indicator for observed price changes excluding changes due to sales. The graph illustrates our approach based on ‘‘contiguous observations’’. As explained in the text, we also consider an approach based on filling in missing prices, but that approach is not illustrated in this figure.
Tables 1 and 2 present results on the frequency of price change based on both procedures explained above for dealing with missing observations.18 The reported average weekly frequency and size of price change figures are weighted means, with the weights set equal to the dollar revenue figures for the included UPC–store–chain combinations. In Table 1, significant differences are evident across the product groups in terms of the flexibility of prices. Also, the frequencies of price change including temporary sales are much higher than when the temporary sales prices are omitted. This indicates that a large fraction of retail price variability arises from temporary sales. Table 2 presents an analogous set of statistics for the average absolute value of log price changes. These results can be interpreted as giving roughly the average percentage change in individual prices conditional on adjustment. These percentage change figures when sales prices are included (columns 1 and 3) are high relative, say, to aggregate inflation which averaged around 2% in the United States over 2001–2005. This finding suggests that the vast majority of grocery price movements are idiosyncratic to particular products, stores or chains as opposed to being common to the whole economy. Notice also that the absolute size of price change including sales is much larger than the absolute size of price change with sale prices excluded. The two procedures discussed yield similar qualitative results, as others have also found. Hence, in the rest of our tables, we present only results based on the ‘‘filledin’’ procedure. Column 1 of Table 3 reports the fractions of weeks when products in the different categories were on sale. Though substantial, all these fractions are less than half. Yet, the second column of Table 3 shows that sales account for more than three fourths of the price changes.19 We next provide evidence on the variation in price dynamics across stores for a given product. Columns 1 and 2 of Table 4 report the standard deviation for the frequency of price change across UPCs within each product category, with and without including temporary sales prices. The estimated standard deviations when
18 Both types of procedures are sometimes used by the BLS in dealing with missing observations in the calculation of aggregate price indexes. The BLS does not make use of the ‘‘fill-in’’ procedure for compilation of the CPI except in exceptional cases, but this is often used by the BLS in the construction of import and export price indexes. 19 The statistic is calculated by comparing the frequency of sale and non-sale price change reported in Table 1.
A.O. Nakamura et al. / Journal of Econometrics 161 (2011) 47–55
51
Table 1 Summary statistics on frequency of price change. Category
Coffee Cold cereal Soft drinks
‘‘Filled-in’’ prices
Contiguous observations
Including temp. sales (%)
Excluding temp. sales (%)
Including temp. sales (%)
Excluding temp. sales (%)
30.2 26.0 54.2
6.83 5.98 7.21
33.8 27.5 57.2
13.1 8.81 27.2
Note: The table reports the weighted mean frequency of price changes for each of the product categories discussed in the paper. The weights are the total dollar sales for each UPC–store combination. Table 2 Summary statistics on absolute size of price changes. Category
Coffee Cold cereal Soft drinks
‘‘Filled-in’’ prices
Contiguous observations
Including temp. sales (%)
Excluding temp. sales (%)
Including temp. sales (%)
Excluding temp. sales (%)
19.0 27.1 20.0
7.98 4.12 8.75
18.7 27.2 20.0
7.40 4.12 8.60
Note: The table reports the weighted mean absolute size of price changes for each of the four categories discussed in the paper. The weights are the total dollar sales for a given product–store combination.
Table 3 Summary statistics on prevalence of temporary sales. Category
Frequency of temp. sales (%)
Fraction of price changes due to temp. sales (%)
Coffee Cold cereal Soft drinks
24.4 17.8 44.0
77.4 77.0 86.7
Note: The table reports the weighted mean frequency of temporary sales for each of the four categories discussed in the paper. The weights are the total dollar sales for a given product–store combination. Missing observations are filled in.
sale prices are included are substantially larger than when sale prices are excluded (a range of 10%–20% versus 3%–7%). We see, therefore, that a substantial fraction of the cross-sectional variation in the frequency of price change arises from variation in the prevalence of temporary sales. Columns 3 and 4 of Table 4 show the values for the standard deviation for the frequency of price change across stores. The ranges of values in column 3 (with sale prices included) and column 4 (with sale prices excluded) are similar to the ranges of values in column 1 and column 2, respectively. However, the standard deviation for the frequency of price change across stores within chains (columns 5 and 6) is much smaller than for the frequency of price change across all stores (columns 3 and 4). Looking now at Table 5 and comparing the column 3 figures (for variation across stores within chains) with the columns 1 and 2 figures, we see that a substantial amount of the variation in the prevalence of sales across stores is accounted for by differences among chains. We next investigate the extent to which the variability in price dynamics across stores and products can be related to differences in product popularity and store size. Table 6 reports the results of regressions of the frequency of price change and of temporary sales on total sales revenue for given UPCs. The analysis answers the question of whether UPCs tend to exhibit different price adjustments depending on the volumes sold (e.g., the volumes for Coke versus less popular soft drink products). We find a robust positive relationship between UPC sales volumes and price flexibility. For the product categories we study, an increase in total sales volume of one hundred thousand dollars is associated with an increase in the frequency of price change when temporary sales prices are included of 0.5%–1% points, an increase in the frequency of price change when temporary sales prices are excluded of 0.04%–0.2% points, and an increase in the frequency of temporary
sales of 0.4%–0.7% points. In all cases but one, the estimated relationship is statistically significant.20 In addition, we also ran OLS regressions of the frequency of price change and temporary sales on the size of the store, where the latter is measured by the estimated annual total sales volume for all products for a store, in millions of dollars. In our dataset, the annual store sales volumes range from around 5 million dollars to over 40 million dollars. The results are reported in Table 7. We find that larger stores tend to have more frequent price changes and temporary sales. These effects are statistically significant in all categories considered. 5. Variance decompositions To more formally analyze the role of products, stores, and chains in explaining cross-sectional variation in price dynamics, we next decompose the observed price variation into two broad classes: (1) variation common to all UPCs within a given product category (such as coffee), and (2) variation common only to all items with the same UPC.21 Within each of these broad classes, we further decompose the cross-sectional variation into variation common across all stores, variation common only to stores within a chain, and variation for items sold at specific stores. In formal terms, we estimate the following nested random effects model, fcsi = αi + (ϕc + ϕci ) + γcs + εcsi ,
(8)
where fcsi is the frequency of price change for UPC i sold at store s which is part of chain c, and αi , ϕc , ϕci , γcs and εcsi are random effects assumed to be identically and independently normally distributed; i.e., αi ∼ N (µαi , σα2i ), ϕc , ∼ N (µϕc , σϕ2c ), and so on. The model specified allows for a wide variety of correlation structures across products, stores and chains. The first component, αi , is common to all retail stores selling a given UPC. The second component, ϕc + ϕci , is common to all stores in a chain.22 The third term, γcs , is common only to a particular store in a given chain. Finally, the residual, εcsi , picks up all remaining variation in the frequency of price change which is specific to both a particular store (within a given retail chain) and a particular
20 This regression illustrates a statistical relationship between the frequency of retail sales and UPC sales volumes; not a causal relationship. Indeed, more frequent retail sales are one potential cause of increased UPC sales volume. 21 Nakamura (2008) carries out a related variance decomposition using a much more limited dataset. 22 Here ϕ is common to all UPCs while the second term, ϕ , applies only to a c ci particular UPC.
52
A.O. Nakamura et al. / Journal of Econometrics 161 (2011) 47–55
Table 4 Cross-sectional variation in price flexibility. Category
Coffee Cold cereal Soft drinks
Across UPCs
Across stores
Across stores within chains
Including temp. sales (%) (1)
Excluding temp. sales (%) (2)
Including temp. sales (%) (3)
Excluding temp. sales (%) (4)
Including temp. sales (%) (5)
Excluding temp. sales (%) (6)
12.3 11.1 20.8
3.54 3.13 6.68
19.0 10.6 13.7
4.15 5.35 4.10
4.80 3.88 5.48
1.60 1.34 2.26
Note: The table reports the cross-sectional standard deviation in the frequency of price change across stores and across stores within particular retail chains. These statistics are calculated by first calculating the cross-sectional standard deviation in the frequency of price change across stores (or across stores within chains), and then calculating the weighted mean value of this statistic. Missing observations are filled in. Table 5 Cross-sectional variation in prevalence of retail sales.
Table 8 Variance decomposition of price flexibility (including sales).
Category
Across UPCs (%)
Across stores (%)
Across stores within chains (%)
Coffee Cold cereal Soft drinks
10.7 8.20 16.0
8.31 7.55 10.8
3.88 2.94 6.12
Note: The table reports the cross-sectional standard deviation in the frequency of temporary sales across stores and across stores within particular retail chains. These statistics are calculated by first calculating the cross-sectional standard deviation across stores (or across stores within chains), and then calculating the weighted mean value of this statistic. Missing observations are filled in.
Coffee Cold cereal Soft drinks
Freq. including sales 0.934 (0.075) 0.692 (0.032) 0.592 (0.046)
Freq. excluding sales 0.181 (0.023) 0.152 (0.011) 0.039 (0.016)
Coffee Cold cereal Soft drinks
Category
0.678 (0.067) 0.367 (0.030) 0.490 (0.035)
Coffee Cold cereal Soft drinks
UPC–retail chain (%) (3)
Retail store (%) (4)
UPC–retail store (%) (5)
37.2 42.0 53.7
20.4 30.1 12.9
27.1 19.8 24.4
5.62 3.39 1.42
9.58 4.72 7.50
Freq. including sales
Freq. excluding sales
Freq. of temporary sales
Coffee
0.282 (0.047) 0.396 (0.053) 0.397 (0.072)
0.129 (0.018) 0.184 (0.027) 0.030 (0.023)
0.139 (0.039) 0.136 (0.041) 0.161 (0.059)
Note: The table reports the coefficient of store size (in millions) on percentage frequency of price change or temporary sales. Standard errors are in parentheses. Missing observations are filled in.
UPC. The parameters of the model are estimated using maximum likelihood.23 These estimated model parameters were used to decompose the sources of variation in the frequency of price change across products and retail outlets. This decomposition was carried out separately for each product category. 24 Table 8 shows the decomposition results for all prices. Looking across the rows of Table 8 for each of the product categories, the column 1 value is the fraction of the variation common to all stores selling a particular UPC (αi ), the column 2 value is the fraction common to all UPCs for all the stores in a given chain (ϕc ), and
23 See e.g. Baltagi (2005) for an excellent survey of these methods. 24 We include in the sample we use for estimation all cases in which a UPC is sold in at least two retail stores and is present for all five years of the dataset.
(1)
Retail chain (%) (2)
UPC–retail chain (%) (3)
Retail store (%) (4)
UPC–retail store (%) (5)
16.4 9.42 32.6
18.1 39.7 8.62
28.1 24.6 33.7
8.30 9.65 2.69
29.1 16.6 22.5
Table 10 Variance decomposition of prevalence of retail sales. Category
Category
UPC (%)
Note: The variance decomposition is estimated using the average frequency of price change for prices excluding sales by store and UPC. Missing observations are filled in.
Table 7 Effect of store size on price flexibility.
Soft drinks
(1)
Retail chain (%) (2)
Note: The variance decomposition is estimated using the average frequency of price change for prices including sales by store and UPC. Missing observations are filled in.
Freq. of temporary sales
Note: The table reports the coefficient of total UPC sales volume (in hundreds of thousands) on percentage frequency of price change or temporary sales. Standard errors are in parentheses. Missing observations are filled in.
Cold cereal
UPC (%)
Table 9 Variance decomposition of price flexibility (excluding sales).
Table 6 Effect of total UPC sales volume on price flexibility. Category
Category
Coffee Cold cereal Soft drinks
UPC (%) (1)
Retail chain (%) (2)
UPC–retail chain (%) (3)
Retail store (%) (4)
UPC–retail store (%) (5)
38.2 36.8 48.2
15.2 26.0 13.9
30.2 28.4 23.1
4.05 2.74 1.71
12.4 5.99 13.1
Note: The variance decomposition is estimated using the average frequency retail sales by store and UPC. Missing observations are filled in.
the column 3 value is the fraction of the variation common only to a particular UPC for a particular chain (ϕci ). The remaining two columns pertain to the fraction of the variation common only to a particular store. Column 4 gives the fraction common to all UPCs at a particular store (γcs ), and column 5 gives the residual variation common only to a particular UPC and store (εcsi ). We also carried out similar variance decompositions for prices excluding temporary sales. These results, shown in Table 9, are broadly similar to the Table 8 results for prices including sales. Also, Table 10 presents the corresponding estimates of the components of cross-sectional variation in the frequency of temporary sales. Similar patterns again emerge. There is far more variation in the frequency of temporary sales across chains for a given product than among stores within a given chain. Chain-wide decisions or shocks dominate the pricing dynamics. These results suggest it is important that statistical agencies collect price observations from a representative selection of retail chains. Price dynamics variation across chains is likely to also lead to variation in the number of observations needed in a product category to calculate an accurate price index, and to variation in the importance of certain measurement problems.
A.O. Nakamura et al. / Journal of Econometrics 161 (2011) 47–55
In the context of the theoretical model presented in Section 2, variation in the frequency of price change across stores is viewed as arising from a combination of variation in menu costs K (z ) and the volatility of idiosyncratic shocks σ 2 (z ) that motivate variation in the desired price. Our empirical results suggest that the variation in these parameters across stores arises primarily from variation at the chain level. Variation in menu costs K (z ) at the chain level could reflect chain-level variation in the costeffectiveness of pricing decision making. Or it could arise from chain-specific differences in the technology for implementing price changes, including even the approach used to put price stickers on product items. Similarly, chain-level differences in the volatility of idiosyncratic shocks σ 2 (z ) could arise from variation in bargaining ability at the chain level. It is well-known that much of the negotiation over wholesale prices for grocery products takes place at the chain level (as opposed to the store level). If some grocery chains are more successful at negotiating stable wholesale prices from producers, this could lead to lower variation in desired prices and a lower frequency of price change for those chains (e.g., the Wal-Mart ‘‘every day low pricing’’ policy).25 Finally, inventory management technologies could be another source of chain-level variation in σ 2 (z ) and hence in the volatility of desired prices. 6. Price dynamics implications for price index construction We now take up the implications of our results for consumer price index making. The (fixed base) period t Laspeyres price index (PLt ) can be written as follows:26 PLt ≡
−
wi0 (pti /p0i ),
(9)
i
where pi0 is the base period price of product i, pit is the price of product i in period t for t = 1, . . . , T , and wi0 is good i’s share of total expenditure in period 0.27 Eq. (9) aggregates unit value indexes by using base period share weights. The period t Paasche price index (PPt ) can similarly be written as follows: PPt ≡
−
−1 wit (p0i /pti )
,
(10)
i
where wit is product i’s share of total expenditure in period t, for t = 0, . . . , T . The Fisher index formula is the geometric mean of the Laspeyres and Paasche indexes; i.e., Fishert = [Paaschet × Laspeyrest ]1/2 . Sometimes expenditure data are not available. One way of proceeding in this case is to set all of the weights in the Fisher index equal to the reciprocal of the sample size. We refer to the resulting index hereafter as the unweighted Fisher.28 We first consider how price indexes differ depending on whether they are constructed using only regular prices or all prices.29 We also present evidence regarding the chain drift bias
25 See, for example, Fishman (2003). 26 See Diewert (forthcoming) for further details and references. 27 The prices are unit values for product classes for each period t of the specified unit time period. 28 Erwin Diewert pointed out to us that in the international Consumer Price Index Manual (ILO, 2004, p. 361, Paragraph 20.43), this index is labeled as PCSWD ; that is, it is labeled as an index that was suggested by Carruthers et al. (1980) and Dalen (1992, p. 140) in the elementary index context. But this index was also suggested by Fisher (1922, p. 472) as his formula 101 and he observed that it was very close to the unweighted geometric mean Jevons index and stated that these two indexes were his best unweighted indexes. 29 This is a particularly relevant exercise considering that a number of statistical agencies including Germany, Italy and Spain explicitly do not collect sale prices as part of the CPI, as was noted in Section 2.
53
problem that can result from temporary sales activity. Chained price index measures are usually recommended as one way of mitigating the product attrition problems that plague price index programs. However, it has been shown in the price measurement literature that price bouncing, of the sort caused by temporary sales, can cause chain drift bias. This part of our study builds on the related research of Ivancic et al. (2009) (IDF) and Haan and van der Grient (2009) (HG). IDF show for Australia and HG show for The Netherlands that price bouncing combined with the use of chained price indexes can lead to unacceptably large chain drift.30 We also consider how price indexes differ depending on whether prices are averaged only across stores within chains versus across all stores. Tables 11–13 compare the values of the price indexes depending on whether they are calculated using all prices or using only regular prices. All three of these tables present results for both the weighted and unweighted Fisher index.31 Table 11 is for the case of no item aggregation across stores; that is, the unit values used to construct the index numbers are the raw price data. The results are different for the index of all prices and the index with only regular prices. For the regular Fisher index, this difference ranges from 2%–20%. The difference is smaller for the unweighted Fisher index, ranging from 1%–10%. We would not expect such substantial differences in the absence of chain drift since once a temporary sale is finished, the price returns to the regular price in effect before the sale began.32 Table 12 is for the case of item aggregation across all stores; i.e., in this case the unit values used to construct the index numbers are averages across all stores for a given UPC. As noted by HG and IDF, this procedure may ameliorate chain drift bias by reducing ‘‘bouncing’’ in prices and quantities. In all of the categories we consider, averaging across stores also reduces the difference between the indexes based on all prices and those based on just regular prices. For the weighted index, the difference between the index based on all prices and the index based only on regular prices falls to between 1% and 6%. For the unweighted index, the difference between the two indexes falls to between 0.6% and 6%. This finding is related to our findings in Sections 4 and 5 that there is a large amount of heterogeneity in pricing dynamics across stores even after controlling for the UPC. This implies that averaging across stores should yield smoother price series than using the raw data. However, IDF note that this procedure has the disadvantage that the unit values are calculated as averages across potentially heterogeneous products if consumers view differences in the locations and shopping experiences offered by different stores as changing the value to them of the purchased grocery products. Another approach suggested by IDF is to aggregate across stores within retail chains. The results for this approach are presented in Table 13. With this approach, the difference between the indexes of all prices versus just regular prices does not diminish as much as with aggregation across all stores. This finding is consistent with our earlier finding in Sections 4 and 5 that a large fraction of the heterogeneity in pricing behavior across stores is due to differences across chains.
30 This is so even when superlative index number formulas are used (see Haan, 2008). For earlier work on this bias problem, see Szulc (1983). 31 The tables present the value of the index at the end of 12 months of data for 2004, assuming an initial value of 100 for the first month. Limiting this portion of the analysis to 12 months made the computations manageable without imposing procedures at this point to reduce the sample size. The sample used to calculate these indexes is described in Nakamura (2008). Using 12 months for the calculations in this section also means that the results are roughly comparable with those of IDF who use 15 months for their index measure calculations. 32 Recall from Section 3 that a temporary sale is identified in our data by the data vender as a short-lived drop below the ‘‘regular price’’.
54
A.O. Nakamura et al. / Journal of Econometrics 161 (2011) 47–55
Table 11 Price indexes based on monthly sale and non-sale prices (no item aggregation over stores). Category
Coffee Cold cereal Soft drinks
Chained Fisher
Chained unweighted Fisher
Including temp. sales
Excluding temp. sales
Diff
111.3 95.9 91.2
109.6 101.8 109.8
−5.9 −18.6
1.7
Including temp. sales
Excluding temp. sales
Diff
101.7 102.1 94.4
102.3 104.3 102.6
−0.6 −2.2 −8.2
Note: The table reports the year-end price index based on alternative measurement approaches assuming an initial value of 100. The ‘‘Diff’’ columns show the differences in the indexes with and without temporary sales.
Table 12 Price indexes based on monthly sale and non-sale prices (item aggregation over all stores). Category
Coffee Cold cereal Soft drinks
Chained Fisher
Chained unweighted Fisher
Including temp. sales
Excluding temp. sales
Diff
Including temp. sales
Excluding temp. sales
Diff
104.9 103.4 99.0
103.6 103.0 104.5
1.3 0.4 −5.5
101.8 102.4 96.8
102.4 103.2 102.6
−0.6 −0.8 −5.8
Note: The table reports the year-end price index based on alternative measurement approaches assuming an initial value of 100. The ‘‘Diff’’ columns show the differences in the indexes with and without temporary sales.
Table 13 Price indexes based on monthly sale and non-sale prices (item aggregation over stores within chains). Category
Coffee Cold cereal Soft drinks
Chained Fisher
Chained unweighted Fisher
Including temp. sales
Excluding temp. sales
Diff
Including temp. sales
Excluding temp. sales
Diff
112.3 97.7 91.9
109.7 101.9 109.7
2.6 −4.2 −17.8
101.9 102.6 96.6
104.1 103.3 106.0
−2.2 −0.7 −9.4
Note: The table reports the year-end price index based on alternative measurement approaches assuming an initial value of 100. The ‘‘Diff’’ columns show the differences in the indexes with and without temporary sales.
7. Conclusion Our empirical results address questions related to price dynamics that are relevant to central bankers and macroeconomists as well as price index specialists. One key finding is that the treatment of temporary sales matters for inflation measurement, analysis and forecasting purposes. Our results definitely show that the implications of temporary sales for index number measurement cannot be ignored when constructing price indexes. A second finding is that in addition to the product characteristics emphasized in the measurement and the macroeconomics literature, retailer characteristics are crucial determinants of heterogeneity in pricing dynamics. We show that a substantial fraction of this variation is accounted for by differences across chains, as opposed to among stores within chains. Our conclusions about the importance of chain-level pricing have potential applicability as well for improving the efficiency of CPI sampling. One implication is that it would be appropriate to sample more chains and fewer stores per chain. In addition, because there are systematic differences in price dynamics between smaller and larger retail outlets, and analogous differences in price dynamics for different UPCs depending on their sales volumes, it also seems important for statistical agencies to recognize that the appropriate sampling methodology may differ depending on the popularity of products and outlets.33 Because much of the variation in price data is idiosyncratic to particular stores, averaging across stores is likely to ameliorate measurement challenges associated with price bouncing. However, this approach has the downside that it involves averaging prices over stores with potentially heterogeneous quality.
33 Previously, Greenlees and McClelland (2007, 2008) documented the importance of chain competition for outlet bias in CPI measurement (see also Reinsdorf, 1993).
Ivancic et al. (2009) therefore recommend that ‘‘statistical agencies that have access to scanner data form their unit values by . . . [aggregating over] stores which belong to the same supermarket chain’’. However, our analysis suggests that since retail price dynamics are also more similar within chains, although averaging within chains will ameliorate the chain drift problem, the improvement will be less than for averaging across chains. Of course, this finding does not mean that statistical agencies should average prices across potentially heterogeneous retail chains.34 What our results imply is that the chain drift problem will not be solved solely by averaging data across stores within retail chains. A more hopeful solution appears to be the use of drift-free indexes such as the one developed by Ivancic et al. (2009). Our results answer some of the questions considered, but leave others for future study. In a 2004 interview about the development of his own career, Arnold Zellner explained: ‘‘I learned much about measurement and its important role in providing data to test alternative explanations or theories and to stimulate theorists to devise new theories to explain observed properties of the data, a very fundamental role of measurement in science’’. Zellner quoted by Morrissey (2004). Similarly, we hope our measurement findings will stimulate both other empirical researchers and also theorists to deepen our
34 Different amounts of transportation and other purchased services and time from household members may be involved when households change where they shop. For instance, a consumer may need to spend more time and use more gasoline driving to a superstore that offers cheaper prices on products they buy compared with a neighborhood grocery store. This type of substitution is difficult to allow for empirically, however, because some of the goods and services involved are bundled and sold together along with the location and other shopping amenities offered by the stores where a consumer chooses to shop.
A.O. Nakamura et al. / Journal of Econometrics 161 (2011) 47–55
collective understanding of price dynamics and to develop improved methods for price index measurement that build on this research. Acknowledgements We thank Erwin Diewert for inspiring us to write this paper and for comments that helped us improve it. We also thank John Greenlees, Jan de Haan, Lorraine Ivancic, Mick Silver, Randall Verbrugge, and two referees for very helpful comments, and Paul R. Flora, an Economic Analyst with the Research Department of the Federal Reserve Bank of Philadelphia, for his outstanding research assistance. This research was partially supported by a grant to Alice Nakamura and Erwin Diewert from the Social Sciences and Humanities Research Council of Canada. References Baltagi, B.H., 2005. Econometric Analysis of Panel Data. John Wiley & Sons, Chichester, England. Bils, M., Klenow, P.J., 2004. Some evidence on the importance of sticky prices. Journal of Political Economy 112, 947–985. Bronnenberg, B.J., Kruger, M.W., Mela, C.F., 2008. The IRI marketing dataset. Marketing Science 27, 745–748. Carruthers, A.G., Sellwood, D.J., Ward, P.W., 1980. Recent developments in the retail price index. The Statistician 29, 1–32. Chevalier, J.A., Kashyap, A.K., Rossi, P.E., 2003. Why don’t prices rise during periods of peak demand? Evidence from scanner data. American Economic Review 93, 15–37. Dalen, J., 1992. Computing elementary aggregates in the Swedish consumer price index. Journal of Official Statistics 8, 129–147. Dhyne, E., Alvarez, L.J., Le Bihan, H., Veronese, G., Dias, D., Hoffmann, J., Jonker, N., Lunnemann, P., Rumler, F., Vilmunen, J., 2005. Price setting in the Euro area: some stylized facts from individual consumer price data. ECB Working Paper No. 524, September. Diewert, W.E., 1995. Axiomatic and economic approaches to elementary price indexes. Discussion Paper 95-01. Department of Economics, University of British Columbia, Vancouver Canada. http://web.arts.ubc.ca/econ/diewert/ hmpgdie.htm; W.E. Diewert, B.M. Balk, D. Fixler, K.J. Fox, and A.O. Nakamura (Eds.), 2010. Price and Productivity Measurement, vol. 6. Index Number Theory. Trafford Press. Available at: www.indexmeasures.com (forthcoming). Diewert, W.E., 1998. Index number issues in the Consumer Price Index. The Journal of Economic Perspectives 12, 47–58. Diewert, W.E., Armknecht, P.A., Nakamura, A.O., 2009. Methods for dealing with seasonal products in price indexes. In: Diewert, W.E., Balk, B.M., Fixler, D., Fox, K.J., Nakamura, A.O. (Eds.), Price and Productivity Measurement: Vol. 2— Seasonality. Trafford Press, pp. 5–28 (Chapter 2) www.indexmeasures.com. Diewert, W.E., Nakamura, A.O., 1993. Essays in Index Number Theory, vol. 1. NorthHolland, Amsterdam. Feenstra, R.C., Shapiro, M.D., 2003. High-frequency substitution and the measurement of price indexes. In: Feenstra, R.C., Shapiro, M.D. (Eds.), Scanner Data and Price Indexes. In: NBER Book Series in Income and Wealth, University of Chicago Press, pp. 123–150 (Chapter 10) http://www.nber.org/chapters/c9733.pdf. Fisher, I., 1922. The Making of Index Numbers. Houghton-Mifflin. Fishman, C., 2003. The Wal-Mart you don’t know. FASTCOMPANY 77, December. http://www.fastcompany.com/magazine/77/walmart.html. Golosov, M., Lucas Jr., R.E., 2007. Menu-costs and Phillips curves. Journal of Political Economy 115, 171–199. Greenlees, J., McClelland, R., 2007. Price differentials across outlets in Consumer Price Index data, 2002–2006. Paper Presented at the Tenth Meeting of the Ottawa Group. Ottawa, Canada, October. www.ottawagroup. org/Ottawa/ottawagroup.nsf/4a256353001af3ed4b2562bb00121564/ ca8009e582a0c66dca257577007fbcd0/$FILE/2007%2010th%20meeting%20-% 20John%20Greenlees%20and%20Robert%20McClelland%20(US%20Bureau% 20of%20Labor%20Statistics)_New%20Outlets%20Bias%20in%20the%20CPI_ A%20Look%20at%20Recent%20Evidence.pdf.
55
Greenlees, J., McClelland, R., 2008. New evidence on outlet substitution effects in consumer price index data. Bureau of Labor Statistics Working Paper 421. November. www.bls.gov/osmr/pdf/ec080070.pdf. Haan, J.de, 2008. Reducing drift in chained superlative price indexes for highly disaggregated data. Paper Presented at the Economic Measurement Group Workshop, Sydney, 10–12 December 2008. Haan, J.de, van der Grient, H., 2009. Eliminating chain drift in price indexes based on scanner data. April 2009 Working Paper. Statistics Netherlands, Division of Macro-economic Statistics and Dissemination. Hosken, D., Reiffen, D., 2004a. Patterns of retail price variation. Rand Journal of Economics 35, 128–146. Hosken, D., Reiffen, D., 2004b. How do retailers determine sale products: evidence from store-level data. Journal of Consumer Policy 27, 141–177. International Labour Office. ILO, 2004. International Monetary Fund, Organisation for Economic Co-operation and Development. Eurostat, United Nations, and The World Bank. Consumer Price Index manual: Theory and Practice, CPI manual. http://www.ilo.org/public/english/bureau/stat/guides/cpi/index.htm. International Monetary Fund (IMF). International Labour Office, Organisation for Economic Co-operation and Development, Eurostat, United Nations, and The World Bank. Producer Price Index manual: Theory and practice, PPI manual. http://www.imf.org/external/np/sta/tegppi/index.htm. Ivancic, L., Diewert, W.E., Fox, K.J., 2009. Scanner data, time aggregation and the construction of price indexes. Working Paper. School of Economics and CAER, University of New South Wales. Ivancic, L., Fox, K.J., 2010. Using price variation across stores to inform aggregation decisions in the CPI. Working Paper. Centre for Applied Economic Research, University of New South Wales. Kehoe, P., Midrigan, V., 2008. Sales, clustering of price changes, and the real effects of monetary policy. Working Paper. University of Minnesota. Klenow, P.J., Kryvtsov, O., 2008. State-dependent or time-dependent pricing: does it matter for recent US inflation? Quarterly Journal of Economics 123, 863–904. Konüs, A.A., 1939. The problem of the true index of the cost of living. Econometrica 7, 10–29. Kryvtsov, C., Midrigan, V., 2008. Inventories, markups, and real rigidities in menu cost models. Working Paper. http://richmondfed.org/conferences_and_events/ research/2008/pdf/midrigan_paper.pdf. Midrigan, V., 2008. Menu costs, multi-product firms, and aggregate fluctuations. Working Paper, New York University, August. Morrissey, K., 2004. Interview with Arnold Zellner. Quad Club, Chicago, Illinois, http://www.strategy2market.com/downloads/InterviewwithArnoldZellner.pdf. Nakamura, E., 2008. Pass-through in retail and wholesale. American Economic Review 98, 430–437. http://www.columbia.edu/∼en2198/papers/retail_ wholesale.pdf. Nakamura, E., Steinsson, J., 2008. Five facts about prices: a reevaluation of menu cost models. Quarterly Journal of Economics 123, 1415–1464. Nakamura, E., Steinsson, J., 2009. Monetary non-neutrality in a multi-sector menu cost model. Quarterly Journal of Economics http://www.columbia.edu/ ∼en2198/papers/heterogeneity.pdf. Nakamura, L.I., 1999. The measurement of retail output and the retail revolution. Canadian Journal of Economics 32, 408–425. Pashigian, B.P., 1988. Demand uncertainty and sales: a study of fashion and markdown pricing. American Economic Review 78, 936–953. Pesendorfer, M., 2002. Retail sales: a study of pricing behavior in supermarkets. Journal of Business 75, 33–66. Pollak, R.A., 1980. Group cost-of-living indexes. American Economic Review 70, 273–278. Pollak, R.A., 1981. The social cost-of-living index. Journal of Public Economics 15, 311–336. Reinsdorf, M., 1993. The effect of outlet price differentials in the US Consumer Price Index. In: Foss, M.F., Manser, M.E., Young, A.H. (Eds.), Price Measurements and their Uses. In: NBER Studies in Income and Wealth, vol. 57. pp. 227–254. Schultze, C.L., Mackie, C. (Eds.), 2002. At What Price? Conceptualizing and Measuring Cost-Of-Living and Price Indexes. National Academy Press. Sobel, J., 1984. The timing of sales. Review of Economic Studies 51, 353–368. Statistics Canada, 1996. Your guide to the Consumer Price Index. Catalogue No. 62557-XPB. http://www.statcan.gc.ca/pub/62-557-x/62-557-x1996001-eng.pdf. Szulc, B.J. (Schultz), 1983. Linking price index numbers. In: Diewert, W.E., Montmarquette, C. (Eds.), Price Level Measurement. Statistics Canada, Ottawa, pp. 537–566. Woodford, M., 2009. Information-constrained state-dependent pricing. Working Paper. Department of Economics, Columbia University. http://www.columbia. edu/∼mw2230/ICSDP_ME.pdf.
Journal of Econometrics 161 (2011) 56–81
Contents lists available at ScienceDirect
Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom
Wealth accumulation and factors accounting for success Anan Pawasutipaisit a , Robert M. Townsend b,∗ a
University of Chicago, United States
b
Massachusetts Institute of Technology, United States
article
info
Article history: Available online 17 September 2010 JEL classification: D31 D24 D92 Keywords: Wealth accumulation Wealth inequality Return on assets Heterogeneity Persistence
abstract We use detailed income, balance sheet, and cash flow statements constructed for households in a long monthly panel in an emerging market economy, and some recent contributions in economic theory, to document and better understand the factors underlying success in achieving upward mobility in the distribution of net worth. Wealth inequality is decreasing over time, and many households work their way out of poverty and lower wealth over the seven year period. The accounts establish that, mechanically, this is largely due to savings rather than incoming gifts and remittances. In turn, the growth of net worth can be decomposed household by household into the savings rate and how productively that savings is used, the return on assets (ROA). The latter plays the larger role. ROA is, in turn, positively correlated with higher education of household members, younger age of the head, and with a higher debt/asset ratio and lower initial wealth, so it seems from cross-sections that the financial system is imperfectly channeling resources to productive and poor households. Household fixed effects account for the larger part of ROA, and this success is largely persistent, undercutting the story that successful entrepreneurs are those that simply get lucky. Persistence does vary across households, and in at least one province with much change and increasing opportunities, ROA changes as households move over time to higher-return occupations. But for those households with high and persistent ROA, the savings rate is higher, consistent with some micro founded macro models with imperfect credit markets. Indeed, high ROA households save by investing in their own enterprises and adopt consistent financial strategies for smoothing fluctuations. More generally growth of wealth, savings levels and/or rates are correlated with TFP and the household fixed effects that are the larger part of ROA. © 2010 Elsevier B.V. All rights reserved.
1. Introduction We use detailed income and balance sheet statements constructed for households in a long monthly panel in an emerging market economy to document and better understand the factors underlying success in achieving upward mobility in the distribution of net worth. The overall growth rate of wealth when accounting for inflation is only a modest 0.3% per year and the wealth distribution is highly skewed, with the relatively rich holding a third of net worth and the bottom half holding less than 10%. But the growth rate of wealth over time is sharply decreasing in initial wealth levels, that is, the relatively poor grow much faster than the rich, at 22% per year vs. 0.09% per year. Further decompositions show that overall inequality comes down substantially as households either transit upwardly over time across initial wealth quartiles or as wealth increases on average for the lower quartiles, so
∗ Corresponding address: Department of Economics, 50 Memorial Drive, Room E52-252c, Cambridge, MA 02142, United States. Tel.: +1 617 452 3722. E-mail address:
[email protected] (R.M. Townsend). 0304-4076/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2010.09.007
that the gaps across the quartiles narrow. Geographic location and occupation contribute less than what we might have expected to this story. There is initially increasing wealth inequality across regions for some occupations, increasing for fish/shrimp and cultivation overall and for labor and business initially. But the larger force, about 60%, is the reduction of inequality within the residual category. Some of the more successful households experience large increases in their relative position in the wealth distribution, while others fall down. Approximately 7% of households in the survey stayed at the same relative position, 43% increased their position, and almost 50% have a negative change in position. The standard deviation of relative position change is 14 points, so again there is substantial mobility within the distribution. The constructed accounts also allow an exact decomposition of these changes into net savings and incoming gifts/remittances. Savings account for 81% of the change, and, roughly, gifts decrease as initial wealth quartiles increase. The rich actually give some money away, for example. But for the second quartile of initial wealth, incoming gifts play a role equal to savings, even more so for households running businesses (as enterprises made losses in early years).
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Another decomposition allows us to separate the increase in net worth into the role played by the savings rate versus how effectively those savings are used, that is, the return on assets (ROA). Growth of net worth is positively and significantly correlated with savings rates but less so than the high and consistently significant correlation of the growth of net worth with ROA, across the board. In this sense, successful households with high growth of net worth are the households who are productive — who utilize their existing assets to produce high per unit income streams. In turn we can search for significant covariates of ROA. There is a positive correlation of high ROA with low initial net worth (i.e., the poor are especially productive), a higher education of the head or among household members (especially for those running businesses), a younger age of the head of the household, and a high debt/asset ratio (as we comment on below). In the robustness checks we control for labor hours and, related, impute a wage cost to self-employment. We also correct for measurement error in initial assets and, in exploring covariates, use IV rather than OLS regressions. We also delete poor wage earning households with few assets. Results are robust to these specifications. But by far the largest single factor in a decomposition of variance is household specific fixed effects. Related, there is considerable persistence. Households successful over the first half of the sample are very likely to be successful over the second, indicating that luck per se is not a likely explanation for this success. Auto-correlation numbers range from 0.15 to 0.83. In one fastgrowing province in the poorer Northeast, persistence is much lower, and there is strong evidence that households are moving out of lower ROA occupations and into higher ones, as the local economy presents opportunities. Variability in household size also undercuts persistence, highlighting the importance of a successful individual rather than a successful household per se. Northeastern households experience more volatility in membership. As noted in passing above, there is some borrowing, and indeed, high ROA households have higher debt/asset ratios. It thus appears that the financial system does manage to some degree to extend credit to the poor with high returns. Indeed, using more structure for production functions, and using instruments suggested by Olley and Pakes (1996) and Levinsohn and Petrin (2003), we find that high marginal product of capital (MPK) households are likely to borrow more relative both to their wealth and to others. However, there is still a divergence between estimated MPK and average interest rates, so in that sense some households are constrained (a related distortion, others utilize their own wealth at a low return rather than allowing it to be intermediated). Further, we allow interest rates to vary across households as measured in the survey data, as if there were a wedge or distortion from the average, as in Hsieh and Klenow (2009), Restuccia and Rogerson (2008) and Fernandes and Pakes (2008). Re-estimated TFP is no longer correlated with the debt/asset ratio, as if we had now correctly accounted in this way for those credit market distortions, or other things. The dynamics coming from the panel are revealing about the distortions which are harder to rationalize. Consistent with a literature on growth with financial frictions, we find, using monthly data, that households with high and persistent ROA are households that tend to save more. This is consistent with the models of Buera (2008) that poor households can save their way out of constraints, say to eventually enter high return businesses, or expand existing businesses. However, this result is not robust to annual data. In the model of Moll (2009) and Banerjee and Moll (2009) persistent ROA should increase the growth of net worth as households save their way out of constraints. Overall however savings levels and rates and growth of wealth are all correlated with the level of ROA, the household fixed parts of ROA, and measured TFP.
57
As further evidence of constraints, high ROA households tend to save by investing, that is, they accumulate physical assets. The top quartile of ROA households invest in their own enterprise activities, but this is not the case for the middle and lower ROA groups. Instead, for many, increases in net worth are accomplished with increases in financial assets or cash saving. Related, relative to others in the cross-section, high ROA households are less likely to use capital assets to smooth consumption. Instead, they use consumption to finance investment deficits. High ROA households are more actively involved in financial markets month by month, in the sense of using formal savings accounts and sources of borrowing, and engage less in informal markets, i.e., receiving fewer gifts. But high ROA households do use cash more than the low ROA households, as well. Indeed, relatively high ROA households surprisingly do seem to seek financial autarky over time; in the long run reducing their debt, reducing the amount of gifts they receive, and increasing the amount in formal savings accounts. This even though they retain a relatively high ROA. As ROA is a widely accepted indicator of success in corporate accounting, but less so in economic theory, we also estimate total factor productivity (TFP) as was anticipated earlier. We find that it is correlated with ROA and in turn with candidate covariates, but less than before. The data are also adjusted for aggregate risk, consistent with the perfect markets, capital asset pricing model at the village level, utilizing the work of Samphantharak and Townsend (2009a). We find a correlation of risk-adjusted returns with ROA, as a measure of individual talent, and a correlation of risk-adjusted returns with growth of net worth. But overall results are weaker, for example, high risk-adjusted return households do not invest more in their own enterprises. This suggests again that the capital markets are not perfect in these data, though there remains some consumption anomalies which we discuss at the end. This paper is organized as follows. Section 2 describes the data, starting from a macro-level perspective as background and moving to the micro-level of selected areas from which we have detailed information on assets, liabilities, wealth, income, consumption, investment, and financial transactions. Section 2 also describes the wealth distribution and its decomposition. Section 3 decomposes growth of net worth into productivity and saving rates, and uses correlation analysis to show their relative importance to the growth of net worth. Section 4 uses regression analysis to find out what factors are associated with success as measured by a high rate of return on assets. Section 5 shows that ROA has considerable persistence, indicating that luck per se is not systematically related to success. Section 6 shows the predictive power of ROA on physical assets accumulation. Section 7 studies the financial strategies that high ROA households use and related imperfections in credit markets. Section 8 provides a short story of a selected successful household, as a case study, to complement the overall statistical analysis, and Section 9 concludes. 2. Data This paper uses data from the Townsend Thai monthly survey,1 an ongoing panel of households being collected since 1998. The survey is conducted in 4 provinces (or changwat in Thai), the semi-urban changwats of Chachoengsao and Lopburi in the Central region and the more rural Buriram and Sisaket in the poorer Northeast region (see the map in Fig. 1). This paper studies the balanced panel of 531 households that are interviewed on a monthly basis, dating from January, 1999, to December, 2005
1 See further details of questionnaire design, and sampling design of this survey, from Paulson et al. (1997) and Binford et al. (2004) respectively.
58
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Fig. 1. Monthly per capita income.
(seven years).2 In each province one tambon (county) and four villages per tambon were selected at random from a larger 1997 survey. With only 16 villages overall, this naturally raises some questions about the representativeness of the sample. So we also report here on the background from secondary sources, which is largely consistent. For comparability over time we put variables into real terms with 2005 as the benchmark for both data sets. 2.1. Background at macro-provincial level Fig. 1 is from the Thai Socio-Economic Survey, a nationally representative household survey conducted by the National Statistics Office of Thailand. Unfortunately there is no time series of wealth data in this survey. So we pick per capita net income to represent levels and what has happened over the past eighteen years. As is evident from the right panel, with all provinces depicted and the sample provinces highlighted, Buriram and Sisaket in the Northeast are among the poorest provinces in the entire nation. Chachoengsao and Lopburi are in the upper end of the distribution, though not the very highest (we do not have data in the SES for Chachoengsao over all years). However, it is also apparent that Lopburi is growing relatively faster. Concentrating on the period of our own study here, the left panel is the ratio of per capita income from 1996 to 2006, relative to 1996. Here one can see the heterogeneity in growth, with high growth concentrated in the central region, but also in some parts of the Northeast and the South. Again, Lopburi stands out as growing relatively faster. Buriram is also doing quite well over this time frame. Chachoengsao and Sisaket are in the lower quartile. Evidently Sisaket is poor and tends to stay poor. Chachoengsao is relatively rich, but not growing much.
2 Attrition is typically one of the most serious problems involving panel data, and this survey is no exception. For those households with incomplete panels, the two main reasons are temporary migration (42%) and the household member with relevant information not being available (26%). However, 60% of the incomplete panels consist of wage earners, and below we drop all wage earners from some of the analysis for other reasons.
2.2. Poverty reduction Over the seven year time frame of the survey, many households have worked their way out of poverty. We use two measures of the poverty line here. The first is the Thai official number for each province, which we can compare to either per capita income or to consumption. The second is a standard benchmark consumption of $2.16 a day at 1993 PPP. These two poverty lines generate some differences in the measure of poverty. By location, there are many more poor in the Northeast. As is evident in Fig. 2, income (especially for farmers) is erratic: monthly income displays nontrivial fluctuations. But we can still look at trends, that is, the overall picture and whether the number of poor people is decreasing over time by occupation, and by location (not shown). Using these high frequency monthly data, the headcount ratio fluctuates with a clear cycle in the Northeast. Poverty drops at harvest, the time when many households receive income from cultivation. This figure also indicates that the fraction of poor households in business is declining substantially. There are gains for livestock and modest gains for labor. But in terms of change, the biggest gains are in Buriram (not shown). Using consumption numbers, there are lower numbers of the poor than by using the income measure, especially in the Central region. By location, the biggest gains are still in Buriram. By occupation, labor and business escape poverty relatively fast. 2.3. Measurement and the accounting framework The Townsend Thai survey interviews households on a monthly basis (and bi-weekly for food consumption). Given that households are typically a production as well as a consumption unit, we treat households as corporate firms, as described in Samphantharak and Townsend (2009b), and impose onto each of the transactions of the monthly data a financial accounting framework. Each flow transaction is classified as being associated with production activities, consumption and expenditure, and financing and saving, for instance. This allows us to construct the balance sheet and the income statement of each household. Then we add the flow variables into the previous period stocks to get the current
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
59
Fig. 2. Headcount ratio.
values of stock variables for each month for each household. The updated stock at the end of each month corresponds to the items in the balance sheet. Wealth or net worth in particular is equal to total assets minus total liabilities. It must also be equal to the contributed capital plus cumulative savings. We then derive the statement of cash flow from the constructed balance sheet and income statement and double check accounting identities between the balance sheet, the income statement and the statement of cash flow of each household, month-by-month, to make sure that the accounting framework is correctly imposed. As the survey was not initially designed for this purpose, we sometimes have to make assumptions and refine the value of a transaction before we enter it into the appropriate line item. For instance, when household members migrate to work somewhere else, we ask them when they return how much they earned while they were away. If this is labor earnings, then this will show up as labor income. But they often send home remittances while they are away (thus treated as non-household member while away), and if we do not adjust for this, we may be double-counting, as otherwise we have both gift-in and labor income. In turn, we would overestimate the wealth of the household. Another example is cash holding. We do not ask for the amount of cash in hand, as this question might be sensitive and affect answers to other questions or even participation in the survey, but for any transactions we ask whether it is done in cash, kind or on credit, so in principle we know the magnitude of all cash transactions. We then as a first step in an algorithm specify that initial cash is zero and then subsequently keep track of the changes. If they spend too much, cash balances will become negative. We reset the initial balance to a higher number such that in the data they never run out of cash. This algorithm, therefore, gives us a lower bound on cash holding. Sometimes discrepancies remain, and then we have to check manually, household by household, as the answers from a household can be more complicated than the existing code can accommodate. Additional difficulties are due to recall error, change in household composition, and appreciation in value of assets. Regarding recall error, a household may simply forget to tell us their relevant transaction at some point, but later with a myriad of additional questions we find out about that transaction. Then we have to go back and fix it. For change in household composition, an
existing member might leave the household and a new member might move in, and as they often take some assets with them or bring new assets in, we have to take appropriate care. For value appreciation, especially land, it is possible that land was cheap in the past when the household started owning it, but that price went up by the time the household sold it. We take care of this by having capital gains items in the income statement and appropriately adjusting this appreciation in the balance sheet. Though we believe that the accounting framework gives us a more accurate measurement than otherwise, measurement errors in the survey data cannot be avoided. We will keep the possibility of measurement errors in mind and revisit this in the appropriate sections below. Lastly, there is the distinction between nominal and real terms. The accounting framework is constructed based on observed transactions, hence nominal units. We lose some of the identities when we convert the data to real terms. For most of the analysis in the paper we use real units for comparability over time. However, when we must rely on an accounting identity, we use nominal terms.3 2.4. Wealth distribution in the Townsend Thai monthly survey Table 1 reports the distribution of average wealth, by location and overall in 1999, in nominal baht value (the exchange rate was 37.81 baht for $1 in 1999, and on average 41.03 baht per $1 for 1999–2005). The median wealth in 1999 was about $20,000. By region, the top two provinces in the Central region are evidently wealthier than the bottom two in the Northeast. Within regions, Chachoengsao is wealthier than Lopburi, and Buriram is wealthier than Sisaket. These are consistent with the background data from the SES, even though income per capita was used there.4 There are five primary sources of income for households in the survey: cultivation, livestock, fish/shrimp, business and labor.
3 See further details of how to construct the account from the survey data from Pawasutipaisit et al. (2007). 4 The minimum wealth in Sisaket in particular is negative, there are 2 households that are excessively indebted in 1999, even if they liquidate all their assets, they would not be able to pay back all their debts.
60
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Table 1 Distribution of net worth in 1999.
Chachoengsao Lopburi Buriram Sisaket All
Min
Max
Mean
Std. dev.
First quartile
Median
Third quartile
46,795 33,108 38,741 −181,188 −181,188
163,025,232 15,054,802 14,966,253 9,193,252 163,025,232
4,869,498 1,878,238 993,141 813,438 2,220,442
14,999,152 2,237,228 1,772,507 1,122,624 8,028,631
420,599 418,380 293,521 216,064 314,217
1,275,083 1,178,969 561,562 486,545 758,426
3,093,290 2,559,019 1,030,033 1,096,106 1,784,110
Table 2 Number of households in each occupation. Primary occupation
Cultivation
Chachoengsao Buriram Lopburi Sisaket Total
Livestock
Fish/shrimp
Business
28 17 53 87
3 0 31 0
28 0 0 2
12 18 8 13
Labor 70 69 55 37
185
34
30
51
231
Table 3 Distribution of net worth in 1999, by occupation. Min Cultivation Livestock Fish/shrimp Business Labor
−10,694 241,640 42,532 76,703 −181,188
Max
Mean
Std. dev.
First quartile
Median
Third quartile
163,025,232 41,304,880 23,857,906 7,757,663 17,567,808
2,733,186 3,567,741 6,950,250 1,568,968 1,141,070
12,451,801 7,363,605 7,012,790 1,640,035 2,295,764
372,901 563,000 1,787,554 528,157 169,364
904,721 1,598,373 3,630,631 1,121,270 478,980
1,980,860 2,948,033 13,178,301 2,057,202 1,060,595
Households in the survey typically have income from multiple sources, and one can label as the primary occupation the activity that generates the highest net income over some period. Table 2 shows the number of households in each occupation where we use the activity that generates highest net income over seven years (here by construction, these households do not ‘‘change’’ occupations/bins over time). As is evident, shrimp is associated with Chachoengsao, livestock with Lopburi. Table 3 reports the distribution of average wealth, by occupation, in 1999. By medians, fish/shrimp, livestock and business households are wealthier than those in cultivation.5 Labor households appear to be the least wealthy, both in mean and median. We now focus on the distribution of wealth.6 Table 4 shows that households in the top 1% of the wealth distribution hold around one-third of the total wealth in the survey, the top 5% hold about half of the total wealth. Half of the households in the survey own less than 10%. These numbers may understate wealth inequality if the rich are undersampled. These observations as given are similar to findings from developed countries like the United States and Britain (see Atkinson (1971), Kennickell (2003), Piketty and Saez (2003), Castañeda et al. (2003) and Cagetti and De Nardi (2008)). 7 Though Table 4 shows the overall picture of wealth inequality, it does not keep households in the same group over time. Because we have panel data, we can track wealth dynamics at the household level. Using all observations (household-years), with groups defined from the distribution of average wealth in year 1999, we track the same group over time.
5 The highest net worth household is one with cultivation, but that is an exception. 6 Because households in this survey are selected by random sampling, and the survey is not explicitly designed to measure the distribution of wealth, therefore the actual wealth distribution might be worse than what we report here if the rich are undersampled. 7 The degree of wealth inequality varies across location, however, it is highest in Chachoengsao while lower in other provinces. The top 1% in Lopburi own less than 13% while the top 1% in Sisaket own less than 18% of the total wealth in their province.
Table 5 indicates that wealth share is rising for many groups, especially for the initially poor. An observation common to all provinces is that the bottom half has a rising share. By this standard, wealth inequality is thus going down over time. Other well-known measures of inequality give similar reductions. For instance, the aggregate level of wealth inequality dropped from 0.71 in 1999 to 0.67 in 2005 if we use the Gini coefficient, and inequality went down from 1.26 in 1999 to 1.10 in 2005 if we use the Theil-T index. 2.5. Average growth of wealth by initial quartile, location and occupation Table 6 documents a salient feature, that growth is decreasing in initial wealth. Every quartile has positive growth, but the magnitude is indeed remarkable for the poorest group, about 22% per year. To see where this movement comes from, Table 7 reports the growth of selected assets in household balance sheet. Land is typically the largest component in household portfolio but it is not the prime mover of wealth. The growth of household assets is decreasing in initial wealth, and though other types of fixed assets are not strictly monotone decreasing in initial wealth more upward movement comes from the bottom 50%. Growth of inventory is also large and decreasing in initial wealth, although inventory is not a big component of total assets. For financial assets, growth of deposits in financial institutions is also decreasing in initial wealth, and the growth of cash largely shares the same pattern, highest for the least wealthy group and lowest for the wealthiest group, though mixed in the middle. The annual percentage increase in net worth by location is presented in Table 8. On average, the total growth rate is about 0.3% per year, but this varies. On average, Lopburi has the highest growth rate (1.7%). Buriram has the lowest growth due to negative initial growth during the first three years, although it reaches the highest number in the Table in 2003–04 at 6.8%. Chachoengsao, in contrast, is slowing down and Sisaket is closest to 0 and does not change much compared to the others. Much of this is consistent with the national background presented earlier.
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
61
Table 4 % of net worth held by various groups defined by percentiles of the wealth distribution. Percentile
0–49.9 50–89.9 90–94.9 95–98.9 99–100
Year 1999
2000
2001
2002
2003
2004
2005
7.0969 29.6281 11.4451 18.2311 33.5984
7.4090 29.7482 11.5570 17.6575 33.6280
7.6326 30.0591 11.5567 17.5615 33.1899
7.9308 30.3522 11.6503 17.4351 32.6314
8.2568 30.8362 11.6095 17.3048 31.9926
8.5838 31.6430 11.6438 17.0155 31.1137
8.8987 32.0325 11.6950 16.9196 30.4540
Table 5 % of net worth held by various groups defined by percentiles of the wealth distribution in 1999, for all years. Percentile
0–49.9 50–89.9 90–94.9 95–98.9 99-100
Year 1999
2000
2001
2002
2003
2004
2005
7.0969 29.6281 11.4451 18.2311 33.5984
7.4880 29.7035 11.6228 17.5574 33.6280
7.8921 30.0337 11.5214 17.3956 33.1570
8.3557 30.4163 11.4426 17.2364 32.5487
8.9835 30.7713 11.2984 17.0702 31.8762
9.5554 31.5393 11.1476 16.7458 31.0117
10.1682 31.7903 11.0729 16.6094 30.3590
Table 6 Growth of net worth by the initial wealth distribution (%). Intial wealth
1st quartile
2nd quartile
3rd quartile
4th quartile
Growth of wealth
21.9735
5.2500
3.1597
0.0984
1st quartile
2nd quartile
3rd quartile
4th quartile
−0.6287
−1.4747
−1.4521
−2.0327
Table 7 Growth of assets by the initial wealth distribution (%). Growth of
Intial wealth
Land Household assets Agricultural assets Business assets Inventory Casha Deposits
17.7873 3.7354 4.3394 38.7774 20.8346 17.4923
11.8007 2.2807 6.4423 24.9481 10.4451 10.5349
8.1558 2.3511 2.9185 21.9198 11.7238 4.9382
1.5013 0.5250 −1.5138 12.0782 9.6324 −3.6677
a As we did not measure cash holding directly but rather construct it based on the flow of all transactions there is likely a measurement error. On average cash is increasing in the data and this may reflect unmeasured consumption expenditure. Here the growth of cash is computed after we adjust for this possible measurement error in consumption by making the growth of cash zero in the aggregate data. Without the adjustment the growth of cash is even larger for the poorest group.
Table 8 Annual percentage increase in net worth by location. Year
Chachoengsao Buriram Lopburi Sisaket Total
99–00
00–01
01–02
1.1769 −7.6151 1.9094 −0.0145 0.4640
1.1223 −3.8233 1.4476 −0.4705 0.6492
1.3684 −0.3564 2.6554 0.3269 1.4455
The annual percentage increase in net worth by occupation is presented in Table 9. On average, households with livestock have the highest growth rate (2.2%) and households with business the second highest (1.5%). In contrast, cultivation and fish/shrimp households have negative growth where the lowest growth is at fish/shrimp (−0.58%). Labor has positive growth (1.28%) but is lower than livestock and business. All occupations experience negative growth in some periods.
02–03 0.5450 2.2057 2.5882 0.6961 1.1797
03–04
04–05
Average
−2.1385
−2.8640 −1.0318
−0.1316 −0.6273
0.0162 0.1955 −1.7057
1.7794 0.1065 0.3004
6.8570 2.0597 −0.0945 −0.2303
quartile, and follow each group over time.8 The wealth distribution of the poorest group and the second poorest group shift toward the right, and for the second group it has a wider support. The wealth distribution of the relatively wealthy group (lower left panel) also become wider. The wealth distribution of richest group (lower right panel) shifts slightly toward the right, but this is not noticeable from the picture. One question is how much of the reduction in wealth inequality is due to the fact that poor households grow faster, as opposed
2.6. Decomposition of wealth inequality change by initial quartile or decile We can also look at entire distributions of wealth. Fig. 3 shows estimates of the kernel densities of wealth distributions (in log scale) when we classify households into four groups, by initial
8 It might seem that the distribution of initial wealth would have four humps when we pull all groups together. But this is only an artifact of the four separate kernel estimations. If the four histograms of initial wealth were plotted and presented in one column, it would be obvious that they are non-overlapping across the four panels.
62
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Fig. 3. Distribution of net worth. Table 9 Annual percentage increase in net worth by occupation. Year
Cultivation Livestock Fish/shrimp Business Labor
99–00
00–01
01–02
02–03
03–04
04–05
Average
0.1744 3.8177 1.1317 −2.6687 −0.1010
−0.5438
0.3573 5.2613 −0.4985 4.0802 2.4504
−0.0960
−0.7858 −0.2723 −3.0835
−2.4409 −0.8516 −3.0560 −1.4962
−0.5558
1.8945 2.0675 2.6728 0.6141
to other forces. One well-known and widely used measure of inequality is the Theil-L index, which is additively decomposable. j ,g Let Wt be the wealth of household j which belongs to group g at time t, and Nt be total number of households at time t. Then the Theil-L index is defined by
It = log
1 − Nt
j ,g
−
Wt
j ,g
1 − Nt
j ,g
log Wt
.
j ,g
g
g
1 − g
Nt
j ,g
Wt
j
g
It = log Wt
−
1 − g Nt
j ,g
log Wt
.
j
W
g
It = AIt + WIt
AIt = log
1 − Nt
− Ntg g
Nt
j ,g Wt
j ,g
−
− Ntg g
Nt
log
g Wt
g
It .
Thus total change inequality must come from changes in each component:
1I = 1AI + 1WI .
1.5476 1.2893
0.0912
−
−
pg 1I g +
pg
1 log W + g
− Wg g
−
W
− log
Wg W
1pg
I g 1pg
g
where 1 and the overbar denote the time difference operator and the time average operator, respectively. ∑ g g Each subcomponent has its own interpretation: g p 1I is the intragroup dynamics or change in inequality ∑inequality g g within a group, g I 1p is the composition dynamics through ∑ shifts across a group with different degrees of inequality, g
Then inequality can be decomposed to a within (WIt ) and across (AIt ) component
WIt =
1AI =
− pg W g
g
g
Nt = total number of households in group g at time t
3.4860 1.8395
2.2014
−0.5863
To simplify the notation, let Wt be an aggregate mean of wealth g at time t, pt be the population share of subgroup g at time t. Then 1WI and 1AI each can be further decomposed into two subcomponents,9
1WI =
Also let
Wt =
3.3591 −0.0792 3.2116 2.8415
− pg 1 log W g is the wealth-gap dynamics or change in ∑ Wg g wealth differential across a group, and − log WW 1pg g W pg W g W
is the composition dynamics in changes in AI. The latter is the Kuznets effect, i.e., shifts by the poor into higher quartiles. If we use the quartile of the wealth distribution in 1999 to g define a group then p1999 = 1/4 for all g, but again we allow people to move across groups over time, and also the number of
9 See Mookherjee and Shorrocks (1982) where 1WI is exact decomposition while 1AI is approximate decomposition. See Jeong (2008) for related application using income rather than wealth from the Thai Socioeconomic Survey.
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
63
Table 10 Transition matrix from 1999 to 2005.
households in each group may vary over time. Some households with increasing wealth may move up, while those with decreasing wealth may move down. Although the boundaries of each cell are fixed over time, many households move in and out of the cells. The transition matrix from 1999 to 2005 where the group is the initial quartile on the left, and by initial decile for comparison on the right, are each reported in Table 10.10 By quartile, about 20% of households move up and 8% of households move down. There seems to be a considerable amount of persistence over time, as almost 72% stay in the same group. However, by deciles there is more mobility, as 37% of households move up, 21% of households move down and only 41% stay in the same group. Table 11 for deciles reports the percentage of the contribution of each subcomponent to the total reduction in wealth inequality, year by year and overall from beginning to end. The biggest contribution comes from the compositional Kuznets change and the wealth gap dynamics. The contributions from the two within components are smaller and sometimes negligible. For the whole period of 1999–2005, the last row indicates that the decrease in wealth inequality is due to a compositional change, and to convergence in the wealth differential across groups, that is, 51% and 49% of the reduction in wealth inequality are due to these two components, respectively. Note that the former is the source of inequality dynamics which Kuznets emphasized, though it was income that he had in mind, while the latter indicates the convergence in wealth across groups.11 The last two columns indicate that overall within group inequality and a composition effect can be negative. For example, in the last overall row, the composition effect goes in the opposite direction from overall inequality, i.e., there are shifts into higher inequality groups. By region, wealth inequality in both regions is decreasing. The Central region has much higher inequality (Theil-L is 1.1019 in 1999) than the Northeast (Theil-L is 0.6098 in 1999), but inequality also decreases faster (Theil-L is 0.8611, and 0.5072 in 2005 for the Central and the Northeast regions). Table 12 reports the same type of decomposition by region.
10 The number in each cell is the number of households. One can convert these to fractions (probabilities as in the conventional transition matrix) by dividing the sum of the numbers in each row. 11 Using quartiles yields similar results but the orders of magnitude are lower because we have fewer bins, i.e., the decrease in wealth inequality for the whole period 1999–2005 is due to a compositional change (44.67%), and to convergence of the wealth differential across groups (39.87%). The two within components account for the rest (15.46%).
Table 11 Decomposition of wealth inequality change by decile (%). Across-group inequality
Within-group inequality
Wealth-gap
Composition
Intragroup
Composition
1999–2000 2000–2001 2001–2002 2002–2003 2003-2004 2004–2005
44.95 128.24 −90.64 20.60 162.07 7.42
74.93 −23.33 184.62 73.16 −42.08 72.89
−17.25
−2.66 −9.54
5.14 −10.18 10.97
11.64 1.09 −9.90 8.72
1999–2005
48.72
50.96
0.15
−0.06
4.39
−5.62
Table 12 Decomposition of wealth inequality change by decile (%), by region. Across-group inequality
Within-group inequality
Wealth-gap
Intragroup
Composition
Composition
Central 1999–2000 2000–2001 2001–2002 2002–2003 2003–2004 2004–2005
−0.10
34.45 27.11 28.00 63.41
81.41 109.05 60.71 72.39 65.50 33.66
1.97 10.02 3.66 1.50 2.94 2.16
20.46
74.85
3.66
1.03
1999–2000 2000–2001 2001–2002 2002–2003 2003–2004 2004–2005
17.55 23.93 −115.77 65.67 219.60 132.09
86.39 60.46 247.44 26.64 −164.86 −47.41
−14.48 13.09 −43.10 4.64 67.95 13.36
10.56 2.54 11.44 3.05 −22.78 1.46
1999–2005
41.59
51.85
3.48
3.03
1999–2005
16.70
−20.88
1.82 1.17 −1.02 3.56 0.78
Northeast
The biggest contribution still comes from compositional Kuznets change and the wealth gap dynamics, while the contributions from the within components are smaller. The Kuznets effect is the largest in both regions. However, the change in wealth differential across groups is relatively more important in the Northeast. As inequality is decreasing in both regions, this means that the average wealth of the lower and higher wealth deciles is converging, especially in the Northeast, in addition to people moving across the deciles. In contrast, the latter is the primary force in the Central region. Both are consistent with results from Tables 4 and 5, that wealth share of the rich is decreasing while wealth share of the poor is increasing.
64
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Fig. 4. Decomposition of wealth inequality: location.
2.7. Decomposition of wealth inequality level by location and occupation j
Let us suppress the time subscript for the moment and let WOL be wealth of household j with occupation O at location L. Then the Theil-L index is defined by
I = log
1 − N j,O,L
j WOL
1 −
−
N j,O,L
j
log WOL
where N is total number of households. We can decompose I to12
I =
log
+
1 − N j,O,L
− NL L
j WOL
−
log (WL ) −
log (WL )
N
L
N
− NL
− NOL NL
O
log (WOL )
+
− NOL O
NL
IOL ,
where WOL is average wealth of occupation O, at location L, WL is average wealth of location L, NOL is total number of households in occupation O and location L, NL is total number of households in location L, and IOL is the Theil-L measure of wealth inequality of occupation O at location L i.e., IOL = log (WOL ) − Let AL = log
1 − NOL
∑ 1 N
j,O,L
j
within location effect is the smallest. These are plotted in Fig. 4 but on different scales. In terms of changes, Table 13 and Fig. 4 confirm that the overall inequality I (Theil-L) is going down over time, where the residual WOWL is the biggest component among the three. Wealth inequality across locations AL has an inverted U-shape, i.e., increasing until 2002 and decreasing after that, and wealth inequality across occupation but within location has a decreasing trend. Within occupation/location, inequality is not decreasing everywhere. For example, business almost everywhere has an increasing trend (except Sisaket). Still, cultivation and labor have a decreasing inequality everywhere and the biggest drop is in cultivation in Chachoengsao. Since most households have labor and cultivation as a primary occupation, these together drive down the aggregate value of this component, WOWL. We can also reverse the order in decomposition and then look at inequality across occupations, and inequality within occupations but across locations:
I =
log
+
log WOL .
1 − N j,O,L
− NO O
N
j WOL
−
− NO N
O
log (WO ) −
log (WO )
− NOL L
NO
log (WOL )
+
− NOL L
NO
IOL ,
j j
WOL −
ity across locations, AOWL =
∑
∑
NL L N
NL L N
log (WL ) denote the inequal-
log (WL ) −
NOL O NL
∑
log (WOL )
denote the (sum of) inequality across occupations, but within location, and WOWL =
NL L N
∑
NOL O NL IOL
∑
denote the (sum of)
inequality within occupations, within locations. Overall, the within location occupation category is the largest. The second largest is across location, and the across occupation
12 See Acemoglu and Dell (2009) for a related application.
where WO is average wealth of households with occupation O, and NO is total number of households with occupation O. ∑ ∑ NO j Let AO = log( N1 O N log (WO ) denote the j,O,L WOL ) − inequality across occupations, ALWO
∑
NOL L NO
=
NO O N
∑
(log(WO ) −
log(WOL )) denote the (sum of) inequality across location,
∑
N
∑
N
O OL but within occupation, and WLWO = O N L NO IOL the same as before, i.e., residual is a residual. Table 14 and Fig. 5 are the decompositions of their Theil-L indices by occupation and location. Of course, the within occupation and within location number is the same as before and the largest. Inequality across occupations is the second largest, but the difference between the first and second rows is smaller than before. In terms of changes, Table 14 shows
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
65
Table 13 Decomposition of Theil-L index by location and occupation. Year 1999
2000
2001
2002
2003
2004
2005
AL AOWL WOWL
0.2552 0.1345 0.6815
0.2701 0.1319 0.6530
0.2806 0.1272 0.6155
0.2839 0.1234 0.5954
0.2795 0.1203 0.5640
0.2645 0.1193 0.5548
0.2553 0.1178 0.5329
I
1.0712
1.0551
1.0233
1.0028
0.9638
0.9386
0.9061
Table 14 Decomposition of Theil-L index by occupation and location. Year 1999
2000
2001
2002
2003
2004
2005
AO ALWO WLWO
0.2437 0.1552 0.6723
0.2477 0.1631 0.6443
0.2465 0.1700 0.6068
0.2406 0.1749 0.5872
0.2338 0.1740 0.5561
0.2218 0.1699 0.5470
0.2136 0.1673 0.5252
I
1.0712
1.0551
1.0233
1.0027
0.9639
0.9386
0.9061
Fig. 5. Decomposition of wealth inequality: occupation.
that ALWO has an inverted U-shape. Inequality across locations is increasing as before, up to about 2002, now controlling in a sense for occupation. We also see that inequality across occupations, not controlling for location, is now slightly increasing at the beginning, i.e., going against the overall trend. These are plotted in Fig. 5 but on different scales.13 2.8. Decomposition of growth of net worth: the mechanics We return to the financial accounts (in nominal terms) to begin to get at the mechanics of the change in net worth for each
household. The change in net worth of each household must come from savings and net gifts received. That is, let 1Wti = Wti − Wti−1 be the change in net worth at time t of household i, Sti and Git be savings (saved if positive or not-saved) and gifts (received if positive or given) at time t of household i. Hence: Wti = Wti−1 + Sti + Git . More generally, wealth or net worth at time t can also be expressed as initial wealth W0i plus accumulated savings and net gifts received up to time t: Wti = W0i +
t −
Sji + Gij .
j =1
13 Looking at the subcomponents of ALWO by occupation makes it clearer why both AL and ALWO have inverted U shapes, due to business and labor. But ALWO is increasing for cultivation, fish/shrimp, while decreasing for livestock. That is, the premia across location, the wealth differential according to location, is going up consistently for cultivation and fish/shrimp, increasing until 2002 and decreasing after that for business and labor while the one for livestock is going down consistently. As we add them up, the combination of these forces together produces the picture of ALWO as inverted U-shape as seen in Fig. 5, which is also similar to the picture of AL in Fig. 4.
We may ask which component is the larger part of the rate of total wealth accumulation in this economy in the aggregate: 1 Ng
Ng ∑
Wti − W0i
i=1 1 Ng
Ng ∑ i =1
1 Ng
=
W0i
1 Ng
Ng ∑ t ∑ i=1 j=1 Ng ∑ i=1
1 Ng
Sji
W0i
+
Ng ∑ t ∑
Gij
i =1 j =1
1 Ng
Ng ∑ i=1
W0i
,
(1)
66
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Table 15 Growth of net worth and its components held by various groups defined by by percentiles of the initial wealth distribution. Percentile
Growth of wealth
0–24.9 25–49.9 50–74.9 75–100
25.5632 7.1841 4.8868 1.5224
Total
1 Ng 1 N
∑N g
i i=1 W0 i i= 1 W 0
∑N
0.0679 0.2252 0.5362 3.1638
2.6991
1
where N g is the total number of households in a group g. Thus, the left-hand side variable measures the overall aggregate growth rate of the net worth of group g. We can now decompose the aggregate growth rate into a weighted sum of growth from these subgroups by group g. Using the notation defined as before, let Wt be aggregate mean of wealth g g at time t, and let pt and Wt be population share and mean of subgroup g, respectively. Then the aggregate mean change in levels can be decomposed into two parts, due to a change in the mean of subgroups and a change in population shares:
1W =
G −
pg 1 W g +
g =1
G −
W g 1pg ,
g =1
where again 1 and the overbar denote the time difference operator and the time average operator, respectively. The first term captures intragroup growth, while the second term captures growth due to a compositional change in population. But as a subgroup is defined by the distribution of initial wealth, e.g., quartiles, and we follow g households in the same group over time, then pt will not change over time so,
1W =
G −
pg 1 W g .
g =1
Dividing both sides by W0 to compute the overall aggregate growth rate, the latter is related to growth rate of the subgroup by: 1 N
N
∑
Wti − W0i
i =1 1 N
N
∑
1 Ng
G
= W0i
−
pg
Ng
∑
Wti − W0i
i=1 1 Ng
g =1
i=1
Ng ∑
1 Ng
1 N
W0i
i=1
Ng
∑ i=1 N ∑
.
(2)
W0i
i =1
Substituting (1) into (2), we can thus decompose the overall aggregate growth of each group into three components: wealth weight in the total, savings, and gifts. 1 N
N ∑
Wti − W0i
i =1 1 N
N ∑
W0i
i=1
G
− = pg g =1
1 Ng
1 Ng
Ng ∑ t ∑ i=1 j=1 Ng ∑ i =1
1 Ng
Sji
W0i
+
Ng ∑ t ∑ i=1 j=1
1 Ng
Ng ∑ i=1
Gij
W0i
1 Ng
1 N
Ng ∑
W0i
i=1 N ∑
.
(3)
W0i
i =1
Each element in (3) is presented in Table 15 where again group g refers to initial quartiles and takes t to be the terminal period that the data is available. Table 15 reports overall growth rate, savings and gift components (annualized value), and wealth weight that can account for the growth of each group. The growth of wealth is decreasing in initial wealth, as noted earlier across the quartiles. The savings component is the one that accounts for most of the growth rate. Largely, the contribution of gifts decreases as wealth increases, except for the second quartile
Gifts component
16.1035 3.5927 3.6121 1.5781
9.4598 3.5914 1.2747 −0.0557
2.2099
0.4892
where both savings and gift components are roughly the same. The wealthiest group has a negative gift component, that is, they give more than they receive on average, and that brings down its growth of net worth. Even though the least wealthy group has a growth rate of over 25%, its fraction of initial mean wealth is only 6%, while the wealthiest group, with its 1.5% in growth rate, has an initial mean out of aggregate of over three-fold. Thus, the overall growth of the (nominal) net worth of the economy is about 2.7% per year, and 81% of this growth is accounted for by the savings component. Although the wealth of all groups is growing, the process is not smooth, especially at high temporal frequencies. If we look at the monthly aggregate change in the net worth of each group (from January, 1999, to December, 2005), this can be decomposed again into aggregate savings and gifts of that quartile group g: g
N 1 −
Ng
g
1Wti =
i=1
N 1 −
Ng
Sti + Git .
i =1
We can see from Fig. 6 that all groups experience negative growth at some point. Another way to look at the role of savings and gifts in the change in net worth is to use a variance decomposition. By the accounting identity, again, change in net worth of each household must be equal to savings plus net gifts received. This identity can be translated to a statistical relationship: cov 1NW i , S i
1=
W0i
Saving component
v ar 1NW
+ i
cov 1NW i , Gi
v ar 1NW i
for all i,
that is, normalized wealth change can be accounted for by the comovement of wealth change with saving and the same with net gifts. By this metric, the variation in wealth change for most households is, again, better explained by variation in savings rather than gifts, as the savings distribution is centered around 100 and gifts around 0 (not shown). This pattern holds for all changwats with the lowest peak in Buriram (a hint that something more complicated is going on there). In Table 16, we further decompose the growth of each group by primary occupation. The extra column (fraction) indicates how many households of that occupation are in that wealth group. All occupations of the least wealthiest group (group 1) have a relatively high rate of growth in net worth, as might have been anticipated. And this negative monotonicity is largely true as one moves across quartiles by occupation. But the highest growth by occupation varies across the quartiles. The majority of households in the lowest quartile group are classified as labor (about 65%), and their growth in net worth is about 32% per year, the highest of all groups. Almost two-thirds of this growth is accounted for by the savings component. Although the fraction of initial wealth of this group is the smallest, the total weight (by population multiplied by initial wealth) is highest among all occupations. Therefore, the high growth in net worth of group 1 is mainly due to this subgroup. The highest growth rate of groups 2 and 4 are from households with livestock as the primary occupation, again mostly accounted for by savings. The highest growth rate of group 3 is from households with fish/shrimp as the primary occupation, and almost 100% of
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
67
Table 16 Growth of net worth and its components held by various occupations. Occupation
1 Ng 1 N
∑N g
i i= 1 W 0 i i= 1 W 0
Growth of wealth
Fraction
Cultivation Livestock Fish/shrimp Business Labor
13.0416 6.9493 21.9301 13.1656 31.7788
0.2727 0.0152 0.0227 0.0303 0.6591
1.1945 1.7715 1.2395 1.1532 0.8865
7.8134 7.7090 15.7833 −2.4516 20.5883
Total
24.0804
0.2485
0.0679
15.1404
8.9400
4.6225 18.4396 6.5792 3.0652 7.9964
0.3534 0.0526 0.0150 0.1128 0.4662
0.9638 0.9665 1.4026 1.0538 1.0052
0.6601 12.7406 5.4533 −1.1467 5.3798
3.9623 5.6990 1.1259 4.2118 2.6166
6.7625
0.2504
0.2252
3.3726
3.3899
3.4940 7.7690 12.0465 4.6544 4.4952
0.3684 0.0752 0.0226 0.1278 0.4060
1.0143 1.0469 1.0626 1.0078 0.9724
1.7822 7.5508 12.0484 3.3590 3.5409
4.5803
0.2504
0.5362
3.3798
1.2004
Cultivation Livestock Fish/shrimp Business Labor
1.7362 4.1846 0.7946 2.2681 0.3411
0.3759 0.1128 0.1654 0.1203 0.2256
0.7572 0.9696 2.2519 0.4934 0.7719
1.5700 3.8592 1.2653 2.5979 0.0384
0.1662 0.3254 −0.4707 −0.3298 0.3027
Total
1.4419
0.2504
3.1638
1.5012
−0.0593
∑N
Savings
Gifts
Group 1 (0–24.9) 5.2282
−0.7597 6.1468 15.6172 11.1905
Group 2 (25–49.9) Cultivation Livestock Fish/shrimp Business Labor Total Group 3 (50–74.9) Cultivation Livestock Fish/shrimp Business Labor Total
1.7118 0.2181 −0.0018 1.2954 0.9543
Group 4 (75–100)
Fig. 6. Savings and gifts.
this growth is accounted for by savings. However, the savings component is negative for business households of the first and second quartiles, and thus contribution from gifts to their growth rate is over 100%. In other words, business households of these two groups either made losses rather than profits and/or consumed more than earned, but were still growing due to net gifts received.
2.9. Return to the heterogeneity in the growth of net worth We have seen that household net worth is growing on average, but not all households are experiencing the same thing. Table 17 is the distribution of the average growth of net worth over seven years. It emphasizes the heterogeneity in the data: there is a
68
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Table 17 Distribution of growth of net worth (% per year).
Chachoengsao Lopburi Buriram Sisaket All
Min
Max
Mean
Std. dev.
First quartile
Median
Third quartile
−22.2502 −15.2999 −322.3219 −306.5071 −322.3219
51.6785 35.0945 183.4458 148.2795 183.4458
5.1254 5.1970 1.0576 3.5287 3.9306
12.0369 8.9357 39.1593 34.7254 25.9758
−1.7828 −0.6547 −4.6787 −2.6326 −2.3448
1.0123 3.1131 −1.1281 0.0594 0.7861
9.1470 9.3896 5.4534 4.2867 7.5045
over seven years. This is naturally centered at zero, but note that the standard deviation is 13.75, the minimum is −57 and the maximum is 80. Forty-three percent of households in the survey have increased their relative position, almost 50% of households have a negative change in relative position, and 7% stay at the same position using percentiles. Fig. 8 is an example of some households in each changwat who experience relatively large increases and decreases in their relative position. Most of these changes are gradual increases or decreases of their relative positions, but some of them experience a sudden change. 3. Growth of net worth: a decomposition into productivity and savings rate
Fig. 7. Change in relative position.
positive real growth rate on average, as both mean and median are positive, but strikingly 44% of households in the survey have a negative growth of net worth. This number varies by location, with a smaller fraction of households in the Central region at 46% and 28% in Chachoengsao and Lopburi, respectively and about 53% of households in the Northeast (57% and 50% in Buriram and Sisaket, respectively). However, the spread of the distribution is much wider in the Northeast where the absolute values of the highest and the lowest growth rates are several times larger than that of the Central region. Related, taking advantage of the long monthly panel, we can track the relative position of net worth within each changwat for each household. Fig. 7 shows a histogram of change in relative positions
As savings can better explain wealth accumulation than gifts for most households, we turn our attention to it. Savings of household i at time t, Sti can be written as a combination of savings rate sit (savings divided by net income), productivity ROAit (net income πti divided by assets, that is, the return on assets as typically used in corporate finance) and assets Ait itself, i.e., Sti =
Sti πti
Ai
π i Ai t ti t i
= st ROAt Ait . That is, how much income a household i can generate from a given level of assets level Ait is measured by the rate of return on assets (ROAit ) and how much a household chooses to save out of income generated is captured by the saving rate (sit ). If two households have the same asset levels and savings levels, then both households will have the same change in net worth (setting net gifts received equal to zero for both households, for the sake of simplicity). In
Fig. 8. Relative position of net worth.
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
69
Table 18 Correlation of growth of net worth and savings rate for non-labor households. All
Chachoengsao
Buriram
Lopburi
Sisaket
HH-month
0.0035 (0.6532)
0.0292* (0.0542)
0.0398 (0.1113)
0.0102 (0.4446)
0.0031 (0.8246)
HH-year
0.0314 (0.1722)
0.1049** (0.0291)
0.1768** (0.0144)
0.0719** (0.0782)
0.0171 (0.6586)
HH
0.2016 (0.0006)
0.3790** (0.0017)
0.2142 (0.2472)
0.4887*** (0.0000)
0.1367 (0.1773)
Notes: number in parentheses is the significance level. * Represents significance at the 10% level. ** Represents significance at the 5% level. *** Represents significance at the 1% level.
this case, if one household has a higher profit, for fixed level of A and S, this higher ROA will also mean that the saving rate is lower, and the difference will go to consumption. So both experience the same change in net worth, but one household is better off than another because it has higher productivity and is thus able to have higher consumption. Alternatively, as we focus on here, if for these two households consumption is the same, the one with the higher profit will have a higher growth rate. While this interpretation is suitable for households that use assets to generate income, it is harder to interpret for households that have labor earnings as the primary source of income because the assets used are mainly human capital assets rather than physical capital. We do not measure human capital, so we thus adjust for this below. More generally, in terms of the growth rate of net worth, we can write
1Wti Wti−1
Ait = sit ROAit i
W t −1
+
Git Wti−1
.
Thus both the savings rate and productivity can determine the growth of net worth. The order of magnitude in decomposition reduces to an empirical question. Although correlations do not allow for causal or structural interpretation, they are still informative about the relative importance of different variables. In this sense, the savings rate is less important than productivity for the growth of net worth. We first study whether variations in the savings rate can help in explaining the growth in net worth.14 The sample contains a non-trivial portion of negative savings (when consumption is higher than net income) and in defining a savings rate as savings/net income, we run into trouble when net income is negative. We drop observations when the net income is negative to get a more meaningful measure of the savings rate. However, this depends on how we aggregate the data. Using household-months as the data are originally collected, 29% of household months have a negative net income. When we aggregate to an annualized value, observations with negative net income reduce to 14%, and when we aggregate over all seven years, negative net income reduces to 4%. Naturally, households experience some losses in the short run as there is a transitory component in income, but this is less likely to persist, and the permanent component in income will play a larger role in the longer run. Computing a savings rate has another problem when net income is positive
Table 19 Correlation of growth of net worth and savings rate for labor households. All
Chachoengsao
Buriram
Lopburi
Sisaket
HH-month
0.0013 (0.8747)
0.0113 (0.4253)
0.0021 (0.8971)
0.0107 (0.5042)
0.0007 (0.9730)
HH-year
0.0257 (0.3253)
0.1742*** (0.0003)
0.0455 (0.3565)
0.0722 (0.1633)
0.0527 (0.4094)
HH
0.1710** (0.0107)
0.5161*** (0.0000)
0.1589 (0.2099)
0.0855 (0.5350)
0.1672 (0.3089)
Notes: number in parentheses is the significance level. ** Represents significance at the 5% level. *** Represents significance at the 1% level.
but close to 0, as this drives the savings rate to a very high number. To deal with this, we raise the cutoff to some small positive number (100 baht). This drops more households from the analysis.15 The median of the distribution of the savings rate is 25% regardless of the type of observation, but the mean is negative because some households have very high negative savings rates. Tables 18 and 19 show the correlation of the growth of net worth and the savings rate, where we use household-months, averaged by mean over calendar time to get household-years, and averaged by mean to get a single number for each household. As there is some difference between labor and non-labor households in the interpretation of ROA, we separately report for each of them. Correlations tend to be higher when we aggregate over calendar years, and over all seven years, that is, a stronger positive association of growth of net worth and savings rate over the longer run. By location, there is a significant, positive and large correlation between growth of net worth and savings rate in Chachoengsao at almost all levels for both non-labor and labor households. And there are more significant correlations for non-labor than labor households. In contrast, Table 20 shows a significant and positive correlation between growth of net worth and ROA at all levels.16 The correlation numbers naturally tend to be higher when we average over
15 Number of dropped observations when we compute the savings rate are as follows.
14 In a continuous time model where the savings rate is actually savings out of
˙ i = S i = S ii π ii W i = si ROE i W i where ROE is the return on equity, profits W π W not the return on assets. In discrete time, one can write Wti+1 =
Sti +Wti πti
πti
Wti
Wti =
sit ROEti Wti which suggests that Var (log( /Wti )) = Var (log sit ) + Var (log ROEti ) + 2Cov(log sit , log ROEti ). In practice the covariance term is quite large and this decomposition fails to be very informative. Wti+1
16 In fact, most households in the survey have multiple sources of income, even for labor households where wage earning is their primary source of income, but most of them also have income from other sources. Therefore, ROA for them is not
70
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Table 20 Correlation of growth of net worth and ROA for non-labor households. All
Chachoengsao
Buriram
Lopburi
Sisaket
HH-month
0.3576*** (0.0000)
0.5664*** (0.0000)
0.7394*** (0.0000)
0.5497*** (0.0000)
0.2665*** (0.0000)
HH-year
0.4040*** (0.0000)
0.5081*** (0.0000)
0.7661*** (0.0000)
0.5270*** (0.0000)
0.3301*** (0.0000)
HH
0.5256*** (0.0000)
0.6830*** (0.0000)
0.7366*** (0.0000)
0.6853*** (0.0000)
0.4423*** (0.0000)
Notes: number in parentheses is the significance level. *** Represents significance at 1% level.
Fig. 9. Density of growth of net worth.
3.1. Assessing the possibility of mismeasured total assets and net worth Table 21 Correlation of growth of net worth and ROA for non-labor households. All
Chachoengsao
Buriram
Lopburi
Sisaket
0.3172*** (0.0000)
0.5235*** (0.0000)
−0.0934
0.4290*** (0.0000)
0.2859** (0.0037)
(0.5936)
Notes: number in parentheses is the significance level. ** Represents significance at the 5% level. *** Represents significance at the 1% level.
all seven years, and the correlation is highest for Buriram, though lowest for Sisaket. Fig. 9 plots the density of growth of net worth by ROA. The density shifts to the right as ROA increases. The quartile of households with the lowest ROA tends to have a lower growth of net worth. But there are quite a few exceptions, as we can see from long right tail of density. Also, higher ROA households tend to have more dispersion in terms of growth. These patterns are common across provinces, even if it is more difficult to see in the picture from Sisaket.
meaningless either. When we include all households in the calculation, the result is quite similar, that is, all of them are statistically significant with varying degrees of correlation, i.e., with labor households, the correlations are higher in the Central region but lower in the Northeast.
If initial wealth seems to be low because it is mismeasured, and later we get a more accurate, higher measure of wealth, then there would be a positive association of low levels and high growth. Likewise, this could make for a high correlation between ROA and growth of net worth, as an ill-measured low initial wealth means both high initial ROA, dividing by a small number, and high growth, as just mentioned. This may be happening despite the pains we take to measure accurately. Possibly households feel more comfortable giving information as they are re-interviewed or the enumerators are getting better over time in acquiring the said information. However, the discipline of the accounts would mitigate this since households would have to report that they had purchased the additional assets since the last interview, or otherwise the enumerator is supposed to have gone back and made corrections on earlier data. Still, we can check on whether the picture we get is entirely spurious by an obvious robustness check. We compute ROA using income from the first subperiod divided by assets from the second subperiod. The correlation of the two measures of ROA is not low (0.5701) and is significant. For growth of wealth, we compute average wealth for the first half and the second half of the overall sample frame (3.5 years each) and then compute its growth rate. The correlation of two measures of growth of net worth is slightly lower than the one of ROA (0.5628) and is significant.
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Table 21 reassesses the correlation of growth of net worth and ROA. We obtain similar qualitative results, except for Buriram where the correlation is now negative and not significant. The correlations are lower when we reassess the possibility of mismeasured total assets and net worth. In summary, although both saving rates and productivity are potentially important in explaining the growth of net worth, productivity seems to drive growth of net worth more consistently. We next study what factors are associated with household success as measured by high ROA. 4. Factors accounting for success We first look at education as a potential factor to explain ROA. Then we use multivariate regressions and consider additional variables that are emphasized in the literature: debt normalized by assets, initial wealth, and an occupation (dummy). We then recompute ROA by focusing on physical assets and see whether the results are robust. Lastly, we look at the persistence of ROA over time. We focus on non-labor households for the suitability of the ROA concept. 4.1. Education and ROA One may ask whether education as a measure of talent is related to ROA, the ability to make money from assets. As the unit of the survey is the household, we need a measure of education for the household, but this depends on household composition, which may change over time. Some household members may graduate or obtain higher education as time passes. Those who have higher education levels may leave, and this might affect the productivity of the reduced household unit. On the other hand, one can think of a longer lasting impact, even after a member has gone. We can treat the head as a representative of the household, but this would be a static number, as most heads had completed their schooling before the beginning of the survey. An alternative takes each month and picks as education among existing members the one who has the highest level of education. Finally, one can take all the members in that month and use the mean or median as representative of that month. We denote these variables respectively by max_edu, mean_edu, and median_edu. To come up with one number for each household over the various months of the sample, we can average each of the three by either the mean or the median. Averaging variables over monthly data by the median per year and regressing ROA on each measure of education with householdyear observations, we have Table 22. Thus, high education is associated with high ROA for all measures, but particularly so for the max. We stratify by occupation in Table 23. Business households have a positive, highest, and most significant regression coefficient for all measures. Estimates are negative for livestock households. Cultivation and fish/shrimp households have slightly positive estimates and are weaker, with p-values at the 10% level. Using a different way to represent the data such as arithmetic mean would produce a quantitatively different result at the overall level. But the results by occupation are similar to Table 22, especially for business. 4.2. Multivariate regressions We consider variables that are emphasized in the literature as being able to explain differences in economic well-being: debt normalized by assets, initial wealth, occupation (dummy), family
71
Table 22 Regression coefficient of ROA on education. max_edu
mean_edu
median_edu
0.134956*** (0.000)
0.068116*** (0.000)
0.0555616*** (0.000)
Notes: number in parentheses is the p-value. *** Represents significance at the 1% level. Table 23 Regression coefficient of ROA on education, by occupation. Cultivation
Livestock
Fish/shrimp
Business
0.0265635* (0.092)
−0.0572225* (0.095)
0.0681653** (0.015)
0.1756315*** (0.001)
mean_edu
0.0303029* (0.067)
−0.0697562** (0.044)
0.0633666* (0.067)
0.1543303*** (0.007)
median_edu
0.0314343* (0.057)
−0.0697973** (0.044)
0.0619228* (0.063)
0.1424772* (0.012)
max_edu
Notes: number in parentheses is the p-value. * Represents significance at the 10% level. ** Represents significance at the 5% level. *** Represents significance at the 1% level.
networks,17 division of labor within the household,18 and again education where we use mean_edu. Initial conditions from a previous generation might matter: we control for parental characteristics such as education of the father and the mother of the head (and spouse), landholding of father and mother of the head (and spouse), the latter as a proxy of how wealthy their parents were. We also include basic demographic variables such as household size, head’s age and head’s gender, sex ratio, and control for time and location by dummies. Moreover, we control for heterogeneity across households by putting in a dummy variable for each household, to see whether the results are robust. We average by mean over months to get household-year observations for households with non-labor as the primary occupation. When running regressions, it is especially important to take the possibility of measurement errors in total assets and initial wealth into account, as we use them as covariates in regressions. Specifically, suppressing the household superscript for the moment, and assuming that total assets are the only variable measured with error, let At = A∗t exp (et ) ,
(4)
where At is the measured assets, A∗t is the true value ∗, and et is a measurement error. The reason we assume a multiplicative form is because assets enter as denominator in both the dependent variable and one of the independent variables, and we want to handle measurement errors by a linear IV. That is, we use measured assets to compute ROA and the debt-asset ratio (Lt /At ), and as a result both the ROA and debt-asset ratio are also contaminated with measurement error. Assume the classical errors-in-variables: cov log A∗t , et = 0.
It is well known that this will result in attenuation bias in the OLS.19 We use the value of land at time t, and lag of wealth as instruments
17 Network is defined by blood relationships and is meant to capture the effect of network (if any) on ROA. 18 The division of labor is defined from the number of days each member spends on each task, a proxy for how well managed a household is (better management should result in a higher ROA). 19 log ROA∗t = log (ROAt ) + et
log
Lt ∗
At
= log
Lt At
+ et .
72
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Table 24 IV regression (using all assets to compute ROA).
Table 25 Adjusted R2 from OLS regression (using all assets to compute ROA).
Dependent variable: log(ROA) log(debt–asset ratio) Education Household size Head age Initial wealth
0.2067*** (0.000) 0.01636 (0.362) 0.1185*** (0.000) −0.01944*** (0.000) −1.9e−08*** (0.000)
0.06687** (0.039) 0.135*** (0.000) 0.1856*** (0.000)
Own hour work Paid hour work Household fixed effect Number obs. Adjusted R2
Dependent variable: ROA
0.2116*** (0.000) −0.01551 (0.388) 0.1207*** (0.000) −0.01802*** (0.000) −2.1e−08*** (0.000) 0.00566*** (0.000) 0.00277*** (0.000)
0.06048* (0.061) 0.1186*** (0.000) 0.1715*** (0.000)
0.00449*** (0.000) 0.0016** (0.019)
No
Yes
No
Yes
1125 0.2519
1653 0.5933
1125 0.2933
1653 0.6010
Notes: number in parentheses is the p-value. * Represents significance at the 10% level. ** Represents significance at the 5% level. *** Represents significance at the 1% level.
for assets At , and initial land value as an instrument for initial wealth W0 . Table 24 reports the results from the IV regression.20 The debt/asset ratio is positive and significant, as is household size, while the initial wealth and the head’s age are negative and significant (in the specification without household fixed effects), i.e., households with a lower initial wealth and younger heads tend to have a higher ROA. Education retains its positive significance in two out of four specifications. Own work is defined by the number of hours the household uses for its own enterprise (cultivation, livestock, fish/shrimp, business) and this is positive and significant. Paid work is positive and significant but the size of the estimate of paid labor supply is lower than own labor supply.21 Other variables (not shown in the Table 24) that are positive and significant are education of the head’s father, and the education of spouse’s mother when we do not include household fixed effects. The results for family network and division of labor (not shown in the Table 24) are not significant. There is an increasing trend in labor income and we do use net income from all sources to compute ROA. Thus, even though we exclude households with primary income from labor in the analysis of ROA, labor income and not profits from non-labor
For initial wealth W0 , the true identity is W0∗ = A∗0 − L0 while the one constructed from the account by using measured total assets is
Adjusted R-squared
0.561
0.164
0.565
0.177
0.566
Household fixed effect Labor supply Other covariates
Yes No No
No No Yes
Yes No Yes
No Yes Yes
Yes Yes Yes
household activities may still play a role. As a robustness check, we subtract labor income from each household’s net income and run the IV regression. We obtain similar qualitative result to Table 24, except for education. 4.3. Robustness checks: OLS, and using physical assets only to compute ROA As a robustness check, we regress annual ROA on the same set of explanatory variables in each specification by ordinary least squares.22 Table 25 reports the adjusted R2 of 5 specifications. The first column reports with only household dummy variables in the regression. The next 4 specifications correspond to the ones in Table 24. The explanatory power of these regressions increases several-fold when we include household fixed effects, indicating that factors accounting for success are specific to each household and an important part of reality. The notable difference is for education with a coefficient that is either negative or positive but not statistically significant.23 Still other variables are similar in terms of sign and statistical significance, though the coefficient on the debt/asset ratio is higher, and household size and initial wealth are lower in absolute values.24 Thus far we have used all assets to compute ROA. We can be less conservative and use only physical assets to compute ROA (deleting currency and financial assets). Since the denominator is lower, mechanically ROA goes up. But the correlation between the two measures of ROA is quite high (0.7534) and statistically significant. We run the same set of regressions as before, but now using this new measure of ROA as another robustness check. With the instruments, the debt/asset ratio is positive and significant for all specifications. Education is positive and significant for three out of four specifications. And all other variables are similar in terms of sign and statistical significance. Again R2 increases several-fold when we control for household specific fixed effects. Using OLS and this new measure, the debt/asset ratio is now positive and significant only when we do not control for fixed effects. Estimates of household size are all positive, but significant only when we control for HHFE. Other variables are more or less the same: education is still not significant or significant with a negative sign, where the head’s age and initial wealth are negative and significant.
W0 = A0 − L0 . As a result, initial wealth is also measured with error 22 About 9% of observations in LS regression are dropped when we run IV because we have to drop observations with a negative net income when we take the logarithm. 23 One possible interpretation for this negative effect of education is through self-
W0 = W0∗ + η0 , where η0 = A0 − A∗0 , an additive measurement error in assets. Therefore the regression in term of observables is log (ROAt ) = β0 + β1 log
+
−
Lt At
+ β2 W0
βi Xi + [ut + (β1 − 1) et − β2 η0 ]
(5)
i≥3
where {Xi }i≥3 are the other control variables that are treated as exogenous and ut is the original error term in the true regression. 20 Only some selected estimates are reported here. 21 One interpretation could be that the household is free from moral hazard or management problems with its own labor supply.
selection: if there are jobs with high return to education in Bangkok but somehow high educated household members are still in the village, probably the member is less talented. See also Udry (1994). 24 We have done further robustness checks. First, the dependent variable is changed to be the return after we subtract the estimated opportunity cost, and the results are quite similar when there is no household fixed effect. But when we include household fixed effects, none of the variables is significant, even though the signs are still much the same. Second, when we include households with labor as the primary occupation, the results are similar, except all coefficients of education are negative and not significant, and the coefficient on own work is not significant and lower than one of paid work.
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
73
Fig. 10. Persistence of ROA. Table 26 Correlation of rank of ROA (first half and second half). Chachoengsao
Buriram
Lopburi
Sisaket
0.7229*** (0.0000)
0.1569 (0.3682)
0.7373*** (0.0000)
0.8307*** (0.0000)
Notes: number in parentheses is the significant level. *** Represents significance at 1% level.
5. Persistence of ROA 5.1. Scatter plots Factors specific to a household can account for variation in success. A related question is whether a high ROA household today is more or less likely to be a high ROA in the future. We compute average ROA for the first half and the second half of the overall sample frame (3.5 years each) and then rank them for each time period (from lowest to highest), so that a low rank number indicates relatively low ROA in that period. Fig. 10 shows scatter plots of the rank of ROA, its fitted linear value, 95% confidence interval, and a 45° line, by changwat, using all observations except labor households. Table 26 reports correlation of the rank across two time frames for each changwat. A household with a high rank of ROA in the first half is likely to have a high rank in the second half, that is, there is considerable persistence.25 This is especially true for households in the three provinces other than Buriram. We also see that some households deviate from this pattern, as there is not a small number of points far from their initial position. A linear fitted line is not a 45° line, but rather has a slope of less than one: a household with low rank in the first period is likely to have a higher rank in the second period, and vice versa. But overall, households which are successful over the first half of the sample are likely to be successful over the second, indicating that luck per se is not an explanation for success.
25 We also try with year by year, and basically find similar results, i.e., there is a considerable amount of persistence except for Buriram for 2000–2001. We also try by occupation and the results somehow vary, there is considerable amount of persistence for Cultivation, Business and Livestock, but less for Fish/shrimp.
In Buriram, however, persistence is much lower (correlation is 0.15 and not statistically significant). There are two pieces of evidence that offer some explanation for Buriram: change in occupation and change in household composition. 5.2. Occupational change and selection into higher returns Households in the survey typically have multiple sources of income. Thus far we have utilized the primary occupations of each household over all seven years. However, if we look at the activity that generates the highest net income over each year, and define that as the primary occupation of that year, there may be occupational changes over time. Households in the Northeast change occupations more often than those in the Central region, and the highest average number of changes is for Buriram. If we look at a correlation, or run a regression of average ROA on the total number of occupations over seven years, we find a negative and statistically significant estimate, that is, high ROA households are associated with having a low number of primary occupations. However, causality cannot be inferred. So, to aid in interpretation, we compare ROA before and after switching occupations, that is, we want to see for those who have an occupational change whether that household tends to switch into an occupation that gives it a higher rate of return. This is still only an association, of course, but it is highly suggestive. Table 27 reports mean-comparison tests of ROA before and after a household changes occupation. Using all the observations, ROA after occupational change is statistically higher than ROA before change. By province, ROA after switching occupations is higher and statistically significant only in Buriram, though it is positive but not significant in Lopburi and Sisaket. 5.3. Stability of household composition Household size in Buriram also exhibits more instability than any other province. If ROA is related to individual talent, and individuals come and go, then a household in which most of the individuals change should have a less persistent ROA over the two subperiods. Alternatively, coming and going could be deceptive if the housing structure is more of a boarding house. But again
74
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Fig. 11. Mean value of physical assets: high ROA group. Table 27 Mean-comparison tests H0 : difference in ROA = 0, H1 : difference in ROA > 0. Obs
Mean
Std. err.
Std. dev.
Lower
Upper
Pr (T > t)
Chachoengsao Buriram Lopburi Sisaket
170 179 130 238
−0.0283246
1.179785 1.684706 1.14799 1.706865
−0.2069518
0.4188116 0.0776118 0.0618619
0.0904854 0.1259209 0.1006854 0.1106396
0.1703218 −0.1215967 −0.1561008
0.1503026 0.6673014 0.2768203 0.2798246
0.6227 0.0005 0.2211 0.2883
All
717
0.1324474
0.0562019
1.50491
0.0221072
0.2427876
0.0094
Table 28 Regression coefficient of ROA on household stability. ROA1 ∗ sd(hhsize)
ROA1 ***
0.6444308 (0.000)
sd(hhsize)
adj-R2
Number obs.
−0.1778253
−0.0071196
0.3712
298
(0.030)
(0.936)
**
Notes: number in parentheses is the p-value. ** Represents significance at the 5% level. *** Represents significance at the 1% level.
be classified into two types: financial and physical assets. In this section, we look at the association with both types of assets. We use non-labor households and group them by ROA. In each changwat those with ROA in the fourth quartile are classified as the high ROA group, those with ROA in the first quartile as the low ROA group, and otherwise, households in the second and third quartiles in each changwat are classified as the middle group. 6.1. Physical assets versus financial assets
we cannot establish causality. An ill-performing household may generate turnover of individuals. Consider the regression ROA2,i = b0 + b1 ROA1,i + b2 ROA1,i ∗ sd(hhsizei ) + b3 sd(hhsizei ) + ui
where ROA2,i and ROA1,i are the average ROA over the second half and first half of household i, and sd(hhsizei ) is the standard deviation of household size. If the estimate of b2 is negative, this would lower b1 + b2 and thus households with a higher variation in household size will have less persistence. The regression result is reported in Table 28. The estimate of the interaction term is negative and significant, while the estimate of ROA1 is as anticipated — positive and significant.
The average value of physical assets for the high ROA group fluctuates with an increasing trend for all four changwats. The middle ROA group and low ROA group display quite different behavior from the high ROA group, i.e., fluctuate, but with decreasing trends for all four changwats. For brevity, we show only the figure for the high and middle ROA groups, as both middle and low ROA groups share similar patterns, i.e., decreasing trends. Evidently, high ROA households put their wealth back into their income generating activities, and that is why their physical assets have an increasing trend (Fig. 11). In Fig. 13 we report the middle group for financial assets, to compare with Fig. 12, the middle group for physical assets. In contrast, the average value of financial assets is growing, and this is true for almost all groups and regions. The only exception is for the low group of Buriram and Sisaket, and middle group of Buriram.26
6. The predictive power of ROA This section studies the predictive power of ROA, specifically the association of ROA with physical investment and financial asset accumulation. We have seen that ROA and the growth of net worth are positively correlated at almost all levels. Total assets can
26 In Buriram, the financial assets of all groups have a decreasing trend, for the first three years even for the high ROA group. In Sisaket, the financial assets of low ROA group displays a decreasing trend for almost five years, but increasing after that. Evidently, the behavior which underlies financial assets is more difficult to discern.
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
75
Fig. 12. Mean value of physical assets: middle ROA group.
Fig. 13. Mean value of financial assets.
7. Financial strategies and credit market imperfections
the overall deficit. It must by definition be financed by devices Ft1 , . . . , Ftn :
In fact, different households use different financial strategies. This section addresses the relationship of ROA with financial strategies and the debt-asset ratio. It also presents evidence indicative of imperfections in credit markets.
Dt = Ft1 + · · · + Ftn . One can derive from this equation a statistical relationship:
1= 7.1. Financing cash flow deficit From the household budget constraint, Ct + It = Yt + Ft1 + · · · + Ftn , where Ct , It , Yt are consumption, investment and net income at time t, and Fti is a financing device i at time t. Let Dt = Ct + It − Yt be
cov D, F 1
v ar (D)
+ ··· +
cov (D, F n )
v ar (D)
,
and this must be true for each household. We can then see by this metric which device each household uses to finance its deficit. In fact, two more types of deficits can be defined, a consumption deficit (C − Y ) and an investment deficit (I − Y ), and their variation can be decomposed in similar way, putting investment and consumption on the right-hand side of the budget constraint,
76
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Table 29 Distribution of debt-asset ratio.
Chachoengsao Buriram Lopburi Sisaket All
Min
Max
Mean
Std. dev.
First quartile
Median
Third quartile
0.0000 0.0004 0.0000 0.0000 0.0000
0.3950 0.2724 0.3830 0.8242 0.8242
0.0367 0.0704 0.0772 0.0919 0.0717
0.0660 0.0635 0.0718 0.1103 0.0871
0.0021 0.0270 0.0235 0.0200 0.0133
0.0126 0.0497 0.0590 0.0664 0.0440
0.0507 0.1221 0.1104 0.1263 0.1036
Table 30 Correlation of ROA and debt-asset ratio (non-labor households). All
Chachoengsao
Buriram
Lopburi
Sisaket
HH-month
0.0962*** (0.000)
0.0468*** (0.0003)
0.0853*** (0.000)
0.1270*** (0.0000)
0.1177*** (0.0000)
HH-year
0.2675*** (0.000)
0.1969*** (0.0000)
0.2267*** (0.0003)
0.3105*** (0.0000)
0.3418*** (0.000)
HH
0.3982*** (0.000)
0.3289*** (0.0051)
0.4113** (0.0141)
0.4136*** (0.0000)
0.4827*** (0.000)
Cultivation
Livestock
Fish/shrimp
Business
***
***
***
HH-month
0.1033 (0.000)
0.1279 (0.000)
0.0817 (0.0000)
0.0539*** (0.0004)
HH-year
0.3172*** (0.000)
0.2314*** (0.0003)
0.3543*** (0.0000)
0.1125** (0.0319)
HH
0.4222*** (0.000)
0.4392*** (0.0094)
0.5932*** (0.0006)
0.2058 (0.1432)
Notes: number in parentheses is the significance level. ** Represents significance at 5% level. *** Represents significance at 1% level.
respectively. What we are interested in here is not the financing device on its own, but whether financing strategies are related with ROA and growth of wealth. We find that high ROA households are not using capital assets to smooth consumption, and conversely are using consumption to finance investment deficits. This is consistent with their trying to maintain, or increase, real physical assets. High ROA households are also more actively involved in financial markets in the sense of using formal savings accounts and borrowing, and less in informal markets, i.e., fewer gifts.27 7.2. Debt-asset ratio A household may be able to use debt to take advantage of its productivity, i.e., borrow to increase assets and earn a high return. The debt/asset ratio in the survey, even though relatively small, has an overall increasing trend. Table 29 shows the distribution of the debt/asset ratio, where we average by mean over seven years for each household. By location, the debt/asset ratio is lowest in
27 Further details are as follows: a. High ROA and high growth of net worth households rely more on cash to finance consumption deficit, more on borrowing to finance overall deficit, more on deposits to finance all kinds of deficit, more on consumption to finance investment deficit. b. High ROA and high growth of net worth households rely less on gifts to finance overall deficits, less on investment to finance consumption deficit. c. High growth of net worth households rely more on borrowing to finance overall and investment deficits, but rely less on borrowing to finance consumption deficit. d. High ROA households in the Northeast rely less on gifts to finance deficits. e. High ROA households in Sisaket rely more on lending to finance consumption and investment deficits. f. High ROA and high growth of net worth households in Chachoengsao rely more on consumption to finance investment deficits, high growth of net worth households rely less on cash to finance investment deficits. g. High ROA households in Lopburi rely more on gifts to finance consumption deficits.
Chachoengsao and highest in Sisaket and more generally lower in the Central region than the Northeast. Table 30 is the correlation of ROA and the debt/asset ratio, something we have seen earlier. The correlations are higher when we aggregate over calendar years or over all seven years. The correlations are statistically significant everywhere with one exception: when we aggregate over seven years, and stratify by business. Evidently, financial markets work to some degree, but we need to look at decisions on the margin. 7.3. Potential imperfection in credit markets Suppose we impose the Cobb–Douglas functional form: β
β
yjt = Ajt Kjt K LjtL ,
(6)
where yjt , Kjt , Ljt are value added from production activities (cultivation, livestock, fish/shrimp, business), level of physical assets, labor input of householdj at time t (both hired and own), and Ajt = exp A0 + αjt + ujt where A0 is the common productivity or mean efficiency across households, αjt is a specific fixed effect for household j at time t, and ujt is the error term for household j at time t. Parameter αjt can be interpreted as ‘‘entrepreneurial ability’’, or talent as in Evans and Jovanovic (1989), for example, or productivity of household j at time t in general. If talent is assumed to be time-invariant αjt = αj for all j, then one can either put in a dummy variable for each household or use a fixed effect estimation to get a consistent estimate of (βK , βL ). While assuming fixed talent over time (i.e., one is born with a level of talent that can never improve or worsen) simplifies the analysis, the assumption is probably too strong. We assume here instead that productivity can vary over time, and that prior to making a production decision at t, αjt is known to household j (but not to us). Future productivity is not known with certainty but serially correlated with current productivity. By this assumption, OLS of the log form of Eq. (6) will give us inconsistent estimates.
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Fig. 14. Marginal product of capital.
Olley and Pakes (1996) construct a structural model of a firm and use investment as a proxy for underlying productivity. Intuitively, since productivity is serially correlated, a productive firm today, knowing that it is likely that its productivity will also be high in the future, will invest more than a less productive one. One can thus use investment as a proxy for unobserved productivity. They also control for the fact that firm size and productivity is related to whether a firm will exit an industry. These allow them to consistently estimate their model. Levinsohn and Petrin (2003) argue that investment is costly to adjust. This results in most firms having zero investment and thus these observations will be truncated from the Olley–Pakes estimation routine. They do not model the exit decision but show that intermediate inputs can be used as proxies for underlying productivity and allow one to consistently estimate parameters of interest. Investments in our data are mostly non-zero. The median and arithmetic mean of annual average investment in our data are 1133 and 3068 baht, respectively. We thus modify the Olley–Pakes estimation routine and use investment as a proxy to estimate (βK , βL )28 (Table 31). The estimate of both capital and labor are positive and significant.29 The estimate of βK is about 0.5.30 We can convert this to marginal product of capital (MPK) for each household and compare to the interest rate. The median of monthly interest rate is 0.75% (or 9% per year). Fig. 14 plots the (time averaged) MPK of each household against log(K) and the (median) interest rate. High capital households are naturally associated with low MPK. However, in a neoclassical world without any imperfection, every household faces the same interest rate, and MPK should be equal to that interest rate. Intuitively, those households with MPK higher than the interest rate should be willing to borrow more than
28 However, we do not model entry and exit decisions. In Olley–Pakes, firms with productivity lower than the threshold will leave the industry, and this is potentially true also in the household survey. The attrition in household surveys can be due to the opposite reason also, i.e., households that do well no longer have time for the interview, and thus disappear from the survey. Here we use a balanced panel to study other related issues such as inequality change and wealth accumulation of each household over time, so we want to use the same sample to estimate the parameters of interest. 29 We have done the following robustness checks. First, we try fuel and energy as proxies, as in Levinsohn and Petrin (2003). The estimates are lower for both labor and capital, and not significant for capital. Second, we assume that talent is timeinvariant and thus use OLS with a dummy variable for each household to control for talent. The estimates are lower (0.12 for labor and 0.34 for capital) but significant for both labor and capital. 30 We cannot reject the hypothesis of constant return to scale. Nevertheless if we impose a constant return and estimate βK by occupation, the number is lower (0.0405–0.4433). See below for further robustness check.
77
Fig. 15. Debt/Asset ratio. Table 31 Estimates of production function.
βK
βL
Number obs.
0.5029177** (0.026)
0.4238552*** (0.000)
1966
Notes: number in parentheses is the p-value. ** Represent significance at 5% level. *** Represent significance at 1% level.
Table 32 Correlations of TFP, ROA, growth of wealth.
ROA Growth of wealth
TFP
ROA
0.4962*** (0.0000) 0.3226*** (0.0000)
0.5256*** (0.0000)
Notes: number in parentheses is the significance level. *** Represents significance at 5% level.
the other group, and in a perfect market this drives MPK closer. Evidently, the divergence of MPK and the interest indicates that this is not entirely true in the data. Nevertheless, we examine whether these relatively more productive households (MPK higher than interest rate) have a higher debt/asset ratio than the other group (MPK less than interest rate). Fig. 15 is a density of the debt/asset ratio of these two groups. Overall it seems that the financial system is working to some extent, in that people with MPK higher than the average interest rate tend to borrow more than the other group. But again the system is not working perfectly either. Combining this with what we have seen before, it seems that high ROA households are slowly accumulating physical capital. Such households seem to be able to borrow only to a limited extent, less than what it should be based on productivity, and that is why MPK and interest rate are not equalized. If we use the estimate of Ajt in Eq. (6) as the measure of TFP for household j, we can see if that is related to ROA and our earlier results on growth of net worth. Table 32 shows these correlations. So household-specific TFP is related to ROA and more weakly to the growth of net worth.31 The picture we have been drawing with ROA seems to be not misleading.
31 Imposing constant returns to scale, the new TFP number is still correlated with ROA, but not with the growth of net worth. In a study of Korean businesses, Lee (2008) finds that TFP and ROE are not related at all.
78
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Table 33 Correlation of TFP derived from Eq. (7). ROA
Debt–asset ratio
Education
Household size
Head age
Initial wealth
Network
Division of labor
0.1226** (0.0429)
0.0064 (0.9160)
0.1506** (0.0172)
0.1254** (0.0384)
−0.2057***
0.1655*** (0.0061)
−0.0504
0.2291*** (0.0001)
(0.0006)
(0.4073)
Notes: number in parentheses is the significance level. ** Represents significance at the 5% level. *** Represents significance at the 1% level.
Part of ROA is due to the difference in capital-output ratio, not only household-specific productivity. In other words, if the financial market is perfect, then MPK should be equalized for all households, and that should be equal to the interest rate, with different scales of K , depending on productivity. Again, there are indications this is not happening. Still, the TFP numbers may not have the interpretation we want. In a regression of TFP onto initial net worth and labor, we get significant coefficients, though those factors should not show up in the residual if we had the correct specification. Nevertheless we use the estimate of the production function parameters and TFP to get an idea of the potential gain if capital were to be reallocated among households in the economy. If we fix the labor input at the level observed in the data, but allow only capital to adjust until the MPKs are the same, then according to the estimates, value added can increase by 200%–300% (capital is now allocated to match with productivity better than before). If we allow both capital and labor to adjust, the allocations are exponential in productivity for both inputs and thus the efficiency gain will be even higher.32 Although the magnitude of the efficiency gain is quite large, there are two practical issues. First what are the obstacles that prevent optimal allocation of K , and how one can reform a system such that the resource could be allocated more efficiently. Second one might still wonder whether these productive households can actually operate on a much larger scale, as one element in productivity might include factors like management, and a high estimate number is based on previously low level of input.33 Finally all these relatively high gains reflect the reallocation of capital into the upper right tail of the distribution of productivity and hence numbers are sensitive to outliers. We also adjust ROA for aggregate risk consistent with the complete markets, capital asset pricing model at the village level, utilizing data from Samphantharak and Townsend (2009a). We find a correlation of risk adjusted returns with ROA as a measure of individual talent, and a correlation of risk adjusted returns with growth of net worth. But overall results are weaker. For example, with exceptions: high risk-adjusted return households do not invest more in their own enterprises. This suggests again that the capital markets are not perfect in these data. From the estimates of parameters, we can also derive an alternative measure of TFP. Suppose for some reason that instead of facing the common interest rate, different households face a different level of interest rate and thus a different margin for marginal decisions. This can result in households having different MPKs. We do see in the data that many households borrow from various sources, and each household may face a distinct interest rate. We can then use the interest rate specific to each household
32 Using microdata to examine manufacturing establishments, Hsieh and Klenow (2009) find that TFP gains from manufacturing are 30%–50% in China and 40%–60% in India when resources are hypothetically reallocated to the efficiency level observed in the United States. Using calibration, Restuccia and Rogerson (2008) find that policy distortion can decrease output and TFP by 30%–50%. 33 See Bloom and Van Reenen (2007) on the importance of management in explaining productivity and what can empirically account for variation in management practices observed in their data.
and the estimate of (βK , βL ) of Eq. (6) to compute a household’s TFP. That is,
Aj =
R 1 + τjk
β k
βk
πj 1 − βk
1−βk
1 LβL
,
(7)
where R 1 + τjk is the interest rate that each household j is facing. The τjk looks like a wedge or distortion. We see whether the computed TFP is correlated to any factors that can explain ROA. Table 33 reports the correlation coefficients. The correlation number between ROA and the new TFP number appears to be lower than before, but it is still positively significant. Education is positive and significant. The head’s age is negative while the household’s size is positive. There are two exceptions to previous results: here the debt/asset ratio is positive but no longer significant, while the initial wealth is now positive and significant. Adjusting the observed interest rate may remove static distortions but not intertemporal dynamics.
7.4. Savings behavior and productivity Some recent contributions in economic theory such as Buera (2008), and Moll (2009) illustrate the relationship of saving behavior and productivity. Buera (2008) studies the optimal saving decision of a household facing a choice between working for a wage or starting a business while Moll (2009) studies the effect of financial frictions on aggregate productivity emphasizing the role of persistent productivity shock i.e., if productivity shocks have enough persistence, self-financing by a household can eventually undo capital mis-allocation from financial frictions. Both models share a common feature that savings of a household or firm are higher as productivity increases. However, productivity in Buera’s model is fixed, whereas it is allowed to change over time in Moll’s model. Allowing for a more general functional form like CRRA utility might give Moll a savings rate which could be increasing in persistence, but this is only a conjecture. The point of Moll’s work as with others is that households who are consistently productive but facing constraints save their way out of these by increasing net worth. So we explore it here whether savings behavior and growth of wealth are related to productivity and its persistence. This subsection employs correlation analysis using estimates of productivity from various places in the paper to examine the relationship of savings behavior and productivity, as suggested by these recent contributions. As a measure of productivity, in addition to ROA, we can also use the estimate of TFP from Eq. (6), the estimate of the household dummy in ROA regressions (IV and OLS). Table 34 reports the correlation of these various measures of productivity and savings (levels and rates), and growth of wealth. Although the size of the correlations is not large (less than 0.21 for savings), most are positive and significantly different from 0. The exception is the correlation of savings level and household fixed effects from IV regression. Overall though, highly productive households seem to have higher savings rates and levels, and also higher growth of wealth. As household fixed effects in ROA are
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
79
Table 34 Correlation of productivity and savings level, rate and growth of wealth. ROA
TFP
FE (IV) w/o labor
Savings level
***
*
0.1400 (0.0012)
0.0437 (0.0526)
0.0593 (0.3426)
Savings rate
0.2059*** (0.0000)
0.0476** (0.0425)
Growth of wealth
0.5256*** (0.0000)
0.3226*** (0.0000)
FE (OLS) w/ labor
w/o labor
w/ labor
0.0348 (0.5758)
***
0.2097 (0.0003)
0.1372** (0.0182)
0.1531** (0.0138)
0.1651*** (0.0075)
0.1258** (0.0308)
0.0362 (0.5355)
0.1838*** (0.0030)
0.2491*** (0.0000)
0.3774*** (0.0000)
0.4056*** (0.0000)
Number in parentheses is the significance level. * Represents significance at the 10% level. ** Represents significance at the 5% level. *** Represents significance at the 1% level.
Fig. 16. Income, consumption and savings.
largely persistent, as shown when we subdivide the sample, the estimate of a household’s dummy from the ROA regression is also an indicator of productivity. The savings and growth of wealth are also correlated with this measure. As another measure of persistence, we regress monthly ROA on its own lag, household by household, to come up with an estimate of autocorrelation, normalized by the standard deviation, to take into account noise in the parameter estimate. The savings rate is computed for each month, then averaged by mean to get the average savings rate for each household. Pooling over all observations, the correlation of this measure of persistence and savings rate is 0.1571 and statistically significant at 1%. If we do not normalize by this standard deviation, and use the monthly data, the correlation is 0.1779 and significant at 1%. However using the growth of wealth and the level of savings, the correlation is −0.1088 and significant at 10% for the former, but not significant for the latter.34
34 By location, the correlation is higher and statistically significant at 10% for Buriram (0.3076), and at 5% for Sisaket (0.2219), but lower and not significant for Chachoengsao (0.1260) and even negative for Lopburi (−0.0763). Also, oddly, the result is not robust when we annualize the data.
8. Household specific fixed effects: story from a selected successful household From ROA regressions, a factor specific to households can account considerably for variations in success. On one hand, this result is important in its own right. But on the other, we are left with no notion of what are the specific factors of each household that makes it successful. We now turn to a case study. The case household singled out here is the highest ROA among those with livestock as primary occupation (16.85% APR) in Lopburi. The lower left panel of Fig. 16 is the wealth accumulation of the case study household. Besides one big jump in 2002, the wealth of this household has an overall increasing trend. There are three members and the size and composition of the household never changes, a prototypical nuclear household. At baseline (1998), the head was a 37 year old married male, with four years of education. The spouse was 30 years old with six years of education. They have one daughter, a 7 years old, and she was entering her last year of kindergarten, which lasts three years. The prior primary occupation for both head and spouse was corn farmer, where head and spouse spent 19, 12 years respectively. The household has multiple sources of income (cultivation,
80
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
livestock, business, labor and others), but the main source of income is livestock (dairy cows). The idea of adopting the dairy cattle occupation came from observing neighbors, and milk cooperatives sent their workers to educate villagers as well. This household thought that this could be a good occupation compared to others, although they were not certain at that time whether the return would be high. This occupational change was not without an internal argument, as the head did not want to change while the spouse insisted. Also, there was a problem with a lack of sufficient funds for the initial investment, but this was alleviated by help from the father of the spouse who was willing to lend his land title to the household to use as collateral to get funds from the Bank for Agriculture and Agricultural Cooperatives (BAAC). Ex post, the result turned out to be very good. The number of cows and total production of milk of this household has an increasing trend. Milk production, however, is not fully controllable, and the average milk per big cow, or productivity, fluctuates over time without a clear trend. The correlation of ROA and average productivity of the cows is 0.6946 and statistically significant. Household net income fluctuates with the increasing trend overall, and consumption also has an increasing trend, but with less fluctuation. The regression coefficient of consumption on net income is 0.0824358, but it is significant (p = 0.030). Household monthly saving is most often positive (about 3/4 of the seven years), with a mean 9544 baht and median 7000 baht. Change in net worth of this household is basically explained by saving, not gifts, but there is a huge increase in net worth at one point due to a large gift-in: the father of spouse gave land as inheritance, and this gift was intended for the spouse. From personal interviews with the authors in the summer of 2008, the head said that their well-being improved after the occupational change and that there are many additional household assets that came from the earnings of this occupation. We also asked them what made them successful compared to many other livestock households. The head thought that it is mainly due to the selection of cows and to the attention they both pay to them. For selection, the spouse learned from cow merchants how they selected good cows, and they used this to try to acquire them. The spouse occasionally received extra training, and the head studied documents that she had received in the training. The household plans to expand the dairy cow activities further, but the constraints are funds and labor, where funds are more important. If funds were needed, the household would borrow from BAAC using assets as collateral, and loan size is determined by collateral. This is higher than the alternative of joint liability group borrowing, so essentially these funds are limited by the amount of assets they can put up as collateral. 9. Conclusion Rather than repeat our findings here, we refer the reader to the abstract and introduction. Instead, we conclude here with unanswered questions and anomalies that come from the analysis, which need to be addressed by future research. The first one concerns household specific fixed effects. Though we are able to provide some suggestive information about this factor, from correlations in the data with education and particular case studies, we still do not understand what this household specific factor really is. If it is intrinsic ability, or attitudes, then that needs to be measured by tests and additional survey questions. We are working on this. Second, and related, we know from this study that wealth inequality is declining, and that is largely due to poor households moving upward in the distribution of wealth and to wealth levels
across initial quartiles converging. But we do not touch on the sources of this initial wealth inequality that is so salient to begin with. Such a study needs to take into account heterogeneity in preferences, the history of credit markets, regional and economic growth, and, again entrepreneurial ability. These questions are the subject of current and continuing research. Third, and also related, 81% of wealth accumulation is accounted for by savings. We have focused in the paper on persistence in high returns. Yet a preliminary result of Pawasutipaisit (2009) indicates that there are other motives. The precautionary motive may be dominant. In contrast, stratifying by age of household head, he rarely finds evidence of hump savings or saving for retirement, as in the Life Cycle hypothesis. Again, see Pawasutipaisit (2009). The observed consumption puzzles present several anomalies. First, the distribution of consumption is much more compressed than the distribution of wealth. For instance, the consumption share of the top 1% households is around 4%–6%, which is much lower than 33% share of wealth of the same group. The consumption share of the bottom 50% is about 1/3, although their wealth share is less than 10%. Regarding the consumption and wealth share of the top 1%, the numbers observed in the Townsend Thai survey are similar to those of the United States. Related, recent model contributions of Castañeda et al. (2003), and Cagetti and De Nardi (2007) generate the wealth share of the top 1% households at roughly 1/3 and the consumption share of the same group at approximately 11%–16%, though the latter share is still higher than the observed US numbers (4%–6%). It might be suggested that the US discrepancy is due to the fact that consumption and wealth are measured in different surveys. Again our Thai numbers here come from the same survey and are similar to the US numbers. We have yet to understand the compression. Second, consumption is higher than income (thus negative saving) for households with negative growth of net worth, and we are not sure what the financial strategies of these households are, that is, what they have in mind for the future (we remain concerned that the definition of the household implicit in what we do is not adequate for these households). Third, though high ROA households tend to increase consumption, overall ROA is not correlated with consumption growth, despite the fact that high ROA is associated with high physical and financial assets accumulation (an exception is Sisaket, however). So what are these households saving for? Actually, this evidence fits better the complete market framework. In solutions to a planning problem, the consumption allocation is determined by the equality of weighted marginal utility across households. So, if Pareto weights are not related to productivity, then high ROA households will not have high consumption. More generally, consumption growth depends on preference, those with relatively low risk aversion and/or low inter-temporal discount rates will have higher consumption growth. So, again, if productivity and preferences are not related, then high ROA households will not be associated with high consumption growth. On the other hand, we have found in this paper strong evidence of potential imperfections in the credit market: in production function estimation and the divergence of marginal products of capital from the average interest rate, the correlation of savings with persistence of ROA, in the reinvestment of the profits of high ROA households into their own enterprises, and in financial strategies, that high ROA households are not using capital assets to smooth consumption and conversely are using consumption to finance investment deficits. We are working with these data toward a better characterization of the effective financial regime in place. The interaction of talent and financial market imperfections must be a huge part of what we are observing.
A. Pawasutipaisit, R.M. Townsend / Journal of Econometrics 161 (2011) 56–81
Acknowledgements Financial support from the NSF, NICHD, John Templeton Foundation, and The Bill & Melinda Gates Foundation through the Chicago Consortium on Financial Systems and Poverty are gratefully acknowledged. We would like to thank Chris Ahlin, Francisco Buera, Mariacristina De Nardi, Tae Jeong Lee, Benjamin Moll, and Benjamin Olken, Anna Paulson and participants of the MIT development lunch for helpful comments. Fan Wang provided excellent research assistance. References Acemoglu, D., Dell, M., 2009. Beyond neoclassical growth: technology, human capital, institutions and within country differences, Mimeo, MIT. Atkinson, A.B., 1971. The distribution of wealth and the individual life-cycle. Oxford Economic Papers 23, 239–254. Banerjee, A.V., Moll, B., 2009. Why does missallocation persist? Mimeo, MIT. Binford, M., Lee, T.J., Townsend, R.M., 2004. Sampling design for an integrated socioeconomic and econological survey by using satellite remote sensing and ordination. Proceedings of the National Academy of Sciences 101 (31), 11517–11522. Bloom, N., Van Reenen, J., 2007. Measuring and explaining management practices across firms and countries. Quarterly Journal of Economics CXXII, 1351–1408. Buera, F., 2008. Persistency of poverty, financial frictions, and entrepreneurship, Mimeo, UCLA. Cagetti, M., De Nardi, M., 2007. Estate taxation, entrepreneurship, and wealth, National Bureau of Economic Research, Working Paper 13160. Cagetti, M., De Nardi, M., 2008. Wealth inequality: data and models. Macroeconomic Dynamics 12, 285–313. Castañeda, A., Díaz-Giménez, J., Rios-Rull, J., 2003. Accounting for earnings and wealth inequality. Journal of Political Economy 111, 818–857. Evans, D.S., Jovanovic, B., 1989. An estimated model of entrepreneurial choice under liquidity constraints. Journal of Political Economy 97, 808–827.
81
Fernandes, A.M., Pakes, A., 2008. Factor utilization in indian manufacturing: a look at the world bank investment climate surveys data, NBER Working Paper 14178. Hsieh, C., Klenow, P., 2009. Misallocation and manufacturing TFP in China and India. Quarterly Journal of Economics CXXIV, 1403–1448. Jeong, H., 2008. Assessment of relationship between growth and inequality: micro evidence from Thailand. Macroeconomic Dynamics 12, 155–197. Kennickell, A.B., 2003. A rolling tide: changes in the distribution of wealth in the US, 1989–2001. Mimeo, Federal Reserve Board, Occasional staff study. Lee, T.J., 2008. On the efficiency of Korean businesses: evaluatioin based on TFP. Presented at the Korea Development Economics Association, The Proceedings of the Joint Conference of the Korean Economic Associations, February 2008 (in Korean). Levinsohn, J., Petrin, A., 2003. Estimating production functions using inputs to control for unobservables. Review of Economic Studies 70, 317–341. Moll, B., 2009. Productivity losses from financial frictions: can self-financing undo capital misallocation? Mimeo, University of Chicago. Mookherjee, D., Shorrocks, A., 1982. A decomposition analysis of the trend in UK income inequality. Economic Journal 92, 886–902. Olley, S., Pakes, A., 1996. The dynamics of productivity in the telecommunications equipment industry. Econometrica 64, 1263–1297. Paulson, A., Sakuntasathien, S., Lee, T.J., Binford, M., Townsend, R.M., 1997. Questionnaire design and data collection for NICHD grant risk, insurance and the family, and NSF grants, Manuscript, The University of Chicago. Pawasutipaisit, A., 2009. Wealth accumulation and motives to save, Mimeo, University of Chicago. Pawasutipaisit, A., Paweenawat, A., Samphantharak, K., Townsend, R.M., 2007. User manual for the Townsend Thai monthly survey, Mimeo, University of Chicago. Piketty, T., Saez, E., 2003. Income inequality in the United States 1913–1998. Quarterly Journal of Economics CXVIII, 1–39. Restuccia, D., Rogerson, R., 2008. Policy distortions and aggregate productivity with heterogeneous establishments. Review of Economic Dynamics 11, 707–720. Samphantharak, K., Townsend, R.M., 2009a. Risk and return in village economies, Mimeo, UCSD. Samphantharak, K., Townsend, R.M., 2009b. Households as Corporate Firms: An Analysis of Household Finance Using Integrated Household Surveys and Corporate Financial Accounting (Econometric Society Monographs). Cambridge University Press. Udry, C.R., 1994. Risk and insurance in a rural credit market: an empirical investigation in Northern Nigeria. Review of Economic Studies 61, 495–526.
Journal of Econometrics 161 (2011) 82–99
Contents lists available at ScienceDirect
Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom
National estimates of gross employment and job flows from the Quarterly Workforce Indicators with demographic and industry detail John M. Abowd ∗ , Lars Vilhuber School of Industrial and Labor Relations, Cornell University, Ithaca, NY 14853, United States
article
info
Article history: Available online 17 September 2010 JEL classification: J6 C82 C43
abstract The Quarterly Workforce Indicators (QWI) are local labor market data produced and released every quarter by the United States Census Bureau. Unlike any other local labor market series produced in the US or the rest of the world, QWI measure employment flows for workers (accession and separations), jobs (creations and destructions) and earnings for demographic subgroups (age and gender), economic industry (NAICS industry groups), detailed geography (block (experimental), county, Core-Based Statistical Area, and Workforce Investment Area), and ownership (private, all) with fully interacted publication tables. The current QWI data cover 47 states, about 98% of the private workforce in those states, and about 92% of all private employment in the entire economy. State participation is sufficiently extensive to permit us to present the first national estimates constructed from these data. We focus on worker, job, and excess (churning) reallocation rates, rather than on levels of the basic variables. This permits a comparison to existing series from the Job Openings and Labor Turnover Survey and the Business Employment Dynamics Series from the Bureau of Labor Statistics (BLS). The national estimates from the QWI are an important enhancement to existing series because they include demographic and industry detail for both worker and job flow data compiled from underlying micro-data that have been integrated at the job and establishment levels by the Longitudinal Employer-Household Dynamics Program at the Census Bureau. The estimates presented herein were compiled exclusively from public-use data series and are available for download. © 2010 Elsevier B.V. All rights reserved.
1. Introduction The measurement of gross flows of workers into and out of employment has occupied applied economists for more than thirty years. For decades the Bureau of Labor Statistics (BLS) has derived these measurements from the Current Population Survey (CPS) in the United States. In other countries statistical agencies use similar instruments, usually called labor force surveys, to measure gross worker flows.1 The measurement of gross job flows has a much more recent history originating in work using American manufacturing establishments. The first direct simultaneous measurement of both worker and job flows using individual, establishment and job-level data integrated at the micro-data level were produced from French administrative and survey records.2 Aggregate estimates of worker and job flows for the US have been produced by
∗
integrating tabulated data from household surveys (CPS), establishment surveys (Job Openings and Labor Turnover Survey (JOLTS)), and establishment level micro-data (BLS measures based on the Quarterly Census of Employment and Wages (QCEW)).3 Using data similar to those that form the basis for our work, integrated collections of worker and job flows have been produced for small groups of states 4 and for the state of Maryland 5 using Unemployment Insurance (UI) wage records as the micro-data basis. A coherent aggregate story has emerged. Gross flows greatly exceed net flows. Furthermore, worker flows – accessions and separations – exceed job flows — creations and destructions. The magnitude of the ‘‘churning’’ depends upon the state of the economy, weakly, and the whether the employer is growing, staying constant, or shrinking.6 Modeling these gross flows, especially during economic downturns and for different demographic groups, has been a goal of many individual and agency researchers. Such
Corresponding author. E-mail address:
[email protected] (J.M. Abowd).
1 See Abowd and Zellner (1985) and Poterba and Summers (1986) for early discussions of the gross worker flow problem in the context of the Current Population Survey (CPS). 2 Job flow data initiated with the work of Dunne et al. (1989) and Davis and Haltiwanger’s earliest work (Davis and Haltiwanger, 1990, 1992; Davis et al., 1996). Integrated worker and job data were first produced by Abowd et al. (1999). 0304-4076/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2010.09.008
3 See Davis et al. (2006), Boon et al. (2008) and Davis et al. (2010). 4 See Anderson and Meyer (1994) and Burgess et al. (2000). 5 See Burgess et al. (2001). 6 Churning is defined in Burgess et al. (2000). See Abowd et al. (1999) and Burgess et al. (2001) for general summaries of flow magnitudes.
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
modeling forms the basis for recent work at the BLS 7 where the principle difficulty remains an inability to measure all of the flows at the same consistent microeconomic level — that of the job or employer. Only the estimates based on French employers and those using employers operating in the state of Maryland provide fully integrated microeconomic data approaches in which all of the relevant flows are measured using micro-data at the job level. When the US Census Bureau began publishing the QWI in 2003, it marked the first time that an American statistical agency had attempted to provide labor force stock, flow, and earnings data from a consistent, integrated job-level source. The system is based on the integration of demographic, economic and joblevel data using state-level UI and QCEW micro-data linked to Census Bureau censuses and surveys on households and businesses (Abowd et al., 2004). At first, there were only 18 participating states, representing about 30% of the US labor force. By 2009, all but two states (Massachusetts and New Hampshire) had joined the Local Employment Dynamics (LED) Federal/State Partnership that provides the data. Current published QWI data cover 92% of private non-farm employment. Since the system was designed to provide consistent stock and flow information at very detailed geographic, industrial and demographic detail, the fact that participation by states was not universal was not seen as a serious drawback. However, now that participation is essentially universal, this paper constructs national estimates, for the first time, using the same industrial and demographic detail that characterizes the original QWI publication. In addition, this paper provides evidence on the statistical reliability of the national QWI estimates and on their sensitivity to missing state data. In Section 2, we describe the basic public-use data sources that form the core of our national gross flow estimates. Section 3 provides the formulae for our worker, job, and excess reallocation rates and their components. We also describe the series from the QCEW, Business Employment Dynamics (BED), and JOLTS that we use for comparison and imputation, where needed. Section 4 describes how we handled the incomplete QWI data in forming national rates. Section 5 presents our results in both tabular and graphical form. We conclude in Section 6. 2. Data sources and definitions 2.1. Quarterly Workforce Indicators The Census Bureau publishes the QWI state-by-state at the beginning of each calendar quarter for data covering the quarter that ended nine months earlier, and all earlier quarters, at lehd.did. census.gov/led/. The complete set of 30 QWI is available at the Cornell VirtualRDC for download at www.vrdc.cornell.edu/qwipu/. In this article we consider only the series pertaining to privateownership, and we use only state-wide totals disaggregated by NAICS sector, gender and age, fully interacted. We focus on six core variables: beginning- and end-of-quarter employment (which will be denoted by B and E, respectively), accessions A, separations S, job creations JC , and job destructions JD. To understand how the QWI relate to similar measures published by the BLS, we present a brief summary of the data integration methods and definitions here. Detailed definitions for other variables are available in Abowd et al. (2009) and online at the sites noted above. We will compare the similarities and differences between QWI and BLS measures from the JOLTS, QCEW and BED below. The fundamental data integration is performed by the Longitudinal Employer-Household Dynamics (LEHD) Program to create
7 See Spletzer et al. (2004) and Boon et al. (2008).
83
Fig. 1. QWI data availability.
its infrastructure file system, which is the core database used to create the QWI. The data are provided by the LED Federal/State partnership, which currently has 48 member states, the District of Columbia, Puerto Rico and the Virgin Islands (as of October 2009). The basic linking record is the state Unemployment Insurance (UI) wage record, which records for a particular individual and legal employer (state UI account) the total UI-covered earnings by that individual paid by the legal employer during the calendar quarter. The individual identifier used in this system (an encrypted form of the Social Security Number) permits longitudinal linking of the individual. The employer identifier used in this system – an employer’s state-specific UI account number – is identical to the one used on the establishment-level summary collected by state labor market information offices currently called the Quarterly Census of Employment and Wages and formerly known as the ES-202 report. However, subsequent edits in the BLS’s Longitudinal Business Database (LDB) and within LEHD’s data infrastructure may differ. The UI wage records and QCEW micro-data are provided to the Census Bureau with a two-quarter lag (the same reporting lag as at the BLS) as part of the LED partnership. Demographic data (age and gender) are integrated using the individual identifier linked to other Census Bureau demographic data, primarily a Census-enhanced version of the Social Security Number database (also with encrypted identifiers). Economic data (NAICS and geography) are integrated using the employer identifier linked to a Census-enhanced version of the QCEW data called the Employer Characteristics File. For state single-unit employers the linkage is exact. For employers with multiple work locations, the linkage is multiply imputed (Abowd et al., 2009). QWI data are currently available for 47 states, but availability declines as one goes back in time. For the year 2000, only 38 states have provided the data. Fig. 1 shows QWI data availability expressed as a percentage of the QCEW month-one employment for all available states by quarter. In the earliest quarter we used (1993 : Q 1), about 30% of the QCEW private employment has data in the QWI. By the end of our analysis period, 92% of private QCEW employment is represented. The QWI uses noise-infusion as its confidentiality-protection mechanism (see Abowd et al. (2006, 2009) for a fuller description). Using internally computed state-wide totals is highly accurate. Suppressions are rare even at the NAICS sector × county × gender × age group level, with far less than 1% of data items suppressed because they do not meet the Census Bureau’s data publication standards and less than 0.1% of the workforce in the relevant states subject to data suppression.
84
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
2.2. Quarterly Census of Employment and Wages The QCEW data are derived from employer reports for the UI systems of the states. The BLS publishes counts at the county, MSA, state, and national levels by industry at www.bls.gov/cew/. In this article, we focus on the state by NAICS sector tabulations for private employment. There are some suppressions at this level, but less than 1% of the values and less than 0.1% of employment are suppressed during the entire period used here. In this article, we use data from 1992 : Q 4 through 2008 : Q 4.8 The QCEW tabulations report monthly employment levels as of the payroll period covering the 12th calendar day in each month. We refer to the tabulation for the first month of the quarter as the QCEW month-1 employment. Similarly, we refer to the tabulation for the payroll period covering the 12th calendar day of the third month of the quarter as the QCEW month-3 employment. 2.3. Business Employment Dynamics
defined consistent with the Census Bureau’s NAICS standard for the QWI and reported with customized aggregates for the BED by BLS); for the states s = 01, . . . , 56 (50 states, excluding DC, using FIPS codes), and, finally, the time index t = 1993 : Q 1 − 2008 : Q 4.11 Since gross job flows are defined only at an establishment (and not at an individual) level, we need to explicitly include the codes for the marginal categories in the definitions of the indices for demographic categories. We elaborate on this requirement below. 3.1. Gross worker flow measures Gross worker flows are measured using the Worker Realloca tion Rate WRRagkst : Aagkst + Sagkst WRRagkst = Bagkst + Eagkst /2 where Aagkst ≡ accessions (new hires plus recalls)
The BED data are released quarterly by the BLS at www. bls.gov/bdm/. The focus is on gross job creation and destruction at the establishment level. Gross job gains (creations) and losses (destructions) data are available at the state, (approximate) NAICS sector, and size class level. None of the two- or three-way interactions of those variables is published. Establishment-based counts are available at the NAICS sector level. There are no missing or suppressed values in the public-use data; however, some NAICS sectors have been collapsed: sectors 11–21 (natural resources and mining), 53–56 (professional and business services), as well as 71–72 (leisure and hospitality services) are thus not separately available. In this article, we use data on private employment from 1993 : Q 1 through 2008 : Q 4.9
Sagkst ≡ separations (quits, layoffs, other)
2.4. Job Openings and Labor Turnover Survey
Aagkst ARagkst = Bagkst + Eagkst /2
The JOLTS data are collected from a monthly establishment survey conducted by the BLS covering private non-farm establishments. The BLS publishes JOLTS data at aggregated NAICS sector and Census Region levels at www.bls.gov/jlt/. Data are collected for continuing establishments for total employment, job openings, hires, quits, layoffs, discharges, and other separations. The employment report is based on employees on payroll in the pay period that includes the 12th calendar day of the reference month. Hires in JOLTS correspond to accessions in QWI, namely, the total number of additions to the payroll at any time during the reference month. Total separations in JOLTS are the total number of terminations at any time during the reference month from quits, layoffs, discharges, and other reasons. We use monthly JOLTS data for private employment aggregated to calendar quarter from 2001 : Q 1 through 2009 : Q 2.10 3. Gross reallocation rate definitions In this section we define all of the reallocation measures that we have created from the national QWI. In addition, we define the comparison measures that we have constructed from the QCEW and BED. For the QWI we use the following categories: for age groups, a = 0, 1, . . . , 8 (all, 14–18, 19–21, 22–24, 25–34, 35–44, 45–54, 55–64, 65+); for gender groups, g = 0, 1, 2 (both, male, female); for industries, k = 11, 21, . . . , 81 (19 NAICS sectors
8 Data as provided by ftp://ftp.bls.gov/pub/special.requests/cew/ on 2009-10-09. 9 Data as provided by ftp://ftp.bls.gov/pub/time.series/bd/ on 2009-10-09. 10 Data as provided by ftp://ftp.bls.gov/pub/time.series/jt/ on 2009-10-15.
Bagkst ≡ beginning-of-quarter employment Eagkst ≡ end-of-quarter employment. WRR measures total accession and separation flows as a proportion of average employment over the quarter in the age, gender, industry and state. The WRR, and all the reallocation measures used in this paper, are symmetric growth rates designed to approximate the logarithmic change over the time period (one quarter). In addition, the WRR, and all of the reallocation measures used in this paper, can be expressed as the sum of its inflow and outflow components, and the distinct accession and separation rates are defined, respectively, as:
and Sagkst . SRagkst = Bagkst + Eagkst /2 Details on the timing and construction of the basic QWI employment stock and flow measures used to define all of our reallocation rates can be found in Abowd et al. (2009). Methodological issues associated with inaccuracies in the individual identifers are discussed in Abowd and Vilhuber (2005). Dynamic linking of UI and QCEW firms and establishments is discussed in Benedetto et al. (2007) for QWI and the LEHD infrastructure files, and in Pivetz et al. (2001) for the LDB, which is the micro-data source for the BED and is derived from the QCEW. Essential details for the QWI are summarized here. When an individual/employer pair has a record in the UI wage record data, an indicator variable mijt = 1 is recorded for individual i at employer j in quarter t, otherwise, mijt = 0. Beginning-of-quarter employment is the count of all individuals working at a particular establishment for whom mijt −1 = 1 and mijt = 1; that is, the individual and employer had a UI wage record for the current quarter (t ) and the previous quarter (t − 1).12 Similarly, end-ofquarter employment is the count of all individuals working at a
11 In the construction of the QWI, data from three successive quarters are required to compute all of the measures we use in this paper. Hence, we use data from 2008 : Q 4 but our reported measures stop in 2008 : Q 3 because 2009 : Q 1 was not available at the time we performed these calculations. 12 The QWI tabulations consider a wage record to be present in a given quarter if and only if at least $1.00 of covered UI wages is reported in that quarter.
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
particular establishment for whom mijt = 1 and mijt +1 = 1; that is, the individual and employer had a UI wage record for the current quarter (t ) and the next quarter (t + 1). An accession in quarter t occurs when mijt −1 = 0 and mijt = 1. A separation in quarter t occurs when mijt = 1 and mijt +1 = 0. Workplace characteristics are defined by the NAICS code and physical address of establishment j for the QCEW report at quarter t. Demographic characteristics are defined by the individual’s gender and age as of the first day of the quarter. Accessions and separations satisfy the net job flow JFagkst identity JFagkst ≡ Eagkst − Bagkst = Aagkst − Sagkst . 3.2. Gross job flow measures Gross job flows are measured fashion using the in a similar symmetric Job Reallocation Rate JRRagkst JCagkst + JDagkst
JRRagkst = Bagkst + Eagkst /2 where JCagkst ≡ job creations JDagkst ≡ job destructions. JRR measures the total job creations and destructions (called job creations/destructions in the QWI and gross job gains/losses in the BED) as a proportion of average employment over the quarter in the category. The gross job inflow and outflow rates, the Job Creation Rate (JCR) and Job Destruction Rate (JDR), can be defined as additive components of the JRR: JCagkst JCRagkst = Bagkst + Eagkst /2 and JDagkst . JDRagkst = Bagkst + Eagkst /2 Gross job flow measures are defined at an establishment, not job, level. Let Bagjt be beginning-of-quarter employment for demographic group ag at establishment j in quarter t, and similarly let Eagjt be end-of-quarter employment for the same category and time period. Then, JCagjt ≡ max Eagjt − Bagjt , 0
JDagjt ≡ max Bagjt − Eagjt , 0
so that, as originally specified by Davis and Haltiwanger, job creations are the change in employment when employment is growing at the establishment and job destructions are the change in employment when employment is shrinking at the establishment. Net job flows also satisfy the identity JFagkst = JCagkst − JDagkst .
Separate inflow and outflow excess reallocation rates can be defined using the components of the ERR — specifically, the Excess Inflow Rate (EIR) and the Excess Outflow Rate (EOR): EIRagkst = ARagkst − JCRagkst and EORagkst = SRagkst − JDRagkst where the additive and symmetric growth rate properties of the measure within categories continue to hold. Because of the net job flow identities, EIRagkst ≡ EORagkst . We only report results for EIRagkst , which because of the identities, always equals 12 ERRagkst , however its statistical precision and other summaries differ from those of ERRagkst because the identities are only enforced exactly in the micro-data. Weighting, rounding and confidentiality protection procedures can cause the identities to hold only approximately in the public-use data (Abowd et al., 2006). 3.4. Aggregates and sub-aggregates Great care must be taken when constructing aggregates related to integrated demographic and economic measures of gross labor market flows. As noted in Section 2, gross worker flows can be linearly aggregated both within and between establishments because the concept can be consistently defined for the individual, establishment and job. However, gross job flows among demographic categories do not aggregate within establishments because the concepts of job creation and destruction cannot be defined for the individual–these concepts only make sense when defined at the establishment level. An establishment may create one job for a man age 55 and destroy a job for a man age 35 without creating a job at the establishment level. Such creations and destructions within the establishment’s demographic employment profile must be defined this way in the Quarterly Workforce Indicators in order to have any coherence associated with the age and gender categories in the gross job creation and destruction statistics. Similarly, the quarterly job and excess reallocation rates cannot be aggregated directly into annual measures because an establishment might create a job in Q 2 and destroy a job in Q 3, which would produce movement in the quarterly job and excess reallocation rates, but would produce no movement in a Q 1 to Q 1 annual rate defined across consecutive years. The fact that quarterly gross Job Reallocation Rates cannot be aggregated to annual gross Job Reallocation Rates is well-known — exactly the same reasoning implies that at any level of aggregation, including national, state and industry, the demographic components of the Job Reallocation Rate and the excess reallocation rates will not aggregate linearly. All other components do aggregate in the natural manner. The Worker Reallocation Rates aggregate naturally across all categories, including the demographic and temporal indices. The national Worker Reallocation Rate for a given demographic group is constructed directly from the appropriate aggregate components Aagt + Sagt WRRagt = Bagt + Eagt /2
= 3.3. Excess flow measures Finally, we define the excess reallocation measured using the symmetric Excess Reallocation Rate ERRagkst
85
1
−
Bagt + Eagt /2
k,s
Bagkst + Eagkst 2
WRRagkst
where the elimination of a subscript means that the variable was summed over that index.13 We note that our definition of the national WRR equals the weighted average of the state and industry
ERRagkst = WRRagkst − JRRagkst which measures the difference between gross worker flow and gross job flow rates, sometimes called the labor market ‘‘churning’’ rate (Burgess et al., 2000). The ERR measures the rate of gross worker flow activity in each category in excess of the minimum rate required to account for the observed gross job reallocation.
13 In contingency table analysis, replacing the subscript with a + sign indicates the operation of computing the marginal table with respect to the index so replaced. For time series analysis, as we do here, such notation is cumbersome because the time subscript legitimately uses the + operation. In the notation used here, the letter used in the subscript indicates the variables used to form the relevant marginal table.
86
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
component growth rates. The national Job Reallocation Rate for a given demographic group, JRRagt , is similarly defined as JRRagt =
=
JCagt + JDagt
JOLTSAt = total new hires nationally summed across all three months of quarter t
Bagt + Eagt /2
1
−
Bagt + Eagt /2
Bagkst + Eagkst
k,s
2
JRRagkst
3.5.1. JOLTS variables We use only the national JOLTS data. The JOLTS-based worker reallocation rates are defined as:
.
JOLTSS t = total separations nationally summed across all three months of quarter t .
Finally, the national Excess Reallocation Rate is simply the difference between the worker and Job Reallocation Rates, ERRagt = WRRagt − JRRagt . The national gross inflow and outflow rates are defined with coherent definitions not shown here. We note that national industry-specific gross worker, job, and excess reallocation rates can be defined analogously by aggregating over only the state index.
For compatibility with our comparisons to QWI- and QCEW-based reallocation rates, we define the JOLTS-based Worker Reallocation Rate, JOLTSWRR:
3.5. QCEW and BED variables
4. Statistical methodology
We next define the gross job creation and destruction rates from the BLS sources. We use notation comparable to the notation used for the QWI-based measures to promote clarity in our comparisons. Variables from both the QCEW and BED must be used to create gross Job Reallocation Rates. The BLS computes gross worker and Excess Reallocation Rates by combining data from the QCEW/BED and JOLTS (Boon et al., 2008). Hence, the integration occurs at the aggregate category level, and not at the establishment level. Furthermore, the JOLTS data exclude establishment births and deaths, which results in a downward bias in the associated WRR and ERR (Davis et al., 2010). There are no demographic details in the BED data; however, to facilitate notational clarity we have carried along the age and gender indices, setting them at the appropriate values for overall comparisons. From the QCEW we use:
Two distinct missing data problems must be addressed in order to form potentially reliable national gross worker and job flow measures on a consistent basis. First, a very small amount of the QWI data has been suppressed as a consequence of the publication quality rules applied by the Census Bureau. Reliable imputation of these suppressed items using the confidential micro-data is part of the Bureau’s quality improvement plan for the QWI; however, we rely on the published data alone for this paper. Although tedious to implement, we use a reliable monotone missing data imputation system to complete the missing within-state QWI data. Second, some states still do not participate in the Local Employment Dynamics Federal/State Partnership that is the source of the micro-data from which the QWI are calculated, or have not provided sufficient historical data to create the QWI back to 1993 : Q 1. As we documented in Section 2, in the early years, there is a substantial amount of missing QWI data because different states were able to recover historical data for different time periods back to the early 1990s. From 1999 : Q 1 forward, the state-specific missing data represent a small proportion of the working population; hence, we expect that missing data will not matter much during the 2000s. Even though the primary reasons for missing state data are nonparticipation in a voluntary dataprovision program, and nonexistence of historical data in the states’ archives, nonparticipating state missing data are probably not missing completely at random. We develop a model below that assumes the missing data from nonparticipating states and suppressed items from participating states are ignorable given the published data on those states from the QCEW and the published QWI data (Rubin, 1987). Since the QCEW data are present for essentially all states, NAICS sectors and time periods, they provide the common conditioning information needed to implement this procedure.
QCEWBkst = beginning-of-quarter employment (BED definition): month-3 employment from the previous quarter (t − 1) for all age and gender groups in industry k and state s QCEWE kst = end-of-quarter employment (BED definition): month-3 employment from the current quarter (t ) for comparable values of the indices. From the BED we use: BEDJC kst = job creations using the establishment change in employment from month-3 of quarter (t − 1) to month-3 of quarter t for all age and gender groups in industry k and state s BEDJDkst = job destructions using the establishment change in employment from month-3 of quarter (t − 1) to month-3 of quarter tfor all age and gender groups in industry k and
JOLTSWRRt =
JOLTSAt + JOLTSst
(QCEWBt + QCEWE t ) /2
.
state s Because the BED data are constructed from the BLS’s longitudinally integrated QCEW (LDB), the employment totals for the beginning and ending quarter employment defined above are fully consistent with the BED definitions of gross job creations and destructions. We define the gross Job Reallocation Rate from the BED, BEDJRR: BEDJRRkst =
BEDJCkst + BEDJDkst
(QCEWBkst + QCEWE kst ) /2
.
The BLS does not publish BEDJCkst and BEDJDkst in a fully saturated form. From the published data, we are able to construct the national rate BEDJRRt , the state rates BEDJRRst and an aggregated subset of the national NAICS sector rates BEDJRRkt . We also constructed separate gross job creation and job destruction rates on these same bases.
4.1. States with published QWI data QWI data items are only published when the individual item meets a publication quality standard that depends upon the employment levels and the number of establishments used to compute the indicator. At the state × NAICS sector × gender × age group publication level, a few data items, less than one-tenth of one percent, are not published. We imputed these unpublished numbers using a Multinomial-Dirichlet Posterior Predictive Distribution (PPD) that implements a Bayesian bootstrap procedure (Rubin, 1981). The procedure is identical to the one described below for states that do not have published QWI data. Imputation of missing data arising from the suppressions in the QWI is done before the imputation for states and periods where no QWI data were published. This implements the monotone missing data pattern.
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
4.2. States lacking published QWI data Because some states are missing early years of QWI data, and because the Census Bureau does not yet publish QWI data for some states, there are missing QWI data that must be imputed. The amount of missing data varies from 70% of employment in 1993 to 8% in 2008. The number of states available to serve as completedata donors for the imputation also varies depending upon the time period considered. There is a trade-off between the number of states with complete data and the start date used for the missing data imputation model. If we use the entire period from 1993 : Q 1 through 2008 : Q 4, only 10 states have complete data. However, from 1999 : Q 1 forward, 37 states have complete data. In order to take advantage of the relative completeness of recent QWI data, we adopt an overlapping sub-sample strategy for the missing data model. Sub-sample I runs from 1993 : Q 1 to 2001 : Q 4 and subsample II runs from 1999 : Q 1 to 2008 : Q 4. Separate missing data models were developed for each sub-sample using the multiple imputation system described below. 4.2.1. Sub-sample imputation models For each sub-sample, we multiply impute the missing QWI data by sampling 100 implicates from a Multinomial-Dirichlet PPD that assumes the missing data are ignorable (Rubin, 1987) in the rates (not levels) given NAICS sector × gender × age group × time period. We use the same model for the trivial amount of missing data discussed in Section 4.1. For each sub_sample, we prepare the complete-data observations. There are 10 states with complete data in sub-sample I and 37 states with complete QWI data in sub-sample II.14 For these states, let yagskt = Bagskt , Eagskt , Aagskt , Sagskt , JCagskt , JDagskt
and e¯ skt = (QCEWBskt + QCEWE skt ) /2. The rates, which we assume to be homogeneous over (agkt ) for all states, are defined as a vector ragskt = yagskt /¯eskt .
4.2.2. Combining sub-samples Once the implicates were created for each sub-sample, we combined the two sub-samples using the following weighting system, which smooths out any seams in the imputation process over the three-year period in which the sub-samples overlap:
0 for t = 1993 : Q 1 − 1998 : Q 4 1 wt = wt −1 + for t = 1999 : Q 1 − 2001 : Q 4 13 1 for t = 2002 : Q 1 − 2008 : Q 4 (ℓ)
(ℓ,I )
if ragskt missing,
then yagskt = rag (bbsℓ )kt × e¯ skt (ℓ,m) else yagskt = ragskt × e¯ skt .
for m = I, II. This procedure implements a Bayes bootstrap where the candidate records are the complete-data states’ rate vectors and the completed data consist of either the published data for state s or imputed data from donor state bbsℓ for implicate ℓ = 1, . . . , M and in each sub-sample I, II.
14 We used the QWI vintage available on the Cornell VirtualRDC on October 15, 2009. The District of Columbia was eliminated from the universe for QCEW private employment data because QWI data are not yet available for DC, which has an unusual industry structure that is primarily federal employment.
(ℓ)
4.3. Summary statistics Once the QWI data have been completed, summary statistics are formed in the usual manner. The worker, job, and excess reallocation rates are aggregated to national level by using the formulae in Section 3. Measures of variability and effective missing data rates are computed using the usual Rubin (1987) formulae. We illustrate the calculations here for WRRagkt . All other rates are handled in a comparable way. The national QWI Worker Reallocation Rate is the average over all states of the implicates:
(ℓ) (ℓ) Bagskt +Eagskt
M 1 − −
M ℓ=1
∑
∀s
∀v
WRR(ℓ) agskt
(ℓ) (ℓ) Bag v kt +Eag v kt 2
(ℓ)
(ℓ)
(ℓ)
2
where WRRagskt , Bagskt , and Eagskt are the worker reallocation rate, beginning employment and ending employment, respectively, calculated for implicate ℓ in state s. The within-implicate vari(ℓ) ability is measured by the deviation of WRRagskt from the withinimplicate mean
(ℓ)
WRR agkt =
(ℓ) (ℓ) Bagskt +Eagskt
2
−
∑
∀s
WRR(ℓ) agskt
(ℓ) (ℓ) Bag v kt +Eag v kt
.
2
∀v (ℓ,m)
(ℓ,II )
Then, yagskt = (1 − wt ) yagskt + wt yagskt . The implicates yagskt are constructed in this fashion because the primary differences in data quality between the 1990s and the 2000s, as reflected in Fig. 1, are due to the inclusion of 30 states beyond the original 18 included in the 2003 initial release of the QWI. Originally, large states were over-represented. States that joined in the mid-2000s were predominately small and medium-sized states. The correlation of state population with the probability of inclusion in the early QWI years cannot be directly controlled, but its influence is reduced by our assumptions of ignorability in the rates and division of the missing data modeling into two sub-samples. Essentially, the overrepresentation of large states in the published QWI data can only influence our national estimates during the period up to 2001.
WRRagkt =
Next, we prepare the incomplete-data states. There are 40 and 13 states with incomplete data in sub-samples I and II, respectively. For those states ragskt is only defined for periods t when the Census Bureau published the QWI data for state s; however, e¯ skt is always available from the QCEW data. To form the Multinomial-Dirichlet PPD for each sub-sample m, assume that the prior probability for each complete-data state is equal, π ∼ Dir (α) with α = [1, . . . , 1]. Then, for each incompletedata state, sample M implicates from Mult (1, π), where M = 100. For a given incomplete-data state s denote the list of Bayesian bootstrap complete-data donor states by BBs = [bbs1 , . . . bbsM ]. Form M completed data samples for state s using the algorithm
87
Hence,
agkt V (ℓ) WRR
=
1 − 49 ∀s
(ℓ) (ℓ) Bagskt +Eagskt
2
∑ ∀v
(ℓ)
WRR(ℓ) − WRR agkt agskt
(ℓ) (ℓ) Bag v kt +Eag v kt 2
2
where 49 = number of states −1. We note that the withinimplicate variance of our indicators should, in principle, be calculated from the underlying (confidential) micro-data, but this calculation is not feasible with the public-use data used in this agkt is paper. Our within-implicate variance estimator V (ℓ) WRR
88
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
Table 1 National worker and Job Reallocation Rates, by age and gender. Worker Reallocation Rate
Job Reallocation Rate
Overall
Q1
Q2
Q3
Q4
Overall
Q1
Q2
Q3
Q4
National 0.490 (0.018) [0.302]
0.430 (0.017) [0.321]
0.502 (0.019) [0.288]
0.526 (0.020) [0.288]
0.501 (0.017) [0.313]
0.130 (0.007) [0.349]
0.125 (0.006) [0.368]
0.136 (0.007) [0.330]
0.126 (0.006) [0.357]
0.135 (0.006) [0.342]
Male 0.496 (0.019) [0.299]
0.437 (0.018) [0.314]
0.513 (0.020) [0.288]
0.528 (0.021) [0.286]
0.505 (0.018) [0.307]
0.143 (0.007) [0.343]
0.135 (0.006) [0.358]
0.151 (0.008) [0.324]
0.138 (0.007) [0.355]
0.148 (0.007) [0.334]
Female 0.482 (0.018) [0.313]
0.422 (0.017) [0.332]
0.489 (0.019) [0.296]
0.523 (0.020) [0.296]
0.496 (0.017) [0.328]
0.138 (0.007) [0.360]
0.134 (0.007) [0.378]
0.141 (0.007) [0.347]
0.136 (0.007) [0.360]
0.141 (0.006) [0.356]
Ages 14–18 1.204 (0.048) [0.334]
0.916 (0.038) [0.353]
1.268 (0.040) [0.324]
1.511 (0.077) [0.298]
1.114 (0.037) [0.364]
0.380 (0.013) [0.363]
0.308 (0.011) [0.380]
0.456 (0.018) [0.352]
0.404 (0.014) [0.362]
0.349 (0.011) [0.359]
Ages 19–21 1.035 (0.030) [0.328]
0.886 (0.027) [0.321]
1.081 (0.032) [0.314]
1.171 (0.034) [0.331]
1.000 (0.028) [0.346]
0.324 (0.011) [0.365]
0.290 (0.008) [0.374]
0.360 (0.015) [0.357]
0.343 (0.012) [0.359]
0.300 (0.009) [0.371]
Ages 22–24 0.766 (0.024) [0.315]
0.671 (0.022) [0.324]
0.787 (0.025) [0.304]
0.836 (0.026) [0.306]
0.768 (0.023) [0.327]
0.264 (0.009) [0.366]
0.248 (0.008) [0.369]
0.277 (0.010) [0.364]
0.271 (0.009) [0.369]
0.259 (0.008) [0.360]
Ages 25–34 0.510 (0.019) [0.306]
0.460 (0.018) [0.323]
0.519 (0.019) [0.291]
0.534 (0.020) [0.293]
0.529 (0.018) [0.316]
0.165 (0.007) [0.353]
0.161 (0.007) [0.366]
0.168 (0.007) [0.341]
0.164 (0.007) [0.358]
0.170 (0.007) [0.348]
Ages 35–44 0.375 (0.016) [0.308]
0.342 (0.015) [0.330]
0.378 (0.016) [0.294]
0.384 (0.017) [0.291]
0.395 (0.016) [0.318]
0.139 (0.007) [0.355]
0.137 (0.007) [0.369]
0.140 (0.007) [0.338]
0.137 (0.007) [0.363]
0.144 (0.006) [0.350]
Ages 45–54 0.303 (0.015) [0.314]
0.279 (0.014) [0.336]
0.305 (0.015) [0.305]
0.305 (0.016) [0.292]
0.324 (0.014) [0.324]
0.129 (0.007) [0.364]
0.127 (0.007) [0.376]
0.130 (0.007) [0.348]
0.125 (0.007) [0.372]
0.133 (0.007) [0.359]
Ages 55–64 0.283 (0.015) [0.305]
0.263 (0.013) [0.335]
0.286 (0.015) [0.294]
0.276 (0.015) [0.278]
0.310 (0.015) [0.313]
0.138 (0.007) [0.365]
0.136 (0.007) [0.385]
0.142 (0.007) [0.346]
0.131 (0.007) [0.369]
0.144 (0.007) [0.357]
Ages 65–99 0.418 (0.027) [0.271]
0.373 (0.020) [0.301]
0.431 (0.024) [0.266]
0.401 (0.025) [0.252]
0.473 (0.039) [0.263]
0.196 (0.008) [0.337]
0.185 (0.008) [0.361]
0.208 (0.009) [0.317]
0.182 (0.008) [0.342]
0.210 (0.009) [0.327]
Note: Computed for QWI data covering 1993Q1-2008Q3. Each cell reports averages, standard errors (in parentheses), and effective missing data ratios (in square brackets). For computations, see text.
almost certainly an over-estimate because the state × NAICS × gender × age group × time period reallocation rates have been computed from the population of integrated data. They are subject to edit and imputation variability at the micro-data level, which can result in some variability in the published rates that is due to estimation, but not sampling, variability. Although most of the variability in the published rates at the aggregation levels we are using is probably real, it is more conservative to use this estimator for the within-implicate variability than to use zero. The between-implicate variability is estimated by
B WRRagkt =
1
100 −
M − 1 ℓ=1
(ℓ)
WRR agkt − WRRagkt
2
.
The total variance is estimated by
T WRRagkt
M M +1 1 − (ℓ) agkt + = V WRR B WRRagkt . M ℓ=1 M
The effective missing data rate is estimated by the ratio of the between-implicate variance to the total variance
MR WRRagkt =
B WRRagkt T WRRagkt
which equals zero if there are no missing data and one if all the variability is due to missing data. The degrees of freedom for forming confidence intervals around the reallocation rates is estimated by
df WRRagkt
1 = (M − 1) 1 + M +1
1 M
M ∑
V
ℓ=1
(ℓ)
2
agkt WRR
B WRRagkt
.
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
89
Table 2 National worker and Job Reallocation Rates, by NAICS sector. Worker Reallocation Rate Overall
Q1
Job Reallocation Rate Q3
Q4
Overall
Q1
Q2
Q3
Q4
Agriculture, Forestry, Fishing and Hunting 1.273 0.867 1.427 (0.070) (0.046) (0.075) [0.155] [0.150] [0.149]
1.453 (0.087) [0.149]
1.348 (0.071) [0.175]
0.345 (0.015) [0.167]
0.243 (0.008) [0.161]
0.415 (0.019) [0.171]
0.304 (0.011) [0.157]
0.424 (0.020) [0.178]
Mining, Quarrying, and Oil and Gas Extraction 0.329 0.297 0.348 (0.020) (0.018) (0.021) [0.305] [0.302] [0.309]
0.342 (0.021) [0.293]
0.330 (0.020) [0.317]
0.111 (0.009) [0.313]
0.106 (0.008) [0.267]
0.122 (0.011) [0.344]
0.102 (0.007) [0.289]
0.115 (0.009) [0.355]
Utilities 0.125 (0.016) [0.330]
0.131 (0.017) [0.301]
0.123 (0.013) [0.333]
0.124 (0.016) [0.314]
0.122 (0.017) [0.374]
0.057 (0.012) [0.305]
0.058 (0.012) [0.298]
0.056 (0.009) [0.313]
0.059 (0.013) [0.281]
0.055 (0.011) [0.330]
Construction 0.693 (0.021) [0.338]
0.601 (0.020) [0.307]
0.733 (0.020) [0.348]
0.730 (0.022) [0.351]
0.709 (0.020) [0.345]
0.211 (0.007) [0.318]
0.193 (0.005) [0.300]
0.233 (0.009) [0.322]
0.197 (0.006) [0.328]
0.223 (0.008) [0.323]
Manufacturing 0.260 (0.011) [0.416]
0.238 (0.011) [0.424]
0.263 (0.010) [0.433]
0.283 (0.012) [0.355]
0.258 (0.010) [0.452]
0.087 (0.005) [0.424]
0.086 (0.005) [0.429]
0.087 (0.005) [0.419]
0.087 (0.005) [0.403]
0.088 (0.005) [0.447]
Wholesale Trade 0.308 0.284 (0.010) (0.010) [0.330] [0.333]
0.313 (0.010) [0.333]
0.322 (0.010) [0.317]
0.314 (0.010) [0.335]
0.107 (0.004) [0.337]
0.107 (0.004) [0.344]
0.108 (0.004) [0.325]
0.105 (0.004) [0.337]
0.107 (0.004) [0.342]
Retail Trade 0.511 (0.013) [0.341]
0.505 (0.013) [0.322]
0.536 (0.014) [0.328]
0.573 (0.012) [0.360]
0.128 (0.005) [0.326]
0.137 (0.006) [0.348]
0.118 (0.005) [0.310]
0.117 (0.006) [0.330]
0.143 (0.005) [0.314]
Transportation and Warehousing 0.408 0.363 (0.014) (0.014) [0.311] [0.320]
0.408 (0.014) [0.307]
0.422 (0.014) [0.311]
0.444 (0.014) [0.307]
0.112 (0.006) [0.310]
0.111 (0.007) [0.328]
0.114 (0.007) [0.289]
0.107 (0.006) [0.322]
0.118 (0.006) [0.301]
Information 0.357 (0.020) [0.287]
0.345 (0.021) [0.280]
0.351 (0.020) [0.293]
0.364 (0.020) [0.293]
0.367 (0.020) [0.281]
0.104 (0.007) [0.330]
0.109 (0.007) [0.336]
0.099 (0.006) [0.328]
0.104 (0.007) [0.332]
0.105 (0.006) [0.323]
Finance and Insurance 0.245 0.234 (0.011) (0.011) [0.392] [0.396]
0.239 (0.010) [0.390]
0.254 (0.011) [0.387]
0.252 (0.011) [0.396]
0.092 (0.006) [0.396]
0.094 (0.007) [0.405]
0.091 (0.006) [0.391]
0.091 (0.006) [0.389]
0.094 (0.006) [0.401]
Real Estate and Rental and Leasing 0.467 0.417 0.478 (0.014) (0.013) (0.014) [0.305] [0.301] [0.296]
0.503 (0.014) [0.323]
0.468 (0.013) [0.301]
0.142 (0.005) [0.308]
0.137 (0.004) [0.304]
0.150 (0.005) [0.304]
0.139 (0.005) [0.322]
0.144 (0.005) [0.302]
Professional, Scientific, and Technical services 0.404 0.386 0.408 (0.012) (0.013) (0.012) [0.347] [0.346] [0.338]
0.403 (0.013) [0.348]
0.420 (0.012) [0.357]
0.142 (0.005) [0.367]
0.147 (0.005) [0.367]
0.153 (0.005) [0.362]
0.128 (0.005) [0.375]
0.138 (0.005) [0.363]
Management of Companies and Enterprises 0.275 0.249 0.271 (0.015) (0.016) (0.015) [0.445] [0.456] [0.428]
0.295 (0.016) [0.448]
0.286 (0.014) [0.450]
0.076 (0.007) [0.410]
0.078 (0.008) [0.405]
0.073 (0.007) [0.398]
0.075 (0.008) [0.424]
0.077 (0.006) [0.412]
Administrative, Support, Waste Management, Remediation 1.071 0.992 1.088 1.108 (0.025) (0.023) (0.026) (0.026) [0.293] [0.283] [0.280] [0.303]
1.099 (0.024) [0.308]
0.180 (0.006) [0.316]
0.181 (0.007) [0.303]
0.183 (0.007) [0.319]
0.167 (0.006) [0.327]
0.191 (0.006) [0.315]
0.434 (0.013) [0.356]
Q2
(continued on next page)
5. National Quarterly Workforce Indicators Table 1 summarizes the national worker and job reallocation rates by age and gender. Over our analysis period, the average Worker Reallocation Rate is 49.0%; the average job reallocation rate is 13.0%, and the Excess Reallocation Rate (churning) is the difference between the two, 35.9%. For the national estimates, our
standard errors indicate that the first two significant digits are essentially unaffected by estimation uncertainty, especially considering the conservative nature of our within-implicate variance estimator. About 30.2% and 34.9% of the estimation uncertainty is due to the incompleteness of the QWI data for the worker and job reallocation rates, respectively. Men and women have essentially identical worker, job and excess reallocation rates. As expected,
90
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
Table 2 (continued) Worker Reallocation Rate Overall
Q1
Job Reallocation Rate Q2
Q3
Q4
Overall
Q1
Q2
Q3
Q4
0.488 (0.019) [0.487]
0.546 (0.019) [0.458]
0.431 (0.020) [0.496]
0.129 (0.007) [0.450]
0.099 (0.007) [0.486]
0.157 (0.007) [0.434]
0.135 (0.008) [0.428]
0.124 (0.007) [0.450]
Health Care and Social Assistance 0.341 0.316 0.337 (0.012) (0.012) (0.013) [0.386] [0.400] [0.367]
0.371 (0.013) [0.384]
0.339 (0.012) [0.395]
0.093 (0.006) [0.408]
0.094 (0.007) [0.424]
0.091 (0.006) [0.391]
0.095 (0.006) [0.403]
0.092 (0.006) [0.413]
Arts, Entertainment, and Recreation 0.744 0.564 0.855 (0.025) (0.015) (0.029) [0.305] [0.290] [0.297]
0.850 (0.034) [0.319]
0.704 (0.021) [0.316]
0.259 (0.015) [0.312]
0.159 (0.007) [0.291]
0.369 (0.023) [0.313]
0.247 (0.013) [0.329]
0.263 (0.015) [0.314]
Accommodation and Food Services 0.805 0.698 0.849 (0.028) (0.026) (0.029) [0.286] [0.293] [0.276]
0.888 (0.029) [0.283]
0.782 (0.025) [0.292]
0.162 (0.006) [0.323]
0.145 (0.005) [0.304]
0.177 (0.009) [0.341]
0.159 (0.006) [0.307]
0.166 (0.006) [0.340]
Other Services (except Public Administration) 0.509 0.461 0.522 (0.014) (0.015) (0.014) [0.314] [0.328] [0.294]
0.553 (0.014) [0.317]
0.501 (0.014) [0.317]
0.165 (0.005) [0.347]
0.159 (0.005) [0.374]
0.171 (0.005) [0.323]
0.167 (0.005) [0.340]
0.162 (0.005) [0.349]
Educational Services 0.447 0.323 (0.019) (0.017) [0.487] [0.508]
Note: Computed for QWI data covering 1993Q1-2008Q3. Each cell reports averages, standard errors (in parentheses), and effective missing data ratios (in square brackets). For computations, see text.
younger workers experience massively more worker reallocation, and somewhat more job reallocation, leading to much more churning among younger workers. The data also display distinct seasonal patterns, which as we note later, dominate both the trend and cyclical components of the series. First quarter worker reallocation is substantially lower than the annual average, while third quarter WRR is substantially above average. Over the course of the year, peak to trough seasonal variation in WRR is ten percentage points — 25% of the average rate. By contrast, seasonal variation in the Job Reallocation Rate is only a single percentage point peak to trough, less than 10% of the average rate. Men and women have essentially the same seasonal variation. Younger workers show a seasonal variation in both the WRR and JRR that is about 40% of the average variation, peak to trough. Prime age WRR and JRR show much less seasonal variation — less than 7% of the average rate. Table 2 summarizes the national worker and job reallocation rates by NAICS sector. Once again, the standard errors indicate that the first two significant digits of the reallocation rates are essentially unaffected by estimation error. Industry sectors like Agriculture, Forestry, Fishing, and Hunting (11); Construction (23); Administrative, Support, Waste Management, and Remediation (56); Arts, Entertainment, and Recreation (71); and Accommodation and Food Services (72) display substantially greater than average worker reallocation rates and, for the most part, also display higher job reallocation rates. Utilities (22); Manufacturing (31–33); Finance and Insurance (52); and Management of Companies and Enterprises (55) display substantially lower worker and Job Reallocation Rates than average. It is noteworthy that for most industry sectors the effective missing data rate is comparable to the national average; however, Manufacturing (31–33); Management of Companies and Enterprises (55) and Educational Services (61) all display substantially greater than average effective missing data rates without affecting the reliability of the estimated reallocation rates. The excess inflow rates and excess reallocation (churning) rates are summarized in Tables 3 and 4 by age and gender, and NAICS sector, respectively. Since these statistics can be calculated by subtraction from the results in Tables 1 and 2, the main purpose of these tables is to display the summaries for churning rates directly, and to provide estimation error and effective missing data rates for these statistics. Estimation error, again, does not
Fig. 2. Comparison of QWI and JOLTS worker reallocation rates.
materially affect the second significant digit. Effective missing data rates are comparable to the basic rates. Tables 5 and 6 present the national accession and separation rates by age and gender, and by NAICS sector, respectively. Again, estimation uncertainty does not materially affect the second significant digit. Effective missing data rates are comparable to the Worker Reallocation Rates. Tables 7 and 8 report the same statistics for the components of the job reallocation index – the job creation and destruction rates – again, by age and gender, and by NAICS sector, respectively. Our main results are also summarized graphically in Figs. 2 and 3. Fig. 2 compares the QWI Worker Reallocation Rate with the WRR estimated from the JOLTS data. The QWI WRR is plotted with its two standard error confidence band. As anticipated, the QWI WRR is greater than the JOLTS estimate, as it turns out almost twice the magnitude. This is probably due to the two problems noted by Davis et al. (2010) — namely, the absence of establishment births and deaths from the JOLTS frame, and an underreporting bias that they document for establishments that are experiencing large contractions. The figure also indicates that the private employer worker reallocation rate estimated from the QWI has more seasonal variability (all results are seasonally
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
91
Table 3 National excess inflow and reallocation rates, by age and gender. Excess Inflow Rate Overall National 0.180 (0.007) [0.287] Male 0.176 (0.008) [0.284] Female 0.172 (0.007) [0.295] Ages 14–18 0.412 (0.021) [0.328] Ages 19–21 0.356 (0.012) [0.319] Ages 22–24 0.251 (0.009) [0.300] Ages 25–34 0.173 (0.007) [0.287] Ages 35–44 0.118 (0.006) [0.285] Ages 45–54 0.087 (0.005) [0.285] Ages 55–64 0.073 (0.005) [0.271] Ages 65–99 0.111 (0.011) [0.254]
Excess Reallocation Rate (churning)
Q1
Q2
Q3
Q4
Overall
Q1
Q2
Q3
Q4
0.153 (0.007) [0.301]
0.183 (0.008) [0.273]
0.200 (0.008) [0.268]
0.183 (0.007) [0.307]
0.359 (0.015) [0.287]
0.305 (0.013) [0.301]
0.367 (0.015) [0.273]
0.400 (0.016) [0.268]
0.366 (0.014) [0.307]
0.151 (0.007) [0.297]
0.181 (0.008) [0.272]
0.195 (0.009) [0.266]
0.179 (0.007) [0.301]
0.353 (0.015) [0.284]
0.302 (0.014) [0.297]
0.362 (0.016) [0.272]
0.390 (0.017) [0.266]
0.357 (0.015) [0.300]
0.144 (0.006) [0.311]
0.174 (0.007) [0.279]
0.193 (0.008) [0.273]
0.177 (0.007) [0.321]
0.344 (0.014) [0.295]
0.287 (0.013) [0.311]
0.348 (0.015) [0.279]
0.387 (0.016) [0.273]
0.355 (0.014) [0.321]
0.304 (0.015) [0.344]
0.406 (0.017) [0.320]
0.554 (0.036) [0.289]
0.383 (0.015) [0.361]
0.824 (0.042) [0.328]
0.608 (0.031) [0.344]
0.812 (0.034) [0.319]
1.107 (0.072) [0.289]
0.766 (0.031) [0.361]
0.298 (0.011) [0.312]
0.361 (0.013) [0.298]
0.414 (0.014) [0.326]
0.350 (0.012) [0.343]
0.711 (0.025) [0.319]
0.596 (0.022) [0.312]
0.721 (0.026) [0.298]
0.827 (0.028) [0.325]
0.700 (0.024) [0.342]
0.211 (0.009) [0.311]
0.255 (0.010) [0.284]
0.283 (0.010) [0.284]
0.254 (0.009) [0.321]
0.502 (0.019) [0.300]
0.423 (0.017) [0.311]
0.511 (0.020) [0.284]
0.565 (0.020) [0.284]
0.509 (0.018) [0.320]
0.149 (0.007) [0.304]
0.176 (0.008) [0.273]
0.185 (0.008) [0.266]
0.180 (0.007) [0.307]
0.345 (0.015) [0.287]
0.299 (0.014) [0.304]
0.351 (0.015) [0.273]
0.371 (0.016) [0.266]
0.360 (0.014) [0.307]
0.103 (0.005) [0.307]
0.119 (0.006) [0.272]
0.124 (0.006) [0.256]
0.126 (0.006) [0.306]
0.235 (0.012) [0.285]
0.205 (0.011) [0.307]
0.238 (0.012) [0.272]
0.247 (0.013) [0.256]
0.251 (0.011) [0.306]
0.076 (0.004) [0.305]
0.088 (0.005) [0.280]
0.090 (0.006) [0.246]
0.095 (0.005) [0.310]
0.174 (0.010) [0.285]
0.152 (0.009) [0.304]
0.175 (0.010) [0.280]
0.180 (0.011) [0.246]
0.191 (0.010) [0.310]
0.063 (0.004) [0.297]
0.072 (0.005) [0.265]
0.072 (0.006) [0.231]
0.083 (0.005) [0.294]
0.145 (0.010) [0.271]
0.127 (0.009) [0.296]
0.145 (0.010) [0.265]
0.144 (0.011) [0.231]
0.167 (0.010) [0.294]
0.094 (0.008) [0.280]
0.112 (0.010) [0.255]
0.109 (0.010) [0.232]
0.132 (0.018) [0.250]
0.222 (0.022) [0.254]
0.187 (0.015) [0.280]
0.223 (0.019) [0.254]
0.219 (0.020) [0.232]
0.263 (0.036) [0.250]
Note: Computed for QWI data covering 1993Q1-2008Q3. Each cell reports averages, standard errors (in parentheses), and effective missing data ratios (in square brackets). For computations, see text.
unadjusted) and a stronger downward trend over the period than the JOLTS estimates. We conclude that the QWI JRR is probably a more comprehensive measure of worker reallocation than its JOLTS cousin. Fig. 3 compares the QWI Job Reallocation Rate with the Job Reallocation Rate estimated from the BED. In contrast to the WRR results, the two Job Reallocation Rates are very similar. The QWI JRR has essentially the same trend as the BEDJRR. There is more seasonality, particularly in the second quarter, in the BEDJRR, but overall the two series give a strikingly similar report on the rate of job reallocation. This result is not a consequence of both series using state Unemployment Insurance administrative records to compute their rates. All four components of the two Job Reallocation Rates are estimated from different records in the state UI systems. For the QWI JRR, the numerator job creations and destructions are calculated by comparing the number of employees in each demographic group who have UI wage records in quarters t − 1 and t to those who have wage records in quarters t and t + 1. For the BEDJRR, the numerator job creations and destructions are calculated by comparing the number of persons on payroll as of the 12th calendar day in the month before the quarter begins to
Fig. 3. Comparison of QWI and BED job reallocation rates.
the number on payroll as of the 12th calendar day of the last month of the quarter. These two methods of calculating gross job creations and destructions are obviously related, but they are by
92
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
Table 4 National excess inflow and reallocation rates, by NAICS sector. Excess Inflow Rate Overall
Q1
Excess Reallocation Rate (churning) Q3
Q4
Overall
Q1
Q2
Q3
Q4
Agriculture, Forestry, Fishing and Hunting 0.464 0.312 0.506 (0.029) (0.020) (0.030) [0.152] [0.150] [0.138]
0.574 (0.040) [0.149]
0.462 (0.027) [0.173]
0.927 (0.058) [0.152]
0.624 (0.040) [0.149]
1.012 (0.060) [0.138]
1.149 (0.080) [0.149]
0.924 (0.054) [0.173]
Mining, Quarrying, and Oil and Gas Extraction 0.109 0.096 0.113 (0.007) (0.007) (0.007) [0.310] [0.321] [0.298]
0.120 (0.008) [0.300]
0.108 (0.008) [0.324]
0.218 (0.015) [0.310]
0.191 (0.013) [0.321]
0.226 (0.014) [0.297]
0.240 (0.016) [0.300]
0.216 (0.016) [0.324]
Utilities 0.034 (0.004) [0.339]
0.036 (0.004) [0.315]
0.034 (0.004) [0.358]
0.033 (0.003) [0.325]
0.033 (0.005) [0.357]
0.068 (0.008) [0.340]
0.073 (0.008) [0.314]
0.067 (0.007) [0.352]
0.064 (0.006) [0.335]
0.067 (0.010) [0.361]
Construction 0.241 (0.008) [0.332]
0.204 (0.008) [0.306]
0.250 (0.008) [0.333]
0.266 (0.009) [0.352]
0.243 (0.008) [0.335]
0.482 (0.016) [0.332]
0.408 (0.016) [0.306]
0.500 (0.016) [0.333]
0.533 (0.017) [0.353]
0.486 (0.016) [0.336]
Manufacturing 0.087 0.076 (0.004) (0.003) [0.402] [0.408]
0.088 (0.003) [0.420]
0.098 (0.004) [0.334]
0.085 (0.003) [0.450]
0.174 (0.007) [0.402]
0.152 (0.007) [0.407]
0.177 (0.007) [0.420]
0.196 (0.008) [0.334]
0.170 (0.006) [0.449]
Wholesale Trade 0.101 0.089 (0.004) (0.003) [0.326] [0.324]
0.103 (0.004) [0.331]
0.109 (0.004) [0.313]
0.103 (0.003) [0.338]
0.202 (0.007) [0.326]
0.177 (0.007) [0.325]
0.206 (0.007) [0.331]
0.217 (0.007) [0.312]
0.206 (0.007) [0.338]
Retail Trade 0.191 (0.005) [0.332]
0.193 (0.005) [0.309]
0.210 (0.005) [0.318]
0.215 (0.004) [0.362]
0.382 (0.010) [0.332]
0.297 (0.010) [0.342]
0.387 (0.010) [0.308]
0.420 (0.010) [0.318]
0.430 (0.009) [0.364]
Transportation and Warehousing 0.148 0.126 0.147 (0.005) (0.004) (0.005) [0.312] [0.317] [0.311]
0.157 (0.005) [0.307]
0.163 (0.005) [0.314]
0.296 (0.010) [0.312]
0.251 (0.009) [0.316]
0.294 (0.010) [0.312]
0.315 (0.010) [0.308]
0.325 (0.010) [0.314]
Information 0.126 (0.009) [0.264]
0.149 (0.005) [0.342]
Q2
0.118 (0.009) [0.253]
0.126 (0.009) [0.277]
0.130 (0.008) [0.258]
0.131 (0.009) [0.267]
0.253 (0.017) [0.264]
0.236 (0.018) [0.253]
0.252 (0.018) [0.278]
0.261 (0.017) [0.259]
0.262 (0.018) [0.266]
Finance and Insurance 0.076 0.070 (0.003) (0.003) [0.375] [0.385]
0.074 (0.003) [0.374]
0.082 (0.003) [0.359]
0.080 (0.003) [0.381]
0.152 (0.006) [0.376]
0.140 (0.006) [0.387]
0.147 (0.005) [0.376]
0.163 (0.006) [0.359]
0.159 (0.006) [0.382]
Real Estate and Rental and Leasing 0.162 0.140 0.164 (0.005) (0.005) (0.006) [0.308] [0.300] [0.297]
0.182 (0.006) [0.326]
0.162 (0.005) [0.306]
0.324 (0.011) [0.307]
0.281 (0.010) [0.299]
0.327 (0.011) [0.297]
0.364 (0.011) [0.326]
0.324 (0.010) [0.306]
Professional, Scientific, and Technical Services 0.131 0.119 0.128 (0.005) (0.005) (0.004) [0.334] [0.333] [0.319]
0.138 (0.005) [0.329]
0.141 (0.005) [0.355]
0.262 (0.009) [0.333]
0.238 (0.009) [0.334]
0.256 (0.009) [0.317]
0.275 (0.010) [0.329]
0.282 (0.009) [0.355]
Management of Companies and Enterprises 0.100 0.086 0.099 (0.005) (0.005) (0.005) [0.441] [0.451] [0.425]
0.110 (0.006) [0.441]
0.104 (0.005) [0.447]
0.200 (0.011) [0.441]
0.172 (0.011) [0.453]
0.198 (0.011) [0.424]
0.221 (0.011) [0.441]
0.209 (0.010) [0.446]
Administrative, Support, Waste Management, Remediation 0.446 0.405 0.453 0.471 (0.011) (0.010) (0.012) (0.012) [0.295] [0.299] [0.272] [0.302]
0.454 (0.011) [0.307]
0.891 (0.022) [0.295]
0.811 (0.020) [0.300]
0.906 (0.023) [0.272]
0.941 (0.023) [0.302]
0.908 (0.022) [0.307]
Educational Services 0.159 0.112 (0.007) (0.006) [0.485] [0.498]
0.166 (0.008) [0.490]
0.205 (0.008) [0.447]
0.153 (0.008) [0.508]
0.318 (0.015) [0.486]
0.224 (0.012) [0.498]
0.331 (0.015) [0.487]
0.410 (0.015) [0.448]
0.307 (0.016) [0.512]
Health Care and Social Assistance 0.124 0.111 0.123 (0.004) (0.004) (0.005) [0.346] [0.352] [0.332]
0.138 (0.004) [0.347]
0.123 (0.004) [0.354]
0.248 (0.008) [0.346]
0.222 (0.008) [0.352]
0.246 (0.009) [0.332]
0.276 (0.008) [0.347]
0.246 (0.008) [0.354]
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
93
Table 4 (continued) Excess Inflow Rate Overall
Q1
Excess Reallocation Rate (churning) Q2
Q3
Q4
Overall
Q1
Q2
Q3
Q4
Arts, Entertainment, and Recreation 0.242 0.202 0.243 (0.008) (0.005) (0.008) [0.309] [0.302] [0.289]
0.302 (0.012) [0.324]
0.221 (0.007) [0.324]
0.484 (0.016) [0.309]
0.405 (0.011) [0.300]
0.486 (0.015) [0.289]
0.603 (0.025) [0.325]
0.441 (0.013) [0.323]
Accommodation and Food Services 0.322 0.277 0.336 (0.012) (0.012) (0.013) [0.286] [0.291] [0.272]
0.365 (0.013) [0.281]
0.308 (0.011) [0.298]
0.643 (0.024) [0.285]
0.553 (0.023) [0.291]
0.673 (0.026) [0.272]
0.729 (0.025) [0.281]
0.616 (0.023) [0.298]
Other Services (except Public Administration) 0.172 0.151 0.175 (0.005) (0.005) (0.005) [0.298] [0.307] [0.282]
0.193 (0.005) [0.308]
0.169 (0.005) [0.296]
0.344 (0.011) [0.298]
0.303 (0.011) [0.307]
0.351 (0.011) [0.281]
0.385 (0.010) [0.308]
0.339 (0.010) [0.296]
Note: Computed for QWI data covering 1993Q1-2008Q3. Each cell reports averages, standard errors (in parentheses), and effective missing data ratios (in square brackets). For computations, see text. Table 5 National accession and separation rates, by age and gender. Accession rate
Separation rate
Overall
Q1
Q2
Q3
Q4
Overall
Q1
Q2
Q3
Q4
National 0.250 (0.010) [0.303]
0.224 (0.009) [0.326]
0.266 (0.011) [0.285]
0.263 (0.011) [0.293]
0.245 (0.009) [0.310]
0.240 (0.009) [0.299]
0.206 (0.008) [0.317]
0.236 (0.009) [0.289]
0.263 (0.010) [0.286]
0.256 (0.009) [0.305]
Male 0.253 (0.011) [0.299]
0.228 (0.009) [0.317]
0.274 (0.012) [0.288]
0.264 (0.011) [0.291]
0.244 (0.010) [0.302]
0.243 (0.010) [0.295]
0.209 (0.009) [0.312]
0.239 (0.010) [0.284]
0.265 (0.011) [0.285]
0.261 (0.010) [0.302]
Female 0.246 (0.010) [0.314]
0.219 (0.009) [0.340]
0.257 (0.011) [0.288]
0.262 (0.010) [0.301]
0.247 (0.009) [0.328]
0.236 (0.009) [0.309]
0.203 (0.008) [0.326]
0.233 (0.009) [0.303]
0.261 (0.010) [0.292]
0.249 (0.009) [0.315]
Ages 14–18 0.660 (0.026) [0.335]
0.504 (0.021) [0.355]
0.775 (0.025) [0.325]
0.764 (0.039) [0.301]
0.594 (0.021) [0.360]
0.543 (0.024) [0.332]
0.412 (0.018) [0.352]
0.492 (0.020) [0.325]
0.748 (0.040) [0.298]
0.520 (0.018) [0.353]
Ages 19–21 0.530 (0.017) [0.335]
0.458 (0.014) [0.327]
0.608 (0.020) [0.327]
0.549 (0.017) [0.338]
0.505 (0.015) [0.348]
0.505 (0.016) [0.325]
0.428 (0.013) [0.320]
0.473 (0.015) [0.309]
0.622 (0.020) [0.331]
0.495 (0.015) [0.341]
Ages 22–24 0.391 (0.013) [0.318]
0.348 (0.012) [0.328]
0.416 (0.014) [0.310]
0.419 (0.013) [0.310]
0.382 (0.012) [0.324]
0.374 (0.012) [0.314]
0.323 (0.011) [0.323]
0.372 (0.013) [0.300]
0.417 (0.013) [0.308]
0.386 (0.012) [0.324]
Ages 25–34 0.259 (0.010) [0.305]
0.238 (0.010) [0.328]
0.267 (0.011) [0.284]
0.271 (0.011) [0.296]
0.259 (0.010) [0.314]
0.252 (0.010) [0.302]
0.222 (0.009) [0.320]
0.252 (0.010) [0.292]
0.263 (0.010) [0.289]
0.271 (0.010) [0.308]
Ages 35–44 0.190 (0.009) [0.307]
0.179 (0.008) [0.337]
0.194 (0.009) [0.282]
0.196 (0.009) [0.296]
0.191 (0.008) [0.315]
0.185 (0.008) [0.305]
0.164 (0.008) [0.324]
0.185 (0.008) [0.298]
0.188 (0.009) [0.288]
0.204 (0.008) [0.310]
Ages 45–54 0.152 (0.008) [0.313]
0.146 (0.008) [0.342]
0.155 (0.009) [0.288]
0.155 (0.009) [0.301]
0.154 (0.008) [0.322]
0.150 (0.008) [0.311]
0.133 (0.007) [0.329]
0.150 (0.008) [0.312]
0.150 (0.008) [0.287]
0.170 (0.008) [0.315]
Ages 55–64 0.137 (0.008) [0.307]
0.132 (0.008) [0.344]
0.140 (0.009) [0.281]
0.134 (0.009) [0.289]
0.141 (0.008) [0.316]
0.147 (0.008) [0.303]
0.131 (0.007) [0.329]
0.146 (0.008) [0.306]
0.141 (0.008) [0.272]
0.170 (0.008) [0.303]
Ages 65–99 0.198 (0.015) [0.275]
0.183 (0.011) [0.308]
0.211 (0.015) [0.269]
0.191 (0.013) [0.264]
0.207 (0.020) [0.258]
0.221 (0.014) [0.271]
0.189 (0.010) [0.301]
0.221 (0.013) [0.270]
0.210 (0.012) [0.246]
0.266 (0.021) [0.268]
Note: Computed for QWI data covering 1993Q1-2008Q3. Each cell reports averages, standard errors (in parentheses), and effective missing data ratios (in square brackets). For computations, see text.
94
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
Table 6 National accession and separation rates, by NAICS sector. Accession rate Overall
Separation rate Q1
Q3
Q4
Overall
Q1
Q2
Q3
Q4
Agriculture, Forestry, Fishing and Hunting 0.640 0.459 0.801 (0.038) (0.023) (0.044) [0.148] [0.146] [0.135]
0.713 (0.043) [0.148]
0.585 (0.042) [0.166]
0.633 (0.037) [0.154]
0.408 (0.024) [0.152]
0.627 (0.039) [0.152]
0.741 (0.045) [0.149]
0.763 (0.040) [0.162]
Mining, Quarrying, and Oil and Gas Extraction 0.169 0.156 0.195 (0.012) (0.010) (0.015) [0.314] [0.303] [0.340]
0.174 (0.012) [0.284]
0.151 (0.011) [0.331]
0.160 (0.010) [0.312]
0.141 (0.009) [0.305]
0.153 (0.009) [0.297]
0.168 (0.011) [0.305]
0.179 (0.012) [0.343]
Utilities 0.062 (0.009) [0.322]
0.069 (0.012) [0.310]
0.065 (0.008) [0.348]
0.055 (0.007) [0.276]
0.058 (0.009) [0.355]
0.063 (0.009) [0.322]
0.062 (0.007) [0.293]
0.058 (0.008) [0.316]
0.069 (0.011) [0.329]
0.063 (0.011) [0.352]
Construction 0.355 (0.012) [0.333]
0.314 (0.011) [0.303]
0.411 (0.013) [0.357]
0.370 (0.012) [0.344]
0.321 (0.010) [0.326]
0.339 (0.010) [0.334]
0.287 (0.010) [0.308]
0.322 (0.010) [0.321]
0.360 (0.010) [0.354]
0.388 (0.012) [0.354]
Manufacturing 0.130 (0.006) [0.413]
0.122 (0.006) [0.405]
0.137 (0.006) [0.412]
0.139 (0.007) [0.375]
0.121 (0.005) [0.462]
0.130 (0.006) [0.415]
0.116 (0.005) [0.441]
0.126 (0.005) [0.449]
0.143 (0.006) [0.337]
0.137 (0.005) [0.434]
Wholesale Trade 0.158 0.150 (0.005) (0.005) [0.329] [0.332]
0.166 (0.005) [0.329]
0.162 (0.006) [0.319]
0.152 (0.005) [0.336]
0.151 (0.005) [0.328]
0.134 (0.005) [0.332]
0.148 (0.005) [0.333]
0.160 (0.005) [0.313]
0.162 (0.005) [0.335]
Retail Trade 0.260 (0.007) [0.336]
0.262 (0.007) [0.312]
0.273 (0.008) [0.325]
0.302 (0.007) [0.368]
0.251 (0.006) [0.339]
0.227 (0.007) [0.366]
0.243 (0.007) [0.328]
0.263 (0.006) [0.323]
0.270 (0.006) [0.338]
Transportation and Warehousing 0.209 0.184 (0.008) (0.008) [0.307] [0.315]
0.214 (0.008) [0.289]
0.215 (0.007) [0.307]
0.222 (0.008) [0.317]
0.200 (0.007) [0.310]
0.178 (0.007) [0.323]
0.194 (0.007) [0.319]
0.207 (0.008) [0.306]
0.222 (0.007) [0.290]
Information 0.181 (0.011) [0.287]
0.180 (0.011) [0.277]
0.181 (0.010) [0.290]
0.183 (0.011) [0.298]
0.179 (0.010) [0.283]
0.176 (0.010) [0.288]
0.165 (0.010) [0.284]
0.170 (0.010) [0.294]
0.181 (0.010) [0.290]
0.188 (0.011) [0.282]
Finance and Insurance 0.125 0.126 (0.006) (0.006) [0.385] [0.390]
0.126 (0.006) [0.368]
0.128 (0.006) [0.380]
0.122 (0.006) [0.401]
0.119 (0.006) [0.389]
0.108 (0.006) [0.397]
0.113 (0.005) [0.393]
0.127 (0.005) [0.382]
0.130 (0.006) [0.383]
Real Estate and Rental and Leasing 0.239 0.221 0.258 (0.008) (0.007) (0.008) [0.309] [0.303] [0.300]
0.249 (0.008) [0.324]
0.226 (0.007) [0.306]
0.228 (0.007) [0.305]
0.196 (0.007) [0.303]
0.220 (0.007) [0.294]
0.254 (0.007) [0.321]
0.242 (0.007) [0.304]
Professional, Scientific, and Technical services 0.209 0.217 0.207 (0.007) (0.007) (0.006) [0.347] [0.345] [0.335]
0.204 (0.007) [0.347]
0.209 (0.007) [0.365]
0.195 (0.006) [0.350]
0.168 (0.006) [0.352]
0.201 (0.006) [0.340]
0.199 (0.006) [0.353]
0.211 (0.006) [0.356]
Management of Companies and Enterprises 0.140 0.126 0.143 (0.009) (0.009) (0.008) [0.434] [0.436] [0.410]
0.149 (0.010) [0.445]
0.142 (0.008) [0.447]
0.135 (0.008) [0.443]
0.123 (0.008) [0.463]
0.129 (0.008) [0.431]
0.146 (0.007) [0.438]
0.144 (0.007) [0.440]
Administrative, Support, Waste Management, Remediation 0.546 0.518 0.577 0.563 (0.013) (0.012) (0.015) (0.014) [0.299] [0.287] [0.289] [0.304]
0.527 (0.012) [0.315]
0.525 (0.012) [0.291]
0.475 (0.012) [0.286]
0.511 (0.012) [0.274]
0.545 (0.013) [0.303]
0.572 (0.013) [0.302]
0.207 (0.007) [0.343]
Q2
(continued on next page)
no means constrained to be similar. As well as the numerators, the denominators of these Job Reallocation Rates are estimated using different administrative records. The QWI average employment over the quarter is the average of the number of persons with UI wage records in quarters t − 1 and t with the number who have wage records in quarters t and t + 1. The BED average employment comes directly from the QCEW and is the average of employment
on the 12th calendar date in the last month of the previous quarter with the employment on the 12th calendar date of the last month of the current quarter. The QWI use the QCEW month-1 employment as a benchmark for creating the final weight used to aggregate the micro-data into the published series (Abowd et al., 2009); however the month-3 employment is never used. Given the closeness of the levels of these two series and the similarity in
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
95
Table 6 (continued) Accession rate Overall
Separation rate Q1
Q2
Q3
Q4
Overall
Q1
Q2
Q3
Q4
0.217 (0.009) [0.432]
0.285 (0.011) [0.466]
0.238 (0.013) [0.476]
0.216 (0.010) [0.482]
0.140 (0.009) [0.487]
0.270 (0.011) [0.496]
0.260 (0.009) [0.439]
0.193 (0.009) [0.508]
Health Care and Social Assistance 0.176 0.170 0.175 (0.007) (0.007) (0.007) [0.384] [0.408] [0.361]
0.189 (0.007) [0.380]
0.170 (0.006) [0.387]
0.165 (0.006) [0.380]
0.146 (0.006) [0.381]
0.162 (0.006) [0.365]
0.182 (0.006) [0.382]
0.169 (0.006) [0.395]
Arts, Entertainment, and Recreation 0.382 0.297 0.533 (0.015) (0.008) (0.025) [0.304] [0.289] [0.305]
0.364 (0.014) [0.322]
0.331 (0.012) [0.300]
0.362 (0.013) [0.298]
0.267 (0.009) [0.291]
0.321 (0.010) [0.272]
0.486 (0.022) [0.317]
0.373 (0.013) [0.315]
Accommodation and Food Services 0.408 0.364 0.449 (0.014) (0.014) (0.016) [0.291] [0.295] [0.288]
0.435 (0.014) [0.283]
0.381 (0.013) [0.300]
0.397 (0.014) [0.286]
0.334 (0.012) [0.292]
0.400 (0.015) [0.276]
0.453 (0.015) [0.287]
0.401 (0.013) [0.290]
Other Services (except Public Administration) 0.260 0.249 0.275 (0.007) (0.008) (0.008) [0.316] [0.336] [0.285]
0.271 (0.007) [0.323]
0.246 (0.007) [0.322]
0.249 (0.007) [0.312]
0.213 (0.007) [0.322]
0.247 (0.007) [0.304]
0.281 (0.007) [0.307]
0.255 (0.007) [0.315]
Educational Services 0.230 0.182 (0.011) (0.009) [0.470] [0.506]
Note: Computed for QWI data covering 1993Q1-2008Q3. Each cell reports averages, standard errors (in parentheses), and effective missing data ratios (in square brackets). For computations, see text.
Fig. 4. QWI reallocation rates for men and women. Fig. 5. Excess Reallocation Rates (churning) by age group.
their seasonal patterns, we conclude that both series are measuring essentially the same underlying Job Reallocation Rate. This gives us additional confidence in the QWI measures when we use them to estimate Job Reallocation Rates for demographic and industry subgroups not published or estimable in the BED series. Fig. 4 confirms the similarity of national worker, job and excess reallocation rates for men and women. The figure shows that the level, trend, and seasonality of these reallocation rates do not differ by gender. All series are within the estimation error of the overall series and the series for the other gender, which can be confirmed by examining Tables 1 and 3. Fig. 5 provides a striking visual description of the age-related heterogeneity in national excess reallocation (churning) rates. The rates are nearly monotonic in age with 14–18 year-olds displaying the greatest churning rate, and the most seasonal variability and 55–64 year-olds displaying the least. The national average Excess Reallocation Rate and the rate for 25–34 year-olds are essentially coincident. Fig. 6 shows that the heterogeneity in agespecific Excess Reallocation Rates is exacerbated by industrial sector variation with Agriculture, Forestry, Fishing, and Hunting (11); Administrative, Support, Waste Management, Remediation (56); and Educational Services (61) dominating the age-related heterogeneity in churning rates.
6. Conclusions Using similar data for the 1990s and a single state (Maryland), Burgess et al. (2000) estimated that the Excess Reallocation Rate (churning) for manufacturing was 11% while that of non-manufacturing sectors was 19%. Using national data for the period from 1993 : Q 1 through 2008 : Q 3., our estimated Excess Reallocation Rate for manufacturing is 17% while the overall rate is 36%. Nationally representative data thus indicate that overall churning is considerably greater than those authors estimated. However, they also estimated that the excess reallocation rates were over 70% of the Worker Reallocation Rates in non-manufacturing and 46% of Worker Reallocation Rates in manufacturing. The non-manufacturing estimate holds up pretty well in nationally representative data. We estimate that the excess reallocation rate (churning) is 73% of the worker reallocation rate nationally over our time period, and not dominated by a trend. In manufacturing, churning is 67% of the worker reallocation rate, which is not as different from non-manufacturing as they found for the state of Maryland. The only other attempts to estimate national Worker Reallocation Rates, (Davis et al., 2006; Boon et al., 2008), appear to
96
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
Table 7 National job creation and destruction rates, by age and gender. Job Creation Rate
Job Destruction Rate
Overall
Q1
Q2
Q3
Q4
Overall
Q1
Q2
Q3
Q4
National 0.070 (0.005) [0.328]
0.071 (0.004) [0.359]
0.083 (0.006) [0.307]
0.063 (0.004) [0.348]
0.062 (0.004) [0.294]
0.060 (0.004) [0.322]
0.054 (0.004) [0.345]
0.053 (0.004) [0.288]
0.063 (0.004) [0.345]
0.073 (0.005) [0.310]
Male 0.076 (0.005) [0.322]
0.077 (0.004) [0.345]
0.093 (0.007) [0.306]
0.068 (0.004) [0.345]
0.065 (0.005) [0.290]
0.067 (0.004) [0.318]
0.058 (0.004) [0.343]
0.058 (0.004) [0.278]
0.070 (0.004) [0.344]
0.083 (0.005) [0.306]
Female 0.074 (0.005) [0.338]
0.075 (0.005) [0.372]
0.083 (0.006) [0.316]
0.069 (0.004) [0.354]
0.070 (0.004) [0.307]
0.064 (0.004) [0.332]
0.059 (0.004) [0.352]
0.059 (0.004) [0.311]
0.067 (0.004) [0.346]
0.072 (0.004) [0.319]
Ages 14–18 0.248 (0.011) [0.349]
0.200 (0.008) [0.368]
0.369 (0.018) [0.341]
0.210 (0.009) [0.352]
0.211 (0.009) [0.333]
0.131 (0.008) [0.342]
0.108 (0.006) [0.363]
0.086 (0.006) [0.319]
0.194 (0.011) [0.359]
0.138 (0.008) [0.326]
Ages 19–21 0.175 (0.008) [0.363]
0.160 (0.006) [0.379]
0.248 (0.015) [0.350]
0.135 (0.006) [0.374]
0.155 (0.006) [0.350]
0.149 (0.007) [0.347]
0.130 (0.005) [0.359]
0.112 (0.006) [0.329]
0.208 (0.011) [0.356]
0.146 (0.007) [0.345]
Ages 22–24 0.140 (0.006) [0.347]
0.137 (0.005) [0.363]
0.160 (0.008) [0.345]
0.136 (0.005) [0.361]
0.128 (0.005) [0.318]
0.123 (0.005) [0.347]
0.112 (0.004) [0.360]
0.116 (0.005) [0.318]
0.135 (0.006) [0.371]
0.131 (0.006) [0.338]
Ages 25–34 0.086 (0.005) [0.328]
0.088 (0.004) [0.356]
0.091 (0.006) [0.306]
0.086 (0.004) [0.344]
0.079 (0.004) [0.305]
0.079 (0.004) [0.327]
0.073 (0.004) [0.346]
0.076 (0.004) [0.298]
0.078 (0.004) [0.349]
0.091 (0.005) [0.314]
Ages 35–44 0.072 (0.005) [0.330]
0.076 (0.005) [0.362]
0.075 (0.005) [0.297]
0.073 (0.004) [0.350]
0.065 (0.004) [0.311]
0.067 (0.004) [0.331]
0.061 (0.004) [0.349]
0.065 (0.004) [0.307]
0.064 (0.004) [0.352]
0.079 (0.005) [0.317]
Ages 45–54 0.065 (0.005) [0.341]
0.070 (0.005) [0.366]
0.068 (0.005) [0.305]
0.065 (0.005) [0.365]
0.059 (0.004) [0.325]
0.063 (0.004) [0.341]
0.057 (0.004) [0.361]
0.062 (0.004) [0.320]
0.060 (0.004) [0.359]
0.074 (0.005) [0.322]
Ages 55–64 0.064 (0.005) [0.345]
0.069 (0.005) [0.375]
0.068 (0.006) [0.311]
0.062 (0.005) [0.365]
0.057 (0.005) [0.326]
0.074 (0.004) [0.343]
0.067 (0.004) [0.371]
0.074 (0.005) [0.322]
0.069 (0.004) [0.356]
0.086 (0.005) [0.321]
Ages 65–99 0.087 (0.007) [0.315]
0.090 (0.005) [0.356]
0.099 (0.009) [0.306]
0.081 (0.005) [0.341]
0.076 (0.006) [0.253]
0.109 (0.006) [0.313]
0.096 (0.004) [0.358]
0.109 (0.007) [0.265]
0.101 (0.004) [0.322]
0.134 (0.008) [0.305]
Note: Computed for QWI data covering 1993Q1-2008Q3. Each cell reports averages, standard errors (in parentheses), and effective missing data ratios (in square brackets). For computations, see text.
suffer from a serious underestimation bias due to the design of the JOLTS sampling frame and nonrepresentativeness of the reports of large establishments, as described in Davis et al. (2010). We document the magnitude of this bias by showing that our nationally representative worker reallocation rate is, at 49%, more than double the 19% quarterly rate estimated by Davis et al. (2006). Given the demonstrated similarity between our Job Reallocation Rates and those estimated from the BED when they can be compared, it seems reasonable to conclude that our worker reallocation rates, based on the same QWI data as our job reallocation rates, are more representative of the overall private economy than the JOLTS estimates. As already noted, the national Job Reallocation Rate based on QWI inputs shows a similar level, trend, and seasonality to the job reallocation rate based on the BED as described in Spletzer et al. (2004) and Davis et al. (2006). The demonstrated contribution from estimating Job Reallocation Rates from the QWI is that they are accurate enough to permit estimation at the national NAICS sector
× age × gender level, thus permitting estimation and forecasting, for the first time, of the interaction between the demographic characteristics and industrial sector. Our figures indicate that job and excess reallocation rates are not very different by gender but, as anticipated, vary enormously by age group, and by age group across industrial sectors. We have created national gross worker and job flow statistics for the period from 1993 : Q 1 through 2008 : Q 3, characterizing worker flows by the Worker Reallocation Rate and job flows by the job reallocation rate. Our methods demonstrate that the Census Bureau’s Quarterly Workforce Indicators can reliably estimate these rates to two significant digits at the national level. Fully saturated estimates by NAICS sector, gender and age group were produced. The incompleteness of the QWI data, which arises from the fact that different states joined the LED Federal/State Partnership at different dates and with historical data of different longevity, does not materially affect the quality of the national QWI data.
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
97
Table 8 National job creation and destruction rates, by NAICS sector. Job Creation Rate Overall
Job Destruction Rate Q1
Q3
Q4
Overall
Q1
Q2
Q3
Q4
Agriculture, Forestry, Fishing and Hunting 0.176 0.146 0.294 (0.014) (0.005) (0.020) [0.141] [0.138] [0.138]
0.138 (0.007) [0.152]
0.123 (0.023) [0.136]
0.169 (0.014) [0.148]
0.097 (0.006) [0.176]
0.121 (0.021) [0.126]
0.166 (0.009) [0.146]
0.300 (0.021) [0.144]
Mining, Quarrying, and Oil and Gas Extraction 0.060 0.061 0.082 (0.007) (0.006) (0.011) [0.322] [0.286] [0.367]
0.054 (0.005) [0.277]
0.043 (0.005) [0.358]
0.051 (0.006) [0.325]
0.045 (0.004) [0.289]
0.040 (0.005) [0.334]
0.048 (0.005) [0.300]
0.072 (0.009) [0.379]
Utilities 0.028 (0.007) [0.295]
0.033 (0.010) [0.296]
0.031 (0.006) [0.343]
0.022 (0.005) [0.235]
0.025 (0.006) [0.304]
0.029 (0.007) [0.309]
0.025 (0.004) [0.289]
0.024 (0.006) [0.306]
0.038 (0.011) [0.327]
0.030 (0.008) [0.314]
Construction 0.114 (0.006) [0.311]
0.110 (0.004) [0.283]
0.161 (0.010) [0.325]
0.104 (0.004) [0.319]
0.078 (0.003) [0.317]
0.098 (0.004) [0.309]
0.083 (0.003) [0.289]
0.072 (0.004) [0.308]
0.093 (0.003) [0.312]
0.145 (0.008) [0.330]
Manufacturing 0.043 (0.003) [0.414]
0.046 (0.003) [0.394]
0.049 (0.003) [0.367]
0.042 (0.003) [0.434]
0.036 (0.003) [0.464]
0.044 (0.003) [0.422]
0.040 (0.003) [0.452]
0.038 (0.003) [0.467]
0.045 (0.003) [0.366]
0.052 (0.003) [0.404]
Wholesale Trade 0.057 0.062 (0.003) (0.003) [0.321] [0.334]
0.063 (0.003) [0.286]
0.053 (0.003) [0.321]
0.049 (0.002) [0.345]
0.050 (0.002) [0.336]
0.045 (0.002) [0.334]
0.045 (0.002) [0.355]
0.051 (0.002) [0.324]
0.059 (0.002) [0.334]
Retail Trade 0.069 (0.004) [0.321]
0.068 (0.004) [0.293]
0.063 (0.004) [0.327]
0.087 (0.003) [0.339]
0.059 (0.003) [0.326]
0.079 (0.003) [0.334]
0.050 (0.003) [0.335]
0.053 (0.003) [0.338]
0.056 (0.003) [0.294]
Transportation and Warehousing 0.061 0.059 (0.004) (0.005) [0.300] [0.319]
0.067 (0.005) [0.254]
0.058 (0.003) [0.311]
0.059 (0.004) [0.319]
0.052 (0.004) [0.311]
0.053 (0.004) [0.332]
0.047 (0.004) [0.328]
0.049 (0.004) [0.311]
0.059 (0.003) [0.271]
Information 0.055 (0.005) [0.315]
0.062 (0.005) [0.316]
0.055 (0.004) [0.316]
0.053 (0.006) [0.315]
0.048 (0.004) [0.313]
0.049 (0.004) [0.328]
0.047 (0.004) [0.331]
0.044 (0.003) [0.313]
0.051 (0.004) [0.342]
0.057 (0.004) [0.328]
Finance and Insurance 0.049 0.056 (0.004) (0.004) [0.376] [0.389]
0.052 (0.004) [0.342]
0.046 (0.004) [0.370]
0.043 (0.003) [0.404]
0.043 (0.003) [0.382]
0.038 (0.004) [0.385]
0.039 (0.003) [0.380]
0.045 (0.003) [0.390]
0.051 (0.004) [0.372]
Real Estate and Rental and Leasing 0.077 0.081 0.094 (0.003) (0.003) (0.004) [0.302] [0.299] [0.301]
0.067 (0.003) [0.313]
0.064 (0.003) [0.295]
0.066 (0.003) [0.318]
0.056 (0.002) [0.309]
0.056 (0.003) [0.299]
0.072 (0.003) [0.326]
0.080 (0.003) [0.339]
Professional, Scientific, and Technical services 0.078 0.098 0.079 (0.003) (0.003) (0.004) [0.351] [0.352] [0.340]
0.067 (0.003) [0.348]
0.068 (0.003) [0.364]
0.063 (0.003) [0.361]
0.049 (0.003) [0.365]
0.073 (0.003) [0.344]
0.061 (0.003) [0.370]
0.070 (0.003) [0.365]
Management of Companies and Enterprises 0.040 0.040 0.044 (0.005) (0.005) (0.005) [0.391] [0.382] [0.359]
0.039 (0.006) [0.423]
0.037 (0.004) [0.399]
0.036 (0.004) [0.403]
0.038 (0.005) [0.406]
0.030 (0.003) [0.408]
0.036 (0.003) [0.398]
0.040 (0.004) [0.401]
Administrative, Support, Waste Management, Remediation 0.101 0.112 0.124 0.092 (0.005) (0.005) (0.006) (0.005) [0.319] [0.310] [0.308] [0.316]
0.073 (0.003) [0.345]
0.079 (0.004) [0.323]
0.069 (0.004) [0.314]
0.059 (0.003) [0.333]
0.074 (0.004) [0.341]
0.118 (0.004) [0.301]
Educational Services 0.071 0.070 (0.005) (0.005) [0.452] [0.463]
0.058 (0.004) [0.327]
Q2
0.052 (0.005) [0.459]
0.080 (0.006) [0.447]
0.084 (0.006) [0.437]
0.057 (0.004) [0.432]
0.029 (0.004) [0.440]
0.105 (0.006) [0.420]
0.055 (0.004) [0.412]
0.040 (0.003) [0.457]
Health Care and Social Assistance 0.052 0.059 0.051 (0.004) (0.004) (0.004) [0.391] [0.422] [0.362]
0.051 (0.004) [0.391]
0.047 (0.003) [0.390]
0.041 (0.003) [0.395]
0.035 (0.003) [0.396]
0.039 (0.003) [0.386]
0.044 (0.003) [0.390]
0.045 (0.003) [0.409]
(continued on next page)
98
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
Table 8 (continued) Job Creation Rate Overall
Job Destruction Rate Q1
Q2
Q3
Q4
Overall
Q1
Q2
Q3
Q4
Arts, Entertainment, and Recreation 0.140 0.095 0.291 (0.010) (0.004) (0.023) [0.301] [0.275] [0.311]
0.062 (0.005) [0.320]
0.110 (0.009) [0.297]
0.120 (0.009) [0.304]
0.064 (0.005) [0.296]
0.078 (0.007) [0.290]
0.185 (0.012) [0.322]
0.153 (0.011) [0.307]
Accommodation and Food Services 0.086 0.088 0.113 (0.004) (0.004) (0.008) [0.314] [0.311] [0.356]
0.071 (0.003) [0.295]
0.073 (0.003) [0.291]
0.075 (0.004) [0.321]
0.058 (0.002) [0.292]
0.064 (0.003) [0.285]
0.088 (0.004) [0.327]
0.093 (0.004) [0.384]
Other Services (except Public Administration) 0.088 0.097 0.100 (0.003) (0.004) (0.003) [0.331] [0.369] [0.293]
0.079 (0.003) [0.328]
0.076 (0.003) [0.334]
0.077 (0.003) [0.340]
0.061 (0.003) [0.346]
0.072 (0.003) [0.347]
0.089 (0.003) [0.324]
0.086 (0.003) [0.344]
Note: Computed for QWI data covering 1993Q1-2008Q3. Each cell reports averages, standard errors (in parentheses), and effective missing data ratios (in square brackets). For computations, see text.
Fig. 6. Excess Reallocation Rates (churning) by age group and NAICS sectors.(Note: NAICS sectors are referenced by their numeric abbreviation, where 00 =‘‘All NAICS Sectors ’’, 11 = ‘‘Agriculture, Forestry, Fishing, and Hunting’’, 21 = ‘‘Mining, Quarrying, and Oil and Gas Extraction’’, 22 = ‘‘Utilities’’, 23 = ‘‘Construction’’, 31–33 = ‘‘Manufacturing’’, 42 = ‘‘Wholesale Trade’’, 44–45 = ‘‘Retail Trade’’, 48–49 = ‘‘Transportation and Warehousing’’, 51 = ‘‘Information’’, 52 = ‘‘Finance and Insurance’’, 53 = ‘‘Real Estate and Rental and Leasing’’, 54 = ‘‘Professional, Scientific, and Technical Services’’, 55 = ‘‘Management of Companies and Enterprises’’, 56 = ‘‘Administrative, Support, Waste Management, Remediation’’, 61 = ‘‘Educational Services’’, 62 = ‘‘Health Care and Social Assistance’’, 71 = ‘‘Arts, Entertainment, and Recreation’’, 72 = ‘‘Accommodation and Food Services’’, 81 = ‘‘Other Services (except Public Administration)’’, 92 = ‘‘Public Administration’’.)
The national estimates from the QWI are an important enhancement to existing series because they include demographic and industry detail for worker and job flows compiled from underlying micro-data that have been consistently integrated by the Longitudinal Employer-Household Dynamics Program at the Census Bureau. The estimates produced for this paper were compiled exclusively from public-use data series and are available for download.
Acknowledgements This research uses data from the Census Bureau’s Longitudinal Employer-Household Dynamics (LEHD) Program, which was partially supported by the following grants: National Science Foundation (NSF) SES-9978093, SES-0339191 and ITR-0427889; National Institute on Aging AG018854; and grants from the Alfred P. Sloan Foundation. Both authors also acknowledge partial direct support
J.M. Abowd, L. Vilhuber / Journal of Econometrics 161 (2011) 82–99
by NSF grants CNS-0627680, SES-0820349, SES-0922005, and SES0922494 and by the Census Bureau. No confidential data were used in this paper. All public-use Quarterly Workforce Indicators data can be accessed from http:// www.vrdc.cornell.edu/ news/ data/ qwi-public-use-data/. The national indicators developed in this paper can be accessed from http://www.vrdc.cornell.edu/news/data/qwi-national-data/. We are grateful for the comments and suggestions of many of our colleagues, past and present, too numerous to list here and thus listed at the website above and in the working paper version of this article. The opinions expressed in this paper are those of the authors and not the US Census Bureau nor any of the research sponsors. References Abowd, J.M., Corbel, P., Kramarz, F., 1999. The entry and exit of workers and the growth of employment: an analysis of French establishments. Review of Economics and Statistics 81 (2), 170–187. Abowd, J.M., Haltiwanger, J.C., Lane, J.I., 2004. Integrated longitudinal employeeemployer data for the United States. American Economic Review 94 (2), 224–229. Abowd, J.M., Stephens, B.E., Vilhuber, L., (2006). Confidentiality protection in the Census Bureau’s Quarterly Workforce Indicators, Technical Paper TP-2006-02, LEHD, US Census Bureau. Abowd, J.M., Stephens, B.E., Vilhuber, L., Andersson, F., McKinney, K.L., Roemer, M., Woodcock, S.D., 2009. The LEHD infrastructure files and the creation of the quarterly workforce indicators. In: Dunne, T., Jensen, J.B., Roberts, M.J. (Eds.), Producer Dynamics: New Evidence from Micro Data. CRIW. University of Chicago Press for the NBER, pp. 149–230. Abowd, J.M., Vilhuber, L., 2005. The sensitivity of economic statistics to coding errors in personal identifiers. Journal of Business and Economic Statistics 23 (2), 133–152. Abowd, J.M., Zellner, A., 1985. Estimating gross labor force flows. Journal of Business and Economic Statistics 3, 254–283.
99
Anderson, P., Meyer, B., 1994. The extent and consequences of job turnover. In: Winston, C., Baily, M.N., Reiss, P.C. (Eds.), Brookings Papers in Economic Activity, Microeconomics. The Brookings Institution, pp. 177–249. Benedetto, G., Haltiwanger, J., Lane, J., McKinney, K., 2007. Using worker flows in the analysis of the firm. Journal of Business and Economic Statistics 25 (3), 299–313. Boon, Z., Carson, C.M., Faberman, R.J., Ilg, R.E., 2008. Studying the labor market using BLS labor dynamics data. Monthly Labor Review 131 (2), 3–16. Burgess, S., Lane, J., Stevens, D., 2000. Job flows, worker flows, and churning. Journal of Labor Economics 18 (3), 473–502. Burgess, S., Lane, J., Stevens, D., 2001. Churning dynamics: an analysis of hires and separations at the employer level. Labour Economics 8, 1–14. Davis, S.J., Faberman, R.J., Haltiwanger, J.C., Rucker, I., 2010. Adjusted estimates of worker flows and job openings in JOLTS. In: Abraham, K., Spletzer, J., Harper, M. (Eds.), Labor in the New Economy. CRIW. University of Chicago Press for the NBER, pp. 187–221. Davis, S.J., Haltiwanger, J., 1990. Gross job creation and destruction: Microeconomic evidence and macroeconomic implications. NBER Macroeconomics Annual 5, 123–168. Davis, S.J., Haltiwanger, J., 1992. Gross job creation, gross job destruction, and employment reallocation. Quarterly Journal of Economics 107 (3), 819–863. Davis, S.J., Haltiwanger, J.C., Schuh, S., 1996. Job Creation and Destruction. MIT Press, Cambridge, MA. Davis, S.T., Faberman, R.J., Haltiwanger, J., 2006. The flow approach to labor markets: new data sources and micro-macro links. Journal of Economic Perspectives 20 (3), 3–26. Dunne, T., Roberts, M.J., Samuelson, L., 1989. Plant turnover and gross employment flows in the U.S. manufacturing sector. Journal of Labor Economics 7 (1), 48–71. Pivetz, T.R., Searson, M.A., Spletzer, J.R., 2001. Measuring job and establishment flows with BLS longitudinal microdata. Monthly Labor Review 124 (4), 13–20. Poterba, J.M., Summers, L.H., 1986. Reporting errors and labor market dynamics. Econometrica 54 (6), 1319–1338. Rubin, D.B., 1981. The Bayesian bootstrap. The Annals of Statistics 9, 130–134. Rubin, D.B., 1987. Multiple Imputation for Nonresponse in Surveys. Wiley, New York. Spletzer, J.R., Faberman, R.J., Sadeghi, A., Talan, D.M., Clayton, R.L., 2004. Business employment dynamics: new data on gross job gains and losses. Monthly Labor Review 127, 29–42.