Show Summary Details

Page of

PRINTED FROM the OXFORD RESEARCH ENCYCLOPEDIA, ECONOMICS AND FINANCE ( (c) Oxford University Press USA, 2020. All Rights Reserved. Personal use only; commercial use is strictly prohibited (for details see Privacy Policy and Legal Notice).

date: 18 February 2020

Measurement Error: A Primer for Macroeconomists

Summary and Keywords

Most applied researchers in macroeconomics who work with official macroeconomic statistics (such as those found in the National Accounts, the Balance of Payments, national government budgets, labor force statistics, etc.) treat data as immutable rather than subject to measurement error and revision. Some of this error may be caused by disagreement or confusion about what should be measured. Some may be due to the practical challenges of producing timely, accurate, and precise estimates. The economic importance of measurement error may be accentuated by simple arithmetic transformations of the data, or by more complex but still common transformations to remove seasonal or other fluctuations. As a result, measurement error is seemingly omnipresent in macroeconomics.

Even the most widely used measures such as Gross Domestic Products (GDP) are acknowledged to be poor measures of aggregate welfare as they omit leisure and non-market production activity and fail to consider intertemporal issues related to the sustainability of economic activity. But even modest attempts to improve GDP estimates can generate considerable controversy in practice. Common statistical approaches to allow for measurement errors, including most factor models, rely on assumptions that are at odds with common economic assumptions which imply that measurement errors in published aggregate series should behave much like forecast errors. Fortunately, recent research has shown how multiple data releases may be combined in a flexible way to give improved estimates of the underlying quantities.

Increasingly, the challenge for macroeconomists is to recognize the impact that measurement error may have on their analysis and to condition their policy advice on a realistic assessment of the quality of their available information.

Keywords: measurement error, data revision, national accounts, seasonal adjustment, business cycles


Macroeconomic series are regularly revised, sometimes substantially. This reflects the fact that published macroeconomic series are not direct observations of macroeconomic variables. Rather, they are estimates of those variables, and estimates which are sometimes revised as more information becomes available. Table 1 gives a glimpse of the extent of the revisions for a number of Organisation for Economic Co-operation and Development (OECD) nations.

Table 1. GDP Revisions 1994Q4 to 2013Q4


Mean Revision

Mean Absolute Revision























































The Netherlands





New Zealand

























United Kingdom





United States










Notes: These figures reflect the revision in quarter-over-quarter (Q/Q) and year-over-year (Y/Y) GDP growth rates (in percentage points) over the first three years after initial publication (Zwijnenburg, 2015, Table 2).

It shows that, to varying degrees, Gross Domestic Product (GDP) growth tends to be revised upward after its initial release. The average absolute revision may be over 0.5% on a quarter-over-quarter (Q/Q) rate or closer to 1% on a year-over-year (Y/Y) rate. Much of this in turn stems from revisions to gross fixed capital formation or net exports; personal consumption expenditures typically suffer relatively smaller revisions (see Zwijnenburg, 2015, Figures 7 and 8).

This article is intended as an introduction to and overview of measurement error and revision in macroeconomic data. It will be useful for most applied researchers in macroeconomics who work with official macroeconomic statistics, such as those found in the National Accounts, the Balance of Payments, national government budgets, labor force statistics, etc. Many of the lessons to be discussed also apply much more broadly whenever researchers treat data as immutable rather than subject to estimation error and revision.

Macroeconomists may be familiar with measurement error in the context of debates about how well official statistics capture productivity growth. That literature examines both the sources of productivity growth and the construction of productivity statistics (e.g., see Aghion, Bergeaud, Boppart, Klenow, & Li, 2019 and the references therein.). In so doing, it investigates the accuracy of productivity. In contrast, the focus in this article is on the precision of these and other macroeconomic series. (Accuracy refers to the closeness of an estimate to the true value. Precision refers to the closeness of multiple estimates to each other. Estimates which are widely scattered around the true value but whose average is correct may be said to be accurate but not precise. Estimates which are tightly clustered but consistently tend to deviate from the true value in a particular direction may be said to be precise but not accurate.)

The article first reviews sources of measurement error in macroeconomics, including definitional issues as well as the role of the statistical agency. It then presents the most widely used statistical models of measurement errors and discusses the relationship between measurement error and data revision. Thereafter, the impact of common data transformations on the importance of measurement errors is reviewed, showing conditions under which highly reliable raw data may give rise to much less reliable derived series. These results are illustrated with a number of popular macroeconomic series. Two widespread but more complex data transformations are then examined, seasonal and cyclical adjustment, which have similarities in their econometric treatment as well as their implications for measurement error and data revision. The conclusions review some of the implications for macroeconomic analysis and forecasting.

Measurement Errors and Macroeconomic Statistics

Understanding the likely importance of macroeconomic measurement errors requires some understanding of their sources and typical characteristics. Errors can result from ambiguity or disagreement about both what should be measured as well as how it should be measured. Examples of both types of problems in contemporary macroeconomic settings are well-known.

What Are We Trying to Measure?

As priorities change, the quantities economists seek to measure may evolve or be redefined. To the extent that economists in the past used primitive or outmoded concepts, this implies that their work may be seen to suffer from measurement errors caused by conceptual or definitional errors. The potential importance of such problems in macroeconomics may be appreciated by considering its use of GDP. Macroeconomists frequently analyze the welfare implications of a policy or other change by examining its impact on GDP (or GDP per capita). Heuristically, this stems from the notion that higher production levels should be potentially Pareto-improving for society overall. But it also ignores widely acknowledged issues with how GDP is commonly measured. Many modifications have been proposed to the System of National Accounts, which seek to correct important conceptual shortcomings in GDP and thereby better capture aggregate welfare (for more discussion, see Stiglitz et al., 2009, and Jorgenson, 2018).

One of the first of these modifications was the replacement of Gross National Product (i.e., production by a country’s people) by GDP (production within the country’s borders). Nordhaus and Tobin (1972) proposed their Measure of Economic Welfare (MEW) that added corrections for components of consumption that GDP omits, such as leisure and non-market production activity. The fact that all of these measures ignore intertemporal tradeoffs has also been acknowledged as a problem. For example, one may temporarily boost consumption by reducing investment and running down the capital stock, or boost production by increasing the extraction rate of a non-renewable resource. However, such boosts will not be sustainable, implying that short-run improvements in consumption or production may not be welfare-improving when future repercussions are considered. To address such issues, substantial effort has been made, on the one hand, to integrate capital accounts in national accounting systems (for example see World Bank, 2006) and, on the other hand, to shift the focus from actual to sustainable levels of production or consumption (as in World Bank, 2011). These build on the concept of genuine savings proposed by Hamilton and Clemens (1999), which modify the traditional measures of aggregate savings by adding the value of investment in human capital and subtracting the value of resource depletion (see Carey, Lange, & Wodon, 2018, for recent advances in their measurement). Welfare improvements are only sustainable if genuine savings per capita are non-negative.

To date, while the conceptual framework for these adjustments to GDP exists, the detailed series required for their accurate and timely measurement do not, which has limited their incorporation into the quarterly national accounts series typically used by macroeconomists. However, even the limited modifications that have been incorporated to date have sometimes proved problematic.

Irish GDP Growth and Intellectual Property

One such example may be seen in Figure 1, which shows a substantial change in the character of official Irish GDP growth starting in 2015Q1. Even during the Great Recession of 2008–2009, annualized quarterly growth rates prior to 2015 fluctuated in a range of ±20%. Thereafter growth became substantially more volatile, starting with a rate exceeding 100% in 2015Q1. This extraordinary growth rate (26.3% on a Q/Q basis) was largely “paper-based.” Among other things, it reflected the decision of multinational firms to move their portfolios of intellectual property assets to Ireland to take advantage of favorable local tax treatments. This was duly incorporated into the National Accounts as an inflow of foreign (intangible) capital, an increase in investment, and an increase in foreign indebtedness. Taken together with other corporate restructurings and developments in aircraft leasing (for aircraft operated abroad but owned by Irish leasing firms, another “paper-based” activity), this led the Bank of Ireland (2016, p. 7) to argue that:1

the 2015 National Income and Expenditure accounts are not reflective of actual economic activity taking place in Ireland. Instead, these developments reflect the statistical on-shoring of economic activity. . . . As a result, National Accounts data now include a very significant amount of activity carried out elsewhere, but formally recorded as part of Irish GDP and GNP.

Measurement Error: A Primer for Macroeconomists

Figure 1. Irish real GDP growth.

Despite this assessment, it should be stressed these figures were considered to be logically correct and consistent with contemporary international standards as Bank of Ireland (2016) acknowledges. Their disagreement instead lay in the definition of aggregate economic activity or welfare that was pertinent for macroeconomic analysis. This demonstrates that subtle differences in the definition of aggregate economic output can give rise to what appear to be substantial measurement errors, even in series as widely used and central to macroeconomic analysis as GDP.

A somewhat related phenomenon occurs in Norway, a country with a smaller population but substantial offshore oil production. Rather than focus on measures of total GDP, the central bank focuses on “mainland” economic activity. (e.g., see various issues of the Norges Bank’s Monetary Policy Report.)

Having acknowledged that such conceptual disagreements or confusions may cause substantial measurement errors, the focus now shifts to more applied problems (the how rather than the what of measurement), starting with the problem facing the statistical agency.

The Role of the Statistical Agency

Statistical agencies have a complex mandate. Given limited resources, they try to produce statistics that are timely, precise, and relevant to diverse communities of end-users in both the public and private sectors. Doing so requires them to make judgmental tradeoffs between costs, precision, timeliness, and accuracy. The degree to which their statistics contain measurement errors is a reflection of some of the choices that they make, and these may vary considerably from one national statistical agency to another.

Sargent (1989) proposes two alternative models of how statistical agencies may respond to known measurement errors. He assumes that in each period they try to estimate an n×1 vector of “true” values Zt but that they can only observe an error-corrupted version of Zt, namely


where zt,Zt and vt are assumed to be covariance stationary n×1 vectors of stochastic processes and vt is a mean-zero vector of measurement errors whose realizations are independent of Z.2 These assumptions are consistent with what has become known as the “classical” measurement error framework.

In the simplest case the statistical agency publishes zt. Note that under some circumstances agencies may choose this approach even when they are aware that measurement errors are large and can easily be reduced. For example, when transparency, objectivity, reproducibility, and simplicity are key concerns, it may be desirable to simply publish the “raw” data and leave further adjustments to the end-users. Perhaps the most common example of this involves seasonal adjustment. While the concept of interest for most users of macroeconomic series is a seasonally adjusted value, the availability of seasonally adjusted figures varies considerably from one national statistical agency to another. End-users understand that the available series contain a serially correlated measurement error (the seasonal effect) despite the fact that many standardized tools for deseasonalization are readily available. For users of U.S. data, the lack of seasonally adjusted series for aggregate stock dividends is a good example (see Pettenuzzo, Sabbatucci, & Timmermann, 2018).

Alternatively, the statistical agency may instead report


That is, they use a statistical model to estimate the unobserved “true” values Zt based on current and past values of their raw series. Sargent (1989) refers to this as the case where the agency “filters” the data, and notes that standard methods for linear state-space models (such as the Kalman filter) naturally lend themselves to such applications. Since in general E(Zt|zt,zt1,)E(Zt|,zt+1,zt,zt1,), this suggests revising previously published estimates z˜t as new raw data zt+1,zt+2, arrive (see the OECD/Eurostat Guidelines on Revisions Policy and Analysis). This also implies that published estimates should become more precise as they are revised. This seems well suited to describing the common practice of revising recently published estimates at regular intervals after their initial release.

An alternative view of data revision focuses on the occasional and often irregular revision of an entire series, usually due to changes in its definition and/or the methodology used to estimate it. Here such changes will be referred to as “benchmark” revisions. They may simply be due to changes in the units of measure (e.g., the base year used for real-value calculations.) In other cases, however, statistical agencies may decide to revise their methodologies in order to better adapt them to a changing economic environment or changing user demands. For example, as new technologies introduce new products and services, there is often a lag before they are incorporated into the samples or benchmarks used by statistical agencies. If those new products and services tend to be the ones with high productivity and output growth, this suggests that initial estimates of overall production and productivity may be biased downward until benchmark revisions correct the situation. Aghion et al. (2019) provide a formal model in which the imputation of prices for products from new firms creates systemic downward bias in estimates of productivity growth. Using U.S. data, they estimate that this understated annual productivity growth by roughly 0.5% over the 1983–2013 period. Bognanni and Zito (2016) present evidence that benchmark revisions since 1968 have tended to revise initial estimates of U.S. productivity growth rates upward by an average of 0.5%, and that the average revisions are more positive for the lowest initial estimates. Youll (2008) reviews related arguments and evidence that early estimates of U.K. GDP growth are biased downward—Bank of England fan charts from around this period showed that the Bank expected recent estimates of GDP growth to be underestimates. However, Robinson (2016) argues that the evidence of bias disappeared in the aftermath of the 2008 recession.

Data Revision

Measurement Errors and Data Revisions

The fact that macroeconomic series are revised is the clearest evidence that such series are estimates and therefore contain measurement errors. That said, there is often unnecessary confusion about the relationship between data revisions and measurement error. Fortunately, the following observations can to be used to clarify matters:

  1. 1. It is sometimes claimed that series which undergo less revision must be more “reliable.” This is false: the case in which the statistical agency simply publishes zt and never revises it shows that the absence of data revision tells us nothing about the absence or size of measurement errors.

  2. 2. It is sometimes claimed that measurement errors must be “at least as large as” data revisions. This is based on the assumption that the statistical agency’s releases are generated by a process like that in (2). Because the information used to construct the earlier releases is only a subset of that used to construct subsequent releases, the revised estimate must be “no worse” than the earlier release. In the case where the conditional expectation in (2) is formed by linear projection, the expected mean-squared error (MSE) of the later release must be no greater than that of the earlier release. (The only case in which the two MSEs would be equal is the case in which the series was not revised.)

  3. 3. The assumption of a process like that in (2) may be reasonable in the context of revising recently published estimates at regular intervals after their initial release, but it seems untenable in the context of benchmark revisions. Whether series after benchmark revisions are more or less accurate depends on the specific context and the task at hand. The case of Irish GDP is a clear example where the central bank felt that revised GDP (and GNP) figures were less accurate for the purpose of guiding monetary policy than the series that they replaced.

Establishing a tighter relationship between data revisions and measurement errors requires stronger assumptions. For example, if one assumes that the “final” published estimate is the true value Zt, then the extent to which any estimate is eventually revised would precisely equal its measurement error. Kapetanios and Yates (2010) and Cunningham, Eklund, Jeffery, Kapetanios, and Labhard (2012) propose an alternative model of the statistical agency which replaces Sargent’s projection model and (2) with a model in which the agency continues to draw samples from the population of zt over time and weighs the results to produce the published estimate z˜t. Together with the assumption that the effective sample size of successive surveys declines at a strictly geometric rate, this enables them to deduce the overall measurement error from the declining volatility of successive data revisions. Jacobs and van Norden (2011) instead assume that measurement errors are a linear combination of news and noise measurement errors. Together with strict assumptions on the dynamics of Zt, this enables them to consistently estimate overall measurement errors from the data revision process. However, neither of these methods have seen wide application.

Modeling Data Revisions

A considerable literature has studied the characteristics of data revisions in macroeconomic series (see the extensive survey by Croushore, 2011, or the shorter surveys contained in Croushore & Stark, 2001, and Jacobs & van Norden, 2011). It includes examinations of which of the two behavioral models proposed by Sargent (1989) seems to better capture the properties of published data. In particular, measurement errors vt are said to be noise when Cov(Zt,vt)=0. This is consistent with the assumptions of the classical measurement error model and (1). One implication of this is that data revisions should be partially forecastable. (To see this, note that Cov(Zt,vt)=0 together with (1) implies that Cov(zt,vt)0. Once zt is observed, this gives information on the unobserved value vt. If values of vt are independent across successive data releases, then Cov(zt,vt)>0. This implies that unusually low values of zt should be revised upwards, and vice versa.) It also implies that the true values Zt can be no more variable than the observed series zt. (To see this, note that (1) implies Var(zt)=Var(Zt)+Var(vt)+2Cov(Zt,vt). When Cov(Zt,vt)=0, this implies Var(zt)<Var(Zt) whenever Var(vt)>0.)

Alternatively, measurement errors vt are said to be News when Cov(zt,vt)=0, or if the statistical agency reports z˜t instead of zt, when Cov(z˜t,vt)=0. The latter condition will be satisfied where z˜t is based on (2), and implies that data revisions will be unforecastable. (Slightly more precisely, if z˜t is formed as the linear projection of Zt onto the information set spanned by {zt,zt1,}, then the measurement error (z˜tZt) must be orthogonal to {zt,zt1,}.) It also implies that the true values Zt must be at least as variable as the observed series z˜t.3

Empirical work since at least Croushore and Stark (2003) and Faust, Rogers, and Wright (2005) has found that, while results vary somewhat from one series to another, revisions to most macroeconomic series do not fit cleanly into either model. This in turn has led to the development of models in which measurement errors may be a combination of both news and noise errors. Such models will be discussed, but first a more popular approach should be considered.

Factor Models and Data Revision

The widespread adoption of dynamic factor models (DFMs) in macroeconomics reflects the usefulness of approaches pioneered by authors such as Forni, Hallin, Lippi, and Reichlin (2000) and Stock and Watson (2002). While many variations on the basic DFM have been proposed, among the most influential for policy-making institutions and forecasters is the work by Giannone, Reichlin, and Small (2008) and Banbura, Giannone, Modugno, and Reichlin (2013) and others, who adapt the DFM to the problem of integrating and updating estimates from a broad range of data sources with varying reporting frequencies and time lags. ( uses their methodology to produce reports for a variety of commercial clients). The vast majority of such models (including the DFMs of Giannone, Reichlin & Small, 2008, and Banbura et al., 2013) use only the latest available data vintage of each variable on the assumption that the effects of measurement errors should be slight in a data-rich environment. (A referee noted that such arguments implicitly assume that the “correct” number of factors are extracted, and, in some cases, that measurement errors are uncorrelated across variables—i.e., H is diagonal—the latter assumption is literally incredible.) While this may seem a reasonable approach if data revisions are “news,” a “noise” model suggests that additional information can be obtained by including multiple data vintages and attempting to filter the noise.

The simplest approach to do so would be to simply include all data vintages for all series in the DFM model. However, this overlooks the fact that DFMs impose structural assumptions on the relationships between variations in the factors and the idiosyncratic errors. To see this, one may write a DFM in its state-space representation,



where ϵti.i.d.N(0,H), ηti.i.d.N(0,Q) and ϵtηss,t. (For clarity, the constants in these equations have been suppressed and it is assumed that the system matrices A,T,R,H and Q are constant. Compare, for example, with equations (1) and (2) in Banbura et al. (2013.) The orthogonality assumption between ϵt and ηs is both central to identification in such models and the stumbling block in using them to account for data revision.

To understand why, consider the case where the vector yt contains only the various releases of a single series, and let the scalar Yt be the unobserved “true” value of the series at time t. One could use (3) and (4) to estimate αt. However, those estimates will be biased for Yt unless the measurement errors are pure noise, since that is the only case in which the idiosyncratic errors εt are uncorrelated with the true values αt and therefore the condition ϵtηss,t is respected.

To solve this problem, Jacobs and van Norden (2011) and Kishor and Koenig (2012) relax the orthogonality conditions. The former’s model is a special case of (3) and (4) where they set






where l is the number of releases in yt, νt is the vector of news measurement errors, U1 is an l×l matrix with zeros below the main diagonal and ones everywhere else, and R1,R2 control the variability of the true values and the news shocks νt—(The dynamics for Yt have been omitted for simplicity. See Jacobs & van Norden, 2011, equations (7) and (8) and their section “Modeling Data Revisions” for details). Note that A defines the l observed releases to be the sum of the true value Yt and news errors νt. Both are generated from the vector of i.i.d. errors ηt via R, which imposes the news structure. The first row of R ensures that all elements of ηt affect Yt; the remaining rows of R strip away a decreasing number of these to produce the l different data vintages. The result is that R imposes the correlation structure between the various data releases and the true values as well as the variance inequalities that are implied by news errors (Jacobs & van Norden, 2011, also show that in the more general case when Yt is persistent, all these parameters are identified). This comes at the cost of a much larger state vector. In the DFM models the state vector for a comparable univariate model would be a scalar containing just the true values Yt. Here it contains an additional l measurement errors.

The Simple Arithmetic of Measurement Errors

While measurement errors may be important in some situations, there is little indication of when this is likely to be a problem in practice. To remedy this one should consider the impact of common practices which may exacerbate the seriousness of measurement errors.

True values and measurement errors have so far been considered in the context of a vector Z of arbitrary dimension. In order to focus on several commonly used univariate and bivariate transformations, the notation that has been used to this point must be modified.

A Change in Notation

In this section, Xt,Yt,Zt are used to denote the true values of three distinct time series, and xt,yt,zt to denote their estimated or published values. For compactness, the time subscript t (which is assumed to range from 1 to T) may be omitted. Each series has its associated measurement error defined as




It is not assumed that the ϵ’s have means of zero (i.e., estimates could be biased) and they may or may not be independent of their corresponding variable’s true values (i.e., they could be news or noise.) The objective is to understand the economic importance of measurement error in z. To do so, one may focus on its noise-to-signal (NS) ratio, defined as


Note that when E(ϵz)<>0, the numerator will be larger than Var(ϵz), so that φ(z) captures both the variability of measurement errors as well as any bias.


Now suppose that z is not measured directly, but is instead constructed as the sum of two variables x and y so that4


This in turn implies that


where σi2 is the variance of variable i. This last equation relates the NS ratio for z to a weighted average of the NS ratios of its components x and y, as well as a term in the cross-moment of their measurement errors. When the latter is zero, the larger σx2 (or σy2) is relative to σz2, the more weight it receives in φ(z)2, a result which seems intuitive.

Less intuitive perhaps is the case where both σx2/σz2>1 and σy2/σz2>1 (as may occur when Cov(x,y)<0, for example). In this case, one may find φ(z)2>max(φ(x)2,φ(y)2). This arises simply because z is less variable than either of its components x or y, so the relative importance of their respective measurement errors becomes larger.

This simple result leads to two rules of thumb for empirical researchers:

  1. 1. Measurement errors may be magnified when working with the first-difference of persistent series.

  2. 2. Measurement errors may be magnified when working with the difference of two correlated series.

Consider each of these cases in turn.

Measurement Errors in First Differences

In the case of first differences, one may set yt=xt1 so that zt=xtxt1=Δxt. Assuming that σy2=σx2 and φ(x)=φ(y), (14) reduces to


Even when E(ϵxtϵxt1)=0, the NS ratio in first differences will be larger than that in the level of the series by a factor of 2σx2/σΔx2. This ratio will typically increase with the persistence of x. For example, in the case where x follows a stationary AR(1) process with first-order autocorrelation ρ, the variance ratio will be proportional to 1/(1ρ) and will be greater than 1 for all ρ|0<ρ<1.

As an example, consider the U.S. unemployment rate (measured in percentage). Taking the 70-year period starting in January 1948, the rate had a sample variance of 2.67 while that of monthly changes in the rate was 0.044, implying that σx2σΔx2=60.8. One should therefore expect that, even when measurement errors have a mean of zero and are serially independent so that E(ϵxtϵxt1)=0, the NS ratio for monthly changes in the unemployment rate should be roughly 120 times larger than that for the level of the rate itself. Kozicki and Hoffman (2004) provide evidence of similar problems with U.S. CPI data which (like the unemployment rate) is typically reported to only one decimal place. While this rounding is relatively innocuous for measuring the level of the series, they document the severe measurement error that this discretization sometimes introduces for inflation rates measured over relatively short time spans.

Measurement Errors in Differences and Ratios

Now suppose that z is constructed as the difference of two series so that z=yx, as may be found for example in the case of fiscal surplus/deficits, current account balances, or net savings. Alternatively, one may be interested in cases where z is calculated as the ratio of two series z=y/x, as would be the case for unemployment rates (unemployment as a fraction of the labor force) or productivity measures (the amount produced per unit input). In this case lnz=lnylnx so one could work with the NS ratios of the logarithms of the variables. From (14) this gives


Aside from a change of sign on the cross-product terms, much is as before. To illustrate the practical application of this change, consider two simple examples using the U.K. Current Account and U.S. Productivity Growth.

Current Account in the United Kingdom

Table 2 shows the results for revisions to the quarterly U.K. Current Account figures from 1996Q1 to 2015Q2. (Data are taken from the revision triangles available on the Office for National Statistics (ONS) website. Revisions are measured as the first estimate minus the value reported three years later, which is the metric reported by the ONS.) The first line shows that while the current account credits and debits both have NS ratios (ϕ) around 2%, that for the current account balance (which is simply their difference) is almost 15%. The next line shows that this is partly due to the fact that each of these components is more than 25× as volatile as their sum. The final line of the table makes it possible to verify the decomposition in (16) by taking into account the remaining terms in E(εxεy) and σz2σx2σy2 which are shown in the column marked “Cross-Moments.” This shows that the NS ratio for the current account balance would be much higher still (over 100%!) were it not for the positive correlation of revisions to the credits and debits.

Table 2. Data Revision in the U.K. Current Account: 1996Q1 to 2015Q2






NS Ratios (ϕi)





Variance Ratios (σi/σz)













Notes: The figures displayed under “Cross-Moments” are the relevant cross moments of Credits and Debits, calculated as σx,y2/σz2 for the Variance Ratio and E(ϵxϵy)/σx,y2 for the NS Ratio.

Productivity Growth in the United States

Table 3. Data Revisions in U.S. Productivity Growth: 1999Q2 to 2018Q2






NS Ratios





Variance Ratios













Notes: The figures displayed under “Cross-Moments” are the relevant cross moments of Output and Hours, calculated as σx,y2/σz2 for the Variance Ratio and E(ϵxϵy)/σx,y2 for the NS Ratio. Growth is measured as the quarterly change in logs. The minor discrepancy in the decomposition of the NS ratio for OPH is due to small inconsistencies between the initially published figures for OPH on the one hand, and those for the Output and Hours series on the other.

Table 3 shows the results for revisions to growth in U.S. productivity (measured as real output per hour (OPH) worked) from 1999Q2 to 2018Q2. (Figures are for OPH worked in the non-farm business sector. Original vintage data are taken from the Federal Reserve Bank of St. Louis’s ALFRED for series OUTNFB, HOANBS and OPHNFB. Revisions are calculated using the first release for which all three series are available minus the value in the September 2018 vintage.) Here one sees an example that combines growth rates with the difference between two series (the logs of real output and hours worked). In the first line one again sees that the NS ratio for the difference of the two series (OPH) is larger than that for either of its components. In this case, the NS ratios for all three series that are substantially larger than those in Table 2, with that for productivity growth exceeding 50% (see Anderson & Kliesen, 2006; Jacobs & van Norden, 2016; and Kurmann & Sims, 2017, for additional analysis of revisions in this and related measures of U.S. productivity growth). The decomposition shows that this largely reflects the importance of revisions in the real output series, which accounts for 0.550/0.565=97% of the NS ratio for productivity growth. In addition the cyclical co-movement of output and hours implies that productivity is slightly less volatile than either of its components as shown by the variance ratios in the second line. This in turn magnifies the relative importance of the revisions in real output for measured productivity.

Adjustment and Measurement Error

There are many situations in which macroeconomists are interested in series which abstract from some source of regularly observed variations. By far the most common among these are seasonally adjusted series, which seek to remove regular annual fluctuations. Also important in some applications are quantities (such as potential output, or structural fiscal deficits) which abstract from the influence of business cycles. While the seasonal cycle in U.S. GDP is of roughly the same magnitude as the business cycle, the problems of seasonal adjustment and related estimation errors and revisions have long occupied statisticians, while macroeconomists have largely focused on the analogous issues for business cycles. Both processes, however, contribute to the overall measurement error of cyclically adjusted series. Their estimation error is therefore often layered on top of the types of estimation error mentioned in the “Measurement Errors in Differences and Ratios” section.

Most seasonal and cyclical adjustment methods rely on the application of linear filters, so that the adjusted series may be expressed as moving averages of past, present and future values of the unadjusted series. The linear state-space models used to derive many of these filters often provide a coherent framework for evaluating the degree of measurement error which such estimation adds to the adjusted series.

Seasonal Adjustment

Seasonal adjustment is a challenging problem. Many countries find it sufficiently difficult that they prefer not to release seasonally adjusted data. For example, Moulton and Cowan (2016) notes several contemporary critiques of seasonal adjustment in U.S. GDP and discusses how the Bureau of Economic Analysis (BEA) is trying to address them (see also Wright, 2018, for an early assessment of the BEA’s latest seasonal adjustment methods). The most widely used seasonal adjustment approaches include the X12-ARIMA method developed at the U.S. Census Bureau (see Findley, Monsell, Bell, Otto, & Chen, 1998) and the TRAMO-SEATS model-based approach due to Gómez and Maravall (2000), which have both been incorporated in the Census Bureau’s X13ARIMA-SEATS seasonal adjustment programs. All grapple with the fundamental problem that the seasonal cycle is not a constant, but that it instead evolves over time. This means that as new observations become available, the modeling approach must decide how much of the observed change in the series is due to changes in the seasonally adjusted value, and how much is due to changes in the seasonal cycle itself. Wright (2013) provides an insightful examination of this problem in the context of U.S. Non-farm Payrolls Employment and the sharp drop due to the 2008 recession. Here, another series with important seasonal cycles is considered; dividends on U.S. stocks (see also Pettenuzzo et al., 2018), who develop an innovative high-frequency approach to seasonally adjusting dividends).

Measurement Error: A Primer for Macroeconomists

Figure 2. Monthly stock dividends.

Seasonally Adjusted Dividends

Figure 2 shows both the raw (in black) and seasonally adjusted (SA, in red) monthly values for the log of the dividend series on the Center for Research in Security Prices (CRSP) value-weighted index. Due to limitations in the Census Bureau’s X13 programs, this analysis of seasonally adjusted values will be restricted to the period starting in January 1965. Figure 2 shows very large monthly variations in dividends, reflecting the fact that most listed stocks pay quarterly dividends, but that the month in which they are paid varies from firm to firm.

Measurement Error: A Primer for Macroeconomists

Figure 3. Seasonal adjustment with monthly dummies.

To understand the importance of changes in the seasonal pattern, consider what happens if one simply regresses the monthly dividend growth rate on a constant and 11 monthly dummies. If the seasonal pattern were constant, the residuals from this regression should provide a seasonally adjusted dividend growth rate. Figure 3 compares the autocorrelations of the raw data (solid line) to that of the regression residuals (dotted line). It shows that almost all of the strong quarterly pattern of peaks and troughs in the autocorrelations of the raw series are preserved in the regression residuals, implying that the monthly dummies failed to capture much of the seasonal cycle in dividends.

Measurement Error: A Primer for Macroeconomists

Figure 4. Evolution of seasonal factors.

One can better appreciate how the seasonal cycle evolves over time by applying the U.S. Census Bureau’s X13 seasonal adjustment techniques to the dividend series. These results use a preliminary logarithmic transformation of the raw CRSP dividend index together with the SEATS approach, automatic model selection and no outliers. The purpose here is merely to illustrate how seasonal adjustment may introduce measurement errors; it is not meant to imply that this or any of the other X13 adjustment methods do an adequate job of seasonal adjustment for this series (see Pettenuzzo et al., 2018, for a better approach).

Figure 4 shows how the estimated seasonal factors evolved over time by comparing the annual estimates of each month’s factor in a series of 12 plots. It is evident that dividends in the middle month of each quarter are consistently higher than those in the other months and that this is the dominant source of the seasonal variation. However, it is also clear the relative size of the “mid-quarter month” factor appears to have changed substantially over time, growing rapidly early in the sample, then falling to substantially less than its initial value by the mid-point of the sample, and then gradually recovering somewhat. Since the 12 seasonal factors are constrained to sum to zero, it is no surprise to see that these movements are largely mirrored by movements in the opposite direction in most of the other seasonal factors. As far as measurement error is concerned, this volatility in the seasonal factors implies that simply capturing the average seasonal effect will typically be insufficient, as was seen in Figure 3.

Measurement Error: A Primer for Macroeconomists

Figure 5. Revision in seasonal factors.

Measurement Error: A Primer for Macroeconomists

Figure 6. Revision in seasonally adjusted values.

But how well do sophisticated methods capture these time-varying seasonals? One suggested diagnostic is to study how the estimated factors change as new observations become available. Revisions in the estimated factors are then interpreted as evidence of error in the initial estimates (see Findley et al., 1998, section 2.2.2; Abeln, Jacobs, & Ouwehand, 2019, propose an alternative method of seasonal adjustment that avoids revision of the seasonal factors). Figure 5 compares those initial estimates over the 2008–2011 period to those from the full data sample, showing that the revisions appear to be trivial. However, this ignores the fact that seasonal variations are much larger than the variations in the seasonally adjusted estimates shown in Figure 2. Figure 6 shows the revisions in the latter during the same period, which appear to be more economically significant. In particular, it shows that the sharp fall in dividends in early 2009 was initially underestimated as it was (incorrectly) partially attributed to a change in the seasonal factor; only in retrospect did the change in the seasonally adjusted dividends become clear. The same also occurs in reverse as dividends recover in early 2010.

These kinds of estimation errors will occur in most settings where there is a need to adjust the observed data to correct for a seasonal or other cyclical pattern that may evolve over time. In the wake of unusually large changes, such methods will tend to confuse the cyclical and cyclically adjusted changes, something that will only be corrected with hindsight. This same phenomenon occurs with business cycles.

Trends and Cycles

Just as macroeconomists find it useful to abstract from seasonal fluctuations, they also often prefer to abstract from business cycle fluctuations. For example, fiscal policy may be described in terms of a “structural” deficit, productivity growth trends may be measured in terms of “utilization-adjusted” total factor productivity, GDP growth prospects may be compared to “potential output” and unemployment and interest rates may be compared to estimates of their respective “natural” rates. In every case, macroeconomists attempt to estimate the degree to which the observed data on GDP or productivity, for example, is influenced by temporary business cycle factors. With such estimation comes another potential source of estimation error.

Some of the methods used for cyclical adjustment are conceptually similar to those used in seasonal adjustment in that they rely on removing specific dynamics from the data series (seasonal frequencies in the case of seasonal adjustment and business-cycle frequencies in the case of cyclical adjustment) using time-series techniques such as smoothing, band-pass filtering or structural state-space modeling. However, business cycle estimation also makes more use of multivariate modeling techniques (including structural vector autoregessions, co-integrating relationships and factor models) which use additional information in order to provide estimates of the business cycle that are more closely tied to economic theory.

It is an understatement to say that there is a lack of consensus about which of these many different approaches is “best.” For that reason, most national statistical agencies refrain from publishing such series. However, among macroeconomists and policy institutions such as central banks a great variety of techniques are in use and results from many different methods are frequently consulted and compared. Because estimates often vary widely across methods (for example, see Orphanides & van Norden, 2002) this makes model uncertainty a much more important contributor to estimation error in the context of business cycle measurement than was seen in the case of seasonal adjustment (see Garratt, Mitchell, & Vahey, 2014, for an analysis of the contribution of model uncertainty to business cycle estimation error).

To the extent that business cycle estimates are based on time-series techniques like those used in seasonal adjustment, studying their revision properties may give insight into the reliability of contemporary estimates of the cycle. Orphanides and van Norden (2002) examine estimates of U.S. output gaps from several such methods and find that revisions are typically the same order of magnitude as the estimated gap. They report NS ratios typically in the range 0.5 to 1.0, although they found much smaller revisions using the univariate Beveridge-Nelson decomposition. Jönsson (2018) also examines revisions from the new technique proposed by Hamilton (2018). All these reflect a more serious problem than typically encountered in seasonal adjustment. Whereas most seasonally adjusted series show a clear peak in their spectral density at the seasonal frequencies, few macroeconomic series show a comparable peak at business cycle frequencies (see Cogley & Nason, 1995). This in turn makes separating the signal from the noise more difficult for business cycles than for seasonal fluctuations.

This difficulty with a purely time-series approach has increased the attractiveness of multivariate approaches to business cycle estimation. According to studies of their revision properties, their success has been mixed, however. Numerous studies, such as Orphanides and van Norden (2002) have found that Phillips Curve relationships do little to improve the situation, reflecting the relatively weak link between output movements and inflation since the 1990s in most major economies. Kurmann and Sims (2017) document large revisions in widely used series of utilization-adjusted U.S. total factor Productivity (TFP), but suggest an alternative adjustment method that may produce more precise estimates. Garratt, Mitchell, and Vahey (2014) suggest that structural vector autoregression (VAR) methods tend to produce better results, a conclusion shared with Coibion, Gorodnichenko, and Ulate (2017).

Several studies in the past five years have examined cyclically adjusted estimates used to guide macroeconomic policy decisions. These are informed by a variety of time-series and structural estimates as well as expert judgment. While in theory cyclically adjusted quantities should not be systematically related to cyclical downturns, studies examining initial estimates of potential output in the wake of the 2008 recession and global financial crisis find systematic overestimates of potential initially and subsequent large downward revisions. This includes Dovern and Zuber (2017), who examine OECD estimates of potential output after recessions, Kuang and Mitra (2018), who look at the European Commission’s estimates of potential output and structural balances, and Coibion, Gorodnichenko, and Ulate (2017), who look at Federal Reserve Board and Congressional Budget Office estimates for the United States, as well as International Monetary Fund and OECD estimates for a number of other countries. Kuang and Mitra (2018) also find the reverse in the years following the crisis, with initial underestimates of potential output consistently being revised upwards—they further argue that excessive pessimism about potential output during this period led to excessive fiscal austerity, which in turn tended to reinforce the underestimation of potential output. All of these studies point to the need for awareness among macroeconomic policy advisors of the relative lack of precision of such estimates.


Measurement error is seemingly omnipresent in macroeconomics. Some is caused by disagreement or confusion about what should be measured. Some is due to the statistical agency’s challenge of producing timely, accurate, and precise estimates with a fixed budget and a diverse clientele. The economic importance of measurement error may be accentuated when the focus shifts from levels to growth rates, or from gross values to net values or ratios. Standard seasonal adjustment methods may give less reliable estimates in the aftermath of unusually large changes. Estimates which purport to correct for business cycle fluctuations are often substantially revised in the years following major recessions and recoveries.

For macroeconomists, increasingly the challenge is to (a) recognize the impact that measurement error may have on their analysis, and (b) where possible, minimize its effects. Awareness of the simple examples here should help them recognize common situations in which measurement problems may be economically significant, and mechanisms which may increase their size and impact. That said, results for the United States (and the United Kingdom) are far from universally representative; those working with data from other sources need to carefully assess their quality. Analysis of data revisions may help identify some problems, but the absence of revisions is not a sufficient indication of reliability. In modeling measurement errors, it should be remembered that classical models of measurement errors fail to capture how published estimates may improve over time as more and better sources of information become available. A growing literature suggests methods for improving forecasts by using methods adapted to such measurement errors (e.g., see the survey by Clements & Galvao, 2019). Finally, macroeconomic policy advice should be conditioned on a realistic assessment of the quality of information that decision-makers have available.


The author thanks Dean Croushore and Jan P. A. M. Jacobs for their insights and collaboration over the years.

Further Reading

Stiglitz et al. (2009) and Jorgenson (2018) discuss efforts to measure aggregate economic welfare, while World Bank (2006, 2011) and Carey et al. (2018) discuss attempts by statistical agencies to improve the system of National Accounts.

Croushore (2011) surveys research on data revision, as do the introductory literature surveys in Croushore and Stark (2001) and Jacobs and van Norden (2011).

Anderson and Kliesen (2006), Jacobs and van Norden (2016), and Kurmann and Sims (2017) look at the revision in various measures of productivity growth.

Garratt, Mitchell, and Vahey (2014) examine the contribution of model uncertainty to business cycle estimation error.

Clements and Galvao (2019) discuss the construction of forecasts in the presence of measurement error.


Abeln, B., Jacobs, J. P. A. M., & Ouwehand, P. (2019). CAMPLET: Seasonal adjustment without revisions. Journal of Business Cycle Research, 15(1), 73–95.Find this resource:

Aghion, P., Bergeaud, A., Boppart, T., Klenow, P. J., & Li, H. (2019). Missing growth from creative destruction. American Economic Review, 109(8), 2795–2822.Find this resource:

Anderson, R. G., & Kliesen, K. L. (2006). The 1990s acceleration in labor productivity: Causes and measurement. Federal Reserve Bank of St. Louis Review, 88(3), 181–202.Find this resource:

Banbura, M., Giannone, D., Modugno, M., & Reichlin, L. (2013). Now-casting and the real-time data flow. Handbook of economic forecasting, 2(Part A): 195–237.Find this resource:

Bank of Ireland. (2016). Quarterly bulletin. Dublin, Ireland: Bank of Ireland.Find this resource:

Bognanni, M., & Zito, J. (2016). New normal or real-time noise? Revisiting the recent data on labor productivity. Economic Commentary, 16.Find this resource:

Carey, G. M., Lange, Q., & Wodon, K. (2018). The changing wealth of nations 2018: Building a sustainable future. Washington, DC: World Bank.Find this resource:

Clements, M. P., & Galvão, A. B. (2019). Data revisions and real-time forecasting. In Oxford Research Encyclopedia of Economics and Finance.Find this resource:

Cogley, T., & Nason, J. M. (1995). Effects of the Hodrick-Prescott filter on trend and difference stationary time series Implications for business cycle research. Journal of Economic Dynamics and Control, 19(1–2): 253–278.Find this resource:

Coibion, O., Gorodnichenko, Y., & Ulate, M. (2017). The cyclical sensitivity in estimates of potential output. Working Paper 23580, National Bureau of Economic Research.Find this resource:

Croushore, D. (2011). Frontiers of real-time data analysis. Journal of economic literature, 49(1), 72–100.Find this resource:

Croushore, D., & Stark, T. (2001). A real-time data set for macroeconomists. Journal of Econometrics, 105, 111–130.Find this resource:

Croushore, D., & Stark, T. (2003). A real-time data set for macroeconomists: Does the data vintage matter? The Review of Economics and Statistics, 85(3), 605–617.Find this resource:

Cunningham, A., Eklund, J., Jeffery, C., Kapetanios, G., & Labhard, V. (2012). A state space approach to extracting the signal from uncertain data. Journal of Business & Economic Statistics, 30(2), 173–180.Find this resource:

Dovern, J., & Zuber, C. (2017). Recessions and instable estimates of potential output. Discussion Paper 639, Department of Economics, Heidelberg University.Find this resource:

Faust, J., Rogers, J. H., & Wright, J. H. (2005). News and noise in G-7 GDP announcements. Journal of Money, Credit and Banking, 37(3), 403–419.Find this resource:

Findley, D. F., Monsell, B. C., Bell, W. R., Otto, M. C., & Chen, B.-C. (1998). New capabilities and methods of the X-12-ARIMA seasonal-adjustment program. Journal of Business & Economic Statistics, 16(2), 127–152.Find this resource:

Forni, M., Hallin, M., Lippi, M., & Reichlin, L. (2000). The generalized dynamic-factor model: Identification and estimation. The Review of Economics and Statistics, 82(4), 540–554.Find this resource:

Garratt, A., Mitchell, J., & Vahey, S. P. (2014). Measuring output gap nowcast uncertainty. International Journal of Forecasting, 30(2), 268–279.Find this resource:

Giannone, D., Reichlin, L., & Small, D. (2008). Nowcasting: The real-time informational content of macroeconomic data. Journal of Monetary Economics, 55(4), 665–676.Find this resource:

Gómez, V., & Maravall, A. (2000). Seasonal adjustment and signal extraction time series. In D. Peña, G. C. Tiao, & R. S. Tsay (Eds.), A Course in Time Series Analysis (pp. 202–246). Wiley.Find this resource:

Hamilton, J. D. (2018). Why you should never use the Hodrick-Prescott filter. The Review of Economics and Statistics, 0(0), null.Find this resource:

Hamilton, K., & Clemens, M. (1999). Genuine savings rates in developing countries. The World Bank Economic Review, 13(2), 333–356.Find this resource:

Jacobs, J. P., & van Norden, S. (2011). Modeling data revisions: Measurement error and dynamics of true values. Journal of Econometrics, 161(2), 101–109.Find this resource:

Jacobs, J. P., & van Norden, S. (2016). Why are initial estimates of productivity growth so unreliable? Journal of Macroeconomics, 47, 200–213.Find this resource:

Jonsson, K. (2018). Real-time properties of a regression-based filter. Stockholm, Sweden: National Institute of Economic Research.Find this resource:

Jorgenson, D. W. (2018). Production and welfare: Progress in economic measurement. Journal of Economic Literature, 56(3), 867–919.Find this resource:

Kapetanios, G., & Yates, T. (2010). Estimating time variation in measurement error from data revisions: An application to backcasting and forecasting in dynamic models. Journal of Applied Econometrics, 25(5), 869–893.Find this resource:

Kishor, N. K., & Koenig, E. F. (2012). VAR estimation and forecasting when data are subject to revision. Journal of Business & Economic Statistics, 30, 181–190.Find this resource:

Kozicki, S., & Hoffman, B. (2004). Rounding error: A distorting influence on index data. Journal of Money, Credit and Banking, 36, 319–338.Find this resource:

Kuang, P., & Mitra, K. (2018). Potential output pessimism and austerity in the European Union. University of Birmingham manuscript.Find this resource:

Kurmann, A., & Sims, E. (2017). Revisions in utilization-adjusted TFP and robust identification of news shocks. Working paper 23142, National Bureau of Economic Research.Find this resource:

Moulton, B. R., & Cowan, B. D. (2016). Residual seasonality in GDP and GDI: Findings and next steps. Survey of Current Business, 96(7), 1–6.Find this resource:

Nordhaus, W. D., & Tobin, J. (1972). Is growth obsolete? Economic Research: Retrospect and prospect, 5, 1–80.Find this resource:

Orphanides, A., & van Norden, S. (2002). The unreliability of output gap estimates in real time. Review of Economics and Statistics, 84, 569–583.Find this resource:

Pettenuzzo, D., Sabbatucci, R., & Timmermann, A. (2018). High-frequency cash flow dynamics. Unpublished manuscript, Brandeis University, Stockholm School of Economics, and UCSD.Find this resource:

Robinson, L. (2016). Rewriting history: Understanding revisions to UK GDP. Bank Underground.Find this resource:

Sargent, T. J. (1989). Two models of measurements and the investment accelerator. Journal of Political Economy, 97(2), 251–287.Find this resource:

Stiglitz, J., Sen, A., &Fitoussi, J.-P. (2009). The measurement of economic performance and social progress revisited. Reflections and overview. Paris: Commission on the Measurement of Economic Performance and Social Progress.Find this resource:

Stock, J. H., & Watson, M. W. (2002). Forecasting using principal components from a large number of predictors. Journal of the American statistical association, 97(460), 1167–1179.Find this resource:

World Bank (2006). Where is the wealth of nations? Measuring capital for the twenty-first century. Washington, DC: World Bank.Find this resource:

World Bank (2011). The changing wealth of nations: Measuring sustainable development in the new millennium. Washington, DC: World Bank.Find this resource:

Wright, J. H. (2013). Unseasonal seasonals? Brookings Papers on Economic Activity, 2013(2), 65–126.Find this resource:

Wright, J. H. (2018). Seasonal adjustment of NIPA data. Working Paper 24895, National Bureau of Economic Research.Find this resource:

Youll, R. (2008). Dealing with potential bias in early estimates of GDP. Economic and Labour Market Review, 2(3), 48–52.Find this resource:

Zwijnenburg, J. (2015). Revisions of quarterly GDP in selected OECD countries. OECD Statistics Brief, 22, 1–12.Find this resource:


(1.) The same source also describes the very large revisions to the Irish national accounts released that month.

(2.) The assumption is that E(vtϵs)=0 for all t,s where ϵs is the innovation in Zs from its Wold representation. Note that vt may be serially correlated and may be correlated across the n variables in an arbitrary fashion.

(3.) To see this, note that one may rewrite (1) as Zt=ztvt, so Var(Zt)=Var(zt)+Var(vt)2Cov(zt,vt). When Cov(zt,vt)=0, this implies Var(Zt)<Var(zt) whenever Var(vt)>0.

(4.) The following derivation may be easily extended to the case where z is any linear combination of x and y. To see this, note that multiplying x by a factor of a will simply scale its covariance and that of its measurement error by a and their respective variances by a2. Extensions to the case where z is a linear combination of n different variables is also straightforward; see the Appendix of Jacobs and van Norden (2016).