1-3 of 3 Results

  • Keywords: prediction x
Clear all


Michael P. Clements and Ana Beatriz Galvão

At a given point in time, a forecaster will have access to data on macroeconomic variables that have been subject to different numbers of rounds of revisions, leading to varying degrees of data maturity. Observations referring to the very recent past will be first-release data, or data which has as yet been revised only a few times. Observations referring to a decade ago will typically have been subject to many rounds of revisions. How should the forecaster use the data to generate forecasts of the future? The conventional approach would be to estimate the forecasting model using the latest vintage of data available at that time, implicitly ignoring the differences in data maturity across observations. The conventional approach for real-time forecasting treats the data as given, that is, it ignores the fact that it will be revised. In some cases, the costs of this approach are point predictions and assessments of forecasting uncertainty that are less accurate than approaches to forecasting that explicitly allow for data revisions. There are several ways to “allow for data revisions,” including modeling the data revisions explicitly, an agnostic or reduced-form approach, and using only largely unrevised data. The choice of method partly depends on whether the aim is to forecast an earlier release or the fully revised values.


The recent “replication crisis” in the social sciences has led to increased attention on what statistically significant results entail. There are many reasons for why false positive results may be published in the scientific literature, such as low statistical power and “researcher degrees of freedom” in the analysis (where researchers when testing a hypothesis more or less actively seek to get results with p < .05). The results from three large replication projects in psychology, experimental economics, and the social sciences are discussed, with most of the focus on the last project where the statistical power in the replications was substantially higher than in the other projects. The results suggest that there is a substantial share of published results in top journals that do not replicate. While several replication indicators have been proposed, the main indicator for whether a results replicates or not is whether the replication study using the same statistical test finds a statistically significant effect (p < .05 in a two-sided test). For the project with very high statistical power the various replication indicators agree to a larger extent than for the other replication projects, and this is most likely due to the higher statistical power. While the replications discussed mainly are experiments, there are no reasons to believe that the replicability would be higher in other parts of economics and finance, if anything the opposite due to more researcher degrees of freedom. There is also a discussion of solutions to the often-observed low replicability, including lowering the p value threshold to .005 for statistical significance and increasing the use of preanalysis plans and registered reports for new studies as well as replications, followed by a discussion of measures of peer beliefs. Recent attempts to understand to what extent the academic community is aware of the limited reproducibility and can predict replication outcomes using prediction markets and surveys suggest that peer beliefs may be viewed as an additional reproducibility indicator.


Martin Karlsson, Tor Iversen, and Henning Øien

An open issue in the economics literature is whether healthcare expenditure (HCE) is so concentrated in the last years before death that the age profiles in spending will change when longevity increases. The seminal article “aging of Population and HealthCare Expenditure: A Red Herring?” by Zweifel and colleagues argued that that age is a distraction in explaining growth in HCE. The argument was based on the observation that age did not predict HCE after controlling for time to death (TTD). The authors were soon criticized for the use of a Heckman selection model in this context. Most of the recent literature makes use of variants of a two-part model and seems to give some role to age as well in the explanation. Age seems to matter more for long-term care expenditures (LTCE) than for acute hospital care. When disability is accounted for, the effects of age and TTD diminish. Not many articles validate their approach by comparing properties of different estimation models. In order to evaluate popular models used in the literature and to gain an understanding of the divergent results of previous studies, an empirical analysis based on a claims data set from Germany is conducted. This analysis generates a number of useful insights. There is a significant age gradient in HCE, most for LTCE, and costs of dying are substantial. These “costs of dying” have, however, a limited impact on the age gradient in HCE. These findings are interpreted as evidence against the red herring hypothesis as initially stated. The results indicate that the choice of estimation method makes little difference and if they differ, ordinary least squares regression tends to perform better than the alternatives. When validating the methods out of sample and out of period, there is no evidence that including TTD leads to better predictions of aggregate future HCE. It appears that the literature might benefit from focusing on the predictive power of the estimators instead of their actual fit to the data within the sample.