Show Summary Details

Page of

date: 17 December 2018

# Aging and Health Care Costs

## Summary and Keywords

An open issue in the economics literature is whether health care expenditure (HCE) is so concentrated in the last years before death that the age profiles in spending will change when longevity increases. The seminal article “Ageing of Population and Health Care Expenditure: A Red Herring?” by Zweifel and colleagues argued that that age is a distraction in explaining growth in HCE. The argument was based on the observation that age did not predict HCE after controlling for time to death (TTD). The authors were soon criticized for the use of a Heckman selection model in this context. Most of the recent literature makes use of variants of a two-part model and seems to give some role to age as well in the explanation. Age seems to matter more for long-term care expenditures (LTCE) than for acute hospital care. When disability is accounted for, the effects of age and TTD diminish. Not many articles validate their approach by comparing properties of different estimation models. In order to evaluate popular models used in the literature and to gain an understanding of the divergent results of previous studies, an empirical analysis based on a claims data set from Germany is conducted. This analysis generates a number of useful insights. There is a significant age gradient in HCE, most for LTCE, and costs of dying are substantial. These “costs of dying” have, however, a limited impact on the age gradient in HCE. These findings are interpreted as evidence against the “red herring” hypothesis as initially stated. The results indicate that the choice of estimation method makes little difference and if they differ, ordinary least squares regression tends to perform better than the alternatives. When validating the methods out of sample and out of period, there is no evidence that including TTD leads to better predictions of aggregate future HCE. It appears that the literature might benefit from focusing on the predictive power of the estimators instead of their actual fit to the data within the sample.

# Introduction

Is age an important predictor of health care expenditure (HCE) after controlling for time to death (TTD)? This question has received a lot of attention in health economics after Zweifel, Felder, and Meiers (1999) argued that age is a red herring—or, a distraction—in explaining growth in HCE. The argument was based on the observation that, in a sample of decedents enrolled in a Swiss sickness fund, age did not predict HCE after controlling for TTD. If true, individual HCE will not increase with longevity, and therefore population ageing in itself will not contribute to future growth of HCE per capita. Zweifel et al. (1999) has been very influential in downplaying the role of population aging in the debate of what causes HCE growth, but the study has been criticized for methodological weaknesses (Norton, 2016). Much of the critique concerns the method, which allegedly fails to represent the actual distribution of HCE correctly—and which in turn reduces the generalizability of its results (Seshamani & Gray, 2004a). The methodological critique has given rise to a number of papers using different methods to predict the age profile of health care spending. The literature generally finds that age is less important for predicting HCE when TTD is taken into account; however, the results vary significantly across the empirical models used and the type of HCE that is being predicted.

The purpose of this article is to reconcile conflicting findings by carefully reviewing and validating the methods that have been used in the literature. In particular, the advantages of using a two-part model for health care expenditure, in which health care utilization is allowed to have different determinants on the extensive and the intensive margin, are discussed. This standard modeling approach is compared to alternatives that require weaker assumptions at the cost of possibly inferior fit to the data. In addition, how different proxies for morbidity affect the estimated end-of-life costs and age profiles in spending are evaluated. Using a high-quality health insurance data set from Germany, which includes all relevant determinants of health and long-term care expenditure, we assess the accuracy of popular methods in predictions of future health and long-term care spending. The basis for our validation exercise is the prediction accuracy within the same data set.

# Literature Review

Health care expenditure rises steeply with age: one typical example is provided in Figure 1. This cross-sectional age gradient led to an early concern among policymakers and researchers that population aging will cause high future growth rates of HCE per capita. This concern was based on traditional (or “naïve,” Häkkinen, Martikainen, Noro, Nihtilä, & Peltola, 2008) projection models that multiplied current per-capita HCE within age groups by the future number of persons (as predicted by demographic projections) by age groups (Melberg & Sørensen, 2013). The “naïve” assumption of constant age profiles of HCE will automatically lead to an increase in HCE when a population ages. Fuchs (1990) contested this assumption by noting that a main reason HCE increases with age is that the risk of dying increases with age and the health costs of dying are high. If the positive relationship between age and HCE is driven primarily by the cost of dying, increased longevity will not imply higher individual HCE over a lifetime. The cost of dying is simply transferred to more advanced ages (Norton, 2000).

Click to view larger

Figure 1. Health care expenditure by age (German PHI holders).

Source: Own calculations based on Karlsson, Klein, and Ziebarth (2016).

In order to find out whether the correlation between age and HCE is driven by the cost of dying, one needs to observe HCE over the life cycle for persons of different ages, and test whether age is positively correlated with HCE after controlling for time to death (TTD) (Melberg, 2012). If age is not a predictor of HCE, when TTD is kept constant, population aging in itself cannot be a primary cause of growth in HCE (Norton, 2000). The first study that conducted this test is Zweifel et al. (1999). It suggested that HCE growth is unrelated to population aging—based on the observation that the age gradient in HCE disappears once TTD is added to the analysis. The study was soon criticized for the choice of empirical model, for the small sample size, and for not including long-term care (LTC) expenditures (Norton, 2000; Salas & Raftery, 2001; Dow & Norton, 2003). Despite these shortcomings, it is very clear that the study has had a lasting impact on how the effects of population aging on HCE are analyzed in economics.

The empirical literature studying the role of age and TTD in predicting HCE is reviewed through peer-reviewed articles in English in which the main theme is the relative importance of age and TTD in predicting HCE. The focus is on the empirical methods employed. The review is arranged according to empirical methods and separated between HC and LTC, because the age profiles of HCE and LTCE can potentially be very different.

## Modeling Health Care Expenditure

The choice of appropriate empirical framework for the modeling of health care expenditure has been discussed in the literature for several decades (Jones, 2000, provides a good overview of the discussion in the 1980s and 1990s). There are two well-known distributional properties of HCE that make it difficult to use it as a dependent variable (Mullahy, 1998): HCE is always nonnegative and equals zero for a nontrivial share of the population. In addition, even though the positive range of HCE is often well approximated by a log-normal distribution there are observations in which HCE is too high (“heavy users”) to be captured by a parametric model (Duan, Manning, Morris, & Newhouse, 1983). Common methods to account for the distributional features of HCE and used for estimating age and TTD effects are two-part models and Heckman selection models.

However, as Madden (2008) points out, the appropriate choice of model is likely to be highly context-specific and depends crucially on (a) the theoretical perspective taken, (b) practical issues (e.g., possibility to impose valid exclusion restrictions), and (c) statistical issues (in particular measuring the actual performance of the models).

As regards theoretical issues, the largely atheoretical nature of the “red herring” literature makes the choice of model a lot easier: it appears that the main goal of the analysis is prediction of health care expenditure out of sample and, in particular, out-of-period (cf. Mason, Sutton, Whittaker, & Birch, 2015). We would thus prefer an estimator that delivers accurate and reliable predictions of future HCE, even if the behavioral assumptions implied by the model are wrong or if the estimated parameters are not meaningful from an economic point of view. This in turn means that we need to be less concerned about practical issues such as endogeneity—which is a huge advantage because the “red herring” literature is marred with an endogeneity problem that cannot be solved: the fact that the purported cause (death) is determined after its effect, and that health care resources are often expended to avert death. It does thus seem appropriate to focus on statistical issues in this context, and to evaluate different models considered solely by their accuracy in predictions.

## Early Studies

Zweifel et al. (1999) study the HCE records of two samples of decedents from Swiss sickness funds. The first sample includes eight quarterly observations of HCE records for 681 decedents in the time period 1981–1992, while the second sample includes HCE records for 360 decedents in 1991–1994. They suggest that population aging and HCE growth are independent after finding an insignificant relationship between age and the logarithm of HCE in the following regression equation:

$Display mathematics$
(1)

where $A$ is calendar age in years, $female$ is a dummy for women, $λ$ is the inverse Mills ratio obtained from a probit regression of the probability of having positive HCE, $Qq$ is a dummy variable equal to unity in quarter $q$ before death, $X$ is a vector of control variables including time-specific effects and insurance type. To take into account that the data are highly skewed to the right (for many persons HCE is zero while for many others HCE is positive and large) the authors use the logarithm of HCE as the dependent variable. To allow for the possibility that persons with positive HCE are a nonrandom sample, which would lead to sample selection bias, the authors include the inverse Mills ratio ($λ$) as an additional control variable in equation 1; $λ$ is obtained from a probit regression of the probability of having positive HCE.

The authors find that for decedents aged 65 and older the estimated partial correlation between age and HCE is not significantly different from zero, while TTD is significant and important in explaining HCE. HCE in the last quarter before death is estimated to be roughly 300% higher compared to the eighth quarter before death. The substantial effect of TTD, and the absence of an age gradient when TTD is included, implies, according to the authors, that the strong positive correlation between age and HCE is driven by the high cost of dying. More important, the authors argue that the result suggests that per person HCE is independent of population aging.

The results and conclusion of Zweifel et al. (1999) were soon questioned by Salas and Raftery (2001) and Dow and Norton (2003). Salas and Raftery (2001) argue that the study suffers from two severe weaknesses. First, TTD is an endogenous variable in equation (1) because of reverse causality, and second, the estimated $λ$ is highly collinear with age. There is no doubt that TTD is an endogenous variable in equation 1: TTD is a proxy for periods of severe sickness that may lead to death (Payne, Laporte, Deber, & Coyte, 2007), and sickness is affected by HCE. However, it is not necessary to identify the causal effect of TTD on HCE to make predictions of future HCE. If HCE is heavily concentrated toward the end of life, including TTD as an explanatory variable will produce more accurate age profiles of HCE and most likely improve predictions of future HCE. Therefore, as noted by Stearns and Norton (2004), including TTD in models predicting HCE will improve predictions irrespective of whether TTD is endogenous. The second criticism is much more serious. If the estimated $λ$ is highly correlated with age, the age coefficients can be rendered insignificant because of inflated standard errors and not because age has no relationship with HCE.

The authors of the original study replied to and dismissed the criticism (Zweifel, Felder, & Meier, 2001). They dismissed that $λ$ picked up age effects, showing how the estimated coefficient in front of $λ$ changed between specifications in the original study. However, a more careful empirical analysis by some of the same authors showed that the collinearity problem could not be ignored (Zweifel, Felder, & Werblow, 2004).

Dow and Norton (2003) followed up on the second criticism and argue that there is no selection problem in estimating HCE because zero observations are real responses and not a censored value for an underlying latent index. They argue that it is more efficient to use a two-part model when the dependent variable is not censored (i.e., no sample selection problem).

## Two-Part Models

As a response to the purported multicollinearity problem in Zweifel et al. (1999), many studies proceeded to use two-part models. The two-part model is similar to the Heckman selection model. An identical first step estimates the probability of positive HCE, and the second step predicts HCE conditional on positive HCE, and the predictions of the first and second step can be used to estimate the conditional expectation of HCE. The difference from the Heckman selection model is that in the second step of the two-part model the inverse Mills ratio ($λ$) is not used as a control variable. The rationale for excluding $λ$ is that there is no selection problem associated with zero HCE: it is an actual observation and not a censored value for some latent true value (Dow & Norton, 2003).

Seshamani and Gray (2004a), Zweifel, Felder, and Werblow (2004), and, a recent study, Hyun, Kang, and Lee (2015) investigate the multicollinearity problem. These studies replicate the Heckman model used in Zweifel et al. (1999) and compare the results to a two-part model. Seshamani and Gray (2004a) use English data on hospital costs for 9,366 decedents in the period 1970–1990, Zweifel et al. (2004) use two years of monthly insurance claims made by 1,095 individuals prior to death in 1999, and Hyun et al. (2015) use Korean National Health Insurance claims data on inpatient HCE. Contrary to the original study, and to Seshamani and Gray (2004a), Zweifel et al. (2004) and Hyun et al. (2015) use data on both survivors and decedents. They can therefore test the caveat mentioned in the original study: age neutrality of HCE may not hold for survivors.

In Seshamani and Gray (2004a) none of the variables in the Heckman model is a significant predictor of HCE.1 However, the variables of the model are jointly significant, which leads to the suspicion that collinearity between age and the inverse Mills ratio ($λ$) makes the model underidentified. In order to investigate this, $λ$ is regressed on the other explanatory variables, which leads to an $R2$ of 0.99—suggesting the collinearity problem is severe. Nevertheless, the replication in Zweifel et al. (2004) corroborates the results in the original study: there is no association between age and HCE after TTD is controlled for, and HCE grows rapidly when end of life is approaching. However, also here, a strong correlation between $λ$ and the explanatory variables is noted, and the authors conclude that multicollinearity appears to be a problem. Hyun et al. (2015), on the other hand, finds significant TTD and age associations with the expected signs in the Heckman model—even in the presence of a multicollinearity problem. Possible reasons for the diverging results are that Hyun et al. (2015) includes survivors in the analysis and controls for the number of chronic diseases.2

Comparing the results from two-part models to the replication results of Zweifel et al. (1999) is hard within and across the three papers, because the specification of the two-part model varies considerably. Zweifel et al. (2004) uses a different sample when estimating the two-part model: it is based on HCE claims in 1999 of decedents who died between 2000 and 2003, and survivors who were alive at the end of 2003. Instead of using dummy variables counting down to death, a linear TTD variable, measuring the months to death, is used. The TTD variable for survivors is coded to the maximum number of months in the observation period. The authors claim that by using TTD as a continuous variable from one point in time the problem of contemporaneous correlation between TTD and HCE is less severe. They conclude that they cannot reject the association between age and HCE being zero in the two-part model and conclude that the original result of no age effects still holds. They do, however, find that they cannot reject an association between age and HCE for survivors. This result is also supported by the two-part model in Hyun et al. (2015). The model used there is an “augmented” two-part model, which includes the lagged value of $HCE$ as an additional control variable. It is included to allow for a relationship between past HCE and TTD. A problem with this analysis is that the lagged dependent variable could be strongly correlated with age and thereby causes a similar collinearity problem to the one present in the Heckman model. In the two-part model using the whole sample and not including lagged HCE as a control variable, the age associations are significant (see Table 3 in Hyun et al., 2015). Achen (2000) provides a general discussion of why including lags of the dependent variable can reduce the explanatory power of other relevant explanatory variables. The results of the two-part model in Seshamani and Gray (2004a) indicate that age and TTD significantly predict hospital costs. Consequently, that study does not support age neutrality of HCE established in the authors’ replication of the original study of HCE. They attribute this to the econometric weaknesses of the Heckman model. They do however find that approaching death has significant impact on hospital expenditure. For example, hospital costs in the last quarter of life were on average three times as high as hospital costs in quarter 2 before death. On this basis, they conclude that proximity to death is the main reason for the strong bivariate association between age and HCE.

Dow and Norton (2003) criticized the application of the Heckman selection model for corner solutions. They used this literature as an example of poor practice. This methodological critique, together with the empirical evidence of a severe multicollinearity problem in the early literature presented in Seshamani and Gray (2004a) and Zweifel et al. (2004), has clearly inspired the choice of methods in later studies. To our knowledge, no later studies within the literature have applied the Heckman model, except the recent replication of the original study just discussed. The two-part model has become the standard model in this literature. Some of these studies, separated between types of HCE studied, are discussed.

### Hospital Expenditure

A number of papers—Dormont, Grignon, and Huber (2006); Seshamani and Gray (2004b); Wong, van Baal, Boshuizen, and Polder (2011); and Geue, Briggs, Lewsey, and Lorgelly (2014)—apply two-part models to predict hospital expenditures (HE) using panel data from various European countries. Dormont et al. (2006) use data on hospital, ambulatory care, and pharmaceutical expenditure in 1992 and 2000. The data include, respectively, roughly 3,400 and 5,000 individuals. They examine the change in HCE between 1992 and 2000 and apply two-part and microsimulation models to decompose the change over time in HE according to changes in demography, morbidity, and medical practice. The results revealed that changes in HE over time due to changes in aging are modest compared to changes in medical practices. Seshamani and Gray (2004b) use a panel of hospital costs for roughly 90,000 decedents in the period 1970–1999. They estimate a two-part model using a probit model in the first step and a random effect generalized linear model (GLM) with a log link function in the second step. In addition to age, sex, and year of death, they control for time-specific effects, and a range of variables measuring socioeconomic status, health status, and information about the hospital stay. The effect of TTD on HE is much larger than the effect of age: in the three last years of life costs are found to increase seven-fold, while the average increase in costs from age 65 to 80 is 30%. Geue et al. (2014) base their analysis on 60,000 Scottish decedents and survivors. The study contributes to the literature by using survival analysis to predict TTD for survivors. Both TTD and age are found to be important predictors of HCE, and the impact of TTD is found to vary by socioeconomic status.

Some of these studies include chronic disease as explanatory variables. This may have an impact on the estimated age coefficients, given that chronic disease is strongly related to age. A possibly better way to use disease-specific information is to use HCE for different diseases as outcomes. This is done in Wong et al. (2011) where 93 disease-specific two-part models are estimated using panel data on HE during the 1995–2000 period. The authors report strong associations between TTD and HCE for most diseases. Associations between HCE and age are statistically significant but modest compared to the TTD associations—with the exception of diseases that are non-lethal and prevalent in old age. Another way of taking health into account without distorting the estimated HCE-age relationship is suggested by Geue, Lorgelly, Lewsey, Hart, and Briggs (2015), who control for a set of health indicators consistently collected within a relatively homogenous group (with regard to age) before the start of the observation window.

### Primary Care Expenditure

It is likely that primary care expenditures (PCE) show a different age profile from HE. As one grows older one becomes more likely to be diagnosed with chronic diseases and other health problems, which require prescription drugs and regular check-ups provided by the primary care sector. Atella and Conti (2014) explore this notion using data on Italian PCE for 750,000 individuals over the period 2006–2009. The analysis is based on a two-part model: a probit model to estimate the probability of positive PCE, and, in the second step, a GLM to estimate expected PCE conditional on PCE being positive. Age is found to be a more important predictor of PCE than TTD, suggesting that the role of age and TTD depends on the type of HCE studied.

### Long-Term Care Expenditures

Werblow, Felder, and Zweifel (2007) make use of disaggregated Swiss data on care according to the seven components ambulatory care, drugs, hospital outpatient, hospital inpatient, nursing home care, home care, and other services. They estimate a two-part model (with probit and ordinary least squares [OLS] as main alternatives) by treating the two equations as stochastically independent. Age and TTD enter both equations in addition to variables representing insurance status. When the analysis is carried out for different service types separately, possible dependency between the services is taken into account by means of seemingly unrelated regression (SUR) models. For surviving users of LTC services, the probability of incurring LTCE increases markedly in old age, while most of the components of their conditional HCE show a decreasing age profile. Consequently, for LTCE, aging might matter regardless of proximity to death.

Häkkinen, Martikainen, Noro, Nihtilä, and Peltola (2008) use Finnish register data for 40% of the population aged 68 and above. Service utilization is disaggregated according to type of services at the individual level. The sample is split into two depending on whether or not an individual is making use of LTC services. The authors estimate the probability of utilizing LTC and two-part models for LTCE. Age, TTD, and some additional control variables enter the regressions. Age has an important positive and increasing effect on the probability of being an LTC user, but the share is also related to TTD: those who died in 1999 had a 10 percentage points higher probability to be an LTC user than those who survived. The relative difference between the two groups increases to over 30 percentage points among those aged 85. Among individuals not in LTC, they find that HE clearly decreased with age among deceased individuals.

De Meijer, Koopmanschap, van Doorslaer, and d’ Uva (2011) study variables that influence LTCE with data that include the entire Dutch 55+ population. They first examine total LTC, institutional LTC, and home care expenditures for the entire Dutch 55+ population, conditional on age, sex, TTD, cause-of-death and co-residence. Next, they examine home care expenditures for a random sample of the non-institutionalized 55+ population, conditioning additionally on morbidity and disability information. They refer to these distinct models as the “population model” and the “extended home care model.” They estimate two-part models with a probit model for the probability of using LTC and a GLM model for estimating the conditional use of LTC. In addition to TTD they focus on cause of death and disability. The authors find that those living alone or deceased from diabetes, mental illness, stroke, respiratory, or digestive disease have higher LTCE, while a cancer death is associated with lower expenditures. TTD no longer determines home care expenditures when disability is controlled for. This suggests that TTD largely approximates disability. They suggest that disability may replace TTD in LTCE projections models.

Balia and Brau (2014) study the dependencies between different types of long-term home care by using data from the first wave of the Survey on Health, Ageing and Retirement (SHARE). The study focuses on the issue of whether formal long-term home care should be considered a substitute or a complement to informal long-term home care. A simultaneous equation system of two-part models allowing for interaction between formal and informal care is estimated. Endogeneity and unobservable heterogeneity are addressed using a common latent factor approach.

Age, TTD, and disability all have sizable explanatory power. The findings suggest that indicators of age, TTD, and disability should be jointly included in models of LTC. The results also show that the link between formal and informal care depends upon the component of formal care considered.

## Studies Based on Aggregate Data

Karlsson and Klohn (2014) use ten years of Swedish administrative data at the municipality level to estimate the relative importance of TTD and age on LTCE. Age-specific probabilities of dying within the next two years, based on Swedish mortality statistics, are used as an indicator for TTD. LTC expenditures are analyzed by means of fixed effects models with age, TTD, and additional controls as right-hand side variables. They find that age increases total LTCE for the oldest age groups. The age effect declines when TTD enters the regression, although it appears to have a strong impact on total LTCE even after TTD is included. Separate estimates for institutional and domiciliary LTCE reveal differing patterns for most age groups. Age seems to be the most relevant predictor for domiciliary care, whereas TTD is much more relevant for institutional LTCE. This finding can be interpreted as a transition process from home to institutional LTC at the end of life.

A similar approach is used by Breyer, Lorenz, and Niebel (2015), who analyze health care expenditure in Germany at the aggregate level. Employing a rich set of cohort, time, and age fixed effects, the authors find that TTD picks up some of the age gradient in HCE. However, a reduction in mortality will also be associated with an increase in life expectancy of older people, which will lead to higher HCE. The net effect of these two opposing trends is positive, so that population aging is associated with an increase in HCE. Van Baal and Wong (2012) reach a similar conclusion based on aggregate data from the Netherlands: even though the TTD coefficient has a similar impact as in individual-level studies, controlling for TTD increases the unexplained component in HCE growth, so that the predicted future HCE is no smaller in this scenario.

A couple of Norwegian studies (Gregersen & Godager, 2014; Gregersen, 2014) use Norwegian registry data on hospital costs. The data are aggregated to groups defined by age, gender, and municipality. Gregersen and Godager (2014) regress real per capita HE on age, gender, group-level mortality rate, and mortality interacted with age. They find that mortality related HE (MRHE) is negatively related to age, that is, dying at higher ages is cheaper. To illustrate the impact of increased longevity, over the analysis period, on HE, they compare HE in 2009 with what it would have been if the mortality rate had stayed equal to the mortality rate in 1998. They find that total HE would have been 2% higher in 2009 had it not been for the improvement in mortality over time and conclude that both mortality and age should be included in a prediction of future HCE. Gregersen (2014) analyzes whether the age gradient is becoming steeper over time and finds that increasing death-related costs are an important part of the steepening of the age gradient.

## Projections of Future HCE

In the studies mentioned, the importance of age is determined by its explanatory power in panel data regressions after controlling for TTD. Another way to determine the importance of age is to compare predicted HCE growth in projection models, only including age and sex HCE profiles with models including TTD as an additional explanatory variable. How much is projected HCE growth diminished when TTD is taken into account?

Breyer and Felder (2006), Stearns and Norton (2004), Polder, Barendregt, and van Oers (2006), and Geue et al. (2014) all compare a simple projection model, which uses the current age- and sex-specific HCE per capita multiplied by the future number of persons in age- and sex-specific groups, to an augmented projection model, which adjusts the age-sex specific HCE by the trend in age-sex specific mortality rates. If the average cost of decedents is higher than for survivors, adjusting for the decreasing trend in age-sex specific mortality rates will reduce projected total HCE (Melberg & Sørensen, 2013).

Breyer and Felder (2006) find that projected HCE is reduced by approximately one fifth when taking mortality rates into account. Similar results are found in Stearns and Norton (2004), Polder et al. (2006), and Geue et al. (2014): projected HCE is reduced by 25, 12, and 7%, respectively.

Van Baal and Wong (2012) criticize these studies for implicitly assuming that the growth rate of per capita HCE is identical in the simple and augmented projection model. Using a theoretical and empirical model, they show that the growth rate due to unidentified causes is higher in an augmented than in a simple projection model. They find that by using model-specific growth rates of per capital HCE the augmented projection model does not produce lower projections of HCE.

## Descriptive Studies

Spillman and Lubitz (2000) combine various sources of U.S. data in a descriptive study of mean expenditures for acute and long-term care (LTC) from the age of 65 years until death and in the last two years of life. The authors find that acute care differs from LTC. Acute care expenditures, mainly for hospital care and physicians’ services, increase at a reduced rate as the age at death increases, whereas LTCE increases at an increased rate as the age at death increases. Hence both age and TTD affect LTCE.

McGrail et al. (2000) aim to assess the relative effects of age and proximity to death on costs of both acute medical care and nursing and social care. They compare all decedents in chosen age categories for the years 1987–1988 and 1994–1995 with all survivors in the same age groups. They find that costs of acute care rise with age, but that proximity to death is a more important factor in determining costs. The additional costs of dying fall with age. In contrast, costs of nursing and social care rise with age. Similar patterns were found for the two cohorts. They conclude that in planning services it is important to take into account the relatively larger impact of aging on social and nursing care than on acute care. These results correspond with what Martikainen, Murphy, Metsä-Simola, Häkkinen, and Moustgaard (2012) find for Finland.

Yang, Norton, and Stearns (2003) employ U.S. survey data in a descriptive data analysis using person month level data. They construct two subgroups, based on the individuals’ mortality status. The first subsample consists of individuals within one year of death. The second subgroup consists of individuals at least one year before death and at least one year before being censored. They compute average monthly expenditure according to service type, age group, and subsample. They find that the observed increase in average health care expenditures with age is largely because of an increasing mortality rate combined with high end-of-life expenditures. On the other hand, nursing home expenditures increase on average with both age and closeness to death. The authors conclude that closeness to death is the most important reason for higher inpatient expenditures, and aging is the most important reason for higher long-term care expenditure.

Polder et al. (2006) is a descriptive study of health insurance data covering 13% of the Dutch population in 1999. The authors find that those who die within one year have higher costs of hospital care, nursing home, and home care compared with the survivors. Both for decedents and for survivors the mean cost of LTC increases with age at least up to 90 years. Mean HE peaks around 80 years for survivors and somewhat earlier for decedents.

Forma, Rissanen, Aaltonen, Raitanen, and Jylhä (2009) use a case-control design to compare utilization of health and social services between older decedents and survivors, and to identify the respective impact of age and closeness of death on the utilization of services. Data are derived from multiple national registers. Decedents’ utilization within two years before death and survivors’ utilization in the same period of time was assessed in three age groups and by gender. The authors find that in hospital care the differences between decedents and survivors rose in the last months of the study period, whereas in long-term care there were clear differences during the whole two-year period.

The differences were smaller in the oldest age group than in younger age groups. The authors conclude that higher average age at death affects utilization of LTC services positively, while age affects utilization of hospital care negatively for the very old.

## Other Models

Weaver, Stearns, Norton, and Spector (2009) estimate the marginal effect of TTD, measured by being within two years of death, on the probabilities of nursing home and formal home care use, and whether this effect differs by availability of informal care, that is, marital status and co-residence with an adult child. The analysis is done with data from the U.S. Health and Retirement Study. They estimate simultaneous equations because the decision to reside with an adult child is considered to be made jointly with the decision of making use of formal LTC. The use of informal LTC is instrumented. They find that TTD is important for the probability of nursing home use and of formal home care use. Availability of informal support significantly reduces the effect of proximity to death.

Murphy and Martikainen (2013) also make use of Finnish register data and estimate number of days in each year spent in the hospital, LTC, or the community. They fit a multinomial vector generalized additive model separately for men and women with age and proximity to death as covariates. They find that days spent in LTC increase sharply with age and proximity to death. They also find that LTC in the period close to death increases with age. Finally, Yu, Wang, and Wu (2015) use quantile regression for total HCE and uncover substantial heterogeneity in the age gradient so that conclusions regarding the red herring hypothesis depend on the quantiles considered.

## Concluding Remarks

The literature review has shown that a large number of studies have been conducted that analyze the implications of the age gradient in HCE, but no clear consensus has emerged regarding the most appropriate method to study these issues, or concerning the relative importance of TTD and age as determinants of HCE. One complicating factor is of course that the studies are based on different data sets that cover different countries, time periods, and service types. However, the review has also made clear that the previous literature has devoted too little attention to evaluating different approaches in contrast to each other.

One aspect required for such an evaluation is a clear idea about desirable properties of an estimator. In the previous literature, the main focus seems to have been on achieving a good fit to the data used in the analysis: the choice of the Heckman selection model and later the switch to two-part models have both been justified with regard to distributional peculiarities of HCE. In our view, this emphasis has been unfortunate, given that the general focus of the literature is on predicting future costs, and not explaining variation in a given data set. This point also applies to some heroic attempts at solving the “endogeneity problem” that some authors have made.

A couple of other issues that have been given too little consideration in the literature are the implications of including proxies for morbidity to the right-hand side of the analysis. In recent studies, such proxies are often routinely included in the analysis without consideration of the fact that they will change the interpretation of the estimated parameters. Likewise, several studies restrict their analysis samples to decedents only, even though the “red herring” hypothesis, as normally stated, refers to health care costs in the entire population.

An empirical analysis will be conducted to address the most pressing of these issues, and to come up with a more general conclusion on what methods and model specifications are useful for testing the “red herring” hypothesis. As a byproduct, the analysis will also produce some useful estimates on how HCE relates to age and TTD among privately insured individuals in Germany.

# Empirical Analysis

## Data

A claims data set from a big German private health insurer that covers the universe of insured individuals over the years 2005 to 2011 is used. All claims are observed. For large parts of the analysis, demographic variables and detailed information on the utilization of different types of services are used. The variable expenditures covers total health care spending per year (including LTCE, which is financed by a separate system, and out-of-pocket payments) expressed in 2011 Euros.

The main analysis sample covers the years 2006–2010 and has been organized around quarterly expenditures on different types of health care. In order to get complete information on HCE, all individuals with a deductible and all civil servants were dropped. Additional sample selection criteria include being at least 30 years old in the base year (2005) and being continuously insured (and thus observed) from the base year until death or censoring. Among the 45,000 individuals meeting these criteria, half were randomly sampled to constitute a validation sample used to evaluate the methods. Some descriptive statistics for our estimation sample are provided in Table 1.

Table 1. Descriptive Statistics of the Analysis Sample

All

Survivors

Deceased

Variable

Mean

S.D.

Mean

S.D.

Mean

S.D.

Quarterly total expenditure

1,373

3,427

1,298

3,070

7,756

12,974

Quarterly outpatient expenditure

500

1,281

490

1,233

1,408

3,344

Quarterly hospital expenditure

266

2,068

233

1,804

3,019

9,152

Quarterly pharma expenditure

268

1,033

252

859

1,650

5,221

Quarterly LTC expenditure

38

541

23

420

1,255

2,945

Age

49.9

11.6

49.7

11.4

68.5

15.1

Employee

0.804

0.397

0.806

0.395

0.586

0.493

Self-Employed

0.074

0.262

0.074

0.262

0.067

0.249

Charlson Index 2005

0.173

0.617

0.163

0.578

1.037

1.903

Year

2007.99

1.41

2008.00

1.41

2007.25

1.19

Time to death in quarters

7.39

4.79

Individuals

22,732

22,246

486

Observations

450,140

444,920

5,220

A more detailed description of the data set and how the insured individuals compare to the entire population with private health insurance (PHI) may be found in Karlsson et al. (2016).

## Econometric Approaches

The baseline specification includes 10 age dummies representing five-year age bands (our sample consists of individuals aged 30 or older in 2005), and 20 time-to-death quarter dummies, for the number of quarters remaining before death. Methods for estimating the age and TTD fixed effects are compared. The simplest approach considered is OLS, where estimates are compared based on the two linear equations

$Display mathematics$
(2)

and

$Display mathematics$
(3)

where $Ait$ is individual $i$’s age (in five-year bands) in quarter $t$, $TTDit$ is the remaining time to death (in quarters), and $Xit$ is a vector of additional control variables—including gender dummy, occupational group fixed effects, federal state fixed effects, and a linear time trend.

The OLS regression (3) has well-known properties: when the assumptions of OLS are fulfilled it represents the best linear unbiased estimator and approximates the conditional expectation function $E(yit|ait,qit,Xit)$ even if the conditional expectation is a nonlinear function of observable characteristics. However, many studies analyzing the relationship between time-to-death and health care expenditure have not been based on linear regression. As mentioned previously, it is customary to take the large number of observations with zero expenditure into account by estimating a two-part model. The rationale for using this approach is that HCE has a special distribution with a large number of zeros and possibly very long right tails. Thus, we followed the approach taken by Atella and Conti (2014) and estimated a probit model in the first step and then a generalized linear model with log link and gamma distribution in the second step. Our model is thus given by the two equations

$Display mathematics$
(4)

$Display mathematics$
(5)

where $Φ(⋅)$ represents the cumulative distribution function (cdf) of the standard normal distribution.

## Baseline Results

Results from these different specifications are presented in the left panel of Table 2. The two leftmost columns include estimates not controlling for TTD, and the two following columns include the corresponding specifications, which include dummy variables representing TTD measured in quarters (where we give the final quarter before death the value 1).

Table 2. Main Estimation Results

Baseline Specification

Including Morbidity

Not Controlling for TTD

Controlling for TTD

Not Controlling for TTD

Controlling for TTD

Morbidity 2005

Morbidity 2005 & $t−1$

Morbidity 2005

Morbidity 2005 & $t−1$

OLS

Two-Part

OLS

Two-Part

OLS

Two-Part

OLS

Two-Part

OLS

Two-Part

OLS

Two-Part

(1)

(2)

(3)

(4)

(1)

(2)

(3)

(4)

(1)

(2)

(3)

(4)

35–39

−3.158

6.4429

−15.108

−9.9822

−26.601

−11.119

−25.535

−9.1444

−34.083

−19.666

−32.648

−16.613

(33.35)

(32.65)

(33.29)

(34.05)

(32.35)

(30.27)

(31.62)

(30.65)

(32.27)

(31.47)

(31.59)

(31.85)

40–44

−3.6365

3.9051

−21.434

−20.192

−39.297

−24

−35.8

−24.212

−50.505

−38.364

−46.22

−36.992

(38.42)

(36.79)

(38.34)

(38.11)

(36.84)

(33.56)

(35.71)

(33.81)

(36.76)

(34.68)

(35.72)

(34.96)

45–49

132.32***

144.79***

99.185**

107.61***

82.632**

116.39***

71.55*

106.31***

61.7

94.569***

53.289

89.236**

(40.52)

(39.06)

(39.95)

(39.3)

(39.18)

(36.4)

(37.86)

(35.83)

(38.73)

(36.56)

(37.57)

(36.62)

50–54

408.88***

425.88***

364.8***

369.23***

340.25***

384.76***

306.25***

357.78***

311.67***

346.23***

283.91***

328.18***

(43.94)

(42.84)

(42.79)

(41.86)

(42.47)

(40)

(41)

(39.18)

(41.52)

(39.07)

(40.32)

(38.85)

55–59

768.67***

770.88***

699.39***

718.28***

659.36***

710.86***

588.56***

659.61***

616.36***

678.33***

556.01***

637.83***

(50.46)

(47.92)

(49.54)

(47.92)

(49.25)

(45.8)

(47.28)

(43.95)

(48.47)

(45.51)

(46.78)

(44.4)

60–64

986.92***

1,017***

910.25***

972.8***

820.56***

898.97***

693.12***

791.35***

782.35***

887.87***

671.62***

789.98***

(59.62)

(59.79)

(59.31)

(60.42)

(57.74)

(54.08)

(54.99)

(49.72)

(57.59)

(55.16)

(55.12)

(51.15)

65–69

805.79***

846.71***

688.81***

731.32***

600.02***

665.54***

427.35***

513.89***

533.14***

611.91***

382.68***

476.24***

(60.62)

(60.8)

(57.95)

(57.67)

(57.05)

(52.29)

(54.15)

(47.96)

(55.45)

(51.78)

(52.89)

(48.09)

70–74

1,205***

1,266***

1,034***

1,147***

841.95***

977.49***

554.73***

733.02***

753.72***

915.63***

504.64***

701.01***

(81.25)

(82.31)

(78.83)

(78.27)

(78.62)

(71.7)

(74.12)

(61.93)

(76.27)

(69.69)

(72.32)

(61.54)

75–79

1,522***

1,624***

1,203***

1,355***

993.72***

1,169***

608.76***

827.76***

810.08***

1,009***

482.03***

711.67***

(123.3)

(128.6)

(118.1)

(118.2)

(115.6)

(103.9)

(106.7)

(84.57)

(112.2)

(95.55)

(103.9)

(77.55)

80+

2,529***

2,589***

1,384***

1,648***

1,763***

1,744***

1,122***

1,119***

911.25***

1,170***

410.28***

729.95***

(169.4)

(181.3)

(171.9)

(149.1)

(158.1)

(145.4)

(138.7)

(107.2)

(161)

(120.6)

(144.4)

(94.13)

TTD (quarters) 1

11301***

9,284***

10354***

7,658***

9,629***

5,888***

(736.2)

(653.1)

(743.4)

(596.7)

(743.8)

(523.3)

TTD (quarters) 2

11532***

9,594***

10670***

7,870***

10035***

5,942***

(939.3)

(866)

(939.4)

(814.3)

(934.8)

(702.7)

TTD (quarters) 3

7,356***

6,157***

6,601***

4,883***

6,068***

3,737***

(675.5)

(645.4)

(671.1)

(607.6)

(666.8)

(562)

TTD (quarters) 4

5,679***

4,729***

4,947***

3,762***

4,541***

2,815***

(533.5)

(481.8)

(530.5)

(438.5)

(515.3)

(360.4)

TTD (quarters) 5

6,050***

4,963***

5,329***

4,031***

4,980***

3,390***

(596.6)

(524.5)

(596.8)

(503.4)

(589.1)

(470.6)

TTD (quarters) 6

5,724***

4,823***

5,014***

3,757***

4,669***

2,984***

(666.6)

(671.8)

(660.5)

(640.5)

(650.4)

(488.3)

TTD (quarters) 7

5,142***

4,312***

4,460***

3,456***

4,191***

2,815***

(783.3)

(846.1)

(779.2)

(863.9)

(772.8)

(633.2)

TTD (quarters) 8

3,946***

3,342***

3,302***

2,602***

3,065***

2,149***

(674.1)

(709.8)

(666.6)

(707.2)

(659.9)

(535.8)

TTD (quarters) 9

3,262***

2,521***

2,645***

1,741***

2,349***

1,370***

(586.8)

(500.1)

(572.2)

(392.6)

(564.9)

(330.4)

TTD (quarters) 10

2,852***

2,084***

2,260***

1,352***

1,972***

1,104***

(462.7)

(367.6)

(439.6)

(290.2)

(435.2)

(285.3)

TTD (quarters) 11

2,617***

2,003***

2,023***

1,259***

1,749***

914.53***

(491.5)

(410.4)

(468.1)

(309.5)

(458.1)

(244.7)

TTD (quarters) 12

1,852***

1,435***

1,272***

957.68***

1,015**

793.51**

(416.3)

(376.6)

(400.4)

(353)

(395.4)

(352.3)

TTD (quarters) 13

2,078***

1,547***

1,531***

1,043***

1,348***

941.77***

(416.3)

(333.6)

(398.3)

(296.7)

(395.8)

(302.3)

TTD (quarters) 14

2,227***

1,650***

1,643***

1,118***

1,522***

1,037***

(496.6)

(407.3)

(483.7)

(346.6)

(478.1)

(347)

TTD (quarters) 15

1,956***

1,328***

1,406***

892.81**

1,391***

900.4***

(521)

(408)

(497.7)

(356)

(488.1)

(348.2)

TTD (quarters) 16

1,351***

892.89***

997.75**

744.53**

1,043**

785.01**

(434.7)

(324.9)

(433.3)

(329.1)

(424.3)

(337.6)

TTD (quarters) 17

737.18*

415.49

539.56

376.58

660.69*

556.12*

(385.1)

(279.5)

(383.2)

(297.2)

(387.8)

(327.9)

TTD (quarters) 18

1,156**

655.41*

946.64**

467.21

1,069**

625.99*

(493.1)

(347.5)

(466.1)

(322.2)

(464.1)

(357)

TTD (quarters) 19

146.94

−85.855

125.84

−28.717

263.63

186.36

(601.9)

(381.1)

(604.2)

(449.9)

(615.6)

(564.6)

TTD (quarters) 20

281.9

132.95

55.986

136.16

133.06

336.78

(656.2)

(448.7)

(700.2)

(440.3)

(747)

(507.5)

year

9.0498*

10.705**

24.409***

26.229***

21.357***

26.726***

−14.101***

−8.813*

31.55***

33.88***

−1.1444

.69503

(5.149)

(5.124)

(4.751)

(4.649)

(4.992)

(4.939)

(5.016)

(4.731)

(4.727)

(4.643)

(4.69)

(4.504)

Note. Standard errors, clustered at the individual level, in parentheses. All specifications include a dummy variable for female sex, 16 federal state indicators, and six occupation indicators. “Avg. Cost” represents mean HCE in the estimation sample.

The age gradient in health care spending is obvious: according to all specifications, older people have significantly higher costs. We also see that costs increase rapidly in the last quarters before death: in the last two quarters, costs are more than 10,000 Euro higher—which is almost ten times the average cost for the entire sample. Over the last five years of life, the cumulative cost increase amounts to around 88,000 Euro. However, despite these large spikes in costs in the last years, taking time-to-death into account does not alter the age profile of costs by very much. When we calculate expected costs under the counterfactual scenario that none of the 2010 deaths occurred, we note a reduction by 4.7–4.8% which can be attributed to TTD.

Click to view larger

Figure 2(a). Comparison of baseline estimates: (a) age gradient.

Click to view larger

Figure 2(b). (b) time to death.

## Extensions

### Including Morbidity

In the literature is has been common to control for the morbidity of patients. Adding proxies for morbidity to the regressions may change the interpretation of estimates, because a progressive increase in morbidity eventually leads to death. Thus, dummy variables representing the presence of a number of different diagnoses are added.3 This is done in two steps. First we control only for the presence of a diagnosis in 2005. It can be argued that these variables, measured up to five years before death, capture unobserved heterogeneity rather than the process of increasing frailty that is attributed to the TTD coefficients. In a second step we also control for the diagnoses being present in the previous year. The idea here is to check whether time-to-death is simply a proxy for morbidity—in which case the TTD coefficients can be assumed to approach zero when the variable is added.

Results are presented in the right panel of Table 2. It is clear that taking diagnoses into account flattens the age gradient somewhat. However, the results seem to refute the idea that TTD is a simple proxy for morbidity, because the rapid cost increase in the last quarters before death is visible in this specification as well. Also, partialling out TTD-related costs is associated with a reduction of expected costs by 3.8–4.7% in these specifications—and this holds regardless of whether the baseline results or the new estimates including diagnoses for the comparison are used.

### Age-Specific TTD Coefficients

One notable difference between methods in the baseline specifications was the magnitude of the TTD coefficients. One possible reason for this discrepancy could be that the TTD effects are different at different ages. We now allow for this and rerun the specifications. Figure 2(b) shows that the relationship between expenditure and time-to-death is approximately log-linear. In order to keep the number of parameters at a tractable level, we now include the logarithm of TTD and a dummy variable for having died during the sample period, and interact these variables with the different age group dummies. Our estimating equation thus becomes:

$Display mathematics$
(6)

and with the corresponding changes in the two-part model.

Results regarding the two variables related to death are provided in Figure 3. The left figure plots the estimated increase in costs in the last quarter of life ($βd$) compared with survivors with the same characteristics. The two estimators appear to agree that the youngest (30–34) and the oldest (80+) individuals in our sample have significantly lower costs in the last quarter of life than the rest (Figure 3(a)). The coefficient associated with the $ln(TTD)$ variable ($βq$)—which shows how steeply the costs increase in the quarters leading up to death—is also much lower for these two groups. Apart from that, there are no significant differences between age groups: the trajectory of expenditure in the last 20 quarters of life appears to be largely independent of age.

Click to view larger

Figure 3(a). Cost of dying by age groups: (a) cost increase, quarter of death.

Click to view larger

Figure 3(b). (b) time to death coefficient.

The age coefficients estimated in these regressions are reported in Table 3, where baseline estimates from Table 2 are included for comparison in columns (1) & (2). The impact of controlling for TTD on the age gradient is smaller than in the baseline estimates, but the overall effect on expected costs is nevertheless the same: taking TTD into account reduces the predicted costs by 4.4–4.6%, irrespective of the estimation method used.

Table 3. Estimates with Age-Specific TTD Effects

Not Controlling for TTD

Controlling for TTD

OLS

TPM

OLS

TPM

(1)

(2)

(3)

(4)

35–39

−3.158

6.4429

0

0

(33.35)

(32.65)

(.)

(.)

40–44

−3.6365

3.9051

5.7872

5.128

(38.42)

(36.79)

(38.51)

(37.64)

45–49

132.32***

144.79***

123.1***

128.3***

(40.52)

(39.06)

(39.33)

(39.34)

50–54

408.88***

425.88***

398.32***

407.62***

(43.94)

(42.84)

(40.77)

(41.49)

55–59

768.67***

770.88***

706.96***

699.18***

(50.46)

(47.92)

(44.92)

(43.83)

60–64

986.92***

1,017***

922.24***

943.72***

(59.62)

(59.79)

(54.87)

(56.79)

65–69

805.79***

846.71***

688.08***

717.19***

(60.62)

(60.8)

(52.74)

(54.35)

70–74

1,205***

1,266***

1,070***

1,119***

(81.25)

(82.31)

(74.03)

(75.23)

75–79

1,522***

1,624***

1,269***

1,351***

(123.3)

(128.6)

(113)

(118.3)

80+

2,529***

2,589***

1,731***

1,798***

(169.4)

(181.3)

(160.9)

(165.7)

year

9.0498*

10.705**

25.065***

26.161***

(5.149)

(5.124)

(4.703)

(4.629)

Avg. Cost

1,369

1,370

Avg. Cost excl. TTD

1,306

1,307

Change (%)

−4.588

−4.634

Observations

449,494

449,494

449,494

449,494

Persons

22,711

22,711

22,711

22,711

Note. Standard errors, clustered at the individual level, in parentheses. All specifications include a dummy variable for female sex, 16 federal state indicators, and six occupation indicators. “Avg. Cost” represents mean HCE in the estimation sample; “Avg. Cost excl. TTD” represents predicted per capita HCE when decedents are treated as survivors.

### Results by Service Type

Next, a comparison is made of the importance of controlling for time-to-death for different service types. The care types that are most likely to be sensitive to aging and time-to-death are inpatient medical care, outpatient medical care, and long-term care. In the data, the distinction between outpatient and inpatient care is not completely clear, because all hospital care is assigned to the same category. For that reason, the two categories are defined as “office-based care,” which refers to outpatient care administered outside the hospital, and “hospital care,” which includes inpatient and outpatient care provided in a hospital. Equation (6) for each of the three service types has thus been estimated separately. Results are provided in Table 4.

The results in Table 4 reveal considerable heterogeneity by service type. Office-based care is the largest component of total spending, but it has the flattest age profile. For this cost category, the impact of controlling for TTD is smaller than for total expenditure: removing death-related costs reduces predicted aggregate spending by as little as 2%. Nevertheless, costs are 1,500 to 3,600 Euro higher in the last quarter of life (Atella & Conti, 2014). At the opposite extreme, LTC costs are very small on average but exhibit a very steep age gradient. In this case, removing death-related costs would reduce aggregate spending by as much as 29–36% (Häkkinen et al., 2008). For pharmaceuticals, the estimated impact of TTD is quite similar to that for overall costs, whereas hospital care costs are slightly more sensitive to the inclusion of TTD.

Table 4. Results by Service Type

Office-Based Care

Hospital Care

Long-Term Care

Controlling for TTD

No

Yes

No

Yes

No

Yes

Estimator

OLS

TPM

OLS

TPM

OLS

TPM

OLS

TPM

OLS

TPM

OLS

TPM

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11)

(12)

died

1,322***

2,542***

7,131***

15654***

1,544***

2,067***

(139.9)

(408.2)

(529.6)

(1711)

(147.6)

(449.9)

lttd

−337.34***

−555.08***

−2,675***

−4,520***

−378.11***

−341.45***

(59.54)

(147.9)

(224.5)

(623.5)

(69.16)

(99.85)

35–39

−26.734

−18.616

−28.77*

−22.216

−14.045

−14.62

−17.926

−20.41

.7493

3.466

−1.7594

5.4671

(16.3)

(15.43)

(16.29)

(15.74)

(14.66)

(13.08)

(14.58)

(14.14)

(2.155)

(2.548)

(2.199)

(3.639)

40–44

−62.697***

−50.528***

−65.683***

−56.364***

−21.456

−23.122*

−27.471*

−29.671**

5.4358

7.3829*

1.7619

9.1976*

(18.42)

(16.95)

(18.39)

(17.16)

(15.71)

(13.91)

(15.7)

(15.13)

(5.65)

(4.362)

(5.662)

(5.211)

45–49

−31.219*

−16.097

−36.491*

−23.03

19.602

17.142

7.206

6.2827

4.3558

6.6354**

−2.096

7.5037**

(18.82)

(17.4)

(18.79)

(17.67)

(18.39)

(16.12)

(18.06)

(16.8)

(3.49)

(2.957)

(3.567)

(3.295)

50–54

50.445**

71.3***

43.887**

61.172***

95.775***

92.327***

77.564***

73.359***

8.0776*

19.15***

.10261

20.949***

(19.91)

(18.49)

(19.84)

(18.66)

(19.69)

(17.47)

(19.03)

(17.33)

(4.514)

(6.768)

(4.417)

(7.081)

55–59

189.2***

201.09***

179.25***

191.92***

196.11***

191.18***

166.04***

166.6***

8.1332*

16.496***

−3.9216

16.631***

(22.18)

(20.44)

(22.16)

(20.7)

(23.38)

(20.72)

(22.61)

(20.3)

(4.483)

(4.612)

(4.749)

(4.566)

60–64

213.09***

235.05***

201.34***

223.99***

274.13***

269.27***

243.23***

249.32***

18.405*

24.255***

4.081

24.237***

(23.1)

(21.97)

(23.14)

(22.16)

(27.11)

(24.49)

(26.83)

(24.58)

(10.57)

(7.162)

(10.84)

(7.08)

65–69

122.9***

146.99***

106.43***

125.69***

313.86***

316.87***

262.03***

267.19***

25.435*

35.73***

5.5185

31.42***

(23.32)

(22.22)

(22.79)

(21.36)

(30.82)

(28.66)

(29.06)

(26.09)

(13.47)

(8.485)

(13.47)

(7.552)

70–74

190.43***

221.1***

166.23***

200.47***

460.27***

475.22***

384.05***

423.23***

61.536***

71.672***

32.278

60.123***

(25.19)

(24.63)

(25.5)

(24.44)

(37.57)

(38.11)

(37.46)

(36.64)

(20.1)

(17.13)

(19.96)

(14.57)

75–79

222.68***

259.57***

177.75***

223.41***

559.99***

578.14***

417.33***

450.75***

278.5***

261.4***

224.19***

219.12***

(31.21)

(30.8)

(32.23)

(30.09)

(55.66)

(56.25)

(53.01)

(46.44)

(57.76)

(55.89)

(57.2)

(49.54)

80+

211.12***

253.12***

55.203

149.06***

737.96***

749.89***

205.2***

361.53***

1,109***

927.46***

921.08***

560.83***

(35.51)

(34.59)

(42.53)

(32.63)

(63.93)

(64.18)

(74.92)

(44.09)

(105.4)

(122.4)

(96.87)

(87.9)

year

5.8623***

5.8792***

8.5759***

9.0108***

−2.4918

−2.4557

2.1892

2.6729

1.7547*

1.8887*

5.1073***

6.0072***

(1.731)

(1.732)

(1.691)

(1.673)

(2.965)

(2.639)

(2.794)

(2.475)

(1.041)

(1.014)

(.9413)

(1.126)

Avg. Cost

497

497

267

267

40

40

Avg. Cost excl. TTD

488

488

237

238

30

27

Change (%)

−1.770

−1.753

−11.115

−10.996

−26.379

−31.950

Observations

449,494

449,494

449,494

449,494

449,494

449,494

449,494

449,494

449,494

449,494

449,494

449,494

Persons

22,711

22,711

22,711

22,711

22,711

22,711

22,711

22,711

22,711

22,711

22,711

22,711

Note. Standard errors, clustered at the individual level, in parentheses. All specifications include a dummy variable for female sex, 16 federal state indicators, and six occupation indicators. “Avg. Cost” represents mean HCE in the estimation sample; “Avg. Cost excl. TTD” represents predicted per capita HCE when decedents are treated as survivors.

The estimates by service type expose large and relevant differences between estimators. They disagree somewhat regarding the age gradient in costs, and also regarding the sensitivity of the age gradient to the inclusion of TTD.

## Validation

It has been shown that, even though the estimators compared deliver qualitatively similar results in most settings, they sometimes disagree regarding effect sizes. A formal validation exercise now compares actual expenditures to those predicted by the estimators—according to three criteria: the aggregate accuracy, as expressed by the predicted mean; the root mean square error ($RMSE=∑i=1N(y^i−yi)2/N$); and the mean absolute error ($MAE=N−1∑i=1N| y^i−yi |$). The RMSE is closely related to the mimimand of the OLS estimator, and hence it might be expected to perform particularly well according to this measure in a within-sample evaluation. On the other hand, the two-part estimator considered estimates twice as many parameters and should thus be able to perform as least as well as OLS.

For the validation exercise, the performance of the estimators is evaluated in four distinct samples, defined by the initial randomization and by the time period considered. For within-sample within-period validation, the regression sample is used. Within-sample out-of-period validation considers individuals in the regression sample observed in 2011. The validation sample is used for out-of-sample, within-period validation, using observations from 2006–2010 for that group, and, finally, an out-of-sample out-of period validation is conducted using the validation sample in 2011.

Results for baseline estimates are presented in Table 5. In the first two rows of each panel, the actual mean is compared to the predicted mean. After that the two validation criteria RMSE and MAE are reported, and finally the number of parameters estimated in the models.

Table 5. Validation: Baseline Estimates

Not Controlling for TTD

Controlling for TTD

OLS

Two-Part

OLS

Two-Part

(1)

(2)

(3)

(4)

I. Within-sample within-period (2006–2010)

Mean actual

1,368.6

1,368.6

1,368.6

1,368.6

Mean predicted

1,368.6

1,370.1

1,368.6

1,376.0

RMSE

3,424

3,425

3,357

3,376

MAE

1,493

1,494

1,469

1,474

Parameters

35

70

55

110

II. Within-sample, out-of-period (2011)

Mean actual

1,515.5

1,515.5

1,515.5

1,515.5

Mean predicted

1,507.1

1,519.4

1,504.4

1,522.6

RMSE

3,881

3,883

3,813

3,822

MAE

1,643

1,649

1,626

1,635

Parameters

35

70

55

110

III. Out-of-sample within-period (2006–2010)

Mean actual

1,366.8

1,366.8

1,366.8

1,366.8

Mean predicted

1,368.5

1,369.6

1,368.3

1,375.8

RMSE

3,424

3,425

3,344

3,356

MAE

1,486

1,486

1,461

1,464

Parameters

35

70

55

110

IV. Out-of-sample, out-of-period (2011)

Mean actual

1,483.6

1,483.6

1,483.6

1,483.6

Mean predicted

1,506.7

1,518.5

1,503.9

1,523.4

RMSE

3,727

3,729

3,675

3,688

MAE

1,621

1,627

1,605

1,615

Parameters

35

70

55

110

One very clear result emerges from Table 5: irrespective of the validation sample used, of the criterion used, and of the independent variables included in the model—there is one clear “winner” among the estimators. OLS always performs better in terms of the criteria RMSE and MAE, and it outperforms the two-part model regarding the predicted mean for all samples but the within-sample out-of-period case. In general, the values attained for RMSE and MAE might seem large, but a comparison with Table 1 reveals that the errors are comparable in size or smaller than the standard deviations of the outcome variable.

The same results are obtained when we conduct the validation exercise for the specifications controlling for morbidity in the right part of Table 2 and for the age-specific estimates presented in Table 3: the OLS prediction performs remarkably well by all criteria.4

The picture is a bit different, however, when estimates by service type are examined. In Table 6, OLS and the two-part model (TPM) appear to be roughly equivalent for office-based care and hospital care: they deliver almost identical results according to all criteria. When it comes to LTC, the OLS estimator dominates in predicting mean future LTC expenditure, whereas the two-part model sometimes attains greater accuracy within the same time period. In summary, there is a strong case for OLS when all service types are pooled, whereas the two estimators appear to be roughly equivalent in type-specific models.

Finally, it should be noted that these validation results also give an answer to the fundamental question of the “red herring” literature: do we get a better prediction of future costs when we take future mortality into account? The answer to this question appears to be “no.” For example, the aggregate predictions in panel IV of Table 5 become uniformly worse when TTD is taken into account in a two-part model—despite improvements in accuracy at the individual level. Also predictions by service type get worse when TTD is taken into account in a two-part model, whereas the accuracy of OLS estimates is largely unaffected.

Table 6. Validation by Service Type

Office-Based Care

Hospital Care

Long-Term Care

Controlling for TTD

No

Yes

No

Yes

No

Yes

Estimator

OLS

TPM

OLS

TPM

OLS

TPM

OLS

TPM

OLS

TPM

OLS

TPM

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11)

(12)

I. Within-sample within-period (2006–2010)

Mean actual

497.2

497.2

497.2

497.2

267.0

267.0

267.0

267.0

40.2

40.2

40.2

40.2

Mean predicted

497.2

496.9

497.2

496.9

267.0

266.9

267.0

266.8

40.2

40.3

40.2

40.7

RMSE

1,225

1,225

1,222

1,222

2,196

2,196

2,167

2,169

550

547

541

534

MAE

564.8

564.38

563.76

563.22

502.72

502.62

490.15

489.7

73.149

69.647

74.279

65.207

Parameters

35

70

37

74

35

70

37

74

35

69

37

73

II. Within-sample, out-of-period (2011)

Mean actual

526.1

526.1

526.1

526.1

311.9

311.9

311.9

311.9

53.6

53.6

53.6

53.6

Mean predicted

539.3

542.0

540.0

544.9

292.8

292.4

287.4

291.8

55.0

56.7

55.9

59.1

RMSE

1,278

1,278

1,277

1,277

2,675

2,675

2,628

2,623

661

660

655

655

MAE

595.87

596.39

596.35

597.54

567.92

567.71

553.24

555.53

95.017

94.769

94.214

94.547

Parameters

35

70

37

74

35

70

37

74

35

69

37

73

III. Out-of-sample within-period (2006–2010)

Mean actual

498.6

498.6

498.6

498.6

265.2

265.2

265.2

265.2

34.7

34.7

34.7

34.7

Mean predicted

497.5

496.9

503.4

504.0

267.1

266.9

299.2

304.0

40.8

40.9

47.8

47.4

RMSE

1,268

1,268

1,264

1,264

2,123

2,123

2,122

2,142

487

486

480

469

MAE

565.77

565.18

567.21

567.17

500.49

500.19

511.42

515.05

68.944

65.569

75.254

63.926

Parameters

35

70

37

74

35

70

37

74

35

69

37

73

IV. Out-of-sample, out-of-period (2011)

Mean actual

534.1

534.1

534.1

534.1

295.2

295.2

295.2

295.2

42.5

42.5

42.5

42.5

Mean predicted

539.2

541.7

539.8

544.8

293.0

292.3

287.6

291.8

55.8

57.8

56.7

60.4

RMSE

1,442

1,442

1,440

1,440

2,321

2,322

2,286

2,282

535

537

531

540

MAE

606.17

606.63

606.82

607.83

551.97

551.44

538.09

540.09

86.662

87.179

86.417

88.512

Parameters

35

70

37

74

35

70

37

74

35

69

37

73

# Conclusion

The concern among policymakers and researchers that population aging will cause high future growth rates of per capita HCE and total HCE has been one motivation for the literature on aging and HCE. The cross-sectional observation that per person HCE is much higher among older compared to younger persons initiated research that brought in TTD as a variable that seems to remove some of the effect of age. After the critique of the use of a Heckman selection model in this context, most of the recent literature makes use of variants of the two-part model. Contrary to the early contributions by Zweifel et al., which claimed age to be an irrelevant variable and HCE to be explained by TTD, the recent literature seems to give some role to age as well in the explanation. Typically, for LTCE, age seems to matter more than for acute hospital care. When disability is accounted for, the effect of age and TTD diminish. Typically, not many articles validate their approach by comparing properties of different estimation models.

Many of the studies give the impression of estimating demand for HC and LTC that follows from characteristics of the population. What is actually studied is the utilization of HC and LTC. Both types of care are rationed from the supply side in many countries. Then, the study results perhaps reveal more of country-specific systems of supply, prioritization, and other institutional factors than usually described in the articles. Hence, institutional characteristics may explain why empirical results might not be so robust across countries. This supply-side perspective is completely absent in most studies, but Chang, He, and Hsieh (2014) represents a rare exception.

In order to gain an understanding of the divergent results of previous studies, and to evaluate popular methods used in the literature, an empirical analysis was conducted based on a claims data set from Germany. This analysis generated a number useful insights. First, there is a significant age gradient in HCE: the oldest-old (80+) have HCE that is 3,000 Euros higher than that of 30-year olds. Second, the costs of dying are substantial: over the last five years of life, an average person generates HCE of 88,000 Euros over and above the amount applying to a comparable survivor. Third, these “costs of dying” have a limited impact on the age gradient in HCE: the effect among the oldest-old is reduced by €1,000 and per capita HCE is reduced by at most 5% in a counterfactual scenario where everyone survives. We interpret these findings as evidence against the “red herring” hypothesis as initially stated.

Controlling for morbidity at the individual level changed results quantitatively but not qualitatively: the age gradient and the TTD coefficients are both reduced by doing so, but not by enough to warrant the conclusion that either of the two only serves as a proxy for morbidity. We also did not find much evidence of heterogeneity by age being a big issue: parameters related to TTD appear to be largely unrelated to age, with the possible exception of the oldest and youngest groups. On the other hand, there is substantial heterogeneity between service types: office-based outpatient care is the service type with the flattest age gradient and the smallest “red herring” effect, whereas the opposite applies to LTC costs.

The main contribution is, however, probably in the methodological domain. Results indicate that the choice of estimation method makes little difference in practice, as the methods considered deliver quantitatively similar results. In addition, when they differ, OLS often performs better than the alternatives. This is surprising given the very limited use of OLS in the literature, and given that OLS requires less than half the parameters of the other approaches, it appears to be a superior choice.

Finally, the validation exercise seems to suggest that the whole “red herring” debate might have been misinformed. When validating our methods out of sample and out of period we find no evidence that including TTD leads to better short-term predictions of aggregate future HCE—at least not in the short time perspective we are able to consider. It remains an open issue whether these results also hold when predicting future HCE over a longer time period. We conclude that the literature might benefit from focusing on the predictive power of the estimators instead of their actual fit to the data within the sample.

## References

Achen, C. H. (2000). Why lagged dependent variables can suppress the explanatory power of other independent variables. Working paper, University of Michigan. Ann Arbor, 1001(2000), 48106–1248.Find this resource:

Atella, V., & Conti, V. (2014). The effect of age and time to death on primary care costs: The Italian experience. Social Science & Medicine, 114, 10–17.Find this resource:

Balia, S., & Brau, R. (2014). A country for old men? Long-term home care utilization in Europe. Health Economics, 23(10), 1185–1212.Find this resource:

Breyer, F., & Felder, S. (2006). Life expectancy and health care expenditures: A new calculation for Germany using the costs of dying. Health Policy, 75(2), 178–186.Find this resource:

Breyer, F., Lorenz, N., & Niebel, T. (2015). Health care expenditures and longevity: Is there a Eubie Blake effect? European Journal of Health Economics, 16(1), 95–112.Find this resource:

Chang, S., He, Y., & Hsieh, C.­R. (2014). The determinants of health care expenditure toward the end of life: Evidence from Taiwan. Health Economics, 23(8), 951–961.Find this resource:

Charlson, M. E., Pompei, P., Ales, K. L., & MacKenzie, C. R. (1987). A new method of classifying prognostic comorbidity in longitudinal studies: Development and validation. Journal of Chronic Diseases, 40(5), 373–383.Find this resource:

de Meijer, C., Koopmanschap, M., van Doorslaer, E., & d’ Uva, T. B. (2011). Determinants of long-term care spending: Age, time to death or disability? Journal of Health Economics, 30(2), 425–438.Find this resource:

Dormont, B., Grignon, M., & Huber, H. (2006). Health expenditure growth: Reassessing the threat of ageing. Health Economics, 15(9), 947–963.Find this resource:

Dow, W. H., & Norton, E. C. (2003). Choosing between and interpreting the Heckit and two-part models for corner solutions. Health Services and Outcomes Research Methodology, 4(1), 5–18.Find this resource:

Duan, N., Manning, W. G., Morris, C. N., & Newhouse, J. P. (1983). A comparison of alternative models for the demand for medical care. Journal of Business & Economic Statistics, 1(2), 115–126.Find this resource:

Forma, L., Rissanen, P., Aaltonen, M., Raitanen, J., & Jylhä, M. (2009). Age and closeness of death as determinants of health and social care utilization: A case-control study. European Journal of Public Health, 19(3), 313–318.Find this resource:

Fuchs, V. R. (1990). The health sector’s share of the gross national product. Science, 247(4942), 534–538.Find this resource:

Geue, C., Briggs, A., Lewsey, J., & Lorgelly, P. (2014). Population ageing and healthcare expenditure projections: New evidence from a time to death approach. European Journal of Health Economics, 15(8), 885–896.Find this resource:

Geue, C., Lorgelly, P., Lewsey, J., Hart, C., & Briggs, A. (2015). Hospital expenditure at the end-of-life: What are the impacts of health status and health risks? PloS ONE, 10(3), e0119035.Find this resource:

Gregersen, F. A. (2014). The impact of ageing on health care expenditures: A study of steepening. European Journal of Health Economics, 15(9), 979–989.Find this resource:

Gregersen, F. A., & Godager, G. (2014). The association between age and mortality related hospital expenditures: Evidence from a complete national registry. Nordic Journal of Health Economics, 2(1).Find this resource:

Häkkinen, U., Martikainen, P., Noro, A., Nihtilä, E., & Peltola, M. (2008). Aging, health expenditure, proximity to death, and income in Finland. Health Economics, Policy and Law, 3(2), 165–195.Find this resource:

Hyun, K.­R., Kang, S., & Lee, S. (2015). Population aging and healthcare expenditure in Korea. Health Economics, 25(10), 1239–1251.Find this resource:

Jones, A. M. (2000). Health econometrics. Handbook of Health Economics, 1, 265–344.Find this resource:

Karlsson, M., Klein, T. J., & Ziebarth, N. R. (2016). Skewed, persistent and high before death: Medical spending in Germany. Fiscal Studies, 37(3–4), 527–559.Find this resource:

Karlsson, M., & Klohn, F. (2014). Testing the red herring hypothesis on an aggregated level: Ageing, time-to-death and care costs for older people in Sweden. European Journal of Health Economics, 15(5), 533–551.Find this resource:

Madden, D. (2008). Sample selection versus two-part models revisited: The case of female smoking and drinking. Journal of Health Economics, 27(2), 300–307.Find this resource:

Martikainen, P., Murphy, M., Metsä-Simola, N., Häkkinen, U., & Moustgaard, H. (2012). Seven-year hospital and nursing home care use according to age and proximity to death: Variations by cause of death and socio-demographic position. Journal of Epidemiology and Community Health, 66(12), 1152–1158.Find this resource:

Mason, T., Sutton, M., Whittaker, W., & Birch, S. (2015). Exploring the limitations of age-based models for health care planning. Social Science & Medicine, 132, 11–19.Find this resource:

McGrail, K., Green, B., Barer, M. L., Evans, R. G., Hertzman, C., & Normand, C. (2000). Age, costs of acute and long-term care and proximity to death: Evidence for 1987–88 and 1994–95 in British Columbia. Age and Ageing, 29(3), 249–253.Find this resource:

Melberg, H. O. (2012). Eldrebølgen: Bedre enn sitt rykte? In H. O. Melberg & L. E. Kjekshus (Eds.), Fremtidens Helse-Norge (pp. 187–202). Bergen, Norway: Fagbokforlaget.Find this resource:

Melberg, H. O., & Sørensen, J. (2013). How does end of life costs and increases in life expectancy affect projections of future hospital spending? Technical report, Oslo University, Health Economics Research Programme.Find this resource:

Mullahy, J. (1998). Much ado about two: Reconsidering retransformation and the two-part model in health econometrics. Journal of Health Economics, 17(3), 247–281.Find this resource:

Murphy, M., & Martikainen, P. (2013). Use of hospital and long-term institutional care services in relation to proximity to death among older people in Finland. Social Science & Medicine, 88, 39–47.Find this resource:

Norton, E. (2016). Health and long-term care. In J. Piggott & A. Woodland (Eds.), Handbook of the economics of population aging (vol. 1, pp. 951–989). Amsterdam, The Netherlands: Elsevier.Find this resource:

Norton, E. C. (2000). Long-term care. Handbook of Health Economics, 1, 955–994.Find this resource:

Payne, G., Laporte, A., Deber, R., & Coyte, P. C. (2007). Counting backward to health care’s future: Using time-to-death modeling to identify changes in end-of-life morbidity and the impact of aging on health care expenditures. Milbank Quarterly, 85(2), 213–257.Find this resource:

Polder, J. J., Barendregt, J. J., & van Oers, H. (2006). Health care costs in the last year of life—the Dutch experience. Social Science & Medicine, 63(7), 1720–1731.Find this resource:

Salas, C., & Raftery, J. P. (2001). Econometric issues in testing the age neutrality of health care expenditure. Health Economics, 10(7), 669–671.Find this resource:

Seshamani, M., & Gray, A. (2004a). Ageing and health-care expenditure: The red herring argument revisited. Health Economics, 13(4), 303–314.Find this resource:

Seshamani, M., & Gray, A. M. (2004b). A longitudinal study of the effects of age and time to death on hospital costs. Journal of Health Economics, 23(2), 217–235.Find this resource:

Spillman, B. C., & Lubitz, J. (2000). The effect of longevity on spending for acute and long-term care. New England Journal of Medicine, 342(19), 1409–1415.Find this resource:

Stearns, S. C., & Norton, E. C. (2004). Time to include time to death? The future of health care expenditure predictions. Health Economics, 13(4), 315–327.Find this resource:

van Baal, P. H., & Wong, A. (2012). Time to death and the forecasting of macro-level health care expenditures: Some further considerations. Journal of Health Economics, 31(6), 876–887.Find this resource:

Weaver, F., Stearns, S. C., Norton, E. C., & Spector, W. (2009). Proximity to death and participation in the long-term care market. Health Economics, 18(8), 867–883.Find this resource:

Werblow, A., Felder, S., & Zweifel, P. (2007). Population ageing and health care expenditure: A school of “red herrings”? Health Economics, 16(10), 1109–1126.Find this resource:

Wong, A., van Baal, P. H., Boshuizen, H. C., & Polder, J. J. (2011). Exploring the influence of proximity to death on disease-specific hospital expenditures: A carpaccio of red herrings. Health Economics, 20(4), 379–400.Find this resource:

Yang, Z., Norton, E. C., & Stearns, S. C. (2003). Longevity and health care expenditures the real reasons older people spend more. Journals of Gerontology Series B: Psychological Sciences and Social Sciences, 58(1), S2–S10.Find this resource:

Yu, T. H.­K., Wang, D. H.­M., & Wu, K.­L. (2015). Reexamining the red herring effect on healthcare expenditures. Journal of Business Research, 68(4), 783–787.Find this resource:

Zweifel, P., Felder, S., & Meier, M. (1999). Ageing of population and health care expenditure: A red herring? Health Economics, 8(6), 485–496.Find this resource:

Zweifel, P., Felder, S., & Meier, M. (2001). Reply to: Econometric issues in testing the age neutrality of health care expenditure. Health Economics, 10(7), 673–674.Find this resource:

Zweifel, P., Felder, S., & Werblow, A. (2004). Population ageing and health care expenditure: New evidence on the “red herring”? Geneva Papers on Risk and Insurance Issues and Practice, 29(4), 652–666.Find this resource:

## Notes:

(1.) They use exactly the same model as in the original study (equation 1), except for excluding control variables for insurance type, which is not relevant in the English system.

(2.) However, one would expect that controlling for chronic conditions would dampen the association between age and HCE.

(3.) The diagnoses considered are those covered in the Charlson index and include AMI, CHF, PVD, CEVD, dementia, COPD, rheumatoid arthritis, PUD, mild liver disease, HP/PAPL, renal disease, cancers, metastatic cancer, and AIDS. For definitions, see Charlson, Pompei, Ales, and MacKenzie (1987).

(4.) We also have results for the Heckman estimator available upon request. The two-part model outperforms the Heckman model in almost all cases.