Show Summary Details

Page of

Printed from Oxford Research Encyclopedias, Economics and Finance. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 18 April 2024

Heterogeneity in Cost-Effectiveness Analysisfree

Heterogeneity in Cost-Effectiveness Analysisfree

  • Ciaran N. Kohli-LynchCiaran N. Kohli-LynchInstitute of Health and Wellbeing, University of Glasgow
  •  and Andrew H. BriggsAndrew H. BriggsInstitute of Health and Wellbeing, University of Glasgow


Cost-effectiveness analysis is conducted with the aim of maximizing population-level health outcomes given an exogenously determined budget constraint. Considerable health economic benefits can be achieved by reflecting heterogeneity in cost-effectiveness studies and implementing interventions based on this analysis. The following article describes forms of subgroup and heterogeneity in patient populations. It further discusses traditional decision rules employed in cost-effectiveness analysis and shows how these can be adapted to account for heterogeneity.

This article discusses the theoretical basis for reflecting heterogeneity in cost-effectiveness analysis and methodology that can be employed to conduct such analysis. Reflecting heterogeneity in cost-effectiveness analysis allows decision-makers to define limited use criteria for treatments with a fixed price. This ensures that only those patients who are cost-effective to treat receive an intervention. Moreover, when price is not fixed, reflecting heterogeneity in cost-effectiveness analysis allows decision-makers to signal demand for healthcare interventions and ensure that payers achieve welfare gains when investing in health.


  • Health, Education, and Welfare Economics


Since the late 20 century, researchers have unraveled the genomic, epigenomic, and behavioral bases for many health conditions. It has therefore become increasingly feasible to stratify patient populations into risk- and benefit-based subgroups. These developments have been met with a clinical trend toward stratified medicine—described by the World Health Organization as assessing “per population stratum” the benefit—risk profile of a healthcare intervention (Kaplan et al., 2013). Concurrently, researchers, healthcare institutions, funding bodies, and politicians have heralded an age of “personalized” and “precision” medicine. Though vaguely defined, these terms are closely related to stratified medicine and often invoke the use of novel diagnostic technology to stratify populations and dictate patient treatment (König, Fuchs, Hansen, Mutius, & Kopp, 2017; Schleidgen, Klingler, Bertram, Rogowski, & Marckmann, 2013).

Variability in a data set describes the extent to which data points are distributed around an average value. Heterogeneity in a patient population refers to variability in sociodemographic and biological characteristics between patients. Subgroups are a set of patients defined by one of these characteristics. Heterogeneity in outcome specifically refers to variability in health and cost outcomes between individuals receiving the same treatment that can be explained by variability in the patient population.

Patient outcomes may differ relatively or absolutely. Those receiving the same relative effect from a treatment will experience a treatment-related multiplicative alteration of their baseline health or cost outcome (e.g., 50% reduction in event probability, 20% increase in treatment costs). Those receiving the same absolute effect from a treatment will experience the same absolute change in outcome (e.g., one adverse event prevented, £150 additional cost incurred).

At its core, stratified medicine aims to address heterogeneity in health outcomes. It recognizes that average health outcomes attributable to a treatment often comprise systematically different patient-level outcomes. Stratified medicine aims to reflect this heterogeneity in health outcome in healthcare decisionmaking. Reflecting heterogeneous outcomes is arguably more important when considering the cost-effectiveness of a treatment. This is because cost-effectiveness results averaged over large populations disregard heterogeneity related to both health and cost outcomes.

Health systems implement cost-effective interventions with the aim of maximizing population-level health outcomes given an exogenously determined budget constraint. Considerable health economic benefit can be achieved by reflecting heterogeneity in cost-effectiveness studies and implementing interventions based on this analysis. Establishing limited use criteria for treatments ensures that only patients in whom an intervention is cost-effective receive treatment. Moreover, by reflecting heterogeneity in their decision-making process, healthcare decision makers can signal demand for products to healthcare providers. This enables payers to realize consumer surplus at the point of equilibrium between demand and supply of a healthcare product.

Forms of Subgroup and Heterogeneity

Interventions can produce heterogeneous outcomes due to a range of sociodemographic and biological factors. Sculpher (2008) lists six forms of subgroups and heterogeneity: intervention-related factors, factors unrelated to intervention but related to health condition, factors unrelated to intervention and health condition, factors unrelated to the patient, preferences, and factors revealed over time. These forms of heterogeneity exist in all clinical areas, from chronic to acute conditions.

The proceeding section, “Cost-Effectiveness Analysis,” describes these forms of subgroups and heterogeneity. These descriptions are supplemented with examples from cardiovascular disease (CVD), to highlight the existence of heterogeneity in a highly prevalent chronic condition, along with examples from other clinical areas.


Intervention-related factors are commonly considered in studies of clinical effectiveness. These characteristics typically indicate differential relative outcomes in a population and can be referred to as treatment effect modifiers. Relative benefit is often quantified by hazard ratio or relative risk of adverse event (Dickman, Sloggett, Hills, & Hakulinen, 2004).

Low-density lipoprotein cholesterol (LDL-C) is an intervention-related factor that causes differential relative treatment effect in the prevention of CVD. Patients at high risk of experiencing a CVD event are often prescribed statins, an LDL-C-reducing medication. Evidence suggests that statins produce a greater relative risk reduction for CVD in patients with higher baseline LDL-C (Mihaylova et al., 2012). This is likely due to the strong positive relationship between LDL-C and CVD risk (Navarese et al., 2018). A study of fracture risk in women with osteoporosis also found a significant difference in relative risk reduction attributable to alendronate versus placebo between subgroups defined by existing fracture class (Black et al., 2000).

Intervention-related factors may also lead to systematic differences in costs. For example, dosage for some pharmacological interventions is determined by patient body mass index (BMI). This multiplicatively increases costs for patients with an elevated BMI. Pan, Zhu, Chen, Xia, and Zhou (2016) highlight that weight-based dosing is important for a range of medications, including hydrocortisone for adrenal insufficiency, vancomycin for the treatment of bacterial infections, and aprotinin for use in cardiac surgery.


Factors unrelated to an intervention but related to health condition often alter absolute outcomes for a patient. They may cause differential absolute risk reduction, pricing, and preference valuation of clinical events.

Even when patients receive the same relative risk reduction from a treatment, absolute risk reduction may vary greatly. Consider two subgroups of a patient population in which the adverse event rate is 50% and 10%, respectively. Further, consider a treatment that reduces relative risk of adverse event by 50% across all patient subgroups. While relative risk reduction is constant, the group with higher baseline risk will receive a much greater absolute risk reduction (25% versus 5%). Based on the principles described, risk stratification is often used to determine which patients should receive preventive treatment for CVD (Grundy et al., 2018; SIGN, 2017).

Sculpher notes that costs and quality-of-life valuations may differ systematically based on observable patient characteristics. Evidence suggests that the direct medical cost of experiencing a stroke can vary between sexes and increases with age (Ghatnekar, Persson, Glader, & Terént, 2004) and health state valuation is consistently lower for individuals with comorbid diabetes (Sullivan & Ghushchyan, 2016).


Factors unrelated to both intervention and health condition may affect patient outcomes.

Age is an example of such a factor. Elderly individuals are typically at an increased risk of developing chronic diseases (e.g., CVD, chronic obstructive pulmonary disease, cancer) and experiencing adverse events (e.g., serious falls). These competing risks limit older individuals’ capacity to benefit from interventions and, in turn, systematically alters the cost-effectiveness of treating them.

Factors that alter risk of mortality are often unrelated to health condition and treatment effect but ultimately affect patient outcome. For example, long-term survival after liver transplantation is significantly greater for younger individuals (Duffy et al., 2010). Probability of non-CVD mortality increases with age and this competing risk can similarly limit an elderly individual’s capacity to benefit from preventive treatment. However, the CVD example is complicated by the fact that age is an independent predictor of CVD risk, and therefore the competing risk of non-CVD mortality must be weighed against the increased risk of a disease-related event (D’Agostino, Pencina, Massaro, & Coady, 2013).


Factors unrelated to the patient may affect health and cost outcomes. These factors may concern the geographic location of a treatment, the treatment provider, or other environmental factors.

Research has been conducted that considers geographic- and provider-related sources of heterogeneity in patient outcome. An analysis of multinational clinical trials provides evidence that healthcare costs vary substantially among countries (Willan et al., 2005). Indeed, Geue, Wu, Leyland, Lewsey, and Quinn (2016) have shown that even within one country`, significant differentials exist between inpatient costs in rural and urban settings. Treatment success rate is also likely to vary based on physician characteristics. For example, surgeons with more experience have greater surgical success rates and their patients have better postoperative quality of life (Cahill et al., 2014), while recent evidence suggests that female physicians have significantly lower mortality and readmission rates than their male counterparts (Tsugawa et al., 2017).


Preferences are an important factor that may lead to heterogeneous patient outcomes. Patient preferences are specifically relevant with regard to treatment disutility and health state valuation.

The health-related quality of life that patients attribute to different treatments and health states often varies based on observable characteristics. An online survey of 1,000 U.S. residents found that the disutility attributable to regular pill-taking varies greatly across the U.S. adult population (Hutchins, Viera, Sheridan, & Pignone, 2015). In a decision-modeling study, Pandya, Sy, Cho, Weinstein, and Gaziano (2015) showed that pill-taking disutility is a key determinant of the cost-effectiveness of preventive statin therapy for CVD-free individuals. Additionally, studies have shown that individuals differentially value health states based on age (Dolan, 2000).


Factors revealed over time are a final potential determinant of heterogeneity in patient-level cost-effectiveness. If these factors are observable, patients can be split into subgroups and a decision maker can make differential decisions based on each group’s respective outcomes.

Treatment response is a factor that is revealed over time that may allow for differential decisionmaking. Some patients receiving statin therapy for CVD prevention experience adverse effects, including myalgia and loss of memory (Kashani et al., 2006). Often these side effects can be avoided by changing dosage or choice of statin. Altering either of these treatment parameters may improve the patient’s health and cost outcomes.

The types of heterogeneity and subgroups discussed may all be employed to stratify patient populations in cost-effectiveness analysis. The National Institute for Health and Care Excellence (NICE), the body responsible for instituting cost-effective practice in the National Health Service (NHS) in England and Wales, discusses heterogeneity and subgroups in its reference case document (NICE, 2013). NICE recognizes that the use of a technology may be approved conditionally on the presence of a biomarker that predicts a patient’s response to treatment. The NICE diagnostics assessment program is responsible for establishing a cost-effective testing strategy to establish the presence of such a biomarker. More generally, NICE states that “it is important to consider how clinical and cost effectiveness may differ because of differing characteristics of patient population” (p. 52). Such heterogeneity and subgroup effects, they note, should be analyzed as part of a health technology assessment.

NICE has played a pioneering role in promoting decisionmaking based on cost-effectiveness analysis. Other healthcare decision makers around the world have also adopted pricing and reimbursement strategies for health technologies that rely on cost-effectiveness analysis (Mathes, Jacobs, Morfeld, & Pieper, 2013). Consequently, many of these decision makers address heterogeneity and subgroup cost-effectiveness in their decision-making processes. For example, decision makers in Ireland, Poland, Lithuania, Germany, and Australia directly acknowledge the role of cost-effectiveness analysis in pricing and reimbursement for diagnostic technology (AOTMiT, 2016; Health, 2017; HIQA, 2018; HTA Lithuania, 2015; IQWiG, 2017). Decision makers in other countries, including the United States, Scotland, Belgium, Norway, and Thailand, have issued clinical guidelines based on stratified cost-effectiveness analysis (Grundy et al., 2018; Kingkaew et al., 2011; Norheim et al., 2011; Roberfroid, San Miguel, & Thiry, 2013; San Miguel, Benhamed, Devos, Fairon, & Roberfroid, 2016; Teerawattananon & Praditsitthikorn, 2011). It should additionally be noted that, regardless of the process adopted by different decision-making bodies, the theoretical basis for reflecting heterogeneity in cost-effectiveness analysis exists in all health systems due to opportunity cost caused by scarcity of resource.

Cost-Effectiveness Analysis

Standard health economic decision rules dictate that an intervention should be implemented over a relevant comparator if the incremental benefits from the treatment justify the incremental expenditure required to achieve those benefits (Karlsson & Johannesson, 1996). If incremental costs, ΔC, are negative, and incremental benefits, ΔE, are positive, the treatment is considered “cost-saving” and should be implemented. If incremental costs are positive and incremental benefits are negative, the treatment is “dominated” and should not be implemented. If incremental costs and benefits are both positive or both negative, decision makers must consider the treatment’s incremental cost-effectiveness ratio (ICER).

The ICER of implementing a treatment over a comparator is equal to the treatment’s incremental costs divided by incremental benefits, mathematically formulated in Equation 1. The measure typically adopted to represent health-related quality of life in cost-effectiveness analysis is the quality-adjusted life year (QALY; Harris, 1987; Weinstein, Torrance, & McGuire, 2009). Therefore, ICERs usually represent the cost per QALY attributable to implementing a treatment.


A decision maker should implement an intervention over its comparator if they believe that the cost per QALY offered by the treatment represents acceptable value for money. Willingness to pay for a unitary increase in health benefit is represented by the decision maker’s cost-effectiveness threshold, λ.

Cost-effectiveness thresholds can be defined in multiple ways. Optimal decisionmaking, which maximizes health in a population given an exogenously fixed budget, will employ a value of λ equal to the opportunity cost associated with displacing funds in the health budget (Claxton et al., 2015; Lomas, Claxton, Martin, & Soares, 2018). This value represents the marginal productivity of the health system and can be described as a supply-side estimate of the threshold (Culyer et al., 2007). Alternatively, the threshold can be defined using a demand-side approach that considers the consumption value of health. This approach derives a value for λ by undertaking preference elicitation, which ranks healthcare alongside non-healthcare purchases (Woods, Revill, Sculpher, & Claxton, 2016).

When incremental costs and benefits are both positive, the decision maker should implement the treatment over its comparator if its ICER is below the cost-effectiveness threshold, as shown in Decision Rule 1A.

Intervention funded if:ICER=ΔCΔE<λ;ΔC>0,ΔE>0

Decision Rule 1A

When incremental costs and benefits are both negative, the decision maker should implement the treatment over its comparator if its ICER is above the cost-effectiveness threshold, as shown in Decision Rule 1B. This is because the cost savings attributable to the intervention can be spent elsewhere in the budget to produce more health than is lost.

Intervention funded if:ICER=ΔCΔE>λ;ΔC<0,ΔE<0

Decision Rule 1B

In a situation where multiple mutually exclusive interventions are being assessed, all options should be ranked in terms of increasing health benefits. Strictly dominated interventions, those that incur more costs and produce less health than a comparator, should be excluded from the analysis. The ICER associated with implementing each non-excluded intervention compared to the next nondominated option should be estimated. At this point, the possibility of extended domination must be considered. Extended domination occurs when an intervention has an ICER in excess of a more effective intervention (Karlsson & Johannesson, 1996). Faced with the decision to implement either of these interventions, a rational decision maker would always choose the intervention that produces more health with a lower ICER. Hence all extendedly dominated interventions should be excluded from the analysis. Finally, when all strictly and extendedly dominated interventions have been excluded, the decision maker should choose to implement the most health-producing strategy with an ICER below their cost-effectiveness threshold.

Incremental net monetary benefit (INMB) is an alternative measure of cost-effectiveness that facilitates easier comparison of multiple strategies. Calculation of INMB requires converting incremental health benefits to costs that represent the monetary value of these benefits. This is achieved by multiplying health benefits by the cost-effectiveness threshold. Next, the treatment’s incremental costs are subtracted from this value, as formulated in Equation 2. Unlike the ICER, this measure is not a ratio of means and is therefore always continuous and defined.


A policy should be adopted over a comparator if it has INMB greater than zero, as presented in Decision Rule 2. A welfare gain is achieved by implementing such a policy. When there are multiple interventions to choose among, all policies must be compared incrementally to a common comparator. In this situation, the policy with the highest INMB should be implemented. Notably, unlike with ICERs, decisionmaking based on incremental net benefit does not require separate decision rules dependent on the sign of incremental costs or benefits.

Intervention funded if:INMB>0

Decision Rule 2

Incremental net health benefit (INHB) is a comparable measure to INMB (Stinnett & Mullahy, 1998). When calculating INHB, all incremental costs are converted to a health benefit value. This is achieved by dividing incremental costs by the cost-effectiveness threshold. Hence the costs represent the minimum amount of health that could theoretically be purchased elsewhere in the budget if the policy was not implemented. Similar to Decision Rule 2, an intervention should be implemented over its comparator if INHB is greater than zero.

The incremental costs and effects attributable to implementing a healthcare intervention compared to a relevant comparator can be disaggregated. They have been described by Weinstein and Stason (1977). Constituents of incremental health and incremental cost are described in Equation 3 and Equation 4, respectively.


Incremental change in costs consists of direct treatment costs (rx), cost increases attributable to treatment-related side effects (se), cost savings due to reduced morbidity (morb), and cost increases associated with extended life expectancy (le).

Incremental change in effect is typically measured by QALYs. The constituents of incremental effect are increased benefits attributable to extension of life expectancy (le), increased benefits due to reduced morbidity (morb), and reduced benefits due to treatment-related side effects (se).

One final measure to consider is TreatmentValue. This is defined as INMB excluding treatment costs and is presented in Equation 5.


It is possible to derive a decision rule for investing in healthcare interventions dependent on treatment cost. A decision maker should invest in an intervention if the treatment value is greater than the direct treatment costs, shown in Decision Rule 3. This rule makes it possible to determine the price at which a treatment becomes cost-effective, or the reverse-engineered price.

Intervention funded if:TreatmentValue>ΔCRx

Decision Rule 3

Stratified Cost-Effectiveness Analysis

Benefits and costs of interventions are typically averaged across large patient groups in cost-effectiveness analyses. This leads to a situation in which heterogeneity is overlooked in healthcare decisionmaking. Across populations, each constituent of incremental costs and incremental benefits may vary. For example, individuals with high levels of a biomarker may receive a greater relative risk reduction from a treatment and older individuals typically have worse health outcomes following acute illness (Goldberg et al., 1989; Kidd, Siegel, Dehdashti, & Grigsby, 2007; Patil, Krishnan, Lechtzin, & Diette, 2003).

If INMB can be reliably calculated at the individual or subgroup level, the decision to initiate treatment in the wider population can be segregated into a set of mutually exclusive decisions. Dependent on λ, it is possible to establish “limited use criteria” that avoid treating patients with INMB less than zero (Coyle, Buxton, & O’Brien, 2003).

Likewise, if TreatmentValue can be calculated at individual or subgroup level, decision makers can establish the proportion of the patient population that should be eligible for treatment at a range of different prices. Performing this analysis and making decisions based on the results allow payers to signal demand to healthcare providers.

Figure 1 demonstrates the health economic effect of disregarding heterogeneity of outcome on the cost-effectiveness plane. The figure presents a scenario in which two patient subgroups experience very different absolute health benefits from a treatment. Costs, however, are constant across the patient population. Subgroup A represents the average incremental outcomes attributable to the treatment in the total population (1 QALY gained), while Subgroups B and C represent outcomes in the population’s two constituent subgroups (−2 and 3 QALYs gained, respectively). Costs are equal to £40,000 in each of the subgroups. A cost-effectiveness threshold of £30,000/QALY is represented by a dashed line on the graph.

Figure 1. Subgroup treatment effects.

Note: Dashed line represents cost-effectiveness threshold.

If a decision maker employs a cost-effectiveness threshold of £30,000/QALY, it is possible to determine whether Subgroups A, B, and C should be treated based on their position on the cost-effectiveness plane. Subgroups with positive health and cost outcomes, those in the top-right quadrant of Figure 1, should be treated if they lie beneath the dashed line. Treating these subgroups has an associated ICER less than the decision maker’s cost-effectiveness threshold. Treatment of subgroups that lie in the top-left quadrant of the cost-effectiveness plane is dominated by no treatment (i.e., treatment is costlier and less effective than no treatment) so should not be implemented.

When the decision to implement the intervention is based on average treatment effects in the total population, the decision maker would choose to not provide treatment to anybody. The ICER associated with treating the total population is £40,000/QALY. This is in excess of the decision maker’s cost-effectiveness threshold.

The intervention should not be implemented in Subgroup B as these patients receive negative health benefits while incurring positive costs. However, with an ICER of around £13,300, treating Subgroup C represents acceptable value for money and should be implemented. The decision, based on average outcomes in the total population, is therefore correct for Subgroup B but incorrect for Subgroup C. Failing to recognize variability in treatment outcome leads to inefficient decisionmaking as patients who are cost-effective to treat do not receive treatment.

Implementing Decision Rule 2: Stratified Cost-Effectiveness Analysis With Fixed Treatment Costs

When the price of an intervention is fixed, stratified cost-effectiveness analysis can be employed to establish limited use criteria. Limited use criteria ensure that an intervention is funded only for patient groups who are cost-effective to treat.

Coyle et al. (2003) discuss the role of stratified cost-effectiveness in establishing limited use criteria in healthcare. They produce a mathematical framework that can be used to quantify the welfare gains achievable through the stratification of patient populations in cost-effectiveness analysis.

Let i be a discrete variable representing univariate subgroups of a patient population. These subgroups are mutually exclusive and collectively exhaustive (when combined, they include every member of the patient population). Further, let INMBi represent the incremental net monetary benefit of an intervention in patient group i, and INMB represent the total incremental net monetary benefit in the population. The population-level INMB is equal to the sum of each subgroup’s INMB. We can define this relationship mathematically as follows:


It is possible that a set of the subgroups will have INMB less than zero. An efficient limited use criterion will ensure that all subgroups with positive INMB are treated, while those with negative INMB remain untreated.

Let INMBs(i) be the total INMB associated with only treating subgroups iϵs(i), where s(i) is the subset of subgroups with INMB greater than zero. We can define INMBs(i) as follows:


It follows that INMBs(i) is greater than INMB in all situations:


The net benefit gain from stratification, ΔSINMB, is equal to the incremental net monetary benefit of only treating cost-effective subgroups, subtracting the total net benefit in the population. Intuitively, this is equal to the negative sum of INMBi in all subgroups with negative INMB.


Additional complexities can be added to the mathematical framework presented that better reflect the reality of stratification in clinical practice. The framework can be extended to situations where more than one type of subgroup is used to stratify the patient population. Furthermore, stratification of patients into subgroups may require additional costs. For example, it often requires additional testing and physician time to stratify patients into biomarker-related subgroups. These additional costs can be weighted into the INMBi calculations.

Example: Stratified Cost-Effectiveness Analysis

Coyle et al. (2003) provide an example of net benefit gain from stratification. They compare the INMB of standard of care versus tissue plasminogen activator (t-PA) for the treatment of patients with acute myocardial infarction (MI).

The acute MI patient population is stratified into eight distinct subgroups. Patients are stratified by location of infarction (anterior or inferior) and age group (<40, 41–60, 61–75, and >75). The incremental cost and incremental effectiveness of the intervention were calculated for each subgroup. Effectiveness was represented by life-year gains as opposed to QALYs. Only direct treatment costs were included in the analysis and these were assumed constant between subgroups.

Table 1 presents the subgroup-specific incremental costs and life years gained from t-PA in the eight patient populations compared to standard care. Also presented is the INMB at a range of cost-effectiveness thresholds. In the interest of simplicity, an equal number of individuals in each subgroup is assumed (N = 100 for each group). Cost-effectiveness is greater in patients with anterior infarctions. Additionally, older individuals are more cost-effective to treat.

Table 1. Subgroup-Specific Cost-Effectiveness Results

Location–Age Cohort

Incremental Life Years Gained

Incremental Cost ($)

NMB ($) Per Patient

(λ‎ = $25 000)

(λ‎ = $50 000)

(λ‎ = $100 000)

Inferior < 40






Anterior < 40






Inferior 41–60






Anterior 41–60






Inferior 61–75






Anterior 61–75





10 965

Inferior > 75





14 667

Anterior > 75





18 371







Note: Incremental life years gained and incremental costs (Coyle, Buxton, & O’Brien, 2003). λ‎ = alternative monetary values of life years gained. Italic numbers relate to cohorts with negative net benefit where treatment with t-PA is not optimal.

Source: Coyle et al. (2003).

The INMB for the intervention averaged across the total population is negative at a cost-effectiveness threshold of $25,000/life year. The average ICER in the population is approximately $30,000/life year. Therefore, at thresholds greater than $30,000/life year, t-PA becomes cost-effective. Based on cost-effectiveness outcomes averaged over the total population, decision makers adopting a threshold of $50,000/life year will choose to fund the intervention. However, there exist subgroups in which treatment is not cost-effective at this threshold. Welfare gains may therefore be realized by stratifying the population and making subgroup-specific treatment decisions.

Figure 2 presents the net benefit gain from stratification over a range of cost-effectiveness thresholds. Three separate stepwise curves on the graph represent ΔSINMB for stratification based on age and location, age only, and location only. The slope of the curves change at threshold values that represent the ICER of a relevant subgroup.

Figure 2. Net benefit gain from stratification.

Source: Coyle et al. (2003).

At very low cost-effectiveness thresholds (<$10,000/life year), there is no benefit from stratification. At these thresholds, the treatment would not be cost-effective in any subgroup of the population. At very high thresholds (>$200,000/life year) there is no benefit from stratification as the treatment would be cost-effective in all subgroups of the population.

As the threshold approaches the ICER averaged across the population (approximately $30,000/life year) from below, net benefit from stratification increases. This is because nobody would be treated when cost-effectiveness is averaged across all patients. However, an increasing number of cost-effective subgroups would be treated when applying stratification. After the threshold reaches the average ICER and continues to increase, the benefit from stratification decreases. All individuals would be treated when cost-effectiveness is averaged across all patients and, as the threshold increases, fewer subgroups would be restricted from treatment under stratification. Stratification by age and location generally produces the greatest net benefit, followed closely by location only. In contrast, stratification by age alone provides relatively little net benefit.

Implementing Decision Rule 3: Signaling Demand

Stratifying treatment decisions based on heterogeneity in cost-effectiveness allows decision makers to signal demand to manufacturers (Claxton et al., 2008; Claxton, Sculpher, & Carroll, 2011). This is because reflecting heterogeneity in cost-effectiveness allows healthcare payers to signal the amount of a product they will purchase at a range of different prices.

Signaling demand is particularly relevant to situations with monopolist manufacturers and monopsonist decision makers. Such a situation occurs when a manufacturer receives patent protection for a novel treatment and NICE must determine whether to recommend its use in the NHS (Claxton et al., 2011).

As mentioned, decision makers can establish a reverse-engineered price at which an intervention becomes cost-effective in a patient population. Rational manufacturers are assumed to respond to this decision mechanism by setting their price accordingly. Funding an intervention at the cost-effective price averaged across all individuals in a patient population leads to an INMB of zero.

If an intervention’s TreatmentValue can be established for every member of a population, then it is possible to determine the reverse-engineered price, the maximum price a decision maker should pay, for every individual. Consider a graph with TreatmentValue and proportion of patient population on its vertical and horizontal axes, respectively. Due to the relationship presented in Decision Rule 3, the vertical axis can alternatively be labeled as the reverse-engineered price. Having calculated the maximum acceptable treatment price for every individual in a population, a payer can plot these data in descending order, as shown by the black curves in Figures 3A and 3B. The graph produced is a demand curve: it details the proportion of a patient population that a decision maker is willing to provide a given intervention to at a range of prices.

Figure 3A displays the situation in which heterogeneity in cost-effectiveness exists in a population but the decision maker does not consider this heterogeneity in the decision-making process. A manufacturer will set their price as high as possible while ensuring that the treatment is cost-effective when provided to the entire population. This price, shown at Point A on the figure, maximizes their revenue (unit price multiplied by quantity sold). At this price, welfare gain is necessarily equal to welfare loss for the decision maker.

Figure 3B displays the alternative situation in which heterogeneity is reflected in the decision-making process. It is assumed that differential pricing is not possible. From a payer’s perspective, they should reimburse the intervention for all individuals up to and including the least cost-effective individual with positive INMBi at a given price. Any individual who is more cost-effective to treat than this person will produce welfare gains for the payer because the price is set below the level at which their INMBi would have been negative. As no patients with negative INMBi

are treated, no welfare loss is experienced. A monopolist manufacturer is assumed to maximize profit. This occurs when marginal revenue is equal to the marginal cost of producing the intervention, shown at Point B.

Figure 3. (A). Demand curve: Treatment decision based on cost-effectiveness averaged across patient population. Note: Consumer’s welfare loss = welfare gain when cost-effective price in total patient population is adopted. (B). Demand curve: Treatment decision based on individual-level cost-effectiveness.

Note: Consumer realizes surplus when reflecting heterogeneity in treatment decisions.

In most cases, it will be difficult to establish TreatmentValue for every individual in a population. Moreover, a paradigm shift in health service decisionmaking would be necessary for cost-effectiveness decisions to be made at the level of individual patients. However, it is often possible to stratify populations into subgroups in which cost-effectiveness is likely to vary. This stratification can be based on risk score, age, biomarker levels, or other relevant variables. In such analysis, the demand curve would be discretized in the form of a step function.

A stratified approach to NICE’s decision-making process has been proposed by Claxton et al. (2011) under the label of value-based pricing. They show that offering manufacturers a menu of different market access dependent on listed treatment prices would allow the NHS to achieve welfare gain when investing in patented products. More specifically, they show that this allows the NHS to share in the value of innovation associated with the development of a novel treatment. This is often a reasonable demand for a public-sector payer, as the public expenditure often contributes to innovation through basic science and clinical research funding.

Example: Signaling Demand

The following example demonstrates the potential for signaling demand by reflecting heterogeneity in health economic decisionmaking (Kohli-Lynch & Briggs, 2016). The framework of stratified cost-effectiveness analysis was employed to establish a demand curve for PCSK9 inhibitors (PCSK9-I) for the primary prevention of CVD in statin-intolerant, hypercholesterolemic individuals in a subset of the Scottish population.

Elevated LDL-C is a well-established risk factor for CVD and PCSK9-I are a class of LDL-C-reducing medications. PCSK9-I are currently labeled by the FDA in the United States and European Medicines Agency in the EU for individuals with familial hypercholesterolemia. They are priced at more than £4,500 in the United Kingdom and are marketed as second-line treatment for familial hypercholesterolemia in statin-intolerant individuals, with the aim of preventing cardiovascular events (SMC, 2016, 2017).

The Scottish CVD Policy Model (Lawson et al., 2016; Lewsey et al., 2015), a decision-analytic state transition model, was employed to estimate cost-effectiveness outcomes attributable to PCSK9-I for a range of hypercholesterolemic individuals in the Scottish population. The model accepts an individual’s CVD risk factors (i.e., age, sex, total cholesterol, high-density lipoprotein (HDL) cholesterol, diabetes, family history of CVD, and deprivation) and estimates their discounted lifetime costs and QALYs, mediated by the occurrence of CVD events. TreatmentValue

of PCSK9-I compared to no active treatment was estimated for every individual in a hypothetical cohort of approximately 4,500 Scottish adults with familial hypercholesterolemia. The treatment effect was modeled as a 58.8% reduction in LDL-C, consistent with estimates of PCSK9-I effectiveness from the literature (Kazi et al., 2015). It was assumed that patients could not tolerate alternative cholesterol-reducing medications, including statin therapy. Once TreatmentValue was estimated for everyone in the cohort, this value was divided by remaining life years, allowing for estimation of a reverse-engineered annual drug price, accounting for discounting of future costs. Next these values were plotted on a graph in descending order to define a demand curve within the population.

Figure 4 presents the demand curve for PCSK9-I for CVD-free individuals with familial hypercholesterolemia in the cohort. The key result is presented in the green curve, which represents demand for the treatment at a cost-effectiveness threshold of £20,000/QALY. The red curves represent demand at thresholds of £10,000/QALY and £30,000/QALY, respectively.

The median reverse-engineered annual treatment price in the population was £250. At an annual PCSK9-I price of £519, the decision maker would provide the therapy to 25% of the population. At an annual PCSK9-I price of £827, the decision maker would provide the therapy to 10% of the population. At the current listed price for PCSK9-I in the Scottish NHS, not one individual in the hypothetical cohort would be cost-effective to treat.

Figure 4. Demand curve for PCSK9-I in a hypothetical cohort of 4,644 Scottish adults with familial hypercholesterolemia.

Linear regressions were run to assess the relationship between reverse-engineered price and key patient characteristics. Hypothesis testing regarding the effect of different covariates on reverse-engineered price cannot be conducted reliably with a meta-model. However, results from linear regressions showed that the Scottish CVD Policy Model predicts that several factors may be drivers of reverse-engineered price, including sex, age, and baseline cholesterol levels.

This example shows that there is a disconnect between drug pricing and health technology assessment. Decision makers could achieve considerable welfare gains by exploiting heterogeneity in patient outcomes to make differential PCSK9-I treatment decisions for adults with familial hypercholesterolemia in the Scottish population.

Subgroup-Specific Pricing

It was assumed in the previous section, “Implementing Decision Rule 3: Signaling Demand,” that differential pricing for the same treatment is not possible. However, subgroup-specific pricing is an alternative approach that would represent heterogeneity in cost-effectiveness in a patient population. This approach would require paying the reverse-engineered treatment price established for each subgroup, ensuring that no welfare loss occurs at the subgroup level.

A report from the U.S.-based Institute for Clinical and Economic Review considers the benefits associated with subgroup-specific pricing (Pearson, Dreitlein, Henshall, & Towse, 2017). It notes that such a price mechanism may catalyze a move toward value-based pricing and supports the development of value-based formulary design. Both of these will occur because subgroup-specific pricing explicitly ties reimbursement for health technology to clinical benefit. The report also notes that subgroup-specific pricing may lead to long-term cost savings. This is because new treatments are often released with “orphan” prices because they are initially targeted at a small population with high expected benefit. As the scope of population treated grows, lower value patients may receive the treatment at the orphan price. Decision makers may feel compelled to continue treatment of these lower value patients. Subgroup-specific pricing would help limit the use of high-priced treatments in broad populations while preserving access to treatment for the entire patient population.

Subgroup-specific pricing has garnered increased interest in recent years. Towse, Cole, and Zamora (2018) discuss the feasibility of implementing indication-based pricing in high-income countries. They show that rebate and discount agreements have enabled subgroup-specific pricing in the United Kingdom through the Pharmaceutical Price Regulation Scheme 2009 (Department of Health, 2008) and in Italy through managed entry agreements (Bouvy, Sapede, & Garner, 2018). The Center for Medicare and Medicaid Services and commercial insurance plans have also piloted subgroup-specific pricing programs in the United States. Towse et al. (2018) note that federal healthcare reimbursement programs in France, Germany, and Spain do not currently engage in indication-specific pricing. Routes to achieving such pricing mechanisms in these countries are local government-led price negotiations, authorization of the same treatment under different trade names, and price differentiation through risk-sharing agreements.

Subgroup-specific pricing limits the capacity for payers to achieve welfare gains from reimbursement decisions. Claxton et al. (2011) show that reimbursement according to differential pricing is welfare-neutral for payers during a treatment’s patent period. Consider a newly patented treatment for which the patient population has multiple subgroups. These subgroups are mutually exclusive and collectively exhaustive. Cost-effectiveness varies between these subgroups, and therefore each has a different reverse-engineered treatment price. If the manufacturer is reimbursed according to subgroup-specific prices, no welfare gain is achieved in any subgroup. This is equivalent to the situation in which a price is paid that reflects cost-effectiveness averaged across the total population.

Net Benefit Regression and Stratified Cost-Effectiveness Analysis

Heterogeneity in cost-effectiveness of a treatment can be explored in economic evaluations alongside clinical trials. Clinical trials have historically attempted to establish clinical differences between patients receiving treatment and those receiving placebo therapy. Economic evaluation alongside such trials typically employ methodology rooted in medical statistics rather than econometrics to establish cost-effectiveness. This methodology requires separately establishing mean costs and effects for a treatment and its comparator and comparing these with traditional health economic decision rules. Hoch, Briggs, and Willan (2002) propose an alternative methodology that allows for a more thorough examination of covariate effects on cost-effectiveness: net benefit regression.

For two competing treatment strategies, the difference in each treatment’s ratio of costs (C) to effects (E) is not equal to the ratio of differences in costs to differences in effects (i.e., the ICER).


This means that there is no simple linear relationship between the mean costs and mean effects observed in a clinical trial and the ICER of the treatment. Therefore, ICERs cannot be employed in a regression framework as a dependent variable. Consider two individuals who have the same ICER attributable to a treatment. As established, these values could have very different interpretations dependent on the sign of the incremental treatment outcomes. Therefore, the ICER does not lend itself to linear estimation models.

A similar measure to INMB is employed in net benefit regression. This value is called net monetary benefit, or NMB. It is differentiated from INMB due to the fact that it does not require incremental comparison between two treatment strategies. Instead, it is calculated by multiplying the health benefits associated with a solitary intervention in a population by the decision maker’s cost-effectiveness threshold and subtracting associated costs.


The linear nature of the NMB statistic allows for the estimation of cost-effectiveness within a regression framework. In contrast to ICERs, the difference in mean NMBs for a researched treatment, NMBr, and its comparator, NMBc, produces the mean difference in NMB for the treatment, ΔNMB. This is shown as follows. Note that ΔNMB is equal to the mean INMB in the population for the intervention versus the comparator.


Given the linearity of the NMB statistic, it is possible to employ it as the dependent variable in linear regression analysis. In a population, NMB can be established for every individual as follows:


Ci and Ei represent the costs and effects observed for patient i throughout the course of a clinical trial, and λ represents the cost-effectiveness threshold. With complete information (fully observed Ci and Ei), this value can be established for every individual in a trial. A simple linear model can then be constructed with NMB as the dependent variable. The simplest version of such a regression model is presented in Model 1.


In Model 1, α is an intercept term representing baseline NMB, ti is a binary variable representing treatment status (0 for placebo, 1 for treatment), εi is the stochastic error term, and δ is the regression coefficient on the binary treatment variable. Hence,


The value of δ could be established with average costs and average effects as is typical in the medical statistics approach to economic evaluations alongside clinical trials. However, the true power of net benefit regression lies in the fact that explanatory variables can be included in the regression model to elicit the effect of covariates on cost-effectiveness. An example of such a regression model, with p covariates, is presented in Model 2.


Inclusion of the p predictors, x1 through xp, in Model 2 controls for variables that may confound estimates of δ. The regression coefficients, βj, represent the unitary effect of the covariates on cost-effectiveness. However, individuals are typically randomized into treatment groups for clinical trials. This randomization aims to produce treatment groups with equal distributions of observed and unobserved covariates. Hence, in an adequately large and well-randomized trial population, controlling for confounding factors will offer no great improvement on unadjusted estimates of δ.

The net benefit regression framework can be extended further to address subgroup-level differences in NMB for a treatment. Model 1 and Model 2 estimate average treatment effect on NMB within the population. This approach likely disregards significant variation in NMB across patient subgroups. Model 3 presents an extension of the previous regression models and aims to reflect subgroup-level differences in cost-effectiveness.


Model 3 contains an interaction effect between patients’ covariates and the binary treatment variable. The coefficient that represents this interaction for each covariate is γj. The significance and size of this coefficient may help researchers to identify and establish important patient subgroups.

Regression techniques can be used to estimate cost-effectiveness using both a randomized-controlled trial (RCT) and observational data. Randomized data from large, pragmatic trials are ideal for estimating the subgroup-level effectiveness and cost-effectiveness of a treatment. However, RCTs are often not powered to enable valid subgroup comparisons (Brookes et al., 2001). Moreover, RCTs rarely report costs adequately enough to enable valid cost estimation. In lieu of suitable randomized data, observational studies can be used to estimate subgroup-level cost-effectiveness.

Kreif et al. (2012) discuss the estimation of subgroup effects in cost-effectiveness analyses with nonrandomized data. They show that statistical methods traditionally used to analyze observational data can be employed in subgroup-level cost-effectiveness analysis. Propensity score matching, genetic matching, and inverse probability of treatment weighting can be employed to reduce the selection bias inherent in observational studies. These methods effectively create synthetic comparisons between subpopulations in the observational data that balance covariates across treatment and control populations. In an example of drotrecogin alfa therapy for severe sepsis, it is shown that these methods can be used to produce unbiased cost-effectiveness estimates when they correctly account for subgroup-specific treatment assignment.

When analyzing clinical and cost-effectiveness across subpopulations in a data set, it is important to remember that not all moderators of treatment effect are observed in clinical research (Basu, 2012, 2014). This may be due to undiscovered etiological significance of observable factors. It may also occur because not all factors that are known to affect treatment outcome can be measured in the trial setting. Randomized studies aim to address the issue of unobserved heterogeneity in informative factors across arms of a trial through the randomization process.

Basu, Jena, Goldman, Philipson, and Dubois (2014) postulate that physicians and patients can engage in “passive prioritization.” This occurs when different treatment strategies are pursued for patients based on variation in observable characteristics not captured in clinical outcomes research. It follows that comparative- and cost-effectiveness analyses that rely on estimates of clinical effectiveness may overstate the achievable benefits from reflecting heterogeneity in patient outcomes as they fail to account for a pre-existing level of patient-centered care.

Identifying Subgroups: Feasibility, Validity, and Equity

Clinical feasibility, statistical validity, and equity must be considered when reflecting heterogeneity in cost-effectiveness analysis. Decision makers must determine whether subgroups identified can be operationalized in clinical practice, whether there is sufficient data to support stratified cost-effectiveness results, and whether stratification could lead to an inequitable distribution of healthcare resources (Sculpher, 2008).

Clinical feasibility should be a primary concern when conducting stratified cost-effectiveness analysis. Several patient characteristics may be routinely collected in clinical practice. These include patients’ age, sex, family history of disease, and clinical markers like blood cholesterol and systolic blood pressure. Patients can often be split into subgroups based on these routinely collected clinical characteristics with little difficulty. It may require extra cost and effort to obtain other relevant patient characteristics. For example, it is increasingly feasible to obtain genomic data from patients (NHGRI, 2018) and considerable research funding has been invested in identifying novel biomarkers for a range of health conditions. The additional costs incurred stratifying patients based on these characteristics must be accounted for in cost-effectiveness analyses.

Statistical validity is another key concern when addressing heterogeneity in cost-effectiveness analysis. When assessing cost-effectiveness in multiple subgroups, there is a danger that researchers might identify relationships due to random error rather than the existence of a real relationship. This is referred to as the multiple testing problem in statistics (Miller, 1981). Techniques to correct for multiple testing in studies of clinical effectiveness have been discussed in the literature (Benjamini & Hochberg, 1995; O’Brien & Fleming, 1979). Sculpher (2008, p. 803) argues that these rules may be too imposing and “represent arbitrary hurdles for identifying meaningful subgroups for decision making.” An alternative proposition is the pre-specification of subgroups for analysis alongside some hypothesized clinical or economic justification.

The uncertainty associated with subgrouping must be explored. In lieu of sufficient evidence from clinical trials, cost-effectiveness studies typically employ decision-analytic models to predict health and cost outcomes. These models can take many forms, from simple decision trees to complex discrete event simulators. Uncertainty can be evaluated with decision-analytic models by altering model inputs and recording the effect that this has on estimated health and cost outcomes. Traditional sensitivity analysis (TSA) involves incrementally changing one or a set of model parameters. This approach can be used to assess the effect of key modeling assumptions on predicted outcomes. Probabilistic sensitivity analysis (PSA) involves assigning each parameter of interest a distribution instead of a fixed value. The model is run repeatedly, allowing parameters to vary according to their assigned distribution, and outcomes are recorded. The distribution of outcomes produced in PSA informs researchers of the scope of parametric uncertainty in the modeling process. Both TSA and PSA can be employed at the subgroup level to gain increased understanding of inherent uncertainty in the decision-making process.

Consideration must be made regarding data limitation when undertaking stratified cost-effectiveness analysis. As with all modeling studies, RCT data should be treated as the gold standard for assessing the relationship between independent and dependent variables. Individual-level data allows researchers to model the independent effect of factors that drive heterogeneity in cost-effectiveness outcomes and to model covariate interactions. Longitudinal data sets may also provide information on the baseline distribution of risk factors, costs, and morbidity in a population.

Subgrouping patient populations requires making inferences based on a smaller amount of data than studies which disregard subgroup effects. Uncertainty will therefore be greater in stratified cost-effectiveness analyses and it may be necessary to acquire more subgroup-level data. Espinoza, Manca, Claxton, and Sculpher (2014) provide a framework to estimate the expected value of acquiring further subgroup-related data when addressing heterogeneity in cost-effectiveness analysis. Applying the value of information framework, they show that the total expected value of perfect information (tEVPI) in a population which comprises S mutually exclusive subgroups is equal to the sum of each subgroup-specific EVPI (EVPIs) weighted by the proportion of that subgroup in the population (ws):


Equity is a final concern when implementing policies that reflect heterogeneity in cost-effectiveness. Making treatment decisions based on some patient characteristics may be deemed socially unacceptable. Stratifying populations based on sociodemographic characteristics like age, sex, race, and social class is likely to raise equity issues. However, stratification by such characteristics may be considered socially acceptable if this stratification leads to a reduction in health inequalities (Dolan, Shaw, Tsuchiya, & Williams, 2005). Another approach to limit equity issues is for decision-making bodies to pre-specify acceptable characteristics with which to stratify patient populations. However, this would considerably limit the scope of future decisionmaking.

Finally, it is important that decision makers acknowledge the opportunity cost associated with neglecting heterogeneity in cost-effectiveness. Figure 2 graphically displays the net benefit attributable to different levels of patient stratification. Much greater benefit can be achieved by stratifying the population by both age and location of infarction than age alone. In a similar study focusing on patient preference, Basu and Meltzer (2007) consider the expected value of individualized care. They specifically examine the value in stratifying treatment decisions by patient-level health state quality-of-life weighting, finding that such an approach to decisionmaking could produce large net benefit gains in the U.S. population if instituted correctly.

Espinoza et al. (2014) also consider the cost-effectiveness of subgrouping in cost-effectiveness analysis. They find that a decision maker could achieve considerable net benefit gains by stratifying patient populations into subgroups. Additionally, they prove that marginal net benefit gains diminish as additional subgroups are added to analyses. This highlights the dynamic tradeoff between increased stratification, net benefit gain, and equity.


It is increasingly feasible to stratify patients into subgroups with different treatment-related outcomes based on sociodemographic and biological characteristics. However, decisions to implement healthcare interventions are often based on cost-effectiveness results averaged across large, heterogeneous patient populations. This approach to decisionmaking often overlooks marked heterogeneity in outcomes between patients.

Adequately reflecting heterogeneity enables the establishment of limited use criteria and this leads to increases in net benefit. Moreover, by basing funding decisions on stratified cost-effectiveness outcomes, decision makers can signal demand to healthcare providers. Ultimately, accounting for heterogeneity in cost-effectiveness helps decision makers to better achieve a fundamental goal of health economics: the maximization of population health given scarcity of resources.