Show Summary Details

Page of

date: 11 December 2019

# From Clinical Outcomes to Health Utilities: The Role of Mapping to Bridge the Evidence Gap

## Summary and Keywords

The assessment of health-related quality of life is crucially important in the evaluation of healthcare technologies and services. In many countries, economic evaluation plays a prominent role in informing decision making often requiring preference-based measures (PBMs) to assess quality of life. These measures comprise two aspects: a descriptive system where patients can indicate the impact of ill health, and a value set based on the preferences of individuals for each of the health states that can be described. These values are required for the calculation of quality adjusted life years (QALYs), the measure for health benefit used in the vast majority of economic evaluations. The National Institute for Health and Care Excellence (NICE) has used cost per QALY as its preferred framework for economic evaluation of healthcare technologies since its inception in 1999.

However, there is often an evidence gap between the clinical measures that are available from clinical studies on the effect of a specific health technology and the PBMs needed to construct QALY measures. Instruments such as the EQ-5D have preference-based scoring systems and are favored by organizations such as NICE but are frequently absent from clinical studies of treatment effect. Even where a PBM is included this may still be insufficient for the needs of the economic evaluation. Trials may have insufficient follow-up, be underpowered to detect relevant events, or include the wrong PBM for the decision- making body.

Often this gap is bridged by “mapping”—estimating a relationship between observed clinical outcomes and PBMs, using data from a reference dataset containing both types of information. The estimated statistical model can then be used to predict what the PBM would have been in the clinical study given the available information.

There are two approaches to mapping linked to the structure of a PBM. The indirect approach (or response mapping) models the responses to the descriptive system using discrete data models. The expected health utility is calculated as a subsequent step using the estimated probability distribution of health states. The second approach (the direct approach) models the health state utility values directly.

Statistical models routinely used in the past for mapping are unable to consider the idiosyncrasies of health utility data. Often they do not work well in practice and can give seriously biased estimates of the value of treatments. Although the bias could, in principle, go in any direction, in practice it tends to result in underestimation of cost effectiveness and consequently distorted funding decisions. This has real effects on patients, clinicians, industry, and the general public.

These problems have led some analysts to mistakenly conclude that mapping always induces biases and should be avoided. However, the development and use of more appropriate models has refuted this claim. The need to improve the quality of mapping studies led to the formation of the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Mapping to Estimate Health State Utility values from Non-Preference-Based Outcome Measures Task Force to develop good practice guidance in mapping.

# Economic Evaluation in Healthcare

Healthcare resources are limited—allocating resources to one healthcare intervention means those resources cannot be used elsewhere. Economic evaluation provides a framework to compare the costs and benefits of different healthcare technologies and interventions to ensure scare resources are used efficiently (Drummond, 2005). Economic evaluations are used across many developed healthcare systems and help many reimbursement agencies make healthcare resource allocation decisions.

Cost-effectiveness analysis (Weinstein & Stason, 1977) is the most widespread type of economic evaluation in healthcare. It compares health technologies by calculating the ratio of the difference in cost to the difference in health benefit. This ratio is the incremental cost effectiveness ratio (ICER) and is compared to some threshold value to determine whether investment in a specific technology represents a cost-effective use of resources. The quality-adjusted life-year (QALY) (Weinstein et al., 2009) is one of the most widely used measures of health benefit in economic evaluations. It combines quantity and quality of life into a single measure of health outcome allowing comparisons across a broad range of disease areas, treatments and patients. The QALY assigns a weight to each year of survival to account for its health-related quality of life (HRQoL). Generic preference-based measures (PBMs) are intended to be applicable to a wide range of disease areas and are central to measuring HRQoL and QALYs. A PBM comprises two elements, a survey instrument and a valuation. A generic survey instrument allows patients to describe their health using a number of dimensions or domains each of which has a number of severity levels. Several generic instruments have been developed. Some widely used examples are the EuroQoL five-dimensional questionnaire EQ-5D (The EuroQol Group, 1990), the six-dimensional Short Form SF-6D (Brazier et al., 2002) based on the 36-item Short Form Survey, SF-36 (Ware et al., 1993), and the eight dimensional Health Utility Index, HUI (Horseman et al., 2003). Table 1 summarizes these example measures.

Table 1. Examples of Some Commonly Used Preference-Based Measures

Instrument

Dimensions

Levels

Number of Health States

Range of Health Utilities

EQ-5D-3L

Five dimensions: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression

Three levels: no/some/extreme problems

243

[–0.594,1]

EQ-5D-5L

Five dimensions: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression

Five levels: no/slight/moderate/severe/extreme problems

3,125

[–0.285,1]

SF-6D

Six dimensions: physical functioning, role limitations, social functioning, pain, mental health, vitality.

Between 4 and 6 levels in each dimension.

18,000

[0.301,1]

HUI3

Eight dimensions: vision, hearing, speech, ambulation, dexterity, emotion, cognition, and pain

Between 5 and 6 levels in each dimension

972,000

[–0.359,1]

Note: The following value sets are used: EQ-5D-3L UK valuation (Dolan, 1997); EQ-5D-5L England valuation (Devlin et al., 2018); SF-6D (Brazier et al., 2002, 2008); HUI3 (Furlong et al., 1998).

A PBM assigns a value or health utility to each of the health states described by the instrument so that QALYs can be calculated. These health utilities are derived from valuation experiments designed to capture general population preferences for health states. Different countries may have different value sets, reflecting the variations in preferences across countries’ populations; however, in all value sets 1 represents full health, and 0 represents dead. These are the anchors around which other health states are valued. Negative values are possible in some PBMs and represent health states that are deemed to be worse than being dead, indicating that, according to the preferences of the general population, individuals would rather die now than spend the rest of their life in that health state. For example, the descriptive system of EQ-5D-3L comprises five dimensions: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. Each dimension has three levels: no problems, some problems, and extreme problems. In total, 243 different health states can be specified by this descriptive system. In the United Kingdom, the values attached to these health states range from –0.594 for the worst health state (extreme problems in all dimensions) to 1 for full health (no problems in any direction).

Different PBMs have been shown to generate different utility values for the same patient (e.g., Longworth & Bryan, 2003; O’Brien et al., 2003; Brazier et al., 2004; Barton et al., 2004; Hernández Alava et al., 2013; Chen et al., 2016). These differences are expected as the value sets have been generated using different valuation techniques, different descriptive systems, and different samples of respondents but are problematic in terms of comparability across studies.

Many national guidelines for economic evaluation recommend and sometimes require the use of generic instruments, such as those of England and Wales (National Institute for Health and Care Excellence, 2013), Spain (Lopez-Bastida, et al., 2010), France (HAS, 2012), Thailand (Wibulpolprasert, 2008), Finland (Pharmaceuticals Pricing Board, 2017), Sweden (Pharmaceutical Benefits Board, 2003), Poland (Agency for Health Technology Assessment, 2009), New Zealand (PHARMAC, 2015), Canada (Guidelines for the Economic Evaluation of Health Technologies, 2017), Colombia (Instituto de Evaluación Tecnológica en Salud, 2014) and the Netherlands (Guideline for the Conduct of Economic Evaluations in Health Care, 2016). Some recommend the use of a specific instrument, usually the EQ-5D (International Society for Pharmacoeconomic and Outcome Research, 2018).

There is an ongoing debate around the use of wider outcome measures to allow for consistent decision making across sectors such as public health and social care where HRQoL measures may be unable to capture important benefits not directly related to health (Brazier & Tsuchiya, 2015). This is an area of continuing research; however, QALYs are likely to remain central to healthcare decision making for the foreseeable future.

# The Role of Mapping in Economic Evaluations

Clinical studies used to estimate the treatment effect of health technologies often include some patient-reported outcome measure, but they are not sufficient for constructing QALYs because they lack a preference-based valuation. Even though more recent clinical studies follow good practice for utility estimation and tend to include a PBM, studies conducted in the past will often form part of the evidence base as comparators in the evaluation of new health technologies. Moreover, there is not a single PBM universally recommended by all decision-making bodies; even if a PBM is included in the clinical study, it might not be the recommended one for the specific setting, or the study alone might not be sufficient to provide all the health utility information required for the economic evaluation. Clinical trials usually have only a relatively short follow up, whereas economic evaluations may simulate the patient’s lifetime as costs and benefits differ in the long run. The information collected in the trial is, therefore, insufficient for the economic evaluation. The economic evaluation will need to be complemented with, for example, information about long-term disease progression collected via clinical data registries. These registries typically have a wealth of clinical information but very rarely collect PBM data as well. Furthermore, PBMs are updated over time with different versions producing different health utility values for the same patients. As new versions are adopted, all relevant existing evidence on the older version needs to be “converted” into the new measure. A recent significant example is the development of EQ-5D-5L, the new version of EQ-5D.

Even outside economic evaluation, PBMs such as EQ-5D are increasingly being used as generic health measures in population surveys (Devlin & Brooks, 2017). PBMs also feature significantly in the Department of Health PROMS program in the United Kingdom. Since 2009, the English NHS has collected EQ-5D data from patients undergoing elective surgery for four procedures.1 Data are used to construct hospital performance indicators informing a range of decisions (Devlin & Appleby, 2010). For example, analysts model post-operative EQ-5D score adjusted for patient pre-operative characteristics to identify cases of concern or of good practice by highlighting variations beyond the normal range.

When the relevant PBM is missing, economic analysts should attempt, if feasible, to predict the value it would have taken given the information available. In some cases, it is possible to bridge the evidence gap by using what has been referred to in the literature as “mapping,” “cross-walking,” or “transfer-to-utility.” Mapping involves finding an external reference dataset that contains both the PBM required for economic evaluation and the instrument or instruments included in the clinical study. Provided a good quality reference dataset can be found, these data can then be used to estimate the relationship between the PBM and the clinical outcome measures, thereby providing the means to bridge the gap between the evidence available in the clinical studies and the requirements of the economic evaluation. The estimated statistical relationship is then used to infer the missing PBM in the clinical study so it can be incorporated in the economic evaluation. Depending on the type cost-effectiveness analysis, the estimated mapping model can be used to (a) predict the missing utility using the conditional expectation, $E[y|x]$, or (b) simulate the distribution of utilities across patients using the full conditional distribution, $f(y|x)$, where $y$ denotes the health state utility and $x$ the vector of conditioning variables (such as the clinical outcome measure, age and gender). These two alternative uses of the mapping model are related to important concepts often confused in the mapping literature. The unconditional distribution of utilities across patients, $f(y)$, describes the distribution of utilities. This distribution is bounded at the top by 1, the value of full health, and at the bottom by the lowest utility value for the particular instrument used (see Table 1). For a given set of conditioning variables $x=X$, the conditional distribution, $f(y|X)$, describes the distribution of utilities in the subpopulation of patients for whom the conditioning variables take the combination of values $X$; individuals with the same observable characteristics $X$ differ in their observed utilities due to an unobserved random component. The conditional expectation, $E[y|X]$, is the mean of the conditional distribution, $f(y|X)$; individuals with the same observable characteristics $X$ share the same mean. Therefore, the distribution of the conditional means, $E[y|x]$, is due to the variation in the combinations of values X in the population of interest. The conditional distribution, $f(y|x)$, includes in addition, the unexplained variation of utility around the conditional means. Hence, the distribution of the conditional means differs from the conditional distribution, and in particular, the distribution of the conditional means will always have less variation than the conditional distribution.2

Mapping is encountered frequently in Health Technology Assessment (HTA) bodies such as NICE in the United Kingdom. NICE is a non-departmental public body providing guidance to the NHS in England on the clinical and cost effectiveness of heath technologies. Recommendations are based on reviews of the clinical and economic evidence. Guidance produced by NICE is also applied selectively in Northern Ireland, Scotland, and Wales, and many other countries refer to NICE guidance when formulating their decisions. NICE makes decisions across different health technologies and disease areas making it essential to adopt a consistent approach. The “reference case” intended to achieve this consistency is detailed in the NICE methods guide (NICE, 2013). It recommends the use of EQ-5D as the measure of health benefit but when EQ-5D data is not available mapping may be used. Mapping is commonly encountered in economic evaluation. Kearns et al. (2013) conducted a review of 79 NICE technology appraisals and found that mapping models were used in 22% of them.

Mapping studies are sometimes conducted for specific economic evaluations but are increasingly being published as a standalone resource with no specific evaluation in mind The Health Economics Research Centre (HERC) Database of Mapping Studies (Dakin, 2013; Dakin et al., 2018) at Oxford University provides an up-to-date catalogue of studies mapping from quality of life or clinical measures to EQ-5D. Mapping to EQ-5D is more prevalent, but there are other papers estimating mappings to other PBMs (Brazier et al., 2010; Goldfeld, Hamel, & Mitchell, 2012; Roset, Badia, Forsythe, & Webb, 2013; Lee, Kaneva, Latimer, & Feldman, 2014; Payakachat, Tilford, & Kuhlthau, 2014; Yang, Wong, Lam, & Wong, 2014; Chen, McKie, Khan, & Richardson, 2015). Analyses using previously estimated mapping models may require either the conditional expectation or the conditional distribution of utilities: for this reason it is important that standalone mapping studies are able to cater for both. A model that estimates the entire conditional distribution is more valuable in this case as the conditional means can be inferred from it afterward if needed.

# Evolution of Mapping Models

There are two broad approaches to mapping that arise from the structure of a PBM: indirect and direct approaches. An indirect mapping (also referred to as “response mapping”) models the discrete responses to the descriptive system of the instrument. For example, for EQ-5D-3L, five (typically independent) discrete data models are used to estimate the probability of being in each of the three levels for each dimension in the descriptive system. Once estimated, the model can be used to calculate the expected health utility as a second separate step. The direct mapping approach bypasses the responses to the descriptive system and models health utility itself in a single step, treating the variable as (at least partly) continuous.

There are advantages and disadvantages of both approaches and the performance of each method might vary according to the PBM used, the disease area, the patient population covered by the dataset, the availability of conditioning variables, and the type of cost-effectiveness analysis. Direct mapping is a one-step process. However, modeling the health utilities directly makes the mapping model specific to a particular utility value set preventing the use of the model across countries with different value sets. It also ignores information contained in the responses to the individual dimensions. In some cases, this information may be quite useful, and it implies that estimating the entire conditional probability structure, $f(y|x)$, has additional value over estimating the conditional means as these can be derived at a later stage, if needed, from the conditional probability distribution. In terms of data requirements, both mapping approaches need responses across the full range of disease severity. The indirect approach also needs enough responses at all levels in each dimension; otherwise the mapping model is unable to predict a full conditional probability distribution across all health states. Mappings to PBMs with a larger number of levels in each dimension are more likely to encounter this problem (Khan et al., 2014b; Gray, Hernández, & Wailoo, 2018). There is often a lack of observations in the lower tail of the health utility distribution where severity of disease is high. This would not be a problem if those levels of severity were not relevant to the patient population of concern. However, this is rarely the case, and the lack of observations at more severe levels of disease is a consequence of the typically small datasets collected in clinical studies and the restricted nature of the included patient populations, where, for example, patients with severe comorbidities are excluded or underrepresented from the populations recruited. Yet robust estimation of that tail is often crucial for cost-effectiveness analyses. Models may track patients with chronic conditions over their entire lifetimes, including those future periods when many will experience deterioration of health to more severe states than are typically well reflected in clinical studies.

Mapping models usually have few conditioning variables in addition to the clinical instrument/s or variables to map from. Age and gender are sometimes included since these tend to be important variables included in the cost-effectiveness analyses. Conditioning on additional covariates tends to improve the mapping in terms of standard model selection statistics such as AIC and BIC or commonly used measures of predictive accuracy such as Root Mean Square Error and Mean Absolute Error. The improvement often comes at the expense of decreasing its usefulness for other cost-effectiveness analyses if those covariates are specific to the reference dataset but are missing from other studies of interest. The usual paucity of conditioning covariates makes selection of the dataset of crucial importance to ensure the general nature of the mapping model. It is often the case in practice that no alternative dataset exists.

# Direct Mapping Approaches

A linear regression (Hurst et al., 1997) is the first direct mapping model in the HERC database. Linear regressions are in fact the most commonly used direct mapping model.3 Health utility data have several features that make the use of linear regression questionable. Health utility data is bounded: limited at the top at 1 (the value of full health) and at the bottom by the value of the worst health state described by the instrument. Different countries and different PBMs differ on the lower boundary, but all of them are limited (see Table 1 for some examples). An important characteristic of bounded variables is that the moments of their distribution are related, as the mean response gets closer to the boundary, the variance of the variable will tend to decrease and its skewness to increase. Histograms of health utilities for the typical reference datasets exhibit a mass of observations at the upper boundary of 1, immediately followed below by a relatively large gap in the distribution before the next feasible utility value. The rest of the distribution is usually characterized by multimodality and/or skewness unlikely to be properly captured by conditioning on the few available variables. Alternative models such as a Tobit, Censored Least Absolute Deviation (CLAD), and two-part models had been used in the literature to capture the large proportion of observations at full health. A strand of the literature has criticised the use of the tobit model in this context arguing that it can only be used for censored data, and health utility data is not censored as values of health above 1 are not possible (see, e.g., Pullenayegum et al., 2010).

This criticism stems from a confusion between two different applications that lead to the same statistical model. The tobit model can arise from true data censoring where the variable of insterest is not fully observed. A typical example is data on income where surveys, to encourage responses at higher levels of income, use an upper limit above which respondents only need to indicate if their income is above that limit. The data are only observed in a particular range that causes a pileup of observations, but the variable of interest is still underlying income. The same tobit model can also arise from a corner solution of choice as in Tobin (1958). Tobin developed his original model to deal with variables that are limited, such as expenditure on durable goods. Expenditure can only be positive or zero, and the variable is referred to as limited at zero. At disaggregated levels of expenditure or for luxury goods, pileups at the “corner” of zero are likely. There is no censoring in this case: the entire range of the variable is observed. In fact, the word “censored” is completely absent from Tobin’s original paper. Health utility data is analogous to this second case. The variable of interest is the actual response variable, and the pileup at one represents a corner solution. This distinction is important for prediction purposes. The linear prediction is appropriate if the pileup is due to censoring, but the (nonlinear) prediction of the response variable should be used in the case of a corner solution.

Nevertheless, the evidence shows that the models above are a poor fit to the data. The majority of the papers illustrate this with reference to the conditional means. They show that the models tend to understimate health utility at the top (where patients are in good health) and overestimate it at the bottom (where patients are in ill health). Even if interest lies only in the conditional means, the characteristics of utility data are such that the models above tend to distort the conditional means, sometimes due to ignoring the bounded nature of the data and sometimes due to the inappropriateness of the chosen parametric distribution. In cost-effectiveness models where effective treatments increase a patient’s health, the use of these mapping models will typically result in a downward bias in the estimated QALY effect of the treatment and consequently a less cost-effective outcome for the treatment. Table 9 in Gillard, Devine, Varon, Liu, & Sullivan (2012) can be used to illustrate this problem. Using data on EQ-5D-3L in individuals with migraine, they found that successful treatments reducing the number of headache days per month from more than 24 (chronic migraine) to less than four per month (episodic migraine) increased the individual health utility by around 0.4 on average. Fitting a linear regression to the data and mapping instead gave an estimate of the expected utility increment of 0.26, a 35% decrease. This illustration presents an extreme case, in practice, the size of the bias will depend on the effectiveness of the treatment. But these large differences accumulated over the life of the cost-effectiveness analysis can have a substantial effect in the ICERs used to inform healthcare priorities.

Problems with these models have led some researchers to wrongly conclude that mapping models always produce biased estimates and should be avoided. However, the development and use of more flexible approaches has shown this not to be the case.4 Austin and Escobar (2003) proposed the use of finite mixture models to estimate PBMs and applied it to HUI3. Mixtures of normal distributions are flexible and can approximate functional froms that are difficult to model using a single distribution. Their use is often linked to the presence of multimodality, but they can also approximate unimodal highly skewed or kurtotic distributions. Austin and Escobar (2003) used a degenerate nornal distribution with mean of one and a small standard deviation to account for the mass of observations at one. Mixtures of normal distributions have been used to model other PBMs such as EQ-5D-3L (Khan, Madan, Petrou, & Lamb, 2014a; Kent et al., 2015; Vilija et al., 2017) and SF-6D (Khan et al., 2014a, 2016). Hernández Alava et al. (2012) introduced a finite mixture model specifically developed to deal with the idiosyncracies of EQ-5D-3L data. The model is based on underlying distributions analogous to the tobit model (corner solution) extended to allow for the gap between one and the immediately previous value encountered in health utility data. The model has been applied successfully to different disease areas (Hernández Alava et al., 2014; Wailoo et al., 2014, 2015; Fuller et al., 2017) a well as different PBMs (Gray et al., 2018). Simultaneously, a separate strand of the methodological literature introduced the use of beta distributions. Beta distributions are useful for modeling health utilities data because they are bounded and are able to accomodate a number of different shapes.

Similarly, fractional response models are useful in modeling bounded data but unlike models based on beta distributions only provide estimates of the conditional means, not the conditional distribution of utilities. Basu and Manca (2012) proposed the use of two-part beta regressions to account for bounded data, skewness, and the spike of observations at the value of full health. Some studies have suggested that the beta regression is not appropriate in cases where the PBM displays negative values (Young et al., 2015) and in some cases have converted all negative values to zero ad hoc purely for convenience (Kaambwa et al., 2017; Khan & Morris, 2014) inadvertently creating a potential problem given the sensitivity of beta regressions to observations at the boundaries. Standard transformations exist and are regularly used with beta regressions (Smithson & Verkuilen, 2006) in other applications. The beta regression approach has been extended to the use of mixtures (Kent et al., 2015) and to, in addition, (a) account for the gap between full health and the next feasible value and (b) allow alternative approaches to deal with observations at the boundaries (Gray et al., 2018).5

The flexibility of mixture models is increasingly being recognized as useful in modeling health utility data. Dakin et al. (2018) reported an increase in the use of mixture models in mapping EQ-5D. The existence of mixture model packages in standard software as well as specific code written for bespoke health utility models (see, e.g., the Stata code and accompanying papers Hernández Alava & Wailoo, 2015; Gray & Hernández Alava, 2018) as well as the recognition of problems derived from using inadequate models has helped their uptake. However, mixture models are more difficult to estimate and require more judgment and expertise on the part of the analyst than standard models. For example, the analyst needs to decide the number of components in the mixture. Mixture models are known for having multiple optima (McLachlan & Peel, 2000) and often require running a model from a large number of starting values or the use of global optimisation algorithms.

The main barrier to more widespread adoption of these methods is the lack of technical knowhow, and consequent misapplication, by analysts. Sometimes mixture models are confused with piecewise models using different distributions on separate ad hoc ranges of the health utility space (Khan et al., 2016). Efforts need to be directed toward developing accessible code and training materials to facilitate implementation and interpretation of complex methods by non-specialists.

Lu, Brazier, and Ades (2013) introduced the use of a common factor model, which treats all outcomes (the PBM and the clinical outcomes) as imperfect measures of a single underlying “health” concept. This is the first and to our knowledge only paper in this literature that acknowledges the problems that measurement error can cause in this setting. Lu et al. (2013) demonstrated the approach in a model of linear relationships, but they do allude to the possibility of extending the model by using more flexible approaches discussed above.

# Indirect Mapping Approaches

Indirect mapping approaches have been less popular than direct mapping approaches, and the literature is limited in comparison. However, they are gaining popularity as models for discrete data become more mainstream in this area. The first indirect mapping model is a set of independent multinomial logit models published by Tsuchiya, Brazier, McColl, and Parkin (2002) in a discussion paper. The most widely known indirect mapping paper is, however, Gray, Rivero-Arias, and Clarke (2006) which also estimated a set of independent logit models and coined the term “response mapping,” which is now widely used in the literature. This approach has intuitive appeal since it is more closely related to the actual data- generating process of health state data. There are a number of papers applying these models in the literature mainly mapping to EQ-5D (e.g., van Hout et al., 2012; Dakin, Gray, & Murray, 2013; Brazier et al., 2014; Khan et al., 2014a; Hoyle et al., 2016). Developments in this literature include the use of models for ordered outcomes such as ordered probits/logits and generalized ordered probits (Hernández Alava et al., 2013, 2014; Wailoo et al., 2015) to account for the ordered nature of the levels in each dimension. The multivariate ordered probit model of EQ-5D-3L of Conigliani et al. (2015) relaxes the independence assumption. The model was shown to outperform competing independent dimensions models and set the ground for future avenues of research in the indirect mapping literature. However, estimating models involving high-dimensional ordinal variables remains an onerous task.

Recently, a newly developed response mapping model has been published in the literature (Hernández Alava & Pudney, 2017). NICE recommends the use of EQ-5D-3L for use in its technology appraisals. A new version of EQ-5D, the EQ-5D-5L has been developed with a new valuation set for England and a potential need for mapping two-ways between these two PBMs has arisen. The objectives for developing the model are somewhat different to the standard mapping models described above, and its complexity is not something that will be needed in all circumstances. The model and the important policy implications it raises are discussed in a separate section.

# Implications for Cost-Effectiveness Analyses

The proliferation of mapping models has uncovered an important concern. Different mapping models produce different estimates of health utilities for the same dataset. Although these differences appear small in magnitude at the individual health utility level, they often translate into large differences in the results of the cost-effectiveness analysis. Occasionally several different mapping functions from different studies exist in the same disease area, which are potential candidates to be used in cost-effectiveness analyses. Pennington and Davies (2014) compared the effect of changing the mapping model to predict health utility on the results of a cost-effectiveness model in rheumatoid arthritis (RA). Using an economic model designed to assess the cost-effectiveness of second-line biologics in RA, they found that the use of different mapping models could have changed estimates substantially, potentially leading to different reimbursement guidelines. The choice of mapping model is therefore not simply a matter of academic interest; it has tangible effects on not only the patients, clinicians, and the industry directly involved in those decisions but also on the general public.

Recognizing the need to improve the quality of mapping studies, the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Mapping to Estimate Health-State Utility Values from Non-Preference-Based Outcome Measures Task Force was established. The Task Force developed good practice guidance on how to conduct mapping analyses, assess their performance and suitability, and how to incorporate the results in cost-effectiveness analyses. The panel had international experts from academia, industry, and decision-making bodies and published its guide in 2017 (Wailoo et al., 2017). Other previous recommendations for mapping and reporting standards (Longworth & Rowen, 2011; Petrou et al., 2015) exist in the literature but the ISPOR good practice guide is unique because of its international perspective, coverage of all aspects of mapping and reflection of the contribution of state of the art methods at the time of publication.

# Important Considerations

It is important to recognize that mapping is fundamentally a statistical issue and should be treated with the same degree of rigor as any statistical analysis. Good quality data is the basis for sound statistical modeling. Sometimes, good quality reference datasets are difficult to find, and it is important that mapping studies present enough evidence to support their use. If the data quality is suspect, a mapping model should be used with caution; when data quality is very poor, it may be the case that a mapping model should not be estimated at all. Furthermore, as a mapping model is essentially an input into a cost-effectiveness analysis, it is not sufficient to consider the properties of the mapping model in isolation, but the particular needs of the cost-effectiveness analysis should be considered. There are different types of cost-effectiveness models that have different requirements in terms of mapping. Any mapping study published as a resource for cost-effectiveness analysis needs to demonstrate its suitability for adoption in all those different settings and needs to report everything that a cost-effectiveness analyst will need. Earlier studies only reported the coefficients of the model and, at best, their standard errors. This is insufficient for both decision model–based evaluations and those conducted alongside clinical trials. The covariance matrix of the estimated parameters should be reported to allow reflection of parameter uncertainty in the cost-effectiveness analysis. But even in the 21st century, reporting of the covariance matrix is still not routine practice. This permits the analyst to easily compute an overall measure of sampling variation by including both the parameter sampling variation in the original (typically, clinical) study and that of the mapping model, since the two sources of sampling variation are independent.

The appropriate method of using the estimated mapping model depends on the type of cost-effectiveness analysis. For some types of cost-effectiveness analysis, it is only necessary to predict the conditional mean health state utility value that is just the expected value of the conditional means. Others require patient-level simulation of utilities from the mapping model for which the full conditional probability distribution may be needed.

Some authors have stated that mapping underestimates uncertainty (Brazier, Yang, Tsuchiya, & Rowen, 2010; Longworth & Rowen, 2011; Fayers & Hays, 2014). When using the expected value, the sample variance of the mean predictor will always be smaller than the variance of the data. The variance of the mean predictor is not an appropriate estimator of the variance of the utilities, if the latter is needed, a consistent estimator can be obtained by calculating the variance of the predictive distribution of the utilities provided the mapping model is correctly specified (Hernández Alava, Wailoo, Wolfe, & Michaud, 2014; Hernández & Pudney, 2017). Using the mean predictor is not a problem when only the mean utility is required for the cost-effectiveness analysis as the calculation of QALY’s is a linear function of the profile of utilities.

In much of the existing mapping literature, model selection has used criteria such as $R2$ (or $R¯2$), mean absolute error (MAE) and root mean squared error (RMSE) and not considered the specific type of cost-effectiveness analysis. The datasets are individual-level and display the typical characteristics of microeconometric data. This level of disaggregation inevitably entails substantial variation across individuals and introduces nonlinearities in the response variable. Furthermore, the range of the dependent variable, health utility, is rather small; for example, the widest range in Table 1 corresponds to EQ-5D-3L with a length of the interval of 1.594. Measures such as $R¯2$, MAE and RMSE are insensitive given the high level of individual-level variation and the small scale of health utility. This can mask serious problems of, for example, systematic under/overprediction due to biases in the conditional means. These will, in turn, have a significant effect on the central estimate of cost-effectiveness, which is the most important figure of interest to decision makers. Nor do these measures help to assess the suitability of the mapping model for patient-level simulation cost-effectiveness models. More recent studies tend to include information criteria and plots of the residuals, but even these are of little value in assessing the adequacy of the mapping model for a particular cost-effectiveness analysis.

Mapping models are inputs into cost-effectiveness analysis. This should form part of model selection considerations. Hernández Alava et al. (2014) proposed the use of plots of observed means versus conditional predicted means by the value of clinical measure and plots of the distribution of the data versus the distribution obtained by simulation using the estimated mapping model as the data generating process. These two plots mirror the two possible ways in which the mapping model will be used in the cost-effectiveness analysis. Some cost-effectiveness analyses (typically those based on cohort models) will use the mapping model to predict the mean conditional on the values of the clinical measure(s), and in this case it is important that the mapping model is shown not to exhibit systematic under/overprediction of the conditional means. If it does show a problem, at least the cost-effectiveness analyst will be clear as to the likely direction of the impact on the cost-effectiveness analysis. Patient-level cost-effectiveness analyses (whether patients included in studies or simulated as part of a decision model) require plotting of the distribution to help the analyst determine the suitability of the mapping function for his/her own specific application.

There has been a lot of discussion around the issue of empirical validation of the mapping function either by estimating the model in one dataset or in one part of a split sample and checking how well it predicts in a similar sample or the other part of the split sample. In practice, often only one small dataset is available that rules out this type of validation. Even in cases where such exercises can be undertaken, the additional information provided is of limited value.

# Mapping: The Case of EQ-5D-3L and EQ-5D-5L in England

EQ-5D-3L is the NICE’s preferred measure of HRQoL in their technology appraisals (NICE, 2013). Concerns about the lack of sensitivity of EQ-5D-3L and the presence of ceiling effects lead to the development of a new instrument, EQ-5D-5L. The new version has the same five dimensions (mobility, self-care, usual activities, pain/discomfort, and anxiety/depression) but increases the number of levels in each dimension from three to five (see Table 1). The descriptive system had already been developed at the time the NICE methods guide was written, but no valuation had been released yet. The guide stated:

The EQ-5D-5L may be used for reference-case analyses. The descriptive system for the EQ-5D-5L has been validated, but no valuation set to derive utilities currently exists. Until an acceptable valuation set for the EQ-5D-5L is available, the validated mapping function to derive utility values for the EQ-5D-5L from the existing EQ-5D (-3L) may be used.

(NICE, 2013)

This suggests that both versions of EQ-5D could be used in appraisal submissions. A preliminary value set for England was published in 2016 with a final version published in Devlin et al. (2018) making it feasible to conduct an economic evaluation using both the descriptive system and associated tariffs for EQ-5D-5L.

Hernández Alava and Pudney (2017) developed a new multi-equation mapping model to map from EQ-5D-5L to EQ-5D-3L and vice versa. The aim of the mapping model was to check the consistency of the two versions of EQ-5D (implied by the wording of the methods guide) and assess the likely consequences of a move from EQ-5D-3L to EQ-5D-5L in cost-effectiveness analyses. The model was built as a system of ten ordinal regressions estimated jointly (response mapping) corresponding to the five dimensions of EQ-5D-3L and the five dimensions of EQ-5D-5L. The specification was chosen as flexible as possible avoiding unnecessarily strong restrictions in the data. The initial study used data from the National Data Bank for rheumatic diseases to estimate the mapping model. The conclusions were striking. There were significant differences between the responses to the two versions of EQ-5D so that the move from 3L to 5L was not just a uniform re-alignment of the response levels. Furthermore, switching to the new 5L version of EQ-5D could have important implications for the results of cost-effectiveness analyses.

Additional work (Hernández Alava et al., 2018; Pennington et al., 2019) showed that the conclusions were not restricted to rheumatoid arthritis but were general to a wide range of disease areas. Critical for NICE decision making was the fact that there was no single proportional adjustment that could be made to reconcile the differences. Results of cost-effectiveness analyses changed in different directions. On average, EQ-5D-5L assigned higher utility values to the same individual, but the scale of utilities was compressed in a smaller space. The implication is that health technologies that improve HRQoL look less cost-effective using EQ-5D-5L as the marginal increase in health is smaller due to the compression of the scale. However, technologies that extend life can look more cost-effective using EQ-5D-5L as on average higher utilities are assigned to the life years gained. In cost-effectiveness analysis of technologies that increase the quality and the length of life, the resulting effect depends on the relative strengths of each.

Following this work, in August 2017 NICE issued a position statement on the use of the EQ‑5D‑5L valuation set. The NICE position was to still support the use of EQ-5D-5L for data collection in clinical studies, but the 5L valuation set was not recommended for use in submissions. At the same time, NICE supported further research in the area to help inform a review of the position statement in August 2018. Subsequently, the November 2018 update maintained the 2017 position.

# Future Research

Mapping forms an important element of economic evaluations informing healthcare resource allocation decisions. Although there have been significant methodological advances in the area, a number of challenges remain.

Detractors of mapping have often criticized it on the grounds of producing systematically biased predictions of utilities. The development of more flexible models has shown this not to be the case: properly specified, flexible models that consider the idiosyncrasies of health utility data do not suffer from this problem. Direct mapping models have advanced significantly, and many different models are now available for analysts to choose from. The indirect mapping literature has not followed the same pace, partly because modeling multivariate discrete data is more demanding. There is still scope to develop more flexible models that could be successfully used for mapping. As different preference-based measures are developed, new and existing mapping methods will require development and testing.

Model selection is another area where more research is needed. A mapping model is essentially an input into cost-effectiveness analyses, but selection criteria have not been developed with this in mind. Is it possible to formally incorporate information about the cost-effectiveness analysis in the model selection process? What are the dangers of using standard model selection criteria without taking into account that the mapping model is an input into an economic evaluation model?

Measurement error has largely been ignored to date. So far, only one paper in the mapping literature has attempted to deal with this issue. There are a number of potential sources of measurement error in mapping, stemming from the responses to the valuations, the descriptive system in clinical studies and the impact of completing both instruments in the mapping datasets. Understanding the implications for cost effectiveness analyses is important, particularly as the need to map consistently between many different instruments becomes a more pressing issue for decision makers.

# Acknowledgments

I am very grateful to the editor and an anonymous referee for a number of helpful comments and suggestions that have greatly improved the article. The views expressed in this article, as well as any errors or omissions, are of the author only.

Ara, R., & Wailoo, A. J. (2011). NICE DSU Technical Support Document 12: The use of health state utility values in decision models.Find this resource:

Austin, P., & Escobar, M. (2003). The use of finite mixture models to estimate the distribution of the health utilities index in the presence of a ceiling effect. Journal of Applied Statistics 30, 909–923.Find this resource:

Basu, A., & Manca, A. (2012). Regression estimators for generic health-related quality of life and quality-adjusted life years. Medical Decision Making, 32, 56–69.Find this resource:

Conigliani, C., Manca, A., & Tancredi, A. (2015). Prediction of patient-reported outcome measures via multivariate ordered probit models? Journal of the Royal Statistical Society: Series A (Statistics in Society), 178(3), 567–591.Find this resource:

Drummond, M. F., Sculpher, M. J., Claxton, K., Stoddart, G. L., & Torrance, G. W. (2015). Methods for the economic evaluation of health care programmes (4th ed.). Oxford, U.K.: Oxford University Press.Find this resource:

Gray, A., Rivero-Arias, O., & Clarke, P. (2006). Estimating the association between SF-12 responses and EQ-5D utility values by response mapping. Medical Decision Making, 26(1), 18–29.Find this resource:

Gray, L. A., Hernández, M., & Wailoo, A. (2018). Development of methods for the mapping of utilities using mixture models: Mapping the AQLQ-S to the EQ-5D-5L and the HUI3 in Patients with Asthma. Value in Health, 21(6), 748–757.Find this resource:

Hernández Alava, M., & Pudney, S. (2017). Econometric modelling of multiple self-reports of health states: The switch from EQ-5D-3L to EQ-5D-5L in evaluating drug therapies for rheumatoid arthritis. Journal of Health Economics, 55, 139–152.Find this resource:

Hernández Alava, M., Wailoo, A. J., & Ara, R. (2012). Tails from the peak district: Adjusted limited dependent variable mixture models of EQ-5D questionnaire health state utility values. Value in Health, 15, 550–561.Find this resource:

Hernández Alava, M., Wailoo, A., Grimm, S., Pudney, S., Gomes, M., Sadique, Z., . . . Irvine, L. (2018). EQ-5D-5L versus EQ-5D-3L: The impact on cost effectiveness in the United Kingdom.Value in Health, 21(1), 49–56.Find this resource:

Hernández Alava, M., Wailoo, A., Wolfe, F., & Michaud, K. (2014). A comparison of direct and indirect methods for the estimation of health utilities from clinical outcomes. Medical Decision Making, 34(7), 919–930.Find this resource:

National Institute for Health and Care Excellence. (2013). NICE guide to the methods of technology appraisal. London, U.K.: National Institute for Health and Care Excellence.Find this resource:

Pennington, B., & Davis, S. (2014). Mapping from the Health Assessment Questionnaire to the EQ-5D: The impact of different algorithms on cost-effectiveness results. Value in Health, 17, 762–771.Find this resource:

Pennington, B., Hernández Alava, M., Pudney, S., & Wailoo, A. (2019). The impact of moving from EQ-5D-3L to -5L in NICE technology appraisals. PharmacoEconomics, 37(1), 75–84.Find this resource:

Wailoo, A. J., Hernández Alava, M., Manca, A., et al. (2017). Mapping to estimate health-state utility from non–preference-based outcome measures: An ISPOR Good Practices for Outcomes Research Task Force report. Value in Health, 20(1), 18–27.Find this resource:

## References

Agency for Health Technology Assessment. (2009). Guidelines for conducting Health Technology Assessment (HTA). Lawrenceville, NJ: ISPOR.Find this resource:

Ara, R., & Wailoo, A. J. (2011). NICE DSU Technical Support Document 12: The use of health state utility values in decision models.Find this resource:

Austin, P., & Escobar, M. (2003). The use of finite mixture models to estimate the distribution of the health utilities index in the presence of a ceiling effect. Journal of Applied Statistics, 30, 909–923.Find this resource:

Barton, G. R., Bankart, J., & Davis, A. C., & Summerfield, Q. A. (2004). Comparing utility scores before and after hearing aid provision: results according to the EQ-5D, HUI3 and SF-6D. Applied Health Economics and Health Policy, 3(2), 103–105.Find this resource:

Basu A., & Manca, A. (2012). Regression estimators for generic health-related quality of life and quality-adjusted life years. Medical Decision Making, 32, 56–69.Find this resource:

Brazier, J., Connell, J., Papaioannou, D., Mukuria, C., Mulhern, B., Peasgood, T., . . . Parry, G. (2014). A systematic review, psychometric analysis and qualitative assessment of generic preference-based measures of health in mental health populations and the estimation of mapping functions from widely used specific measures. Health Technology Assessment, 18(34), 1–188.Find this resource:

Brazier, J., & Roberts, J. (2004). The estimation of a preference-based index from the SF-12. Medical Care 42, 851–859.Find this resource:

Brazier, J., Roberts, J., Tsuchiya, A., & Busschbach, J. (2004). A comparison of the EQ-5D and SF-6D across seven patient groups. Health Economics, 13, 873–884.Find this resource:

Brazier, J., Roberts, J., & Deverill, M. (2002). The estimation of a preference-based single index measure for health from the SF-36. Journal of Health Economics, 21, 271–292.Find this resource:

Brazier, J. E., Rowen, D., & Hanmer, J. (2008). Revised SF-6D scoring programmes: A summary of improvements. PRO Newsletter, 40, 14–15.Find this resource:

Brazier, J., & Tsuchiya, A. (2015). Improving cross-sector comparisons: Going beyond the health-related QALY. Applied Health Economics and Health Policy, 13(6), 557–565.Find this resource:

Brazier, J. E., Yang, Y., Tsuchiya, A., & Rowen, D. L. (2010). A review of studies mapping (or cross walking) non-preference based measures of health to generic preference-based measures? European Journal of Health Economics, 11(2), 215–225.Find this resource:

Chen, G., Khan, M. A., Iezzi, A., Ratcliffe, J., & Richardson, J. (2016). Mapping between 6 Multiattribute Utility Instruments. Medical Decision Making, 36(2), 160–175.Find this resource:

Chen, G., McKie, J., Khan, M. A., & Richardson, J. R. (2015). Deriving health utilities from the macnew heart disease quality of life questionnaire. European Journal of Cardiovascular Nursing, 14(5), 405–415.Find this resource:

Coca Perraillon, M., Shih, Y. C., & Thisted, R. A. (2015). Predicting the EQ-5D-3L preference index from the SF-12 health survey in a national US sample: A finite mixture approach. Medical Decision Making, 35, 888–901.Find this resource:

Conigliani, C., Manca, A., & Tancredi, A. (2015). Prediction of patient-reported outcome measures via multivariate ordered probit models? Journal of the Royal Statistical Society: Series A (Statistics in Society), 178(3), 567–591.Find this resource:

Dakin, H. (2013). Review of studies mapping from quality of life or clinical measures to EQ-5D: An online database. Health and Quality of Life Outcomes, 11, 151.Find this resource:

Dakin, H., Gray, A., & Murray, D. (2013). Mapping analyses to estimate EQ-5D utilities and responses based on Oxford Knee Score. Quality of Life Research, 22(3), 683–694.Find this resource:

Dakin, H., Abel, L., Burns, R., & Yang, Y. (2018). Review and critical appraisal of studies mapping from quality of life or clinical measures to EQ-5D: An online database and application of the MAPS statement. Health and Quality of Life Outcomes, 16, 31.Find this resource:

Devlin, N. J., Shah, K. K., Feng, Y., Mulhern, B., & van Hout B. (2018). Valuing health-related quality of life: An EQ-5D-5L value set for England. Health Economics, 27, 7–22.Find this resource:

Devlin, N., & Appleby, J. (2010). Getting the most out of PROMs: Putting health outcomes at the heart of NHS decision making. London, U.K.: King’s Fund and Office of Health Economics.Find this resource:

Devlin, N., & Brooks, R. (2017). EQ-5D and the EuroQol group: Past, present and future. Applied Health Economics and Health Policy, 15(2), 127–137.Find this resource:

Dolan, P. (1997). Modeling valuations for EuroQol health states. Med Care, 35(11), 1095–1108.Find this resource:

Drummond, M. (2005). Methods for the economic evaluation of health care programmes. 3rd ed. Oxford, U.K.: Oxford University Press.Find this resource:

EuroQol Group. (1990). EuroQol—A new facility for the measurement of health-related quality of life. Health Policy, 16, 199–208.Find this resource:

Fayers, P. M., & Hays, R. D. (2014). Should linking replace regression when mapping from profile-based measures to preference-based measures? Value in Health, 17(2), 261–265.Find this resource:

Fuller, G. W., Hernández, M., Pallott, D., Lecky, F., Stevenson, M., & Gabbe, B. (2017). Health state preference weights for the Glasgow outcome scale following traumatic brain injury: A systematic review and mapping study. Value in Health, 20(1), 141–151.Find this resource:

Furlong, W., Feeny D., Torrance, G., Goldsmith, C., DePauw, S., Zhu, Z., . . . Boyle, M. (1998). Multiplicative multi-attribute utility function for the health utilities index Mark 3 (HUI3) System: A technical report. Centre for Health Economics and Policy Analysis Working Paper Series 1998–11. Hamilton, ON: Centre for Health Economics and Policy Analysis (CHEPA), McMaster University.Find this resource:

Gillard, P. J., Devine, B., Varon, S. F., Liu, L., & Sullivan, S. D. (2012). Mapping from disease-specific measures to health-state utility values in individuals with migraine. Value in Health, 15(3), 485–494.Find this resource:

Goldfeld, K. S., Hamel, M. B., & Mitchell, S. L. (2012). Mapping health status measures to a utility measure in a study of nursing home residents with advanced dementia. Medical Care, 50, 446–451.Find this resource:

Gray A., Rivero-Arias, O., & Clarke, P. (2006). Estimating the association between SF-12 responses and EQ-5D utility values by response mapping. Medical Decision Making, 26(1), 18–29.Find this resource:

Gray, L. A., & Hernández Alava, M. (2018). BETAMIX: A command for fitting mixture regression models for bounded dependent variables using the beta distribution. Stata Journal, 18(1), 51–75.Find this resource:

Gray, L. A., Hernández, M., & Wailoo, A. (2018). Development of methods for the mapping of utilities using mixture models: Mapping the AQLQ-S to the EQ-5D-5L and the HUI3 in patients with asthma. Value in Health, 21(6), 748–757.Find this resource:

Guideline for the Conduct of Economic Evaluations in Health Care. (2016). Netherlands: Zorginstituut Nederland.

Guidelines for the Economic Evaluation of Health Technologies. (2017). (4th ed.). Ottawa, ON: CADTH.Find this resource:

Haute Autorité de santé. (2012). Choices in methods for economic evaluation 2012.Find this resource:

Hernández Alava, M., & Pudney, S. (2017). Econometric modelling of multiple self-reports of health states: The switch from EQ-5D-3L to EQ-5D-5L in evaluating drug therapies for rheumatoid arthritis. Journal of Health Economics, 55, 139–152.Find this resource:

Hernández, M., & Pudney, S. (2018). eq5dmap: A command for mapping between EQ-5D-3L and EQ-5D-5L. The Stata Journal, 18(2):395–415.Find this resource:

Hernández Alava, M., Brazier, J., Rowen, D., & Tsuchiya, A. (2013). Common scale valuations across different preference-based measures: estimation using rank data. Medical Decision Making, 33(6), 839–852.Hernandez-AlavaFind this resource:

Hernández Alava, M., & Wailoo, A. J. (2015) Fitting adjusted limited dependent variable mixture models to EQ-5D. Stata Journal, 15, 737–750.Find this resource:

Hernández Alava, M., Wailoo, A. J., & Ara, R. (2012). Tails from the Peak District: Adjusted limited dependent variable mixture models of EQ-5D questionnaire health state utility values. Value in Health, 15, 550–561Find this resource:

Hernández Alava, M., Wailoo, A., Grimm, S., Pudney, S., Gomes, M., Sadique, Z., . . . Irvine, L. (2018). EQ-5D-5L Versus EQ-5D-3L: The impact on cost effectiveness in the United Kingdom. Value in Health, 21(1), 49–56.Find this resource:

Hernández Alava, M., Wailoo, A., Wolfe, F., & Michaud, K. (2013). The relationship between EQ-5D, HAQ and pain in patients with rheumatoid arthritis. Rheumatology 52(5), 944–950.Find this resource:

Hernández Alava, M., Wailoo, A., Wolfe, F., & Michaud, K. (2014). A comparison of direct and indirect methods for the estimation of health utilities from clinical outcomes. Medical Decision Making, 34(7), 919–930.Find this resource:

Horseman, J., Furlong, W., Feeny, D., & Torrance, D. (2003). The health utilities index (HUI®): Concepts, measurement properties, and applications. Health and Quality of Life Outcomes, 1(54).Find this resource:

Hoyle, C. K., Tabberer, M., & Brooks, J. (2016). Mapping the COPD Assessment Test onto EQ-5D. Value in Health, 19, 469–477.Find this resource:

Hurst, N., Kind, P., Ruta, D., Humter, M., & Stubbings, A. (1997). Measuring health-related quality of life in rheumatoid arthritis: validity, responsiveness and reliability of EuroQol (EQ-5D). Rheumatology, 36(5), 551–559.Find this resource:

International Society for Pharmacoeconomics and Outcomes Research. (2018) Pharmacoeconomic guidelines around the world.Find this resource:

Instituto de Evaluación Tecnológica en Salud. Manual para la elaboración de evaluaciones económicas en salud. Bogotá, Colombia: IETS, 2014.Find this resource:

Kaambwa, B., Chen, G., Ratcliffe, J., Iezzi, A., Maxwell, A., & Richardson, J. (2017). Mapping between the Sydney Asthma Quality of Life Questionnaire (AQLQ-S) and five multi-attribute utility instruments (MAUIs). Pharmacoeconomics, 35, 111–124.Find this resource:

Kearns, B., Ara, R., Wailoo, A., Manca, A., Hernández Alava, M., Abrams, K., et al. (2013). Good practice guidelines for the use of statistical regression models in economic evaluations. Pharmacoeconomics, 31(8), 643–652.Find this resource:

Kent, S., Gray, A., Schlackow, I, Jenkinson, C., & McIntosh, E. (2015). Mapping from the Parkinson’s Disease Questionnaire PDQ-39 to the Generic EuroQol EQ-5D-3L. The Value of Mixture Models Medical Decision Making, 35, 902–911.Find this resource:

Khan, K. A., Madan, J., Petrou, S., & Lamb, S. E. (2014a). Mapping between the Roland Morris Questionnaire and generic preference-based measures. Value in Health, 17(6), 686–695.Find this resource:

Khan, I., & Morris S. A. (2014). A non-linear beta-binomial regression model for mapping EORTC QLQ-C30 to the EQ-5D-3L in lung cancer patients: A comparison with existing approaches. Health and Quality of Life Outcomes, 12, 163.Find this resource:

Khan, I., Morris, S., Pashayan, N., Matata, B., Bashir, Z., & Maguirre, J. (2016). Comparing the mapping between EQ-5D-5L, EQ-5D-3L and the EORTC-QLQ-C30 in non-small cell lung cancer patients. Health and Quality of Life Outcomes, 14, 60.Find this resource:

Khan, K. A., Petrou, S., Rivero-Arias, O., Walters, S. J., & Boyle, S. E. (2014b). Mapping EQ-5D utility scores from the PedsQL Generic Core Scales. Pharmacoeconomics, 32(7), 693–706.Find this resource:

Lee, L., Kaneva, P., Latimer, E., & Feldman, L. S. (2014). Mapping the gastrointestinal quality of life index to short-form 6D utility scores. Journal of Surgical Research, 186, 135–141.Find this resource:

Longworth, L., & Bryan, S. (2003). An empirical comparison of EQ-5D and SF-6D in liver transplant patients. Health Economics, 12, 1061–1067.Find this resource:

Longworth, L., & Rowen, D. (2011). The use of mapping methods to estimate healthstate utility values. Technical Report NICE DSU Technical Support Document 10, Decision Support Unit. Sheffield, U.K.: Health Economics & Decision Science, University of Sheffield.Find this resource:

Lopez-Bastida, J., Oliva, J., Antoñanzas, F., García-Altés, A., Gisbert, R., Mar, J., et al. (2010). Spanish recommendations on economic evaluation of health technologies. European Journal of Health Economics, 11, 513–520.Find this resource:

Lu, G., Brazier, J. E., & Ades A. E. (2013). Mapping from disease-specific to generic health-related quality-of-life scales: A common factor model. Value in Health 16(1), 177–184.Find this resource:

McLachlan, G. J., & Peel, D. (2000). Finite mixture models. New York, NY: John Wiley.Find this resource:

National Institute for Health and Care Excellence. (2010). Adalimumab, Etanercept, Infliximab, Rituximab and Abatacept for the treatment of rheumatoid arthritis after the failure of a TNF inhibitor. London, U.K.: National Institute of Health and Care Excellence.Find this resource:

National Institute for Health and Care Excellence. (2013). NICE guide to the methods of technology appraisal. London, U.K.: National Institute for Health and Care Excellence.Find this resource:

O’Brien, B. J., Spath, M., Blackhouse, G., Severens, J. L., & Brazier, J. E. (2003). A view from the bridge: agreement between the SF-6D utility algorithm and the Health Utilities Index. Health Economics, 12, 975–982.Find this resource:

Payakachat, N., Tilford, J. M., & Kuhlthau, K. A. (2014). Predicting health utilities for children with autism spectrum disorders. Autism Research, 7, 649–663.Find this resource:

Pennington, B., & Davis, S. (2014). Mapping from the Health Assessment Questionnaire to the EQ-5D: The impact of different algorithms on cost-effectiveness results. Value in Health, 17, 762–771.Find this resource:

Pennington, B., Hernández Alava, M., Pudney, S., & Wailoo, A. (2019). The impact of moving from EQ-5D-3L to -5L in NICE technology appraisals. PharmacoEconomics, 37(1), 75–84.Find this resource:

Petrou, S., Rivero-Arias, O., Dakin, H., et al. (2015). Preferred reporting items for studies mapping onto preference-based outcome measures: The MAPS statement. Health and Quality of Life Outcomes, 13, 106.Find this resource:

PHARMAC. (2015). Prescription for pharmacoeconomic analysis version 2.2.Find this resource:

Pharmaceutical Benefits Board. (2003). General guidelines for economic evaluations from the Pharmaceutical Benefits Board. Lawrenceville, NJ: ISPOR.Find this resource:

Pharmaceuticals Pricing Board. (2017). Preparing a health economic evaluation to be attached to the application for reimbursement status and wholesale price for a medicinal product. Application instructions.Find this resource:

Pullenayegum, E. M., Tarride, J., Xie, F., Goeree, R., Gerstein, H. C., & O’Reilly, D. (2010). Analysis of health utility data when some subjects attain the upper bound of 1: Are Tobit and CLAD models appropriate? Value Health, 13, 487–494.Find this resource:

Roset, M., Badia, X., Forsythe, A., & Webb, S. M. (2013). Mapping cushingqol scores onto SF-6D utility values in patients with Cushing’s syndrome. Patient, 6, 103–111.Find this resource:

Smithson, M., & Verkuilen, J. (2006). A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychological Methods, 11, 54–71.Find this resource:

Tobin, J. (1958). Estimation of relationships for limited dependent variables. Econometrica, 26(1), 24–36.Find this resource:

Tsuchiya, A., Brazier, J., McColl, E., & Parkin, D. (2002). Deriving preference-based single indices from non-preference-based condition specific instruments: converting AQLQ into EQ5D indices. Discussion paper. Sheffield, U.K.: Health Economics and Decision Science, University of Sheffield.Find this resource:

van Hout, B., Janssen, M. F., Feng, Y. S., Kohlmann, T., Busschbach, J., Golicki, D., et al. (2012). Interim scoring for the EQ-5D-5L: Mapping the EQ-5D-5L to EQ-5D-3L value sets. Value in Health, 15(5), 708–715.Find this resource:

Vilija, R. J., Sun, H., Barnett, P. G., Bansback N., Griffin S. C., Bayoumi A. M., . . . Owens D. K. (2017) Mapping MOS-HIV to HUI3 and EQ-5D-3L in Patients With HIV. MDM Policy & Practice, 2(2).Find this resource:

Wailoo, A. J., Hernández Alava, M., Manca, A., et al. (2017). Mapping to estimate health-state utility from non–preference-based outcome measures: An ISPOR Good Practices for Outcomes Research Task Force Report. Value in Health 20(1), 18–27.Find this resource:

Wailoo, A., Hernández Alava, M., & Escobar Martinez, A. (2014). Modelling the relationship between the WOMAC osteoarthritis index and EQ-5D. Health and Quality of Life Outcomes, 12(1), 37.Find this resource:

Wailoo, A., Hernández, M., Philips, C., Brophy, S., & Siebert, S. (2015). Modeling health state utility values in Ankylosing Spondylitis: Comparisons of direct and indirect methods. Value in Health, 18(4), 425–431.Find this resource:

Ware, J., Snow, K., Kolinski, M., & Gandeck, B. (1993). SF-36 Health Survey manual and interpretation quide. Boston, MA: The Health Institute, New England Medical Centre.Find this resource:

Weinstein, M. C., & Stason, W. B. (1977). Foundations of cost-effectiveness analysis for health and medical practices. New England Journal of Medicine, 296, 716–721.Find this resource:

Weinstein, M. C., Torrance G., & McGuire A. (2009). QALYs: The basics. Value in Health, 12(1), S5–S9.Find this resource:

Wibulpolprasert, S. (2008). Health technology assessment guideline. Journal of the Medical Association of Thailand, 91, 2Find this resource:

Yang, Y., Wong, M. Y., Lam, C. L., & Wong, C. K. (2014). Improving the mapping of condition-specific health-related quality of life onto SF-6D score. Quality of Life Research, 23, 2343–2353.Find this resource:

Young, T. A., Mukuria, C., Rowen, D., Brazier, J. E., & Longworth, L. (2015). Mapping functions in health-related quality of life: Mapping from two cancer-specific health-related quality-of-life instruments to EQ-5D 3L. Medical Decision Making, 35, 912–926.Find this resource:

## Notes:

(1.) The four procedures are hip replacement, knee replacement, and up to September 2017, varicose vein and groin hernia surgery in England.

(3.) The vast majority of the linear regressions have been estimated using Ordinary Least Squares. A few recent papers have used robust MM estimators.

(4.) A number of additional models such as splines, fractional polynomials, quantile regression, and GLMs have also been used in the literature.

(5.) Other mixture models used to estimate mapping functions include mixtures of Tobit models with and without an additional degenerate distribution at full health (Coca-Perrallion et al., 2015; and Hernández Alava et al., 2012, 2014; respectively).