Show Summary Details

Page of

PRINTED FROM the OXFORD RESEARCH ENCYCLOPEDIA, ECONOMICS AND FINANCE (oxfordre.com/economics). (c) Oxford University Press USA, 2019. All Rights Reserved. Personal use only; commercial use is strictly prohibited (for details see Privacy Policy and Legal Notice).

date: 17 November 2019

Economic Evaluation of Medical Screening

Summary and Keywords

The objective of medical screening is to prevent future disease (secondary prevention) or to improve prognosis by detecting the disease at an earlier stage (early detection). This involves examination of individuals with no symptoms of disease. Introducing a screening program is resource demanding, therefore stakeholders emphasize the need for comprehensive evaluation, where costs and health outcomes are reasonably balanced, prior to population-based implementation.

Economic evaluation of population-based screening programs involves quantifying health benefits (e.g., life-years gained) and monetary costs of all relevant screening strategies. The alternative strategies can vary by starting- and stopping-age, frequency of the screening and follow-up regimens after a positive test result. Following evaluation of all strategies, the efficiency frontier displays the efficient strategies and the country-specific cost-effectiveness threshold is used to determine the optimal, i.e., most cost-effective, screening strategy.

Similar to other preventive interventions, the costs of screening are immediate, while the health benefits accumulate after several years. Hence, the effect of discounting can be substantial when estimating the net present value (NPV) of each strategy. Reporting both discounting and undiscounted results is recommended. In addition, intermediate outcome measures, such as number of positive tests, cases detected, and events prevented, can be valuable supplemental outcomes to report.

Estimating the cost-effectiveness of alternative screening strategies is often based on decision-analytic models, synthesizing evidence from clinical trials, literature, guidelines, and registries. Decision-analytic modeling can include evidence from trials with intermediate or surrogate endpoints and extrapolate to long-term endpoints, such as incidence and mortality, by means of sophisticated calibration methods. Furthermore, decision-analytic models are unique, as a large number of screening alternatives can be evaluated simultaneously, which is not feasible in a randomized controlled trial (RCT). Still, evaluation of screening based on RCT data are valuable as both costs and health benefits are measured for the same individual, enabling more advanced analysis of the interaction of costs and health benefits.

Evaluation of screening involves multiple stakeholders and other considerations besides cost-effectiveness, such as distributional concerns, severity of the disease, and capacity influence decision-making. Analysis of harm-benefit trade-offs is a useful tool to supplement cost-effectiveness analyses. Decision-analytic models are often based on 100% participation, which is rarely the case in practice. If those participating are different from those not choosing to participate, with regard to, for instance, risk of the disease or condition, this would result in selection bias, and the result in practice could deviate from the results based on 100% participation. The development of new diagnostics or preventive interventions requires re-evaluation of the cost-effectiveness of screening. For example, if treatment of a disease becomes more efficient, screening becomes less cost-effective. Similarly, the introduction of vaccines (e.g., HPV-vaccination for cervical cancer) may influence the cost-effectiveness of screening. With access to individual level data from registries, there is an opportunity to better represent heterogeneity and long-term consequences of screening on health behavior in the analysis.

Keywords: population-based screening program, cost-effectiveness, decision-analytic models, screening strategies, clinical trials, uncertainty, health economics

Introduction

Medical screening involves identifying the likely presence of a disease or condition in individuals perceived to be healthy, using diagnostic tests, examinations, or other medical procedures (Wilson & Jungner, 1968). The objective of medical screening is to prevent future disease (secondary prevention) or to improve prognosis by detecting the disease at an earlier stage (early detection). Screening as a prevention approach has gained increased acceptance following advances in screening technologies and the success of screening programs (Arbyn et al., 2010; Elfström et al., 2015; Vaccarella et al., 2014). There are generally two types of screening: population-based, which involves inviting all individuals in a target group (e.g., selected by age) to an examination or screening test; and, high-risk screening, which involves restricting screening to higher-risk groups. While population-based screening includes both prevention and early detection (such as screening for cancer), high-risk screening primarily involves early detection (e.g., screening for tuberculosis among immigrants).

According to the World Health Organization (Andermann, Blancquaert, Beauchamp, & Déry, 2008; Dobrow, Hagens, Chafe, Sullivan, & Rabeneck, 2018; Wilson & Jungner, 1968), several principles and practice of screening should be considered prior to implementing population-based screening for a disease or condition. From an economic perspective, some principles are of particular interest: the target population is unambiguously defined (e.g., all newborn infants, women aged 60 years, or recent immigrants from countries with a high prevalence of tuberculosis); the test is acceptable for use on the population (e.g., not painful, sufficiently accurate); evidence for the effectiveness has been established in the scientific literature (e.g., risk reduction in cancer mortality and/or cancer incidence); the expected benefits of screening outweigh the harms (i.e., balancing the health gains with adverse events, costs and other potential negative consequences); and planned evaluation of the program from the outset. Based on these principles, a broad variety of screening programs across different disease areas have been implemented. For example, in England, ten different population-based screening programs are offered (Table 1).

Table 1. List of Population-Based Screening Programs Offered in Englanda

Screening Program

Screening Test(s)

Target Population/Age(s)

Target Disease(s)

Number of International CEAs (Year)b

Abdominal Aortic Aneurysm

Ultrasound test

Men aged 65 and older

Abdominal Aortic Aneurysm

6 (2007–2016)

Cancer Screening

Bowel Cancer

Faecal occult blood

Every two years to all men and women aged 60–74 years

Early detection and prevention of colorectal cancer

20 (1998–2016)

Breast

Mammography

Women aged 50–70 years

Early detection of breast cancer

24 (1991–2016

Cervical

Liquid-based cytology; human papillomavirus testing (triage)

Every three years for women aged 25–49; every five years for women aged 50–64 years

Early detection and prevention of cervical cancer

52 (1990–2016)

Prenatal and Infant Programs

Fetal Anomaly

Ultrasound to measure fluid at the back of the fetus’ neck combined with blood test

Expectant mothers at between 10+0 and 14+1 weeks of their pregnancy

Down’s syndrome (T21), Edwards’ syndrome (T18), Patau’s syndrome (T13), and nine additional conditions

2 (2007–2015)

Infectious Diseases in Pregnancy

Blood sample

All expectant mothers

HIV, hepatitis B, syphilis

None

Newborn and Infant Physical Examination

Physical examination

All infants within 72 hours of birth, and again between six to eight weeks

Congenital heart disease, developmental dysplasia of the hip, congenital cataracts, cryptorchidism (undescended testes)

None

Newborn Blood Spot

Blood from child’s heel

All infants five days old

Sickle cell disease, cystic fibrosis, congenital hypothyroidism, inherited metabolic diseases (IMDs).

None

Newborn Hearing

Automated otoacoustic emission (AOAE), automated auditory brainstem response (AABR)

All eligible infants ≤five weeks old

Hearing impairment

None

Sickle Cell and Thalassaemia

Blood sample

All expectant mothers, father-to-be if mother is genetic carrier

Genetic carriers for sickle cell, thalassaemia, and other haemoglobin disorders sickle cell disease thalassaemia haemoglobin disorders

None

b CEA Registry (12 Mar 2018) performed world-wide

Abbreviations. CEA: Cost-effectiveness Analysis

Although screening is increasingly common, it has often been implemented without comprehensive evaluation of both costs and consequences. Because introducing a screening program to the entire population is resource demanding, stakeholders emphasize the need for evaluation to ensure the health benefits and monetary costs are reasonably balanced prior to offering implementation (Canadian Agency for Drugs and Technologies in Health, 2017; Dutch National Healthcare Institute, 2016; Holland, Stewart, & Masseria, 2006; National Institute for Health and Care Excellence, 2017).

Evaluation of population-based screening programs should include both the long-term health and economic implications of a program as well as evaluate all relevant strategies. Compared to pharmaceuticals or new technologies, there are specific properties of screening that require special consideration during the evaluation process, which complicate the sole use of randomized clinical trials (RCTs) to complete the evaluation process. For example, there are often multiple screening approaches (i.e., strategies) that needs to be considered, such as the type of screening test, the screening target groups (e.g., ages for when to start and stop screening), the screening frequency (e.g., once-only, annual, biannual) and how to follow up screen-positive individuals (the triage algorithm). In addition, the benefits and costs of the program should be evaluated over a sufficiently long-time horizon (often involving decades or a lifetime) to capture all relevant outcomes of the screening program. Due to these challenges, the use of complex methods (such as decision-analytic/disease simulation models) is often required to comprehensively estimate the value of a screening program.

This article provides an overview of evaluation of population-based screening programs, for any type of disease or condition, within the framework of economic evaluation. Due to the characteristics and complexities of screening, this article focuses on economic evaluation of screening programs within the context of decision-analytic modeling, which allows evaluation an array of mutually exclusive screening algorithms over an extended analytic period. In addition, this paper provide a discussion of how screening can be evaluated using primary sources of data such as RCTs and observational studies. Evaluation of screening for cancer is used as an example, drawing from several examples of economic evaluation based on decision-analytic models in this disease area.

The properties of screening and how screening can influence the natural history of disease will be defined in the next section, followed by a presentation of the framework of cost-effectiveness analysis. Subsequently, the analytic frameworks will be described, including both the use of individual level data and decision-analytic models, followed by a discussion of the challenges in evaluation of population-based screening programs. Finally, some concluding remarks will be provided.

Screening

The natural history of a disease and how medical screening could affect the disease pathway is illustrated in Figure 1. The exemplary disease has a biological onset; without screening, the disease would be detected during the clinical phase, based on symptoms. The characteristics of a screening test (i.e., sensitivity and specificity) determine the ability of the screening test to detect (or falsely detect) the disease. The sensitivity of a test is defined as the probability of having a positive test result given that you have the disease, while the specificity of a test is defined as probability of having a negative test result given that you do not have the disease. The earlier a disease is detected by screening, the shorter is the delay time. Earlier detection of the disease also increases the lead time, that is, the amount of time the diagnosis of the disease is advanced due to screening.

Economic Evaluation of Medical ScreeningClick to view larger

Figure 1. The natural history of disease (panel A) and illustration of where screening would affect the disease pathway, with screening for cervical cancer as an example (panel B).

Two types of biases can occur in evaluation of the effectiveness of a screening program: length bias and lead-time bias. Length bias occurs because screen-detected diseases may inherently be less aggressive forms of the disease. In contrast, an aggressive disease may have a shorter screen-detectable period, defined as the period in which the disease can be diagnosed with a screening test (Figure 1). Therefore, the most aggressive forms of diseases are diagnosed due to the development of symptoms. The less aggressive diseases may be over-represented in the screening cohort because they have a longer asymptomatic period. If the probability of cure and survival is higher for a patient with a less aggressive form of disease than a patient with a more aggressive disease, a screen-detected aggressive disease may not face the same survival improvements as a screen-detected but less aggressive disease. Lead-time bias can occur when screening falsely prolongs survival. Due to screening, the disease is diagnosed earlier than when diagnosed based on symptoms, but the outcome in terms of time of death, remains unchanged by the screening intervention. Consequently, the patient will not benefit from screening (no additional life years), but due to early diagnosis, the individual will be aware of (and potentially anxious) about the disease for a longer period of time. Furthermore, screening may also induce overdiagnosis, that is, detect disease that would never have been diagnosed in the patient’s lifetime due to symptoms. The challenge of overdiagnosis has been addressed in several screening programs, such as breast and prostate cancer screening (Etzioni et al., 2002; Jørgensen, Gøtzsche, Kalager, & Zahl, 2017; Vickers et al., 2014), which has been shown to result in unnecessary treatments and follow-ups and reduce overall quality of life.

The majority of screening tests are imperfect (i.e., the sensitivity and specificity is less than 100%); consequently, some individuals who have the disease are falsely reassured (i.e., false negative) and sent back to routine screening, while some healthy individuals will have a false positive test result and will be advised additional follow-up. A positive test can trigger a cascade of additional testing, including potentially anxiety-inducing and painful diagnostic testing (e.g., biopsies, perforation of the colon during colonoscopy) to ascertain the diagnosis (Habbema et al., 2017; Henderson, Webber, & Sawaya, 2018; Lin et al., 2016). These potential negative consequences of screening are often overlooked and under-researched, but should play an important role in screening evaluation (Brewer, Salz, & Lillie, 2007; Gareen et al., 2014).

As screening involves inviting a healthy population without any symptoms to prevent or detect a few numbers of cases, it is essential that the test is deemed acceptable by individuals in the target population. Acceptance is important to ensure a high participation rate—the benefits of screening can only be achieved if individuals actually participate in screening. In sum, the choice of screening test may depend on availability of required resources, existing qualifications of health personnel, and individual preferences.

In a population-based screening program, cohorts are invited to a standardized screening program with a specified follow-up algorithm, which is carefully evaluated for key-performance indicators such as coverage, timely assessment, quality, and standardization of laboratories (Public Health England, 2017). Contrary to organized programs, opportunistic screening is unorganized screening of individuals and relies on an individual’s own initiative to participation and follow-up according to recommendations. Opportunistic screening could therefore contain several non-standardized screening methods (combination of screening test and age) and result in heterogeneity in the follow-up algorithm. Evaluation of the costs and health outcomes of a screening program may be impacted by opportunistic screening, particularly if it is frequently used and of relatively high quality.

Cost-Effectiveness Framework

In economic evaluation, the net present value (NPV) of all relevant (often long-term) health benefits and monetary costs of alternative screening strategies are calculated. The two most common types of evaluations are cost-effectiveness analysis (CEA) and cost-utility analysis (CUA), both measuring the health benefits in natural units, but the latter in quality-adjusted life-years (QALYs). CEAs, particularly those evaluating public health programs, such as population-based screening programs, are used to inform population-level decisions, but are not aimed at directly informing individual-level decisions.

The valuation of each screening strategy is represented by the incremental cost-effectiveness ratio (ICER), calculated by dividing the differences in total costs between one screening strategy compared to the next least costly strategy by the difference in health benefits between the two strategies (Figure 2). Screening strategies that are more costly but provide less health benefits than other strategies are removed from further consideration (i.e., strongly dominated). Similarly, strategies that are more costly, but less cost-efficient (i.e., higher ICER) than the next strategy are considered weakly dominated and are also removed from further consideration. The remaining non-dominated strategies are considered “cost-efficient” and can be plotted on the efficiency frontier (Figure 2). While all strategies on the efficiency frontier are considered “cost-efficient,” only one strategy would be considered optimal, or “cost-effective.”

Economic Evaluation of Medical ScreeningClick to view larger

Figure 2. The efficiency frontier and the ICER when comparison several screening strategies.

In a recent review study of the Tufts Medical Center CEA Registry and the Global Cost-Effectiveness Analysis (GHCEA) Registry, 12% of all CEAs-evaluated screening programs (Neumann et al., 2016). In contrast, the majority of CEAs (44%) evaluate pharmaceuticals (Neumann, Sanders, Russell, Siegel, & Ganiats, 2016), usually with the aim to inform market access and reimbursement decisions. For evaluation of screening, the aim of the CEA may be broader; that is, a screening program may have already been implemented, and the aim of the CEA is to inform optimal refinement of the program following advances in screening technologies (e.g., Mendes, Bains, Vanni, & Jit, 2015). In refining an existing screening program, decision-makers are faced with the challenging task of considering a large number of candidate strategies. In this context, a CEA can help inform which strategies are more efficient than others, and thus help narrow down the number of strategies decision-makers need to consider (Pedersen, Sørbye, Burger, Lönnberg, & Kristiansen, 2015).

To identify the optimal or most “cost-effective” intervention, an external decision rule is applied in order to determine where on the efficiency frontier to operate. The decision rule indicates the amount society is willing to pay for an additional health benefit, commonly referred to as the “willingness to pay threshold” or “cost-effectiveness threshold” (CE threshold), often expressed as the additional costs per additional QALY gained (Weinstein, Torrance, & McGuire, 2009). The preferred intervention is the intervention with an ICER just below the CE threshold. While there is no universal criterion for what defines a CE threshold, benchmarks exist for some countries (e.g., England, Sweden and Denmark). For example, the National Institute for Health and Care Excellence (NICE), which issues guidance on high-quality and value-driven healthcare in England and Wales, has defined interventions with an ICER between £20,000 to £30,000 per QALY gained to be considered cost-effective (NICE, 2017). Importantly, the optimal strategy identified by the CE threshold is rarely the sole contributor to decision-making; other factors such as distributional and ethical concerns, feasibility, affordability, and political discourse may contribute to an alternative choice of optimal strategy. Even so, quantifying the trade-offs in health benefits and resource use improve the accountability and transparency of decision-making.

In order to increase transparency of analyses, the second panel on cost-effectiveness analysis in the United States (Neumann et al., 2016) recommended to include an impact inventory table as part of the analysis. Within the context of screening evaluation, different analytic viewpoints can inform different aspects of the decision-making process. For example, while a societal perspective can inform which strategy is optimal for the society, using a healthcare perspective can inform whether a screening strategy is feasible. Even though a screening strategy may be cost-effective, it may not be feasible within current capacity constraints of the healthcare system. For example, a more intensive screening strategy may cause more screen-positive results, requiring more follow-up tests and thus more pressure on pathology laboratories. However, pathology resources are often limited, and changes in a screening program may cause bottlenecks in the system.

Health Benefits and Costs

Health benefits in screening are most commonly measured in life-years or QALYs, but also with intermediate outcomes, such as cases detected or events avoided (e.g., reduction in cancer mortality). In order to calculate the total health benefit of screening, monitoring of survival is required until everybody exposed and not exposed to screening have died. For a trial among 50-year-olds, this would require monitoring survival for over 40 years before the total benefit is available. For this reason, the use of intermediate outcomes from clinical studies is important for calibrating (i.e., model fitting) long-term health benefits of screening. Similarly, if the primary objective of screening is to prevent future disease, the health benefits generally will not occur immediately, but often after several years or decades, which again requires a lifetime perspective is analyzed.

Evaluating screening in a healthcare perspective would typically include cost of the screening test, adverse events, follow-up procedures, and treatment costs. Extending to a societal perspective would have a great impact on the estimation of costs. As population-based screening programs invite individuals without any symptoms of disease, the majority are likely to be working. Hence, in order to participate, the time allocated to travelling and the screening test would either be during working hours or leisure time, both associated with an opportunity cost of time. In addition, the cost of screening would increase because of costs related to traveling to/from screening examinations as well as the individual’s time related to participating in the screening and the follow-up examinations (if indicated). However, not all country-specific guidelines recommend this broader perspective, countries such as the United Kingdom and Norway recommend a more narrow “healthcare perspective,” as such, these costs will not be included.

Calculating the NPV of screening strategies, it is recommended that both health benefits and costs are discounted at an equal rate (Neumann et al., 2016); however, this has been subject to debate (Paulden, O’Mahony, & McCabe, 2017). In the calculation of NPV, another challenge emerges: the imbalance in timing the accumulation of costs and health benefits. While the cost of screening occurs initially, the health benefits accumulate after several years. Therefore, the effect of discounting can be substantial, and reporting both discounted and undiscounted health benefits is recommended. Supplementing the results section with intermediate health outcomes, which are highly correlated with the main health outcome (avoided cases), may overcome some of the challenges with measuring long-term health benefits.

Analytic Framework

Clinical Trial-Based Evaluations

The preferred methodology for evaluating the effect of a new technology, including screening methods, is through an RCT. Because the intervention is assigned to individuals at random, and blinded when possible, the RCT framework ensures that the group invited to screening is statistically equivalent to the control group, and any resulting differences in costs and health benefits are directly caused by screening alone (Angrist & Pischke, 2015; Wooldridge, 2010). In an RCT, costs and health outcomes should be collected simultaneously for all individuals, which provide an opportunity for advanced analysis on the interaction of costs and benefits. The preferred analytic approach is intention-to-treat (ITT), where the average costs and health outcomes for all invited to screening is compared with the average outcomes in the control group. The discounted health benefit is measured by area under the curve (AUC) for both life-years gained and QALYs, discounted over the time horizon, while costs will be estimated as NPV of costs accumulated over the observation period. To account for uncertainty, bootstrap is typically applied to identify the variation in costs and health benefits between the control and the screening group. Importantly, evaluators should consider the potential impact of length and lead-time biases, which can conflate survival and QALY gains, particularly when interventions are evaluated before all individuals have died. See Glick, Doshi, Sonnad, and Polsky (2007) for additional information on conducting a CEA in a trial setting.

An advantage with individual level data from an RCT is the opportunity to obtain more detailed analysis on heterogeneity, which could potentially give recommendations for sub-groups and not only for the average population. For instance, with individual level data the effect of gender and comorbidity on costs and health benefits could be calculated, further differences in costs between cases detected at screening and cases detected later could also be estimated. Adverse events of screening are easier to identify in trials and could also be linked to short- and long-term costs and health benefits.

When an RCT has not been conducted to establish the effect of screening, specific statistical methods can be applied to observational data to identify the causal effect of screening on costs and health outcomes (Angrist & Pischke, 2015; Wooldridge, 2010). This is, however, challenging due to potential selection bias. Analysts can apply instrumental variable analysis to adjust for selection bias; however, the instrument must fulfill certain properties: 1) must have a causal effect on being allocated to screening; 2) the instrument must be randomly assigned and independent of other variables (independence assumption); and 3) the instrument cannot affect the main outcome variables (life-years gained, QALYs, or costs) other than through screening (exclusion restriction).

Another approach to evaluate empirical, non-randomized screening data is to use regression of discontinuity design (Angrist & Pischke, 2015). This method estimates the effect of the introduction of screening on the health benefit, such as mortality, by estimating whether there is a discontinuity in trend after introduction of screening. Introduction of screening will be identified by a dummy variable, defined as 1 after screening and 0 before screening. One important assumption for using this method is that it requires a sharp introduction of screening, requiring that there was very little or no screening (including opportunistic screening) prior to the introduction of the program.

Lastly, the difference in difference (DID) approach can also be used to evaluate the benefit of screening (Angrist & Pischke, 2015). DID compare two groups before and after introduction of screening, where one group is allocated to screening and the other is not offered screening (e.g., DID can be applied in two different geographical areas). DID compared the trends in an outcome, such as mortality, before and after the introduction of screening. The health benefit of screening on mortality is measured as the interaction between being allocated to screening and the post-screening period (referred to as the DID estimator). A primary assumption of DID requires that the trend in the two groups would have been similar if screening was not introduced in one group.

Decision-Analytic Models

In the context of evaluating screening programs, in which multiple competing strategies are evaluated and the health gains may not be observed for several decades, the use of decision-analytic modeling (simulation or mathematical models) may be the preferred approach. In contrast to statistical models, mathematical models are often designed to capture some underlying natural history process (e.g., progression from healthy to precancer to cancer). (Figure 1B). No single clinical trial can capture all the short- and long-term health and economic consequences of all possible strategies needed to inform complex policy decisions, for example, surrounding the alternative preventive strategies for colorectal or cervical cancer screening. An advantage of modeling is the ability to synthesize available evidence (e.g., RCTs, observational studies, registry data) from multiple sources, extrapolate data beyond the time horizon of studies, reflect parameter and process uncertainty, and identify the most influential factors on the decision. As such, decision-analytic modeling allows evaluation of scenarios not evaluated in an RCT, such as other target groups, screening intervals or follow-up of screen-positive results.

As decisions related to health outcomes involve choices between risky or uncertain prospects, expected utility theory provides the foundation behind decision-analytic modeling (Neumann et al., 2016). Within the framework of expected utility theory, alternative pathways are characterized by numerical representations of the outcomes and the respective probability of achieving each outcome. The use of decision-analytic modeling in health policy is not without controversy, stemming from accusations of less transparent methods (Buxton et al., 1997) and strong assumptions. However, following recommendations for best practices (e.g., ISPOR-SMDM Modeling Good Research Practices) and using standardized reporting guidelines (Drummond, Sculpher, Torrance, O’Brian, & Stoddart, 2005; Husereau et al., 2013) instill greater confidence in model-based analyses. Subsequently, there is a growing dependence on decision-analytic methods to help guide drug and technology reimbursement decisions in, for example, England, Wales, Australia, and Scotland (Erntoft, 2011). The U.S. Preventive Services Task Force (USPSTF) has warned that “failure to use models to extrapolate from primary data can lead to greater errors that the models themselves would introduce” (Weinstein, Siegel, Gold, Kamlet, & Russell, 1996, p. 1257).

There are several types of decision-analytic models each defined by a set of specific characteristics. In a simplistic classification, simulation models can be static (i.e., do not allow for interactions between individuals) or dynamic (e.g., one individual can infect another individual), and they simulate a population at the cohort or individual level. Kuntz and colleagues (2013) provide a full overview of model taxonomy. The suitability of a particular model is governed by matching the attributes and characteristics of a particular type of model (Table 2) to the needs of the decision problem at hand (Roberts, Russell, Paltiel, Chambers, McEwan, & Krahn, 2012).

Table 2. Overview of Types of Models, Attributes and Examples

Type of model (ISPOR-SMDM Modeling Good Research Practices Reference)

Attributes and Characteristics of Model Types

Strengths and Weaknesses

Example Application to Screening Program

Decision Trees (Siebert et al., 2012)

Diagrams the probability of events over a fixed time horizon

Appropriate for a short-time horizon

Abdominal aortic aneurysm (Ehlers et al., 2009)

State-transition Models (Siebert et al., 2012)

Simulates individuals or a cohort of individuals through a series of mutually exclusive and collectively exhaustive health states

Appropriate for a longer-time horizon, time-dependent transition probabilities, does not allow for interactions between individuals

Cervical cancer screening (Goldhaber-Fiebert et al., 2008)

Dynamic Transmission Models (Pitman, Nagy, & Sculpher, 2013)

Compartmentalizes individuals or a cohort of individuals by infection status

Appropriate for detailed analysis of disease transmission patterns

Chlamydia trachomatis (CT) screening program (De Vries et al., 2006)

Discrete Event Simulation Models (Karnon et al., 2012)

Analysis of individuals’ interactions with each other and/or within systems of constrained resources

Computationally efficient

Breast cancer screening (Wisconsin Breast Cancer Epidemiology Simulation Model; Fryback et al., 2006)

The selection of model structure and input parameters requires a multidisciplinary approach (e.g., decision analysts, clinicians, and epidemiologists). Importantly, those input parameters that inform transitions between natural history health states (e.g., progression to or from cervical precancer) should be informed using comprehensive literature reviews to guard against biased results. For transitions that are often unobservable (e.g., progression from cervical precancer to invasive cancer) or may vary from setting to setting, calibration (model fitting or dependent validation). Calibration involves a multi-step process to specify the value of an input parameter that corresponds, or generates good-fit, to available empirical data (i.e., a calibration target) (Vanni et al., 2011). For example, due to variations in sexual behavior across settings, the incidence and clearance of a sexually transmitting infection such as human papillomavirus (HPV) can be fit to observed data on HPV prevalence (Campos et al., 2014). A goodness-of-fit measure, of which several exist (Vanni et al., 2011), is used to evaluate the fit of model outcomes to the observed data. Bayesian or approximate Bayesian calibration methods are also gaining traction (e.g., Menzies, Soeteman, Pandya, & Kim, 2017).

Confidence in the model’s projections can be strengthened through an independent validation process that involves comparing model outputs to: (1) other simulation models; (2) retrospective data not used to inform model inputs or calibration; and (3) prospective data (Eddy, Hollingworth, Caro, Tsevat, McDonald, & Wong, 2012). Convergent validity compares two or more models aimed at answering similar questions and reconciles reasons why different simulation models may come to different conclusions. A primary example of comparative modeling is the Cancer Intervention and Surveillance Modeling Network (CISNET). CISNET is a simulation modeling consortium aimed at improving the understanding of important factors related to cancer control strategies by comparing multiple simulation models for six cancer sites: breast, cervical, colorectal, oesophagus, lung, and prostate. Ongoing work of CISNET teams involves independent model validation to forthcoming RCTs (Rutter et al., 2016).

Finally, the ISPOR-SMDM Modeling Task Force (Eddy et al., 2012) emphasizes the need for transparent reporting of model structure, inputs, model-specific methods, and model validation, often in the form of technical appendices, which can include supplementary results. Appendices also provide the required space to include standardized checklists for economic evaluation, such as the Consolidated Health Economic Evaluation Reporting Standards (CHEERS) checklist and the impact inventory. Calibrating decision-analytic models to empirical data, independently validating the models to external data, and reporting detailed methods and results transparently comprise a few of the necessary steps required to inform health policy decisions.

Uncertainty

When evaluating screening interventions, the amount of evidence may be either large, small, or not existent, and there may be uncertainty about effectiveness and long-term outcomes of screening technologies. A challenge of screening evaluation is to synthesize and assess all available evidence and evaluate the impact of uncertainty on outcomes, even when extensive model-based calibration approaches have been applied. When using economic evaluation and decision-analytic modeling to evaluate screening, there are multiple types of uncertainty analyses that can be conducted to help inform decision-making. Specifically, the ISPOR-SMDM Task Force distinguishes between four main types of uncertainty for decision-analytic modeling: parameter uncertainty, stochastic uncertainty, heterogeneity, and structural uncertainty (Briggs, Weinstein, Fenwick, Karnon, Sculpher, & Paltiel, 2012).

First, parameter uncertainty refers to the uncertainty stemming from estimation of model parameters that are inherently uncertain, such as the probability of experiencing a specific event or the accuracy of a diagnostic test. When incorporating multiple sources of data, these parameters may take a range of different values that should be accounted for when evaluating screening strategies. For diagnostic accuracy estimates, two common biases should be considered: (1) spectrum bias, occurring due to different disease severity of different populations; and (2) verification bias, occurring if the gold standard test has not been performed in test-negative individuals. Estimates of diagnostic accuracy that are adjusted for these biases may differ from unadjusted estimates (e.g., Mayrand et al., 2007).

Second, variability between individuals or sub-groups of individuals can either be attributed to overall random variability (referred to as stochastic uncertainty) or to individual characteristics such as age, gender, socioeconomic status and other indicators of disease risk (referred to as heterogeneity). Considering heterogeneity is essential for screening evaluations in order to determine the optimal target population for the screening program.

Lastly, for screening evaluations that rely on decision-analytic modeling, structural uncertainty relates to the structural assumptions of the model. One example is the conceptualization of the natural history of the disease that is being screened for (i.e., which health states are included in the model and how individuals can transition between the health states). Although cost-effectiveness results may differ considerably between alternative structural assumptions (Le, 2016), structural uncertainty is not usually formally quantified as part of the analysis because of the programming and computation time required to perform such analyses.

To assess the impact of these different types of uncertainties as part of the screening evaluation, a range of sensitivity analyses can be performed. A simple approach is to perform deterministic sensitivity analysis and manually vary one, two, or more parameters at a time (referred to as one-way, two-way, and multi-way sensitivity analysis, respectively), while holding all other parameters constant. An alternative approach is to vary all parameters simultaneously by assigning predefined probability distributions (e.g., beta, gamma) to each parameter and sample multiple sets of parameter values (referred to as probabilistic sensitivity analysis, PSA). Model output can then be reported as the mean values across all parameter sets with uncertainty bounds. The SMDM-ISPOR Modeling Task Force states that:

a model-based analysis value lies not simply in its ability to generate a precise point estimate for a specific outcome but also in the systematic examination and responsible reporting of uncertainty surrounding this outcome and the ultimate decision being addressed.

(Briggs et al., 2012, p. 722)

Using the results of PSA, analysts can calculate the probability for each screening strategy to be cost-effective for a given cost-effectiveness threshold (i.e., cost-effectiveness acceptability curves), as well as quantifying the value of acquiring additional information (i.e., expected value of perfect information).

Challenges of Screening Evaluation

Screening is a complex system that usually involves multiple stakeholders and multiple decisions to be made along the screening pathway; as such, comprehensive screening evaluation is a challenging task.

The prior presentation of evaluation of screening has only considered the health benefit and costs of screening. The optimal use of resources depends on the insurer’s objective function (i.e., a composite of factors the insurer seeks to achieve). The objective function for an insurer in the healthcare sector may include measures such as population health (e.g., mortality and morbidity), profit, and equity. Consequently, optimal resource allocation in the sector depends on how new interventions affect the insurer’s objective function relative to the cost of the new intervention.

Multiple aspects need to be addressed when evaluating screening interventions, which could be part of the insurer’s objective function. Certain challenges unique to screening evaluation are important to consider, including harm-benefit trade-offs, adverse lifestyle effects of screening, the participation rate, the heterogeneity of the target population, and “prevention versus cure.”

Harm-Benefits

As mentioned previously, all screening programs should ensure that the benefits of screening outweigh the harms, thus the harm-benefit trade-off should be considered alongside the feasibility and cost-effectiveness of a program prior to implementation. Relevant harm-benefit outcomes associated with a particular screening strategy should be identified by stakeholders and quantified (e.g., using decision-analytic modeling). For example, a study evaluating trade-offs in health benefits and resource use associated with candidate cervical cancer screening strategies quantified resource use or “harms” as the number of referrals for diagnostic colposcopy with biopsy (a semi-invasive procedure required to confirm presence of high-grade precancer), while the reduction in cancer incidence associated with each strategy represented health benefits (Burger, Pedersen, Sy, Kristiansen, & Kim, 2017). In addition to quantifying relevant benefit and harm outcomes in absolute terms, some studies have used a framework similar to the one used in CEA to evaluate incremental harm-benefit ratios (a metric analogous to the ICER). For example, a study evaluating candidate cervical cancer screening strategies calculated incremental harm-benefit ratios defined as the additional number of colposcopy referrals per additional precancers detected (Pedersen et al., 2015). Similarly, a study analyzing alternative breast cancer screening strategies evaluated incremental harm-benefit ratios defined as the additional number of false-positive findings per additional life-year gained (Van Ravesteyn et al., 2012). Quantifying such benefit-harm outcomes and metrics may help inform the non-economic efficiency of a strategy, and may help decision-makers and individuals alike decide which strategies are optimal given their preferences for harms versus benefits of screening.

Adverse Lifestyle Effects of Screening

An evaluation of screening from a societal perspective aims to quantify all consequences of screening. One feature of screening programs addressed in the literature is the “Health Certificate Effect” (Larsen, Grotmol, Almendingen, & Hoff, 2007; van der Aalst, van Klaveren, de Koning, 2010). The “Health Certificate Effect” is a result of participants’ misinterpretation of the screening result, with a consequence on lifestyle decisions. For instance, if a negative test result is misinterpreted as a verification of being in perfect health, fewer measures may be taken to prevent future illness (allocative inefficiency of health production). The “Health Certificate Effect” can be interpreted within the framework of Bayesian updating of disease probabilities from test results. Several studies have identified the problems with understanding information both among health professionals and the general population (Bramwell, West, & Salmon, 2006; Whiting et al., 2015). In an RCT for screening colorectal cancer, the percentage with a lifestyle-related disease was greater among the group invited to screening than in the control group, which provides support for the “Health Certificate effect” (Aas, Iversen, & Hoff, 2017). In addition, lifestyle changes could potentially influence overall survival. In order to account for the “Health Certificate Effect” when evaluating the cost-effectiveness of screening, unrelated healthcare costs need to be included in addition to related healthcare costs (Neumann et al., 2016).

Participation

Evaluating the cost-effectiveness of screening within a model-based framework often relies on specific assumptions, for example, assumptions around participation in the program. Data from trials, implementation studies, and quality indicators of screening programs show that 100% participation rarely is achieved (Aas, 2009; Elfström, Arnheim-Dahlström, von Karsa, & Dillner, 2015). Whether or not imperfect screening participation affects the cost-effectiveness of screening depends on the characteristics of those not participating compared to those participating. A specific challenge is related to the healthy screenee bias, a concept suggesting that those who choose participating in screening are those at the lowest risk of having the disease or condition. For example, in a Danish study, cervical cancer screening participants had lower all-cause mortality rates than non-participants (Kranse et al., 2013). In a Norwegian study, the frequency of lifestyle-related diseases was substantially higher among non-participants (Aas et al., 2017). Accounting for higher risk among non-participants could be captured in simulation models by applying information based on individual level data or by exploring different risk profiles in sensitivity analysis. Individual level data allow for estimation of the correlation between risk profiles, survival, and costs. The consequence of identifying the effect of different risk profiles on the cost-effectiveness of screening is ambiguous. For example, if the risk of the disease screened for is higher among the non-participants, increasing participation would increase life-years gained; however, if background mortality is higher among the non-attendees, the expected health gain of screening among non-participants is lower than among the attendees. Finally, if increased risk of the disease being screened for and/or disease-specific mortality is associated with costs of screening (adverse events), treatment, and follow-up, the incremental cost of screening will likely be affected.

Socioeconomic status is also positively related to participation in some screening programs (Aas, 2009; Marlow, Chorley, Haddrell, Ferrer, & Waller, 2017; McCaffery, Wardle, Nadel, & Atkin, 2002; Petersen, 2002; Vernon, 1997). Individuals with low socioeconomic status are often less likely to participation in screening, and if this is due to a lack of knowledge of the program, it may contribute to increasing inequities in healthcare, a founding principle of in many countries. A consequence of bias in participation is a negative distributional effect on health in the population. To identify the distributional effect, methods have been developed to balance distributional and efficiency effects of screening (Asaria, Griffin, Cookson, Whyte, & Tappenden, 2015). In an evaluation of increased participation in the bowel cancer screening program in the United Kingdom, three interventions were compared to no screening; standard screening, targeted reminder in deprived areas and universal reminder. From a cost-effectiveness standpoint, a universal reminder was the preferred alternative, while a targeted reminder was the optimal choice when considering minimizing unfair distribution. Several measures for inequality were applied in the article.

Prevention Versus Cure

The demand for a preventive service, such as screening, is closely, but complexly related to cure (see Figure 1): (1) the impact of the price of a screening method on the demand for screening; (2) the influence of the development in screening technology on the demand of screening; and (3) whether technological developments in treatment could potentially affect the demand for the preventive services are important considerations for insurers (discussed in Hey and Patel’s (1983) study of the interaction between prevention and cure). Hey and Patel focus on the “appropriate allocation of expenditure in healthcare” (p. 119) and present a two-state model (health and sickness), where the movements between the states not depend on time, only on the consumption in each health state, prevention, or cure. Optimal resource allocation of prevention is defined where the marginal cost of prevention is equal to the marginal utility of prevention. Within the framework suggested by Hey and Patel, it is possible to evaluate how changes in prices and technology affect the demand for the preventive service (such as screening) and cure (treatment). Hence, with continuous developments in both treatment and screening technologies, re-evaluations of the benefits, harms, and the economic costs of a screening program are required.

When new and better screening methods become available, the increased health gain has to be evaluated against the costs. When the treatment of a disease is improving, there will be a need for re-evaluating the existing screening program as screening will provide a lower incremental health gain. Furthermore, introduction of new preventive technology, such as HPV vaccines for cervical cancer (Figure 1B), will change the existing screening program and new evaluation of the cost-effectiveness are warranted (Burger et al., 2012; Kim, Burger, Sy, & Campos, 2016; Pedersen et al., 2018).

Heterogeneity

Evaluation of population-based screening program includes estimates of expected health benefits and costs for the whole population. Offering screening to sub-groups, in which screening is not cost-effective, would be an inefficient use of resources. For screening methods aimed at preventing future disease, life expectancy without screening should be longer than the expected time-period the individual would have to live before benefiting from screening. Braithwaite (2011) addressed this issue in screening for aorta aneurism and screening for colorectal cancer by comparing risks of complication during screening (short-term harms) with the long-term benefit of screening, and compare this scenario with the life-expectancy without screening. The risk of dying due to screen-related complications and the life expectancy without screening were important components in the recommendation of screening for specific sub-groups. These individual attributes may not be apparent when evaluating a screening program on a population-level.

Screening for colorectal cancer is a population-based intervention where patient heterogeneity, except for age, is generally not considered in decision analyses but may be important (Lansdorp-Vogelaar, Knudsen, & Brenner, 2011). For this cancer type, it is known that several sub-groups are at increased risk of developing and dying from colorectal cancer due to the presence of certain comorbidities (e.g., diabetes, obesity, and inflammatory bowel disease), lifestyle-related factors (e.g., smoking and heavy alcohol use), and familial history and genetic predisposition (American Cancer Society, 2016). In addition, the risk of screening-related adverse events may be different among patient sub-groups (Warren, 2009). In some studies, they have considered personal characteristics in determining the optimal stopping age for screening (Dinh, Alperin, Walter, & Smith, 2012; Lansdorp-Vogelaar et al., 2014; Lansdorp-Vogelaar et al., 2009; Van Hees et al., 2015). In a recent study, different effect of gender on the health outcome of screening was identified (Holme et al., 2018). Compared to an evaluation for the entire population, differences in relative risk between genders would influence the cost-effectiveness and may warrant screening women and men differentially.

Personalized Screening

In contrast to the disease simulation models utilized for cost-effectiveness analyses to address population-level policies, clinical prediction models are a tool used to assess personalized risk at the individual level, and have been successfully applied to a range of medical conditions such as coronary heart disease (Genders, Steyerberg, & Hunink, 2012), osteoporosis (Collins, Mallett, & Altman, 2011), type-2 diabetes (Abbasi et al., 2012), and various types of cancer (e.g., Beane et al., 2008). Within cervical cancer screening, it has been proposed (Castle et al., 2007) that women should not return to routine screening if the risk of high-grade precancer exceeds 2% before the next screening round; as such, a formal tool to help clinicians assess the risk of precancer and cancer is needed in clinical practice. Similarly, for other screening programs more personalized screening algorithms are currently being discussed (Assured project, Seibert et al., 2018). However, as algorithms become more personalized, the monetary costs of such strategies may also increase, necessitating full evaluations of these personalized approaches.

Concluding Remarks

Evaluation of medical screening programs requires robust and complex methods that should capture the unique properties of disease and screening approach, including multiple screening strategies: the screening target ages, the screening frequency, and how to follow up screening positive individuals (the triage algorithm). In addition, capturing all potential health benefits, harms, and monetary costs of a screening program requires a lifetime horizon. While critical to inform clinical effectiveness of screening programs, randomized trials often do not capture all necessary components of screening evaluation. Alternatively, constructing decision-analytic models, which incorporate empirical data, is the preferred method for a more comprehensive evaluation of a screening program and can be used to inform the cost-effectiveness of screening. However, while cost-effectiveness is one critical component of the decision-making process, other factors should be considered alongside cost-effectiveness analyses prior to implementation of medical screening.

References

Aas, E. (2009). Pecuniary compensation increases participation in screening for colorectal cancer. Health Economics, 18(3), 337–354.Find this resource:

Aas, E., Iversen, T., & Hoff, G. (2017). The effect of education on health behavior after screening for colorectal cancer. In K. Bolin (Ed.), Human capital and health behavior (pp. 207–242). Bingley, U.K.: Emerald Publishing.Find this resource:

Abbasi, A., Peelen, L. M., Corpeleijn, E., van der Schouw, Y. T., Stolk, R. P., Spijkerman, A. M. W., . . . Beulens, J. W. J. (2012). Prediction models for risk of developing type 2 diabetes: Systematic literature search and independent external validation study. BMJ, 345, e5900.Find this resource:

American Cancer Society (2016). Colorectal cancer.Find this resource:

Andermann, A., Blancquaert, I., Beauchamp, S., & Déry, V. (2008). Revisiting Wilson and Jungner in the genomic age: A review of screening criteria over the past 40 years. Bulletin of the World Health Organization, 86(4), 241–320.Find this resource:

Angrist, J. D., & Pischke, J. S. (2015). Mastering metrics: The path from cause to effect. Princeton, NJ: Princeton University Press.Find this resource:

Arbyn, M., Anttila, A., Jordan, J., Ronco, G., Schenck, U., Segnan, N., . . . von Karsa, L. (2010). European guidelines for quality assurance in cervical cancer screening. Second edition—summary document. Annals of Oncology, 21(3), 448–458.Find this resource:

Asaria, M., Griffin, S., Cookson, R., Whyte, S., & Tappenden, P. (2015). Distributional cost-effectiveness analysis of health care programmes—A methodological case study of the UK Bowel Cancer Screening Programme. Health Economy, 24(6), 742–754.Find this resource:

Beane, J., Sebastiani, P., Whitfield, T. H., Steiling, K., Dumas, Y.-M., Lenburg, M. E., & Spira, A. (2008). A Prediction Model for Lung Cancer Diagnosis that Integrates Genomic and Clinical Features. Cancer Prevention Research, 1, 56–64.Find this resource:

Braithwaite, R. S. (2011). Can life expectancy and QALYs be improved by a framework for deciding whether to apply clinical guidelines to patients with severe comorbid disease? Medical Decision Making, 31(4), 582–595.Find this resource:

Bramwell, R., West, H., & Salmon, P. (2006). Health professionals’ and service users’ interpretation of screening test results: Experimental study. BMJ, 333, 284.Find this resource:

Brewer, N. T., Salz, T., & Lillie, S. E. (2007). Systematic review: The long-term effects of false-positive mammograms. Annals of Internal Medicine, 146(7), 502–510.Find this resource:

Briggs, A. H., Weinstein, M. C., Fenwick, E. A., Karnon, J., Sculpher, M. J., & Paltiel, A. D. (2012). Model parameter estimation and uncertainty: A report of the ISPOR-SMDM modeling good research practices task force working group-6. Value Health, 15(6), 835–842.Find this resource:

Burger, E. A., Ortendahl, J. D., Sy, S., Kristiansen, I. S., & Kim, J. J. (2012). Cost-effectiveness of cervical cancer screening with primary human papillomavirus testing in Norway. British Journal of Cancer, 106(9), 1571–1578.Find this resource:

Burger, E. A., Pedersen, K., Sy, S., Kristiansen, I. S., & Kim, J. J. (2017). Choosing wisely: A model-based analysis evaluating the trade-offs in cancer benefit and diagnostic referrals among alternative HPV testing strategies in Norway. British Journal of Cancer, 117(6), 783–790.Find this resource:

Buxton, M. J., Drummond, M. F., Van Hout, B. A., Prince, R. L., Sheldon, T. A., Szucs, T., & Vray, M. (1997). Modelling in ecomomic evaluation: An unavoidable fact of life. Health Economics, 6(3), 217–227.Find this resource:

Canadian Agency for Drugs and Technologies in Health (CADTH) (2017). Guidelines for economic evaluation of health technologies. Ottawa, Canada.Find this resource:

Campos, N. G., Burger, E. A., Sy, S., Sharma, M., Schiffman, M., Rodriguez, A. C., Hildesheim, A., Herrero, R., & Kim, J. J. (2014). An updated natural history model of cervical cancer: Derivation of model parameters. American Journal of Epidemiology, 180(5), 545–555.Find this resource:

Castle, P. E., Sideri, M., Jeronimo, J., Solomon, D., Schiffman, M. (2007). Risk assessment to guide the prevention of cervical cancer. American journal of obstetrics and gynecology, 197(4), 356.e1–356.e6.Find this resource:

Collins, G. S., Mallett, S., Altman, D. G. (2011). Predicting risk of osteoporotic and hip fracture in the United Kingdom: Prospective independent and external validation of QFractureScores. BMJ, 342, d3651.Find this resource:

De Vries, R., Van Bergen, J. E., Jong‐van den Berg, D., Lolkje, T. W., Postma, M. J. (2006). Systematic screening for Chlamydia trachomatis: Estimating cost‐effectiveness using dynamic modeling and Dutch data. Value in Health, 9(1), 1–11.Find this resource:

Dinh, T. A., Alperin, P., Walter, L. C., & Smith, R. (2012). Impact of comorbidity on colorectal cancer screening cost-effectiveness study in diabetic populations. Journal of General Internal Medicine, 27(6), 730–738.Find this resource:

Dobrow, M. J., Hagens, V., Chafe, R., Sullivan, T., & Rabeneck, L. (2018). Consolidated principles for screening based on a systematic review and consensus process. Canadian Medical Association Journal, 190(14), E422–E429.Find this resource:

Drummond, M., Sculpher, M. J., Torrance, G. W., O’Brian, B. J., & Stoddart, G. L. (2005) Methods for economic evaluation in health care programmes (3rd ed.). Oxford, U.K.: Oxford University Press.Find this resource:

Dutch National Health Care Institute (2016). Guidelines for the conduct of economic evaluation in Health Care. Diemen, The Netherlands.Find this resource:

Eddy, D. M., Hollingworth, W., Caro, J. J., Tsevat, J., McDonald, K. M., & Wong, J. B. (2012). Model transparency and validation: A report of the ISPOR-SMDM modeling good research practices task force working group-7. Value in Health, 15(6), 843–850.Find this resource:

Ehlers, L., Overvad, K., Sørensen, J., Christensen, S., Bech, M., & Kjølby, M. (2009). Analysis of cost effectiveness of screening Danish men aged 65 for abdominal aortic aneurysm. BMJ, 338, b2243.Find this resource:

Elfström, K. M., Arnheim-Dahlström, L., von Karsa, L., & Dillner, J. (2015). Cervical cancer screening in Europe: Quality assurance and organisation of programmes. European Journal of Cancer, 51(8), 950–968.Find this resource:

Erntoft, S. (2011). Pharmaceutical priority setting and the use of health economic evaluations: A systematic literature review. Value in Health, 14(4), 587–599.Find this resource:

Etzioni, R., Penson, D. F., Legler, J. M., di Tommaso, D., Boer, R., Gann, P. H., & Feuer, E. J. (2002). Overdiagnosis due to prostate-specific antigen screening: Lessons from U.S. prostate cancer incidence trends. Journal of the National Cancer Institute, 94(13), 981–990.Find this resource:

Fryback, D. G., Stout, N. K., Rosenberg, M. A., Trentham-Dietz, A., Kuruchittham, V., & Remington, P. L. (2006). Chapter 7: The Wisconsin breast cancer epidemiology simulation model. JNCI Monographs, 2006(36), 37–47.Find this resource:

Gareen, I. F., Duan, F., Greco, E. M., Snyder, B. S., Boiselle, P. M., Park, E. R., Fryback, D., & Gatsonis, C. (2014). Impact of lung cancer screening results on participant health-related quality of life and state anxiety in the national lung screening trial. Cancer, 120(21), 3401–3409.Find this resource:

Genders, T. S., Steyerberg, E. W., Hunink, M. G. (2012). Prediction model to estimate presence of coronary artery disease: Retrospective pooled analysis of existing cohorts. BMJ, 344, e3485.Find this resource:

Glick, H. A., Doshi, J. A., Sonnad, S. S., & Polsky, D. (2007). Economic evaluation in clinical trials. Oxford, U.K.: Oxford University Press.Find this resource:

Goldhaber-Fiebert, J. D., Stout, N. K., Salomon, J. A., Kuntz, K. M., & Goldie, S. J. (2008). Cost-effectiveness of cervical cancer screening with human papillomavirus DNA testing and HPV-16, 18 vaccination. Journal of the National Cancer Institute, 100(5), 308–320.Find this resource:

Habbema, D., Weinmann, S., Arbyn, M., Kamineni, A., Williams, A. E., de Kok, I. M., . . . Brown, M. (2017). Harms of cervical cancer screening in the united states and the Netherlands. International Journal of Cancer, 140(5), 1215–1222.Find this resource:

Henderson, J. T., Webber, E. M., & Sawaya, G. F. (2018). Screening for ovarian cancer. Updated evidence report and systematic review for the US preventive services task force. Journal of the American Medical Association, 319(6), 595–606.Find this resource:

Hey, J. D., & Patel, M. S. (1983). Prevention and cure? Or: Is an ounce of prevention better than a pound of cure? Journal of Economic Literature, 2, 119–138.Find this resource:

Hey, J. D., & Patel, M. (1993). Prevention and cure? Or: Is an ounce of prevention better than a pound of cure? Journal of Health Economics, 2(2), 119–138.Find this resource:

Holland, W. W., Stewart, S., & Masseria, C. (2006). Policy brief—Screening in Europe. World Health Organization on behalf of the European Observatory on Health Systems and Policies.Find this resource:

Holme, Ø., Løberg, M., Kalager, M., Bretthauer, M., Hernan, M. A., Aas, E., . . . Hoff, G. (2018). Long-term effectiveness of sigmoidoscopy screening on colorectal cancer incidence and mortality in women and men: A randomized trial. Annals of Internal Medicine.Find this resource:

Husereau, D., Drummond, M., Petrou, S., Carswell, C., Moher, D., Greenberg, D., . . . Loder, E. (2013) Consolidated health economic evaluation reporting standards (CHEERS) statement. BMC Medicine, 11, 80.Find this resource:

Jørgensen, K. J., Gøtzsche, P. C., Kalager, M., Zahl, P. H. (2017). Breast cancer screening in Denmark: A cohort study of tumor size and overdiagnosis. Annals of Internal Medicine, 166(5), 313–323.Find this resource:

Kalager, M., Adami, H.-O., Bretthauer, M., & Tamimi, R. (2012). Overdiagnosis of invasive breast cancer due to mammography screening: Results from the Norwegian screening program. Annals of Internal Medicine, 156(7), 491–499.Find this resource:

Karnon, J., Stahl, J., Brennan, A., Caro, J. J., Mar, J., Möller, J. (2012). Modeling using discrete event simulation: A report of the ISPOR-SMDM Modeling Good Research Practices Task Force-4. Medical Decision Making, 32(5), 701–711.Find this resource:

Kim, J. J., Burger, E. A., Sy, S., & Campos, N. G. (2016). Optimal cervical cancer screening in women vaccinated against human papillomavirus. Journal of the National Cancer Institute, 109(2), djw216.Find this resource:

Kranse, R., van Leeuwen, P. J., Hakulinen, T., Hugosson, J., Tammela, T. L., Ciatto, S., . . . Schröder, F. H. (2013). Excess all-cause mortality in the evaluation of a screening trial to account for selective participation. Journal of Medical Screening, 20(1), 39–45.Find this resource:

Kuntz, K., Sainfort, F., Butler, M., Taylor, B., Kulasingam, S., Gregory, S., . . . Kane R. L. (2013). Decision and simulation modeling alongside systematic reviews. Rockville, MD: Agency for Healthcare Research and Quality.Find this resource:

Lansdorp-Vogelaar, I., Gulati, R., Mariotto, A. B., Schechter, C. B., de Carvalho, T. M., Knudsen, A. B., . . ., Jeanne S. Mandelblatt, J. S. (2014). Personalizing age of cancer screening cessation based on comorbid conditions: Model estimates of harms and benefits. Annals of Internal Medicine, 161(2), 104–112.Find this resource:

Lansdorp-Vogelaar, I., Knudsen, A. B., & Brenner, H. (2011). Cost-effectiveness of colorectal cancer screening. Epidemiologic Reviews, 33(1), 88–100.Find this resource:

Lansdorp-Vogelaar, I., Van Ballegooijen, M., Zauber, A. G., Boer, R., Wilschut, J., Winawer, S. J., Habbema, J. D. F. (2009). Individualizing colonoscopy screening by sex and race. Gastrointestal Endoscopy, 70(1), 96–108.Find this resource:

Larsen, I. K., Grotmol, T., Almendingen, K., & Hoff, G. (2007). Impact of colorectal cancer screening on future lifestyle choices: A three-year randomized controlled trial. Clinical Gastroenterology and Hepatology, 5(4), 477–483.Find this resource:

Le, Q. A. (2016). Structural uncertainty of Markov models for advanced breast cancer: A simulation study of Lapatinib. Medical Decision Making, 36(5), 629–640.Find this resource:

Lin, J. S., Piper, M. A., Perdue, L. A., Rutter, C. M., Webber, E. M., O’Connor, E., . . . Whitlock, E. P. (2016). Screening for colorectal cancer. Updated evidence report and systematic review for the US preventive services task force. JAMA, 315(23), 2576–2594.Find this resource:

Marlow, L. A. V., Chorley, A. J., Haddrell, J., Ferrer, R., & Waller, J. (2017). Understanding the heterogeneity of cervical cancer screening non-participants: Data from a national sample of British women. European Journal of Cancer, 80, 30–38.Find this resource:

McCaffery, K., Wardle, J., Nadel, M., & Atkin, W. (2002). Socioeconomic variation in participation in colorectal cancer screening. Journal of Medical Screening, 9(3), 104–108.Find this resource:

Mayrand, M. H., Duarte-Franco, E., Rodrigues, I., Walter, S. D., Hanley, J., Ferenczy A., . . ., Franco, E. L. (2007). Human papillomavirus DNA versus Papanicolaou screening tests for cervical cancer. The New England Journal of Medicine, 357(16), 1579–1588.Find this resource:

Mendes, D., Bains, I., Vanni, T., & Jit, M. (2015). Systematic review of model-based cervical screening evaluations. BMC Cancer, 15, 334.Find this resource:

Menzies, N. A., Soeteman, D. I., Pandya, A., & Kim, J. J. (2017). Bayesian methods for calibrating health policy models: A tutorial. PharmacoEconomics, 35(6), 613–624.Find this resource:

National Institute for Health and Care Excellence (NICE). Guide to the methods of technology appraisal 2017.

Neumann, P. J., Anderson, J. E., Panzer, A. D., Pope, E. F., D’Cruz, B. N., Kim, D. D., & Cohen, J. T. (2018). Comparing the cost-per-QALYs gained and cost-per-DALYs averted literatures. Gates Open Research, 2, 5.Find this resource:

Neumann, P. J., Sanders, G. D., Russell, L. B., Siegel, J. E., & Ganiats, T. G. (2016). Cost-effectiveness in health and medicine. Oxford, U.K.: Oxford University Press.Find this resource:

Paulden, M., O’Mahony, J. F., & McCabe, C. (2017). Discounting the recommendations of the second panel on cost-effectiveness in health and medicine. Pharmacoeconomics, 35(1), 5–13.Find this resource:

Pedersen, K., Sørbye, S. W., Burger, E., Lönnberg, S., & Kristiansen, I. S. (2015). Using decision-analytic modeling to isolate interventions that are feasible, efficient and optimal: An application from the Norwegian cervical cancer screening program. Value in Health, 18(8), 1088–1097.Find this resource:

Pedersen, K., Burger, E. A., Nygård, M., Kristiansen, I. S., Kim, J. J. (2018). Adapting cervical cancer screening for women vaccinated against human papillomavirus infections: The value of stratifying guidelines. European Journal of Cancer, 91, 68–75.Find this resource:

Petersen, G. M. (2002). Barriers to preventive intervention. Gastroenterology of Clinics of North America, 31, 1061–1068.Find this resource:

Pitman, R. J., Nagy, L. D., Sculpher, M. J. (2013). Cost-effectiveness of childhood influenza vaccination in England and Wales: Results from a dynamic transmission model. Vaccine, 31(6), 927–942.Find this resource:

Public Health England. (2017). Key performance indicators for the NHS screening programmes: Definitions and data submission guidance.Find this resource:

Roberts, M., Russell, L. B., Paltiel, A. D., Chambers, M., McEwan, P., & Krahn, M. (2012). Conceptualizing a model: A report of the ISPOR-SMDM modeling good research practices task force. Medical Decision Making, 32, 678–689.Find this resource:

Rutter, C. M., Knudsen, A. B., Marsh, T. L., Doria-Rose, V. P., Johnson, E., Pabiniak, C., . . . Lansdorp-Vogelaar, I. (2016). Validation of models used to inform colorectal cancer screening guidelines: Accuracy and implications. Medical Decision Making, 36(5), 604–614.Find this resource:

Sabik, L. M., & Lie, R. K. (2008). Priority setting in health care: Lessons from the experiences of eight countries. International Journal for Equity in Health, 7(1), 4.Find this resource:

Seibert, T. M., Chieh, F. C., Yunpeng, W., Verena, Z., Roshan, K., Kellogg, P. J., . . . Dale A. M. (2018). Polygenic hazard score to guide screening for aggressive prostate cancer: Development and validation in large scale cohorts. BMJ, 360, j5757.Find this resource:

Siebert, U., Alagoz, O., Bayoumi, A. M., Jahn, B., Owens, D. K., Cohen, D. J., & Kuntz, K. M. (2012). State-transition modeling: A report of the ISPOR-SMDM Modeling Good Research Practices Task Force-3. Medical Decision Making, 32(5), 690–700.Find this resource:

Vaccarella, S., Franceschi, S., Engholm, G., Lonnberg, S., Khan, S., & Bray, F. (2014). 50 years of screening in the Nordic countries quantifying the effect on cervical cancer incidence. British Journal of Cancer, 111(5), 965–969.Find this resource:

Van der Aalst, C. M., van Klaveren, R. J., de Koning, H. J. (2010). Does participation to screening unintentionally influence lifestyle behaviour and thus lifestyle-related morbidity? Best Practice & Research Clinical Gastroenterology, 24(4), 465–478.Find this resource:

Van Hees, F., Saini, S. D., Lansdorp-Vogelaar, I., Vijan, S., Meester, R. G. S., de Koning, H. J., . . ., van Ballegooijen, M. (2015). Personalizing colonoscopy screening for elderly individuals based on screening history, cancer risk, and comorbidity status could increase cost effectiveness. Gastroenterology, 149(6), 1425–1437.Find this resource:

Van Ravesteyn, N. T., Miglioretti, D. L., Stout, N. K., Lee, S. J., Schechter, C. B., Buist, D. S. M., . . ., de Koning, H. J. (2012). What level of risk tips the balance of benefits and harms to favor screening mammography starting at age 40? Annals of Internal Medicine, 156(9), 609–617.Find this resource:

Vanni, T., Karnon, J., Madan, J., White, R., Edmunds, W. J., Foss, A., & Legood, R. (2011). Calibrating models in economic evaluation. Pharmacoeconomics, 29, 35–49.Find this resource:

Vernon, S. (1997). Participation in colorectal cancer screening: A review. Journal of the National Cancer Institute, 89(19), 1406–1422.Find this resource:

Vickers, A. J., Sjoberg, D. D., Ulmert, D., Vertosick, E., Roobol, E. J., Thompson, I., . . . Lilja, H. (2014). Empirical estimates of prostate cancer overdiagnosis by age and prostate-specific antigen. BMC Medicine, 12(26).Find this resource:

Warren, J. L. (2009). Adverse events after outpatient colonoscopy in the Medicare population. Annals of Internal Medicine, 150, 849.Find this resource:

Weinstein, M. C., Siegel, J. E., Gold, M. R, Kamlet, M. S., & Russell, L. B. (1996). Recommendations of the panel on cost-effectiveness in health and medicine. Journal of the American Medical Association, 276(15), 1253–1258.Find this resource:

Weinstein, M. C., Torrance, G., & McGuire, A. (2009). QALYs: The basics. Value in Health, 12(s1), S5–S9.Find this resource:

Whiting, P. F., Davenport, C., Jameson, C., Burke, M., Sterne, J. A. C., Hyde, C., & Ben-Shlomo, Y. (2015). How well do health professionals interpret diagnostic information? A systematic review. BMJ Open, 5, e008155.Find this resource:

Wilson, J. M. G, & Jungner, G. (1968). Principles and practice of screening for disease. Public Health Papers, 34. World Health Organization, 22(11), 473.Find this resource:

Wooldridge, J. M. (2010). Econometric analysis of cross sectional and panel data (2nd ed.). Cambridge, U.K.: The MIT Press.Find this resource: