Show Summary Details

Page of

Printed from Oxford Research Encyclopedias, Communication. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 25 February 2021

Numeracy in Health and Risk Messagingfree

  • Priscila G. Brust-Renck, Priscila G. Brust-RenckLaboratory for Bioethics and Ethics in Science, Clinical Hospital of Porto Alegre, Brazil
  • Julia NolteJulia NolteFaculty of Behavioral and Cultural Studies, Heidelberg University
  •  and Valerie F. ReynaValerie F. ReynaDepartments of Human Development and Psychology, Cornell University


The complexity of numerical information about health risks and benefits places demands on people that many are not prepared to meet. For example, much information about health is communicated numerically, such as treatment risks and effectiveness, lifestyle benefits, and the chances of side effects from medication. However, many people—especially the old, the poor, and the less educated—have difficulty understanding numerical information that would enable them to make informed health decisions. Some evidence also suggests cultural and gender differences (although their causes have been disputed). The ability to use and understand numbers (i.e., numeracy) plays an important role in how information should be displayed and communicated.

Measuring differences in numeracy provides a standard to guide one’s approach when communicating risk. Several surveys have been developed to allow for a descriptive assessment of basic and analytical mathematical skills in nationally representative samples (e.g., NAEP, NAAL, PISA, PIACC). Other measures assess specific skills, such as perception of numbers (e.g., number line, approximation, dots tasks), individual perception of one’s own ability (i.e., Subjective Numeracy Scale), and arithmetic computation ability (i.e., Objective Numeracy Scales, Abbreviated Numeracy Scale, and Berlin Numeracy Test).

Difficulties associated with low numeracy extend well beyond the inability to understand place value or perform computations. Understanding and remediating low numeracy requires getting below the surface of errors in judgment and decision making to the deeper level of scientific theory. Despite the relevance of numbers in decision making, there is a certain level of disagreement regarding the psychological mechanisms involved in numeracy. Studies show that people have a basic mental representation of numbers in which the discriminability of two magnitudes is a function of their ratio rather than their difference (psychophysical approaches). Numerical reasoning has been identified with quantitative and analytical processes, and such computation is often seen as an accurate and objective way to process information (traditional dual-process approaches as applied to numeracy). However, these approaches do not account for the contradictory evidence that reliance on analysis is not sufficient for many decisions and has been associated with worse performance for some decisions. Studies supporting a more recent dual-process approach—one that accounts for standard and paradoxical effects of numeracy on risk communication—emphasize the role of intuition: this is a kind of advanced thinking that operates on gist representations, which capture qualitative understanding of the meaning of numbers that is relevant in decision making (Fuzzy Trace Theory). According to Fuzzy Trace Theory, people encode both actual numbers (verbatim representations) and qualitative interpretations of their bottom-line meaning (gist representations) but prefer to rely on the qualitative gist representations when possible. Thus, potential difficulties in decision making arising from deficits in numeracy can be resolved through meaningful communication of risk. Creating narratives that emphasize the contextually relevant underlying gist of risk and using methods that convey the meaning behind numeric presentations (e.g., use of appropriate arrays to communicate linear trends, meaningful relations among magnitudes, and inclusion relations among classes) improve understanding and decision making for both numerate and innumerate individuals.

Understanding Numeracy

In a world that is increasingly technology dependent, sufficient mathematical competence has not only become a key to pursuing a higher education: adequate numerical skills enable people to successfully navigate tasks of everyday living, including entering the workforce and maintaining a favorable health status. An influential factor contributing to how people make healthier judgments and decisions in everyday life is their ability to understand and use numbers—or numeracy (Cokely, Ghazal, & García-Retamero, 2015; Kutner, Greenberg, Jin, Boyle, Hsu, & Dunleavy, 2007; Peters, 2012; Reyna, Nelson, Han, & Dieckmann, 2009; Reyna & Brust-Renck, 2015). Individuals with limited numeracy skills, also described as lack of mathematical proficiency (or innumeracy), are at a marked disadvantage in understanding risks. For example, cancer patients are more likely to overestimate the effectiveness of an experimental cancer treatment when they lack numerical competence (i.e., when they incorrectly interpret what it means when a physician tells them a treatment is known to work in “40% of cases” such as theirs) (Weinfurt et al., 2003).

In the context of health care, correctly grasping and handling numbers is often a matter of life and death. Information crucial to medical decision making—the risks associated with engaging in unhealthy lifestyles, how much and when to take medication, survival rates, or the likelihood of success and side effects ascribed to different treatment options—is often communicated in a quantitative fashion. However, most people report feelings of insecurity when faced with tasks or problems involving numerical information (Reyna & Brainerd, 2007). Overall, insufficient numeracy has been demonstrated to have several detrimental effects on patients’ health states, as it leads to inferior health knowledge, adverse health outcomes, increased hospitalization rates, and the choice of lower-quality health facilities (Kiechle, Hnat, Norman, Viera, DeWalt, & Brice, 2015; Peters, Dieckmann, Dixon, Hibbard, & Mertz, 2007; Taha, Sharit, & Czaja, 2014).

While patients and consumers now enjoy an unprecedented access to health-related data, they are not equipped to adequately interpret the information they are exposed to on the Internet, in print, and through other types of media. Numbers are often uninterpretable for laypeople unless information is presented in a format that is easy to understand (Nelson, Reyna, Fagerlin, Lipkus, & Peters, 2008; Reyna & Brainerd, 2007). For example, when reading educational documents displaying complex numerical information about their risks from diabetes, less numerate patients were confused and had difficulty understanding their disease (Joram et al., 2012). Numerical deficiency has also been associated with poorer disease self-management and medication adherence in patient populations affected by diabetes, atrial fibrillation, or HIV (Cavanaugh et al., 2008; Kalichman, Ramachandran, & Catz, 1999; Waldrop-Valverde, Jones, Jayaweera, Gonzalez, Romero, & Ownby, 2009; Waldrop-Valverde, Jones, Gould, Kumar, & Ownby, 2010). The ability to make informed health-care choices relies on a patient’s ability to understand the potential harms and benefits associated with taking a risk (Cox, 2014).

In summary, modern research provides a torrent of numerical information for patients to use to make better decisions. Numbers are used to quantify levels of benefits of preventive behaviors (e.g., the number of added years of life through exercise) or of alternative treatments (e.g., five-year mortality rates for chemotherapy vs. surgery), as well as levels of risks or uncertainties associated with those choices (e.g., the probability of serious injury from running or of serious side effects from chemotherapy). However, most people struggle to understand the meaning of such numbers. For some people, basic knowledge about fractions or decimals can be lacking; they do not know whether a 1 in 100 risk is bigger or smaller than a 1 in 1,000 risk (Peters, 2012). For others, numbers may communicate information precisely, but those numbers may have little significance, and such information is relatively useless in decision making (Levy, Ubel, Dillard, Weir, & Fagerlin, 2014; Peters, Dieckmann, Västfjäll, Mertz, Slovic, & Hibbard, 2009; Reyna et al., 2009). For example, patients can use simple and free online tools to estimate their risk of developing various kinds of cancer. If a patient has a 20% chance of developing invasive breast cancer, this patient may lack the understanding of what 20% actually means (Reyna, 2008, 2012b): Is that a low or high risk? Should a person feel anxious or relieved by that level of risk? Hence, modern approaches to numeracy emphasize meaning (Reyna, Nelson, Han, & Pignone, 2015).

Demographics Differences

Differences in numeracy have been observed across a wide range of demographic characteristics, including gender, age, education, race, ethnicity, and sociodemographic status (Gonzales et al., 2004; Lemke et al., 2004; Perie, Grigg, & Dion, 2005; Perie, Moran, & Lutkus, 2005). The sources of these differences are poorly understood. Women overestimate their risk of contracting or dying from breast cancer, but men underestimate their risk, which is not zero. Overall, the less numerate make larger risk estimation errors, are more inconsistent in their use of risk estimation scales and are less able to incorporate risk reduction information into their judgments (Black, Nease, & Tosteson, 1995; Davids, Schapira, McAuliffe, & Nattinger, 2004; Lipkus, Peters, Kimmick, Liotcheva, & Marcom, 2010; Schapira, Davids, McAuliffe, & Nattinger, 2004; Schwartz, Woloshin, Black, & Welch, 1997; Woloshin, Schwartz, Black, & Welch, 1999). However, women and older adults arrive at more inaccurate risk estimations and score lower in tests of numeracy than their male and younger counterparts (Carman & Kooreman, 2014; Galesic & García-Retamero, 2010; Keller, Siegrist, & Visschers, 2009; Kobayashi et al., 2015; Lipkus et al., 2010). In addition, older subjects were less likely to choose a treatment option with more benefits (combination of chemotherapy and hormonal therapy) than younger subjects—although it should be noted that the “better” treatment might depend on individual preferences. Lipkus et al. (2010) suggest that this seeming effect of age really reflected numeracy, because subjects low in numeracy were less accurate in their estimation of cancer-free survival rates than subjects high in numeracy (see also Peters, Slovic, Västfjäll, & Mertz, 2008; Sprague, LaVallie, Wolf, Jacobsen, Sayson, & Buchwald, 2010).

Disparities in educational attainment have also been argued to underlie the differences in numeracy that exist between countries and across cultures (e.g., Lemke et al., 2004; Reyna & Brainerd, 2007). In the latest National Assessment of Educational Progress, Hispanic, African American, and Native American students performed worse than their Caucasian and Asian peers (Grigg, Donahue, & Dion, 2007). These differences likely reflect poverty rates and educational opportunity, rather than race or ethnicity per se. Similarly, the less educated have repeatedly been shown to have more difficulties grasping the meaning of numerical information (Keller et al., 2009; Lipkus et al., 2010). As a result, innumeracy is especially widespread among those subgroups of the population who conventionally lack formal education, financial resources, and language proficiency (Gonzales et al., 2004; Kutner, Greenberg, Jin, & Paulsen, 2006; Lemke et al., 2004; Perie et al., 2005; Perie, Moran et al., 2005; Reyna & Brainerd, 2007). Basic or below-basic (as opposed to proficient) numeracy skills can be found in about every second American high-school dropout, as well as in the majority of US-based Hispanics (66%) and African Americans (58%), the latter differences that are likely due to poverty and other socioeconomic factors. Nevertheless, low levels of numeracy can also occur in highly educated individuals and persist across Western and industrialized nations such as the United States (Lemke et al., 2004; Lipkus, Samsa, & Rimer, 2001). Up to 70% of American fourth graders and eighth graders perform below their respective grade levels, and many high school graduates fall short of the quantitative performance expected in college or the workplace (Perie, Grigg et al., 2005; Perie, Moran et al., 2005).

Math Anxiety

While innumeracy constitutes as a cognitive barrier in the context of medical decision making, performance in mathematical tasks and decision contexts can also be hampered when numeracy is only perceived to be low: a subjective lack of numerical proficiency can lead to mathematical anxiety. Experiencing math anxiety entails feelings of concern, distress, or threat when solving mathematical and statistical tasks, with anxiety in turn interfering with mathematical performance (Lussier, 1996; Ma, 1990). Math anxiety can also occur among those performing well in mathematical tasks or those high in numeracy, and predict mathematical performance in academic contexts irrespective of actual mathematical knowledge (Ashcraft & Krause, 2007; Lyons & Beilock, 2011; Morsanyi, Busdraghi, & Primi, 2014).

The negative impact that anxiety exerts on mathematical performance has been shown to cause negative self-perceptions and self-reported low mathematical skills (Gregoire & Desoete, 2009). Math anxiety also leads to negative attitudes toward tasks involving mathematics. Both negative attitudes toward mathematics and one’s mathematics-related skills can be coupled with an avoidance of contexts that require processing numbers or performing computations (Ashcraft, 2002). Consequently, attention is being allocated to (tangential) aspects of a situation or task that are not concerned with numerical information, thus interfering with working memory performance (Silk & Parrott, 2014). (Working memory is the ability to hold and manipulate information in short-term memory, e.g., the capacity to constantly update numbers while performing mental calculations). Reduced working memory spans that result from math anxiety are associated with an increase in error rates and elevated reaction times on mathematical tasks (Ashcraft & Kirk, 2001).

The relationship between numeracy and math anxiety can also shed light on the mechanisms through which math anxiety impacts health-related decision making and risk processing. Silk and Parrott (2014) showed that exposure to risk messages about genetically modified food elicited math anxiety when those messages included percentages, numbers embedded in written information, and statistical graphs. Higher levels of math anxiety were tied to a lower likelihood of understanding the significance of food safety and hampered comprehension of mathematical content in those individuals who subjectively or objectively lacked numeracy skills. When evaluating alternative treatment options in the context of serious health conditions, low levels of numeracy and high levels of math anxiety are both associated with a decreased ability to accurately comprehend baseline health risks and the risks associated with different treatment plans (Rolison, Morsanyi, & O’Connor, 2015). Math-anxious individuals also place less faith in their ability to make judgments about risks or evaluate the effectiveness of treatment options. This relationship persists even after controlling for objective numeracy. Conversely, math anxiety no longer predicts accuracy of risk estimates when numeracy is being taken into account. In other words, negative, numeracy-related self-concepts act as an additional barrier to patients’ active participation in medical decision-making processes, above and beyond factual numerical aptitude.

Assessment of Numeracy

Standardized Testing

As evidenced by nationally representative assessments of quantitative skills (Kirsch, Jungeblut, Jenkins, & Kolstad, 2002; Kutner et al., 2006), millions of adult Americans struggle to grasp the notion of decimals and ratio concepts, including fractions and probabilities (Reyna & Brainerd, 2007). This is alarming, as medical information is commonly expressed in ratios, such as base-rates, joint and conditional probabilities, percentages, and frequencies (Reyna et al., 2009; Wolfe & Reyna, 2010a, 2010b; Wolfe, Fisher, & Reyna, 2012; Wolfe, Fisher, Reyna, & Hu, 2012) and increasingly available to the public through a variety of uncensored commercial and noncommercial sources.

In the United States, several national surveys were developed to test adults’ basic mathematical skills (Cokely et al., 2015; Reyna et al., 2009). The National Assessment of Educational Progress (NAEP), also referred to as the nation’s report card, for example, provides a comprehensive assessment of mathematical knowledge and skills. The test is split into two different types of assessments: one trend assessment has tracked long-term developments in the performance of US high school students since 1973, and one main assessment is continuously updated to account for topical changes in taught content, education policies, and improved evaluation methods. The most recent NAEP trend assessment revealed that performance of current American twelfth graders did not show notable improvements when compared to the first NAEP trend cohort of 1973 (Perie, Grigg et al., 2005; Perie, Moran et al., 2005). In the 2007 main NAEP, 78% of the 9,000 high school seniors assessed performed under their expected grade level. This means that the large majority of high schoolers demonstrated only basic or below-basic mathematical performance (Grigg et al., 2007). In line with similar findings discussed earlier in this section, Hispanics, African Americans, and Native American students performed worse than their Caucasian and Asian peers in the latest NAEP, again, likely reflecting poverty and unequal opportunities for quality education.

Mirroring the weak performance of their adolescent peers, American adults perform poorly in national assessments of numerical aptitude. According to the National Adult Literacy Study (NALS, Kirsch et al., 2002), a nationally representative sample of 26,000 participants, 22% of American adults perform at the lowest quantitative level possible, and 26% perform at the next-highest level of quantitative skill. This translates to an extensive inability to extract numbers in lengthy pieces of written information and to perform operations involving two or more steps, such as calculating a dosage of medication adequate based on body weight.

Similar results were obtained in the 2003 National Assessment of Adult Literacy (NAAL; Kutner et al., 2006), which assessed prose literacy, document literacy, and quantitative literacy in a representative sample 19,000 Americans. Out of all domains tested, quantitative skills yielded the most disheartening results: 36% of adults (about 93 million Americans) performed at a below-basic or basic level, meaning their mathematic skills had not progressed beyond the stage of performing easy, one-step mental calculations. As the 2003 NAAL assessment expanded on the NALS by including items tapping health literacy and health numeracy, this implies that the numerical aptitude of more than one third of the States’ adult population falls short of the skill level necessary to adopt an active role in medical decision-making contexts (Kutner et al., 2006).

Using a more broad definition of numeracy, two international programs, the Program for International Student Assessment (PISA) and the Programme for the International Assessment of Adult Competencies (PIACC), assessed both literacy and numeracy as skills needed to fully understand information and use it. Findings are not different from NAEP reports. According to Lemke et al. (2004), PISA results show that American 15-year-olds are not sufficiently equipped to solve real-life mathematical problems or handle probabilities and fractions. In both domains, US high schoolers fall notably behind their international counterparts, with the United States ranked 29th and 24th out of 39 countries evaluated, respectively.

These instruments are typically administered to nationally representative samples and allow a descriptive assessment of the analytical level at which numbers are used and understood, finding differences in people of different ages, gender, race, ethnicity, educational level, and culture (Reyna & Brainerd, 2007). Nevertheless, these factors are not underlying causal factors of how individuals process and understand risk and are instead superficial descriptors (Brust-Renck, Reyna, Corbin, Royer, & Weldon, 2015; Reyna, 2012a).

Health-Specific Tests of Numeracy

In order to account for the lack of formal assessment of numeracy, a variety of instruments have been developed that specifically assess numeracy in multiple dimensions (e.g., Davis, Kennen, Gazmararian, & Williams, 2005; Rothman et al., 2006). Among the first to develop an assessment of risk understanding in health contexts were Black et al. (1995). They measured people’s ability to correctly compare the likelihood of contracting breast cancer to the chance of dying from breast cancer, and their competence to determine how many times heads would come up if a fair coin was tossed 1,000 times.

Similarly, Weinfurt et al. (2003) created a simple, single-item measure to evaluate how well oncology patients understand the following statement: “This new treatment controls cancer in 40% of cases like yours.” In their original sample, the majority (72%) of the 318 participating cancer patients tested correctly equated the original statement with the response option, “For every 100 patients like me, the treatment will work for 40 patients.” In addition, 12% of patients openly admitted to not understanding the statement, and the remaining 16% of patients misinterpreted the statement to either mean that the treatment would reduce the cancer risk by 40%, or that the physician was 40% certain that the treatment was effective.

A standardized test was developed to test numeracy items that pertain to tasks commonly encountered in health settings, such as simple arithmetical operations, basic understanding of time, and the ability to recognize and apply numbers embedded in text, the Test of Functional Health Literacy in Adults (TOFHLA; Davis et al., 2005). The 50-item TOFHLA provides a measure of reading comprehension and quantitative reasoning that contributes to functional health literacy by testing a patient’s ability to understand the steps required to prepare for surgery, file a Medicaid application form, or give informed consent (Parker, Baker, Williams, & Nurss, 1995). A 17-item short version also assesses universal health skills commonly required in everyday medical contexts such as evaluating the basic, low-level competence to judge individual blood glucose levels and a family’s eligibility for financial support (S-TOFHLA; Baker, Williams, Parker, Gazmararian, & Nurss, 1999). Unlike integrative measures, however, a composite score takes this notion one step further by combining two separate components, a reading comprehension and a numeracy section, into one composite score.

Both measures have been administered in various hospital, Medicare, and emergency department settings, and a variety of populations, including HIV patients (e.g., Gazmararian, Baker, Williams, Parker, Scott, Green, & Koplan, 1999; Gazmararian, Williams, Peel, & Baker, 2003; Kalichman et al., 1999). Despite its overt popularity, the TOFHLA’s usefulness as a measure of health numeracy is restricted by its lack of validation and its length: The TOFHLA’s numeracy subtest has yet to be validated against an accredited instrument of mathematical skill.

A notably broader assessment of health numeracy is the Medical Data Interpretation Test (Schwartz, Woloshin, & Welch, 2005). This test constitutes a 20-item multiple-choice instrument, with each item offering between two and five choice options, one of which is correct. In the context of written statements resembling everyday health information, decision makers evaluate the riskiness and efficiency of pharmaceutical and surgical treatment options, compare mortality rates across different health conditions, and judge which additional pieces of information are needed to make sense of the medical information already provided. As a result, the tests evaluate comprehension of epidemiological concepts (i.e., incidence) and cue the decision maker to contrast population-level and individual-level risk. It further requires responders to work with advanced mathematical concepts such as relative and absolute risk, as well as base rates. As the Medical Data Interpretation Test calls on advanced levels of numerical skills and expertise, higher levels of numeracy and education are associated with better test scores, and physicians score notably higher than equally educated postgraduates.

While an isolated numeracy score allows us to evaluate numerical competence on its own, it stands to reason that successful navigation of health-care contexts necessitates an adequate integration of several skills, including numerical abilities, reading proficiency, and the ability to understand and use medical documents. Food labels present a valuable proxy for this set of skills: comprehension of food labels has been shown to be associated with both numeracy and literacy skills (Rothman et al., 2006), justifying the development of integrative measures such as the Newest Vital Sign and the Nutrition Label Survey. Completion of items in the Newest Vital Sign and the Nutrition Label Survey requires successful execution of several consecutive steps. These include understanding the content of the label and how the label is organized, extracting the information (numbers) essential to completing a task, and determining the arithmetic calculations necessary to derive a solution. For instance, the Newest Vital Sign tasks people to examine the nutrition label of an ice cream container and estimate the percentage of their daily value of calories (2,500) one serving of ice cream would satisfy. To correctly answer this question, individuals first read the label and identify those sections of the information provided that are relevant to solving the task. Second, people discern the calorie content per serving (250). Third, this number needs to be divided by 2,500 to obtain the right answer—10% (Rothman et al., 2006).

Other integrative measures assess even more advanced skills, such as the Medical Data Interpretation Tests (MDIT; Schwartz et al., 2005; Woloshin, Schwartz, & Welch, 2005). The MDIT tests a range of general computational skills in addition to understanding of probability and risk (e.g., chance of having a heart attack in the next 10 years), as well as quantitative reasoning skills needed to interpret common health information in context (e.g., information contained in drug labels and advertisements). Similarly, the Numeracy Understanding in Medicine Instrument (NUMi; Schapira et al., 2012) is an integrative measure of numeracy that assesses comprehension beyond computational ability by testing other skills such as number sense and graph literacy. The NUMi is based on a two-parameter Item Response Theory analysis and is designed to be a comprehensive assessment of health numeracy for use with participants who have low levels of general numeracy. Although a more comprehensive measure of numeracy, the NUMi measures the combined skills necessary to understand numbers with those necessary to understand graphs, which have been shown to be different in a series of studies (e.g., García-Retamero & Cokely, 2015a; Reyna, 1991, 2008).

Graph Literacy

Numeracy also applies to the ability to interpret numerical information in graphs. García-Retamero and Cokely (2015c) discuss that graph literacy (the ability to read and understand graphs) goes beyond prior tests of numeracy (see also Galesic & García-Retamero, 2010). Visual aids would be expected to improve the processing of numbers for people with sufficient graph literacy (e.g., García-Retamero & Cokely, 2014; García-Retamero, Cokely, & Hoffrage, 2015). Galesic and García-Retamero (2011) developed a graph literacy scale consisting of 13 questions and 8 visual displays (including pie charts, bar, and line graphs), which are embedded in contexts resembling newspaper articles and magazine ads. Participants were required to locate relevant data in a visual display, understand the relationship between different elements of a graph, and make predictions based on the data presented. As expected, graphic representations were particularly helpful for low-numerate individuals and facilitated more accurate decisions about risk (see also Garcia-Retamero & Galesic, 2010).

In subsequent studies, graphs provided an effective means of risk communication when their elements (numerator and denominators) were well defined and clearly represented relevant risk information by making the relationship among classes clear (e.g., Garcia-Retamero & Cokely, 2015b; Okan, García-Retamero, Cokely, & Maldonado, 2012; Reyna & Brainerd, 2008). Visual aids were effective for reducing denominator neglect in participants with high graph literacy. Gaissmaier, Wegwarth, Skopec, Müller, Broschinski, and Politi (2012) also showed that comprehension of numerical information was not related to iconicity (i.e., concrete and realistic representations). The only difference that affected comprehension and recall was the difference between graphics and numbers; the actual level of iconicity of graphics did not matter. Individuals with high graph literacy had better comprehension and recall when presented with graphics instead of numbers, but the reverse was true for individuals with low graph literacy. Because people vary in graph literacy, visual aids may not be helpful to everyone, depending on the degree to which they transparently depict the simple gist of numbers and numerical relationships (Brust-Renck, Royer, & Reyna, 2013; Reyna, 2004, 2008).

A self-reported ability to process and use graphically presented information was also developed to identify individuals with limited graph literacy skills (García-Retamero, Cokely, Ghazal, & Joeris, 2016). The scale was based on the approach of Fagerlin et al. (2007) to capture an individual’s own assessment of their subjective ability. The scale comprised five items designed to assess self-reported confidence (e.g., “How good are you at working with bar charts?”) with different types of graphs (i.e., bar charts, line plots, pies). Results showed that the new scale was associated with measures of objective performance (i.e., objective graph literacy) and uniquely predicted graph understanding (García-Retamero et al., 2016).

Tests of Objective Numeracy

Objective numeracy scores are elicited by evaluating people’s performance in comparing numbers in size, understanding and converting probabilities, percentages, and frequencies, performing arithmetical calculations, and interpreting numbers in a given context (Cokely et al., 2015; Reyna et al., 2009). Measures of objective numeracy have been suggested to test mathematical computation skills on a continuum from crude to intermediate to advanced levels of numerical competence (Reyna et al., 2009). Rudimentary, low-level numerical skills encompass abilities such as the skill to solve simple arithmetic problems that only require a limited number of steps to complete, tasks assessing intermediate skills call for an understanding of higher-order concepts (i.e., ratios, probabilities) and analytical reasons skills, and advanced skills are required to draw inferences based on the already existing information, such as the capacity to identify the positive predictive value of a test.

The Objective Numeracy Scale is the most popular, widely used instrument of numeracy, originally developed by Schwartz et al. (1997), to test familiarity with basic probability and numerical concepts. Evaluating an intermediate level of numeracy, this scale tests the understanding of chance and ratio concepts (percentages and frequencies), as well as the proficiency to convert back and forth between these formats (e.g., “Imagine that we flip a fair coin 1,000 times. What is your best guess about how many times the coin would come up heads?”). The original sample that was used to explore the test’s psychometric properties consisted of 287 female veterans, of which 96% had graduated high school. Results showed that more than half of the women answered either none or one of the three questions correctly. In the same sample, Schwartz et al. further inspected the relationship between objective numeracy and the tested women’s comprehension of risk reduction information. More numerate women were more apt to understand the advantages of screening mammography. Higher scores in the original Schwartz et al. numeracy scale were also associated with more logically sound utility scores in a follow-up study (Woloshin, Schwartz, Moncur, Gabriel, & Tosteson, 2001).

Since its creation, the three-item numeracy scale has been modified for and administered in a wide variety of health contexts (i.e., Estrada, Barnes, Collins, & Byrd, 1999; Parrott, Silk, Dorgan, Condit, & Harris, 2005; Peshkin et al., 2015). For instance, Sheridan and Pignone (2002) obtained findings similar to those reported by Schwartz et al. in that medical students high in numeracy were superior in their ability to interpret medical data grounded in mathematical information. Using a slightly altered variant of the three-item objective numeracy scale, Schwartz, McDowell, and Yueh (2004) report that patients diagnosed with head and neck cancer provided more consistent utility scores when they were adequately numerate.

Lipkus et al. (2001) added eight questions to the Schwartz et al. (1997) scale to expand the numeracy assessment necessary to work to perform arithmetic computations, transform frequencies or percentages, and also to understand probabilities and ratio concepts (for a detailed review, see Liberali et al., 2012). (Note that Lipkus et al. [2001] updated one item from Schwartz’s original scale, from flipping a coin to throwing a die.) Some of the new items examine absolute risk perception (“Which of the following numbers represents the biggest risk of getting a disease: 1%, 10%, or 5%?”), others assess judgments of relative risk (“If person A’s chance of getting a disease is 1 in 100 in 10 years, and person B’s risk is double that of A’s, what is B’s risk?”). Liberali et al. (2012) showed that the final list of items assessed four different constructs of mathematical proficiency: mindless matching, conversion of ratios, linear ordering, and multiplying. Unlike Schwartz et al. (1997) sample, up to 94% of participants in Lipkus et al. (2001) had received some kind of higher education. Nevertheless, performance was remarkably similar in both samples, implying that even college-educated individuals struggle to perform well at an intermediate (health-relevant) level of numeracy.

The predictive validity of Lipkus et al. (2001) has been extensively documented, such that high-numerate participants were more likely to show better performance in decision tasks than low-numerate ones (e.g., Chapman & Liu, 2009; Gurmankin, Baron, & Armstrong, 2004; Peters et al., 2008; Peters, Hart, & Fraenkel, 2011). Nevertheless, results are inconsistent across studies. For example, Peters, Västfjäll, Slovic, Mertz, Mazzocco, and Dickert (2006) found that high-numerate participants were more prone to an irrational bias involving processing numbers than low-numerate participants were when rating the attractiveness of a risky gamble.

Peters et al. (2007) further extended the Objective Numeracy Scale by adding four items of more challenging content, thus making the measure more demanding. The new four items were developed to be more challenging and to assess advanced numerical skills. In addition to testing familiarity with ratio concepts and probability, the 15-item scale also tests the ability to keep track of class-inclusion relations (e.g., estimate the likelihood that a woman has a breast cancer tumor based on the positive and negative predictive value of a mammogram). Results showed that high-numerate participants performed better in decision tasks (e.g., Dickert, Kleber, Peters, & Slovic, 2011; Dieckmann, Slovic, & Peters, 2009).

In yet another attempt to improve the distributional properties of the Objective Numeracy Scale, Weller, Dieckmann, Tusler, Mertz, Burns, and Peters (2013) proposed a short version of the scale to assess numeracy, the Abbreviated Numeracy Scale. Building on classical measures and based on item response theory scaling approach (Rasch, 1960/1993), the new scale used a one-parameter model to generate a refined scale with eight items. After analysis of item difficulty, discrimination, consistency, and guessing, the scale was constituted of three items from Schwartz et al. (1997; including one that was modified by Lipkus et al., 2001), two items from Lipkus et al. (2001), and one item from Peters et al. (2007), the new scale included two items from the Cognitive Reflection Test (CRT; Frederick, 2005). The CRT is a three-item assessment that requires comprehension and use of numerical information while not explicitly developed to test numeracy but with a tendency to inhibit a dominant response that is incorrect and engage in further reflection that leads to the correct response. Overall results showed that more numerate individuals made objectively better decisions in decision choice tasks. They also observed an effect of educational level, as high-numerate participants were more likely to have a college degree or greater compared to those who did not finish college or high school.

Another short numeracy measure, the Berlin Numeracy Test, was developed by Cokely, Galesic, Schulz, Ghazal, and Garcia-Retamero (2012), who used an adaptive testing approach to predict levels of risk literacy (i.e., “the ability to make good decisions based on information about risk;” Cokely et al., 2015, p. 22). (Although a preference is given to the two-to-three-item computer adaptive test, other versions are also used, such as the four-item paper-and-pencil test and the single item.) Their attempt to develop an easy-to-use measure of numeracy, one with improved discriminability and predictive power focused on testing statistical numeracy, defined as the ability to understand “the operations of probabilistic and statistical computation, such as comparing and transforming probabilities and proportions.” This is not surprising, as the items were similar to those on earlier scales (Cokely et al., 2012, p. 25). The overall assumption is that statistical literacy would predict how people make decisions about risk through “accurately interpreting and acting on information about risk.”

All items are of intermediate complexity, as they require ratio judgments based on arithmetical calculations, and either call for answers to be given in the form of probabilities or as a fraction (i.e., the number of times a desired events occurs out of a given amount of dice throws) even though they involve more complex reasoning (e.g., “Imagine we are throwing a five-sided die 50 times. On average, out of these 50 throws how many times would this five-sided die show an odd number (1, 3 or 5)?”). In comparison to the numeracy tests by Lipkus et al. (2001) and Schwartz et al. (1997), the Berlin Numeracy Test is the most predictive of everyday risk interpretation skills (i.e., the ability to understand weather forecasts or to interpret health risks associated with the use of a new treatment; Cokely et al., 2012, 2015). Results showed that all three versions of the Berlin Numeracy Test predicted less biased decisions about risk (e.g., evaluating how much a person would benefit from using a new toothpaste), lower intertemporal discounting (less choice of sooner-smaller rather larger-later rewards), medical judgments, and overconfidence (e.g., Garcia-Retamero & Cokely, 2014; Garcia-Retamero, Wicki, Cokely, & Hanson, 2014; Ghazal, Cokely, & Garcia-Retamero, 2014).

Tests of Subjective Numeracy

Subjective measures of numeracy assess respondents to communicate their personal preferences for working with numbers and perception of number-relevant skills. Two measures were made to assess how confident and comfortable people feel about their ability to understand and use numbers, even though they do not measure numeracy per se (Fagerlin, Zikmund-Fisher, Ubel, Jankovic, Derry, & Smith, 2007; Woloshin et al., 2005; Zikmund-Fisher, Smith, Ubel, & Fagerlin, 2007).

The first measure of subjective numeracy was created by Woloshin et al. (2005) to assess interest in knowing statistics (STAT-interest) and confidence in one’s ability to understand them (STAT-confidence). The Interest scale is a five-item survey that seeks to determine how closely people attend to medical data presented in medical settings and the media (e.g., “I do not believe in statistics because something will either happen or not happen to me”), and the Confidence scale is a three-item questionnaire that estimates self-reported comprehension of medical statistics (e.g., “I am confident that I can make sense of medical statistics.”). Participants reported high levels of confidence and interest in health-related statistics, which were related to their performance on interpretation of medical data.

The second measure was the Subjective Numeracy Scale, which was developed to capture individual’s own assessment of their quantitative ability to work with numbers (Fagerlin et al., 2007; Zikmund-Fisher et al., 2007). The scale includes four measures of self-reported confidence (e.g., “How good are you at working with fractions?”) and four measures of preferences for working with numbers (e.g., “How often do you find numerical information to be useful?”). Results showed that the Subjective Numeracy Scale was moderately associated with measures of objective performance. In patient populations, lower subjective numeracy has been demonstrated to be associated with objectively worse decision making (Fraenkel, Cunningham, & Peters, 2014; Miron-Shatz, Hanoch, Doniger, Omer, & Ozanne, 2014; Miron-Shatz, Hanoch, Katz, Doniger, & Ozanne, 2015). A three-item variant comprising two competence and one preference question has been found to perform just as well as the original long version (SNS-3; McNaughton, Cavanaugh, Kripalani, Rothman, & Wallston, 2015; McNaughton, Wallston, Rothman, Marcovitz, & Storrow, 2011).

Tests of Number Perception

Approximation tasks based on quantity comparisons have been developed as an alternative approach to studying numerical ability in those unable to perform concrete calculations, such as young children, indigenous populations lacking formal education (Pica, Lemer, Izard, & Dehaene, 2004), and patients suffering from certain types of traumatic brain damage (i.e., Deheane & Cohen, 1991). These measures assess perception of numerical magnitudes without reliance on knowledge of mathematical symbols and formal education, relying on approximation to the concrete numbers.

Problem sets that entirely forego numeric symbols and language are solvable through the existence of a sense of numerical magnitude, through an approximate cognitive system that enables crude, symbol-free estimations even among infants, children, and populations unfamiliar with symbolic math (Dehaene, 1999; Feigenson, Dehaene, & Spelke, 2004; Xu & Spelke, 2000). Accuracy in approximation tasks was related to performance in mathematics involving numbers (i.e., Booth & Siegler, 2006; Gilmore, McCarthy, & Spelke, 2010; Halberda, Mazzocco, & Feigenson, 2008; Link, Nuerk, & Moeller, 2014), indicating that these tasks may serve as measures of numerical comprehension. The resulting tests require approximate rather than exact solutions and can be categorized as either calling on a rudimentary grasp of numbers or not involving math symbols at all.

In 1991 Deheane and Cohen presented a patient with trauma-induced acalculia (i.e., inability to perform simple arithmetic tasks) with easy, one-step arithmetic operations (additions, subtractions, or multiplications) accompanied by two proposed results, one that was slightly incorrect and one that was grossly incorrect. As an example, one of the items required the patient to add the numbers 7 and 3 and then to decide whether 17 or 11 was closer to the real sum (which in this case would be 10, making 11 the closer result). While struggling with subtraction and multiplication tasks, the patient—who was otherwise largely unable to operate with numbers—achieved surprising accuracy for the addition approximation tasks. This case study illustrates the usefulness of alternative measures in situations in which standard assessments of numeracy would be expected to fail, and sparked the creating of comparable tests. For instance, a more recent interpretation of this task type requires participants to judge which one of two numbers is the most similar in size to a third one (i.e., Ansari, Donlan, Thomas, Ewing, Peen & Karmiloff-Smith, 2003; Paterson, Girelli, Butterworth, & Karmiloff-Smith, 2006).

A similarly informative number-based approximation task is the Number-Line Task, a test that is either administered as a Number-to-Position or Position-to-Number problem (Siegler et al., 2011; Siegler & Opfer, 2003). Both versions rely on participants’ judgments about the relative position of a number or a point on a line that represents a continuum, usually between zero and an upper limit (sometimes 100 or 1,000). Variants of this task (i.e., both bounded and unbounded number lines) have been used in samples of adults and children, with task scores reliably related to performance scores on regular arithmetic tests (i.e., Dackermann, Fischer, Huber, Nuerk, & Moeller, 2016; Huber, Moeller, & Nuerk, 2014; Link et al., 2014; Namkung & Fuchs, 2016; Opfer & Martens, 2012; Peeters, Degrande, Ebersbach, Verschaffel, & Luwel, 2016; Sullivan, Juhasz, Slattery, & Barth, 2011).

Unlike the aforementioned measures, symbol-free tasks often are numerosity comparison tests. One popular example of this type of test is the Panamath task (Halberda et al., 2008) that asks people to make a string of decisions about whether there are more dots of one color than of another color in an array. In a one-step comparison task used by Pica et al. (2004), Amazonian indigenous people were asked to visually determine which of two sets of spots was larger in size. A somewhat more advanced variant of the same task involved an additional computational step: In the two-step approximation task, the same test takers estimated which set of two sets of spots was the larger one when one large set of spots was pitted against two smaller sets of spots that first needed to be added together. Even though the members of said Amazonian group exhibited a limited numerical lexicon (i.e., they did not have number words beyond the number 5), the tasks revealed them to be able to compare and add approximated numbers whose size extends far beyond their finite vocabulary.

Numerical magnitude comparisons have since found frequent application, particularly in samples of children, in the context of Developmental Dyscalculia (i.e., specific learning disability affecting the acquisition of arithmetic skills), and among cases of Williams syndrome, a genetic deficit marked by severe deficiencies in arithmetic competence (e.g., Bugden & Ansari, 2015; Clayton, Gilmore, & Inglis, 2015; Paterson et al., 2006; Van Herwegen, Ansari, Xu, & Karmiloff-Smith, 2008; Xu & Spelke, 2000). Taken together, these findings support the assumption that unlearned numerical approximation skills are different from but related to learned knowledge of mathematics.

Tests of Gist Numeracy

Unlike the other measures that we have reviewed so far, all of which share similar items, there is new work on gist numeracy, which is theory-based assessment of numeracy as the ability to use and understand numbers based on Fuzzy Trace Theory discussed later in this article (Reyna & Brainerd, 1995, 2011). Two measures have been developed to assess gist comprehension of numbers.

The first measure is the Fuzzy Processing Preference Index (FPPI; Wolfe & Fisher, 2013), which assesses individual preferences in the integration of qualitative text information and numeric base-rates. It comprises 19 probability judgment items and four items that draw a distinction between mere pattern matching and base-rate respect. Each of the probability judgment items presents either a high or low base rate and qualitative written information that directly contradicts the base rate presented. According to Wolfe and Fisher, individuals with a preference for relying on rote, verbatim traces would be expected to ground their estimates in the exact base rates given, whereas those favoring fuzzy (i.e., gist) traces should derive an estimate based on the overall qualitative and quantitative information instead. Results show that FPPI predicts the accuracy of risk estimates in the context of breast cancer and breast cancer gene mutations with unique variance beyond objective numeracy (Weil, Wolfe, Reyna, Widmer, Cedillos-Whynott, & Brust-Renck, 2015). This result suggests that the FPPI assesses a skill other than mere numerical computation and that processing preferences factor into the way health information is perceived and interpreted.

The second measure is the Gist Numeracy Scale (Reyna, Brust-Renck, Portenoy, Gichane, & Wilhelms, 2012), which goes beyond traditional measures to emphasize qualitative understanding of the meaning (gist) of numerical information in addition to the computational skills measured by objective numeracy scales and number perception tasks. Results revealed items that fell into two categories: The Categorical Thinking Scale of Gist Numeracy reflected choices between categorical distinctions in the simplest gist associated with each option, such as choosing the sure (or higher chance) when there is risk involved, and avoiding undesirably high risks that do not compensate for the possibility of death (or an undesirable high risk). The Relative Magnitude Scale of Gist Numeracy reflected perception of relative magnitude of numbers from multiple comparisons (e.g., between two magnitudes, or on a number line). Each Gist Numeracy Scale predicted different types of decisions: the Categorical Thinking Scale predicted decisions that could be boiled down to choices between the simplest, categorical gist distinctions, and the Relative Magnitude Scale predicted decisions that required an ordinal level of representation.

Theoretical Frameworks of Numeracy

There are two relevant theoretical approaches that make predictions about numeracy. The first are psychophysical approaches, in which the subjective magnitude of quantities is perceived as a nonlinear function of objective magnitude, and the second are dual-processes approaches that contrast intuition with quantitative analysis (Reyna & Brust-Renck, 2015; Reyna et al., 2009). Within the scope of dual-processes approaches, there are two main theoretical frameworks. The first are traditional approaches that contrast intuition or affect with analytical processes, in which analytical processes are considered advanced. The second is Fuzzy Trace Theory, in which intuition differs from analysis in the type of mental representation used to process information—simple gist rather than verbatim detail—and gist-based intuition is considered advanced.

Psychophysical Approaches

Psychophysical approaches emphasize perception of quantities. According to these approaches, people have a basic mental representation of numbers in which the discriminability of two magnitudes is a function of their ratio rather than the difference between them (Brannon, 2006; Brannon & Merritt, 2011; Gallistel, 2011; Gallistel & Gelman, 2005; Merritt, Casasanto, & Brannon, 2010; Siegler, Thompson, & Schneider, 2011). This mental representation of numbers suggests that the difference between two numbers with the same absolute difference (e.g., 300) is perceived as larger for smaller numbers, such as 600 and 300 (ratio of 2.0), compared to larger numbers, such as 1,600 and 1,300 (ratio of 1.2). According to these psychophysical approaches, a linear representation of subjective magnitude of quantities would mean a precise representation of its objective magnitude (same absolute difference of 300) and considered more advanced (Opfer & Siegler, 2007).

Research has shown that adults, in particular those with higher levels of education, tend to have a more linear representation of numbers (Hubbard, Piazza, Pinel, & Dehaene, 2005; Siegler et al., 2011). In addition, children progress gradually from a logarithmic (lower numbers spaced further apart than higher ones) to linear (similar distance between numbers) representation as they grow into adulthood (Brannon & Merritt, 2011; Opfer & Siegler, 2007; Siegler, Thompson, & Opfer, 2009). This transition from logarithmic to linear function occurs earlier for small number than for larger ones (e.g., 5- to 6-year-olds are already more linear than 3- to 4-year-olds for numbers between 0 and 10; Berteletti, Lucangeli, Piazza, Dehaene, & Zorzi, 2010; Opfer et al., 2010). Fraction arithmetic procedures, however, are more likely to favor logarithmic representation at the expense of the linear one (Hecht, Close, & Santisi, 2003; Hecht & Vagi, 2010; Opfer & DeVries, 2008; Siegler et al., 2011).

Psychophysical approaches predict perceived similarity among numbers and many studies provide evidence for a basic number system that mentally represents numbers along a logarithmic curve (although some adults seem to have linear representations of quantity in different tasks). For example, understanding of numerical magnitudes is assessed by performance on tests of numerical perception discussed earlier in this article (e.g., number line, approximation of magnitudes), which are related to better performance in achievements tests, better recall of numbers, and more normative responses in riskless evaluations and risky choices (e.g., Booth & Siegler, 2006; Schley & Peters, 2014; Thompson & Siegler, 2010). Accurate performance was measured by more exact symbolic-number mappings (i.e., absolute rather than proportional differences between numbers are weighted), which indicated more linear representations (Peters et al., 2008).

Distortions in the perception of numbers indicate greater deviation from linearity, and are more likely to occur when the numbers are of higher magnitude. Common distortions are curvilinear (i.e., logarithmic) representations, which are less accurate because they flatten out for larger numbers. However, mental representations of numbers can be trained using non-symbolic magnitudes (symbolic number mapping), which improves mathematical performance (Booth & Siegler, 2008; Ramani & Siegler, 2008; Schley & Fujita, 2014). Fraction perception, for example, is likely to be more accurately estimated (and implausible solutions are more likely to be correctly rejected) by those who understand the gist of the magnitudes of the fractions than by those who do not understand the fraction magnitudes (Siegler et al., 2011).

Traditional Dual-Process Approaches

Other theories predict that numerical reasoning is a result of quantitative and analytical processes and that the arithmetic computation of numerical information is the most accurate and objective way to process information (Epstein, 1994; Epstein, Pacini, Denes-Raj, & Heier, 1996; Kahneman, 2003, 2011; Lipkus & Peters, 2009; Stanovich, West, & Toplak, 2011). According to these dual-processes approaches, analytic reasoning is more deliberate as opposed to intuition, which is based on a fast, automatic, and associative system. The analytical system (System 2) is an advanced mode of thinking based on Cartesian dualism that is also known as “rule based” and therefore “rational” and can override the intuitive system (System 1). System 1 is more primitive and is roughly similar to Freud’s psychodynamic distinction between primary versus secondary processes (Epstein, 1994; Kahneman, 2003, 2011; Reyna, 2013). Some dual-processes approaches attribute an affective component to System 1, which provides motivation to choice processes (Damasio, 1994; Lipkus & Peters, 2009).

In this view, high numeracy is a result of analytical processing (System 2) because of its deliberative, slow reasoning, which is responsible for accurate computational abilities (Peters, 2012). Low numeracy is a result of intuitive processing (System 1), which relies on impulsive and emotional ways of thinking in some dual-process approaches (Lipkus & Peters, 2009). Measures such as the Objective Numeracy Scale explained earlier in this article are often used to assess this type of deliberative computational ability. The prediction is that subjects with a high score on the objective numeracy test process numerical information more analytically, and those with a low score process numerical information more intuitively (but see Epstein, 1994). According to the theory, higher numeracy scores predict fewer biases when solving decision problems (Peters et al., 2006; Peters & Levin, 2008).

For example, Peters et al. (2011) found low numeracy individuals perceived the likelihood of a medication to be less risky when information was presented in a percentage format (10% of patients did not get a blistering rash) than in a frequency format (10 out of 100 patients did not get a blistering rash). In this view, this effect is explained as a result of reliance on the intuitive system because of the emotion generated by thinking about 10 people receiving a blistering rash in the frequency format (see also Peters et al., 2009). For example, Hess, Visschers, and Siegrist (2011) showed that subjects with low numeracy were less prone to intuitive biases and performed better at estimating risk from medical screening information (e.g., whether risk was low or high) than those with high numeracy. However, there are some consistent exceptions to the generalization that those higher in numeracy reason at a higher level (and are less subject to biases).

Fuzzy Trace Theory

There have also been many studies that emphasize the role of intuition as fundamental to the understanding of numerical concepts and to effective use of numbers (Reyna & Brainerd, 2008, 2011, 2014). Contrary to the traditional dual-processes theories described earlier, in which both processes are serial, research based on Fuzzy Trace Theory shows that precise reasoning (i.e., verbatim) and intuition (i.e., gist) are parallel independent processes. According to the theory, verbatim-based (quantitative) reasoning captures the surface form of information rather than its bottom-line (qualitative) meaning, which is gist based. Even though gist representations are less precise than verbatim ones, more advanced reasoners seem to have a preference for fuzzy processing (Mills, Reyna, & Estrada, 2008; Reyna, 2008, 2012a; Reyna, Chick, Corbin, & Hsia, 2014; Reyna, Nelson, Han, & Pignone, 2015; Reyna, Weldon, & McCormick, 2015). For example, people who understand the meaning (gist) of numerical information (e.g., the risk is small) will make more informed decisions than people who rely solely on rote (verbatim) numbers (e.g., there is a 2% chance of side effects).

Fuzzy Trace Theory builds on previous research, including traditional theoretical approaches such as psychophysical theories (for which numerical reasoning is a result of representing numbers on a mental number line), even though the emphasis in Fuzzy Trace Theory is not on the precise representation of numerical quantities but rather on fuzzy representations of numbers. Fraenkel et al. (2012) showed that providing the gist of numerical information (e.g., explaining that 2% of adverse effects means “a small chance”) increased patient knowledge, willingness to escalate care, and likelihood of making an informed choice in a pre- and post-test comparison. Medication preferences shifted from 35% value-concordant at pre-test to 64% at post-test (see also Brewer, Richman, DeFrank, Reyna, & Carey, 2012; Elstad et al., 2015).

In reviews of the literature, Reyna and colleagues (Reyna & Brust-Renck, 2015; Reyna et al., 2009) suggest that more numeracy does not necessarily imply more precise (verbatim) representation of numbers, but instead a better understanding of its meaning (gist), which is consistent with Fuzzy Trace Theory. In this view, higher numeracy is reflected in gist representation (or gist numeracy), because the latter requires understanding the bottom-line (qualitative) meaning of numbers, which can be assessed by measures of gist numeracy described earlier in this article. Verbatim representations involve mindless (quantitative) calculation (e.g., Liberali, Reyna, Furlan, Stein, & Pardo, 2012; Reyna, 2008, 2012a; Reyna & Brainerd, 1995, 2011). (Note that mindful computation is sometimes required, and naturally this, too, can be a feature of higher numeracy.)

Gist numeracy is about being able to connect the dots within and between quantitative dimensions (e.g., those involving probabilities and ratios) and boiling information down to its essence (i.e., gist). According to the theory, relying on gist is a result of understanding the meaning of numerical information, even when computation is required, for example when measuring blood sugar and adjusting insulin and oral medication doses. Thus, errors can result from mere calculation without proper understanding of the numbers that are being processed (Reyna & Brust-Renck, 2015; Reyna et al., 2009). According to Fuzzy Trace Theory, if people have sufficient numeracy to understand risks and probability, they will extract the gist of the numbers (e.g., Wolfe et al., 2013, 2015). Fuzzy Trace Theory suggests that the gist be presented along with the verbatim information, contrary to some other theories that recommend presenting only numerical information and leaving the decision maker (often with no background knowledge) to interpret its meaning (Reyna & Hamilton, 2001). People often want to extract their own gist. However, providing both numerical information and an interpretation (or meaning) increases the likelihood that the risk will be understood regardless of one’s numeracy. This possibility occurs because some overall differences in risk perception can be attributed to how information is communicated rather than whether people are numerate or innumerate (Brust-Renck et al., 2013, 2015).

Gist numeracy is not just representing numbers using a qualitative format (e.g., Hawley, Zikmund-Fisher, Ubel, Jancovic, Lucas, & Fagerlin, 2008; Tait, Zikmund-Fisher, Fagerlin, & Voepel-Lewis, 2010), but it is primarily a meaningful representation of the numerical information. Many decision makers cannot extract the meaning from precise (verbatim) numerical information on their own because they lack specific knowledge. Research shows that decision makers prefer to extract gist at the lowest (least precise) level possible that will allow them to discriminate their options (i.e., categorical, such as some or no risk) and escalate to more precise distinctions (i.e., ordinal, such as low or more risk) when necessary (Reyna & Brainerd, 1991, 2011). Fuzzy Trace Theory also acknowledges the impact of affect and basic emotions, in particular by contributing to the gist interpretation of information (Reyna & Rivers, 2008; Rivers, Reyna, & Mills, 2008). For example, even though the chance of being infected with HIV-AIDS is objectively “small,” the gist interpretation of the risk can be “high” because the overall meaning is influenced by emotion—HIV-AIDS is an incurable, deadly disease. This conclusion is not limited to preference-sensitive decisions (i.e., those involving uncertainty about risks and benefits for the patient), as many have claimed. Instead, understanding (or gist) is a prerequisite to making health-relevant decisions such as those involving diabetes, in which patients do not adhere to recommendations regardless of their preferences (Joram et al., 2012).

Discussion of the Literature

Many people have difficulty understanding and interpreting numerical information about risk. This lack of understanding can become particularly important in a health and medical context in which poor decisions can lead to bad outcomes, such as lower quality of life. Studies have described differences across age and gender—older people and females are somewhat less accurate when computing risk (although there are exception that cut the other way). However, the reasons for these differences are unlikely to involve age and gender per se; for example, they may involve social expectations and generational differences in educational access. Similarly, less educated subjects are more likely to have difficulty interpreting numerical information, and analytical abilities are related to educational attainment across countries and cultural groups.

Measures of individual differences in numeracy have been linked to cognitive processes that predict judgment and decision making processes and outcomes. Evidence of the effects of numeracy has been widely documented, and considerable progress has been made in measuring numeracy (see Cokely et al., 2015; Nelson et al., 2008; Reyna et al., 2009). Our examination of commonly used measures of individual differences in numeracy shows that most measures of numeracy have improved from standardized tests (e.g., Davis et al., 2005; Kutner et al., 2007) and simple instruments (e.g., Lipkus et al., 2001; Peters et al., 2007; Schwartz et al., 1997) to more specialized ones based on complex modeling using item response theory or adaptive testing approach (e.g., Cokely et al., 2012; Weller et al., 2013) and context specific (e.g., Schapira et al., 2012; Schwartz et al., 1997). While testing of item operational parameters is a common method of scale improvement (but see Fabrigar, Wegener, MacCallum, & Strahan, 1999), the new measures incorporated some of the traditional items (e.g., Weller et al., 2013), despite the little or nonexistent validity tests (Schwartz et al., 1997). Number perception tasks and measures of gist numeracy, however, are based on theoretical explanations of numeracy.

Computation and precision have been central to the definition of numeracy; however, theoretical advances suggest a definition that goes beyond these abilities to also encompass perception of numerical magnitude and gist understanding of risk and probability in context (Izard, Pica, Dehaene, Hinchey, & Spelke, 2011; Reyna & Brainerd, 1994, 2008, 2011; Siegler et al., 2011). Adequate understanding of risk and probability is critical for judgment and decision making (Nelson et al., 2008; Reyna & Brust-Renck, 2015; Reyna & Hamilton, 2001; Reyna et al., 2009). It is not that computation is not good or a defining part of numeracy. However, computation is not the only ability necessary for using and understanding numbers in order to make healthier decisions; indeed, it leaves decision makers less informed about the significance of numbers and how they should be used (e.g., Brewer et al., 2012; Fraenkel et al., 2012; Wolfe et al., 2015).

Despite the relevance of numbers in decision making, there is a certain level of disagreement regarding psychological mechanisms. Some authors emphasize that people have a basic nonsymbolic mental representation of numerical magnitude (Gallistel, 2011; Siegler et al., 2011). Others emphasize that numerical reasoning is a result of quantitative and analytical processes (Epstein, 1994; Kahneman, 2003, 2011; Peters et al., 2008). Still others emphasize the role of intuition as fundamental to the understanding of numerical concepts and to effective use of numbers (Reyna et al., 2009; Reyna & Brust-Renck, 2015).

Each of the theoretical approaches has been tested empirically. The evidence suggests that number is perceived logarithmically by many populations, but the developmental and psychophysical literatures disagree with one another (in part), and the evidence clearly challenges this view as an explanation of numeracy or decision making (Reyna et al., 2009). Dual-process views as applied to numeracy also have some major discordances with evidence. For example, people higher in numeracy are more likely to commit specific judgment and decision errors. Fuzzy Trace Theory integrates prior approaches, and makes counterintuitive but supported predictions about numeracy, judgment, and decision making. Combined, these approaches suggest that numeracy involves several abilities in addition to performance on mathematical tasks, including perception of the magnitude of quantities (i.e., number sense; Siegler et al., 2011), emotion as well as analytic thinking (e.g., Lipkus & Peters, 2009), and creating a meaningful intuitive understanding of numbers (e.g., a fuzzy gist representation; Reyna et al., 2009; Reyna & Brust-Renck, 2015). This knowledge is essential to understand and improve health communication and medical decision making.

Further Reading