Show Summary Details

Page of

Printed from Oxford Research Encyclopedias, Business and Management. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 22 September 2021

Control Variables in Management Researchfree

Control Variables in Management Researchfree

  • Guclu AtincGuclu AtincDepartment of Management and Economics, Texas A&M University-Commerce
  •  and Marcia J. SimmeringMarcia J. SimmeringDepartment of Management, Louisiana Tech University


The use of control variables to improve inferences about statistical relationships in data is ubiquitous in management research. In both the micro- and macro-subfields of management, control variables are included to remove confounding variance and provide researchers with an enhanced ability to interpret findings. Scholars have explored the theoretical underpinnings and statistical effects of including control variables in a variety of statistical analyses. Further, a robust literature surrounding the best practices for their use and reporting exists. Specifically, researchers have been directed to report more detailed information in manuscripts regarding the theoretical rationale for the use of control variables, their measurement, and their inclusion in statistical analysis. Moreover, recent research indicates the value of removing control variables in many cases. Although there is evidence that articles recommending best practices for control variables use are increasingly being cited, there is also still a lag in researchers following recommendations. Finally, there are avenues for valuable future research on control variables.


  • Research Methods

Organizational research is primarily aimed at supporting inferences about relationships among variables, and the validity of conclusions researchers draw is strengthened by ruling out plausible alternative explanations (Pedhazur & Schmelkin, 1991). Control is a primary mechanism used by researchers to reduce the threat to validity posed by alternative explanations. Control can be enacted in a number of ways, and best practices exist for the use of control variables in statistical analysis. Misuse or misinterpretation of control variables can have deleterious effects on research (Becker, 2005; Breaugh, 2008). In this article, we define and review the concept of control, summarize findings regarding the current state of practice regarding control variables, review control best practices, and provide recommendations for future research regarding control variables.

Control and Research Design

Control in research can be enacted in a variety of ways, and the choice of how to control depends primarily on research design. Experimental research can occur in the laboratory or in the field, and this design allows the researcher to manipulate a wide variety of independent variables (Kerlinger & Lee, 2000; Shadish et al., 2002). Quasi-experimental research occurs when researchers manipulate some causal variables while others vary naturally (Shadish et al., 2002). This design provides less ability for the researcher to enact control than in an experiment. Nonexperimental research occurs when the researcher has no control over independent variables and exercises no intervention (Kerlinger & Lee, 2000). Survey research in which there is no experimental condition or manipulation is nonexperimental research.

to which the researcher has influence. In experiments and quasi-experiments, the researcher has more discretion over who participates, what treatments are received, and what influences can be removed. In nonexperimental research, the researcher can dictate who participates, but enacting control in other ways is highly challenging.

Pedhazur and Schmelkin (1991) present four ways in which control can be enacted in research: manipulation, elimination/inclusion, randomization, and statistical control. Manipulation occurs when the researcher applies a treatment to participants (Shadish et al., 2002), and it can only be enacted in experimental and quasi-experimental research. In the context of organizational research, this could be adopting a new variable pay plan or training leaders to use different behaviors. The control enacted through manipulation assumes that the researcher has some confidence that the inferences drawn regarding relationships among variables can be directly traced to the manipulation.

Elimination and inclusion are two sides of the same coin, and both can be used in all research designs. The purpose of elimination and inclusion is to identify and isolate variables that could have an unwanted influence on relationships between independent and dependent variables (Pedhazur & Schmelkin, 1991). With exclusion, the influence of a variable is removed by removing its variance (i.e., making it constant). Examples of this would be studying only unemployed workers or collecting data only from entrepreneurial firms in order to remove the variance that could be introduced by studying other groups. Inclusion of an extraneous variable to enact control means that it is purposefully made a part of the study. For instance, employment status may be included in the statistical model as a binary variable to account for its possible effect on the study variables. The variable can then be examined as part of the research design, perhaps as a moderating variable, or can be controlled for statistically.

Randomization is used as a means of control in experimental and quasi-experimental research when the influences of extraneous variables are either hard to capture or perhaps even unknown. Participants are randomly assigned to different treatment or nontreatment (i.e., control) conditions, with the assumption that differences among participant groups that could influence inferences will be equated across groups. For instance, if participant age is expected to be related to attitudes toward a new variable pay policy but is not a relationship under study, randomly assigning participants to the experimental condition (in which variable pay is received) versus the control (where no variable pay is received) would be expected to “equate” both groups on age, thus eliminating it as an alternative explanation of findings.

Statistical control is common in the organizational sciences, and although it is primarily used in nonexperimental research, it can be applied to all empirical research designs. When statistical control is used, data are collected on variables that are expected to be both extraneous and influential to the research question (Kish, 1959). These variables are then included in statistical analysis in a way that accounts for their effects on the relationships of interest. For example, if researchers study the effect of workplace policies on employee attitudes, they may control for the sex of the respondent as a means to account for general differences in opinions that may exist between male and female workers that are not a part of the research question. Or, they may statistically control for organizational tenure in order to capture the notion that an employee who has been working at the firm longer may have different perceptions than one who has been there for less time.

A final word regarding research design and control is the importance of considering context in organizational research. Johns (2001) argues that context, or the conditions surrounding participants during the course of data collection, is critical to understanding research findings. Context can range from specific decisions made during data collection (e.g., whether the survey was on paper or online) to issues particular to a participant (e.g., an individual who is newly promoted) to organizational (e.g., data collected in a firm that was recently acquired), and even more broadly. For example, organizational research in which data were collected from workers in the latter two-thirds of the year 2020 and the beginning of 2021 must acknowledge the effects of the global COVID-19 pandemic on work, the economy, and personal well-being. The role of context and its importance should be considered by the researcher, who will make a determination as to how to enact control.

Theoretical Use and Interpretation of Control Variables

The distinction between prediction and understanding is a good way to comprehend the importance of theory. Hair et al. (2016) highlight the importance of relying on theory to come up with better answers to business problems in the form of explanation and prediction. Increasing the statistical power of a statistical model by blindly including a control variable does not automatically mean that the model is explaining or answering a scientific question. What is more important is understanding what the model is explaining.

The use of statistical analysis should be preceded by a theoretical rationale. Thus, it is important to understand the conceptual role that control variables play in order to be able to interpret a statistical finding in which they are included. Recent writings have addressed the theoretical perspectives that underlie the choice, inclusion, and interpretation of control variables in statistical analysis, with special emphasis on variables that are proxies for other constructs.

A Conceptual Understanding of Control

Hair et al. (2016) define variables as the observable and measurable characteristics of a conceptual model (p. 142). In a typical study, the relationship between a dependent (criterion) and an independent (predictor) variable is hypothesized and tested. While doing that, many researchers adopt a statistical control practice that removes the variance associated with other confounding variables (Bernerth & Aguinis, 2016; Carlson & Wu, 2012). These confounding variables, sometimes called nuisance variables (Breaugh, 2006), may be included as control variables, which are used in nonexperimental studies to yield more accurate findings (Spector & Brannick, 2011) by ruling out alternative explanations. By including control variables, researchers aim to enhance the validity of their studies.

Purification Principle

Spector and Brannick (2011) refer to the purification principle as the use of control variables to purify the studied relationships between the predictor and criterion variables. By including the control variable in statistical analysis, researchers aim to eliminate other possible explanations of substantive relationships that may be due to control variables. This purification action, while done to uncover the “true relationships” (Atinc et al., 2012; Carlson & Wu, 2012), has many downsides, as Bernerth and Aguinis (2016) outline. First, while performing the so-called purifying, the researcher, many times unintendedly, masks the true relationships by suppressing the real effect of the predictors on the criterion variables (Becker, 2005; Breaugh, 2008).

For instance, assume a study is being conducted on the firm performance and chief executive officer (CEO) compensation relationship. Agency theory (Jensen & Meckling, 1976) proposes that having governance mechanisms in place to align the interests of the principal (shareholders) and the agents (top management teams) would mitigate agency problems, and in return it is likely to result in enhanced firm performance. Hence, from a fundamental theoretical perspective, one expects to see a statistically significant relationship between executive compensation structure (predictor) and firm performance (criterion) if they are to be included in a research model together. However, in such a model, there are various conceptual and nonconceptual control variables that can be included. One of the most obvious control variables is firm size. In their conceptual model, Finkelstein et al. (2009) include firm size as an antecedent of executive compensation. Tosi et al. (2000), in their highly regarded meta-analytic study, observed that more than 40% of the variance in CEO compensation is accounted for by firm size, while firm performance only accounts for less than 5%. Based on these findings, including firm size as a control variable makes intuitive sense. On the other hand, firm size may also impact the availability of resources, CEO discretion, and access to finance and communication channels (Finkelstein et al., 2009). In that case, firm size not only may suppress executive compensation’s effect on firm performance but also may mask theoretically and potentially important relationships. DiMaggio and Powell (1983) define isomorphism as a “constraining process that forces one unit in a population to resemble other units that face the same set of environmental conditions” (p. 149). This defined population is further specified as an institutional field such as the micro- and macro-subfields of management research which in which common control mechanisms are used (Atinc et al., 2012, p. 60). While firm size should not be avoided as a control variable, it should not be included solely as an isomorphic action because of the institutional field; rather, the implications of its use should be given serious consideration. For example, on many occasions, control by design, such as adopting elimination as a control technique (Pedhazur & Schmelkin, 1991) by studying similar sized companies, may be a better way of controlling for firm size instead of using it as a proxy control variable (Becker et al., 2016).

Considering the limitations of the purification principle from another perspective, a potential nonperceptual control variable to be included in a firm performance executive compensation relationship model is the CEO’s risk-taking propensity. Kraiczy et al. (2014) report the possible effect of the CEO’s risk-taking propensity on firm innovation which is likely to result in, once again, higher firm performance. As it is highly unlikely for a researcher to be able to reach out to a CEO with a survey that includes a scale to measure risk-taking propensity, this variable may be omitted as a control variable. In that case, the above-mentioned purification need may not be fulfilled.

In summary, we propose that including control variables as a way to purify the true relationships should be done with caution. Although the intention might be purification, research indicates that the use of control variables may suppress substantive effects or make interpretation of relationships challenging.

Control Variables and Residual Predictors

The influence of control variables on independent variables (or predictors) is both a statistical and conceptual concern. As noted previously, researchers can use an inclusion approach with control variables by measuring those they believe have an effect on a substantive relationship. Yet, when doing so, they must consider that the meaning of independent variables in analysis change. As noted by Breaugh (2008), when a control variable is entered into a statistical analysis, shared variance between it and the independent variable to which it is related creates a semipartial correlation. With the variance from the control variable removed, this is now a residual predictor, for which the researcher must consider that there is lower variance in the construct of interest.

Breaugh (2008) details this issue by reviewing the findings presented in Judge and Cable (2004), who predicted earnings based on an analysis of height (the independent variable), with control variables of sex, weight, and age. Breaugh (2008) demonstrated that the independent variable of height was transformed into a residual predictor when shared variance due to these control variables was removed, and that “only 40% of the original height variance was reflected in the residual height predictor variable” (p. 286). Indeed, the meaning of the residual predictor here did not capture height but some conception of proportionality, as weight and sex were controlled. Breaugh (2008) and Becker et al. (2016) recommend that authors report the percentage of shared variance between predictors and the residual predictor so that readers can interpret causal relationships, particularly when the correlation between the two is not very high.

Perceptual, Nonperceptual, and Proxy Control Variables

To better understand the conceptual role that control variables play, the nature of these variables must be considered. Control variables can be described and categorized in a number of ways. Atinc et al. (2012) distinguish between perceptual and nonperceptual control variables in their review of 812 empirical papers published between 2005 and 2009. A perceptual control variable is one that is rated or judged by a respondent. For instance, in a study where leader member exchange and employee commitment are examined, using established scales from the literature (Allen & Meyer, 1990; Scandura et al., 1986), a researcher might choose to include the positive affect indicators from the PANAS (Positive and Negative Affect Schedule) scale (Watson et al., 1988) to account for the respondents’ current psychological state. In this case, a respondent makes a judgment or rating on all three of these variables, including the perceptual control variable (positive affect). Breaugh (2006) suggests that if a perceptual control variable is included in a study, it should be corrected for unreliability (see Stauffer & Mendoza, 2001, for how to make corrections) in order to ensure that all extraneous variance is removed from the perceptual control variable.

A nonperceptual control variable, on the other hand, is the one on which a judgment or rating is not done. For example, if a respondent’s age is collected, a simple question asking age to be entered is included as a survey question. This type of question does not require a rating on an established scale to measure a latent variable like positive affect. In this case, there is no need to correct for unreliability because it does not exist for nonperceptual variables. Atinc et al. (2012), in their review of 5 years of organizational research, found that 94.7% of all articles in their sample used a nonperceptual control variable, with 98.71% use in macro-studies, and 91% use in micro-studies. The use of perceptual control variables was much lower, in about 25.8% of articles reviewed.

In many instances, nonperceptual control variables can be described as proxy variables. Pedhazur and Schmelkin (1991) define a proxy variable as a substitute (Miller & Rao, 1971) or a surrogate (Atinc et al., 2012; Becker et al., 2016) for a variable that was not directly measured. The use of demographic variables as proxy control variables in the social sciences is prevalent (Breaugh, 2008). Spector and Brannick (2011) include an example to illustrate the use of proxy variables, hypothesizing a relationship between age and job performance. Spector and Brannick (2011) argue that naturally, as employees get older, they become more experienced and possess more knowledge about their jobs which in turn might translate into better job performance. Hence, when using age as a measure, the real relationship is not between age and performance but with multiple other variables for which age is used as a proxy.

The use of proxy variables as control variables is frequently done in social science studies (Atinc et al., 2012). In micromanagement studies, demographic variables are often used as proxies, and in macromanagement studies, firm size and industry are frequently included (Becker et al., 2016). In many of the studies, neither a theoretical justification nor any knowledge about the strength of the relationship between the proxy variables and other focal independent variables and control variables is present. Including an unsubstantiated proxy control variable may either introduce unintended confounding effects to a study or may result in the removal of valuable variance that the researcher may be interested in studying. This is due to the imprecision with which the proxy variable measures what is intended. Echoing the contentions of Becker et al. (2016) and Breaugh (2008), we discourage researchers from using proxy variables as control variables unless there is a sound theoretical reason for including them. Proxy variables, due to their nonspecific nature, are limited in their usefulness. Theoretically justified controlled variables are meant to facilitate more accurate estimates of the relationships between the predictor and criterion variables; proxy variables, as surrogates, possess higher risks of negatively affecting these desired accuracies.

Control Variables in Statistical Tests

As noted previously, control can occur through research design. However, most writings on control in organizational research acknowledge the heavy use of statistical control. As academic journals in organizational science have become increasingly rigorous in terms of statistical analysis, the proper use of statistical control is an important topic. In this section, we provide an overview of the overall effects of including one or more control variables in an empirical analysis, review the use of control variables in three specific statistical tests, consider the effects of using multiple control variables in an analysis, and explore recommendations for when to remove control variables from an analysis.

Zikmund et al. (2008) define casual research as the type of research that “allows inferences to me made; seeks to identify cause and effect relationships” (p. 57). The authors also discuss three conditions necessary for causality to be present: Temporal sequence, which means whatever the cause is it must occur before the event, concomitant variation, which means there is a systematic variation between the two variables, and finally nonspurious association, which means the covariation between a cause and effect is not due to third variable. The main purpose of including control variables in a study is to ensure that nonspurious association is present. This situation applies to different statistical methods. We discuss some of these statistical methods below.

When researchers enact control through the use of measured nuisance variables in statistical analysis, the effects of this action must be considered. At a fundamental level, adding any variable to analysis changes the degrees of freedom of that analysis (Bernerth & Aguinis, 2016). Degrees of freedom are equal to the maximum number of logically independent values that have the freedom to vary in a data sample. As control variables are added, degrees of freedom decline, unless sample size is increased. One danger of increasing degrees of freedom too much is overfitting a model or producing misleading results due to the model describing random error rather than substantive relationships (Draper & Smith, 1998).

In any statistical analysis, the researchers’ choices to include or exclude control variables has an effect on error and findings. Sturman et al. (2021) conducted a large-scale data simulation to determine the degree to which control variable use resulted in Type I error (in which a true null hypothesis is rejected) or Type II error (in which a false null hypothesis is not rejected). While some authors warn of the potential for control variables to increase both types of error, Sturman et al. (2021) indicate that the inclusion or exclusion of control variables had almost no impact on Type I error and had only minimal impact on Type II error. Researchers can reduce the risk of Type II error by choosing theoretically appropriate control variables (Atinc et al., 2012; Spector & Brannick, 2011).

More important, Sturman et al. (2021) investigated the case of data in which true relationships among substantive variables were nonzero but of undetectable magnitude in a typical social science sample size. In their simulation, they concluded: “The inclusion of control variables increased the probability of both having and observing a statistically significant effect of X on Y” (p. 8) and that “including additional control variables, given typical relationships among microlevel variables, increases the chance of finding a statistically significant effect” (p. 8). With this finding, Sturman et al. (2021) warn of the dangers of p-hacking, which is “trying multiple analyses to obtain statistical significance” (Simonsohn et al., 2014, p. 534). Because the inclusion of different control variables can produce statistically significant findings, researchers operating in a publish-or-perish environment may be more prone to control variable use that is aimed at p-hacking (Sturman et al., 2021).

How control variables operate in specific, common statistical analyses is addressed in the following sections. Keeping in mind the issues related to error and statistical significance, a researcher must be mindful of the influence that control variables exert on results and also the proper way in which to interpret findings when control variables are in a model.


Correlation is a statistical measure of the strength of a relationship between two variables. Given the above mentioned three conditions of causality, the correlation coefficient (r) only measures to what extent concomitant association is present. It is the researcher’s task to ensure that temporal sequence and nonspurious association are present. Thus, Becker (2005) recommends that researchers include the control variables in their correlation tables. Understanding the covariation between the criterion, the predictor, and the control variables is an important step toward assessing the causality between studied variables. For example, a statistically significant correlation between a dependent and an independent variable may be due to third variable (violation of nonspurious association requirement) and the control variable might be that third variable.


Regression is a statistical method used to estimate the relationships between dependent and independent variables (Hair et al., 2016). There are various forms of regression such as simple or multiple linear regression, logistic regression, probit regression, and generalized linear models. In addition, there are multiple estimation techniques such as least squares, weighted, and Bayesian. For the purpose of this discussion on control variables, we concentrate on linear regression with ordinary least squares (OLS) estimation. Using OLS, a researcher aims to find the line that fits the given data by minimizing the sum of the squares of differences between the observed and predicted values of the studied variables (Hair et al., 2016).

Breaugh (2008) includes a detailed discussion on the use of control variables in regression. A typical regression formula is as follows:

(1) Y = a + ß 1 X 1

where Y is the dependent variable, a is the Y-intercept (value of Y when X is zero), X1 is the independent variable and ß1 is the coefficient of the independent variable that shows the change in Y given a unit change in X1 (slope). A typical regression output includes ß1 and its corresponding statistical significance (testing the null hypothesis of ß1=0).

(2) Y = a + ß 1 X 1 + ß 2 X 2

where X2 is included as a control variable and ß2 is the coefficient of the control variable that shows change Y given a unit change in the control variable (X2).

ß 1 in equation (1) and ß1 in equation (2) are not the same coefficients anymore since the second equation includes a control variable (X2) that is likely to have an impact on the dependent (Y) and/or the independent (X1) variable. In fact, the coefficient of determination, the variance in the dependent variable determined by the independent variables, is the square of the correlation between them. Due to this reason, instead of R2, the coefficient of determination is denoted as r2 in a simple regression like equation (1). On the other hand, R2 in the second equation (multiple regression) includes the additional variance explained by the newly added variable. In other words, a control variable is included in a regression model for the purpose of capturing the variance that the predictor is not able to capture so that, as previously mentioned, a purification process takes place. However, as Breaugh (2008) demonstrates, by doing purification, the researcher may not be testing the intended hypothesis but one given the control variable. That is the fundamental reason that a theoretical justification is crucial when control variables are included in a model (Becker et al., 2016).

Once again, we urge researchers to follow the recommendations outlined by Becker et al. (2016). We also recommend that R2 values with and without the control variables are reported along with inclusion of the control variables in the correlation tables as previously suggested. This way, zero-order correlations between the criterion and predictor and control variables can be directly observed, which in turn aids the researcher in interpreting the results of the regression model.

Structural Equations Modelling

Structural equations modelling (SEM) is a second-generation multivariate technique in which indicators of latent variables can be directly incorporated, and many of the limitations of first-generation techniques such as regression are overcome (Hair et al., 2014). Using SEM, a researcher may incorporate both observed and unobserved (latent) variables either to explore patterns or to support existing theories. Regardless of whether a researcher chooses to utilize the traditional covariance-based SEM or the relatively new path modelling SEM, control variables can be used, and their treatment is not very different than it is in multiple regression. The need for justification is still the most important requirement for including a control variable in an SEM model. Also, as in regression, control variables should be treated like the independent variables. The biggest difference is that in SEM, one can specify the covariation between study variables. Hence, one can covary a control variable with any of the endogenous variables based on theory. While SEM might allow for more accurate statistical control, its use does not lessen the importance of justification for adding control variables as each variable added to a structural model introduces a new parameter and affects the covariance matrix. The fit indices with and without the control variables must be considered and individual coefficients must be reported and discussed. In other words, all of the Becker et al. (2016) recommendations apply regarding the treatment of control variables in SEM.

Commonly Used Control Variables

In this section, we review typical control variables that are likely to appear in both macro- and micro-research. In micro-research, age, sex, organizational tenure, and attitudes are commonly studied (Atinc et al., 2012; Bernerth et al., 2018). Age is a nonperceptual variable that can be used as both a proxy (e.g., for life experiences and years of work experience) and a direct measure of how biological age relates to study variables (Bernerth & Aguinis, 2016), and it is often considered the former. Age is often included as a control variable with little or no rationale (Atinc et al., 2012; Carlson & Wu, 2012), and thus assumptions must be made as to why age should be equalized across all respondents or its value is limited.

Organizational tenure—the length of time a respondent has worked for an employer—is a typical control variable, and age has been used as a proxy for it (Bernerth & Aguinis, 2016). Tenure may capture information like work experience, relationship with employer or supervisor, or organizational commitment. Notably, if used as a proxy variable, tenure has serious limitations, because many factors may influence an employee’s length of employment. Researchers are therefore better served to more specifically measure constructs of interest.

Sex is a control variable that can be used to capture biological characteristics, but in management research it is more often used to account for a range of experiences and behaviors that might differ, on average, between women and men at work (more appropriately called gender). Sex is included as a control variable when these differences are intended to be ignored or nullified. There are two primary limitations with the use of sex or gender as controls. First, experiences at work and in the home are more variable than ever for women and men; one can no longer assume that being female equates to having primary child care responsibility or having a lower income in a two-income family. Thus, using sex to capture elements of traditional gender roles should be avoided. Second, as attention has turned to different definitions of gender identity, researchers cannot assume that this variable can be captured as a dichotomy. Scholars distinguish between sex (a biological distinction of male or female) and gender (based on roles and expectations of men and women in society) (Phillips, 2005). Increasingly, researchers have a number of ways to capture a wider variety of respondents’ gender identity and gender-related perspectives (Smiler & Epstein, 2010), which may lead researchers to choose control variables that are more salient to organizational experiences than self-reported sex or gender. According to Smiler and Epstein (2010), categories of measures of support for and adherence to cultural gender norms that are relevant to surveys of adults in workplaces are (a) trait measures that purportedly are more common to men or women and (b) ideology measures, which capture a person’s “endorsement of a culture’s ideological beliefs about gender roles” (p. 136), such as women agreeing that they are a more nurturing gender. Further, these authors present measures of gender role conflict or stress (i.e., “the degree to which internalization to traditional gender roles is likely to cause stress in an individual’s life”) (Smiler & Epstein, 2010, p. 145) and measures related to the relative position of men and women in society, assessing sexism and feminist identity.

In summary, age, organizational tenure, and sex are often included as control variables as a proxy for other, more difficult-to-measure phenomena. Bernerth and Aguinis (2016) argue that these are likely included because they are easily accessible and doing so is normative, echoing a charge of isomorphism by Atinc et al. (2012), and often not statistically related to the substantive variables studied. Indeed, Bernerth et al. (2018), called for a moratorium on the use of demographic control variables, particularly as proxies, unless there was a clearly defined rationale for their use. Not only are there theoretical limitations with the use of these control variables, but scholars note serious problems with the conclusions that one can draw when such control variables are included in statistical analysis. Namely, results describe findings based on sexless and ageless individuals who are not truly representative of real individuals at work (Breaugh, 2008). So, what is the researcher to do? If these effects are not included, they can confound the research questions, but if they are included, interpretation of findings is limited. Becker et al. (2016) are proponents of minimizing the use of statistical controls, yet if they are used, Breaugh (2008) recommends explaining findings in terms of the residual. Notably, in instances where a more specific construct can be captured (such as child care responsibilities at home, rather than gender), it should be collected.

Turning to the notion of context (Johns, 2001), while the preceding variables may not be ideal as control variables, they can provide meaning to a study by describing the sample of respondents. Indeed, it is recommended that such data be shared to allow for increased scrutiny of findings. In their evidence-based best-practice recommendations to enhance methodological transparency of research design in the organizational sciences, Aguinis et al. (2019) note the importance of describing sample characteristics in terms of age, sex, and employment status.

Attitudes, such as job satisfaction, job strain, and so forth, are frequently used perceptual control variables that differ based on research domain. Unlike proxy variables, these controls capture specific information related to hypotheses. Yet, caution is still warranted. As noted by Breaugh (2008) and Becker (2005), interpreting results of statistical analyses when controls are used is limited, as the researcher is describing a phenomenon with a residual predictor. While some may argue that attitudes, like negative affectivity (NA), are a nuisance, others may consider them to be relevant and meaningful to the study at hand (Spector, 2006).

In macromanagement studies, three control variables are frequently used—industry, firm size, and prior firm performance. Historically, industry has been measured with dummy variables in statistical models. Researchers may concentrate on a specific industry instead of multiple industries (i.e., use exclusion as a control), and second, researchers will randomly select companies from various industries and ensure that a single industry is not heavily represented in the selected sample. We believe that instead of including “industry dummy” as a control variable, accounting for it by different means, such as the ones mentioned above, would be more consistent with the recommendations of the pioneer studies (Becker et al., 2016) and may provide more useful results.

Firm size is another popular control variable in macro-studies. Although we agree that firm size is a theoretically justified variable in the majority of the studies we reviewed, its inclusion should not be done for isomorphic reasons (Atinc et al., 2012). The statistical models should be run with and without firm size to check for differences (Becker et al., 2016).

Finally, firm prior performance is a ubiquitous control variable in macro-studies. In many, company-level performance is the dependent variable of interest, so accounting for prior firm performance allows for an explanation of the variance in performance during their study time frame. Even though we do not object to this practice of including prior performance, we urge researchers to be cautious about the amount of variance that is due to prior firm performance. After all, it may not be easy to fully isolate prior performance’s effect on current performance.

Review of Seminal Research

There are multiple insightful studies published during the past two decades on the use of control variables in social sciences research. Some reviewed previously published research to identify the state of the field, while others provided conceptual arguments regarding proper control variable use. Becker (2005) evaluated a random set of 60 articles from top management journals published between 2000 and 2002 and evaluated them on 10 dimensions related to quality of the use of control variables. He then provided 12 recommendations for future research using control variables. Atinc et al. (2012) conducted a review of 812 micro- and macro-oriented articles published from 2005 to 2009 in four premier journals in the management field, finding that control variables accounted for more variance than the main effects in many studies, and that justification for inclusion of the variables was not as prevalent. They urged scholars to provide a theoretical foundation for statistical control. A study by Carlson and Wu (2012) extended the Becker (2005) and Atinc et al. (2012) studies in an evaluation of 162 published management articles. An analysis of the zero-order correlations between control variables and the independent and dependent variables indicated that control variables had little impact on outcomes. These authors concluded the following about control variables: “If in doubt, leave them out.” Finally, Bernerth and Aguinis (2016) conducted an in-depth review and content analysis of 580 micromanagement studies and concluded that many of the control variables were demographic factors, and authors did not provide adequate explanation about their treatment of control variables.

Other authors’ reviews provide a better understanding of the conceptual issues related to control variable used. Breaugh (2006) assessed the use of nuisance variables specifically with the multiple regression technique, finding that, depending on the relationship between the nuisance and predictor and nuisance and criterion variables, correlations and many times shared variances could be used as effect sizes instead of the traditional statistical significance tests of multiple regressions. Further, Breaugh (2008) focused on the use of nuisance variables in nonexperimental studies, warning of the generalizability of their finding and the threat to external validity introduced with improper treatment of control variables. Spector and Brannick (2011) criticized the automatic inclusion of control variables in multiple regressions and considered this practice to be based on “methodological urban legend” (p. 287). They also highlighted the possibility of contamination and spuriousness due to inclusion of control variables and how authors may be inaccurately testing their stated hypotheses.

Becker et al. (2016) conducted a collaborative study where the lead authors shared their input on the treatment of statistical control in correlational studies, making 10 essential recommendations for organizational researchers, subsuming the prior recommendations into a compendium for researchers. Table 1 lists these recommendations, which are summarized here. First, from Carlson and Wu (2012), Becker et al. (2016) encourage researchers to err on the side of omitting control variables in order to facilitate better interpretation of the results of analyses. Second, as discussed previously, proxy control variables are limited in their ability to capture variables of interest and are to be avoided. Researchers should attempt to identify the conceptual meaning of control variables used. The third recommendation is that control variables be included in hypotheses and models as a means to addressing both the effects of control variables and the potential for creating residual predictors. Fourth, researchers are encouraged to give more details about how control variables are measured (again, supporting the notion that proxies are to be avoided) and included in analyses in order to “facilitate the interpretation and replication of results” (Becker et al., 2016, p. 161). The fifth recommendation regards the reliability and validity of control variables. Researchers often overlook construct validity for control variables, but this can reduce the accuracy of the statistical tests that use them. Sixth, researchers should have consistency in their hypotheses and analyses, such that a hypothesis without a control variable should only be tested with an analysis that does not include a control variable to encourage appropriate model specification. The seventh suggestion by Becker et al. (2016) is to conduct comparative tests of relationships between independent and control variables, especially when perceptual control variables are used. The risk of not doing this is that the control variables could create contamination by affecting measured but not latent variables, or could cause a spurious relationship by affecting latent variables. Failure to do this can cause misinterpretation of the model. Eighth, researchers are encouraged to run their model both with and without the control variables in order to compare the findings. This could uncover meaningful interactions between control and independent variables. Ninth, as a means to facilitate understanding of the psychometric properties of the control variables used, authors are encouraged to report descriptive statistics for them and correlations between the control and independent variables. Further, researchers should identify the conceptual nature of any residual predictor created by inclusion of control variables. Finally, in their tenth recommendation, Becker et al. (2016) again address the notion of the residual predictor and recommend that findings not be extended to “fictional people” who are represented when statistical control is enacted. For instance, a researcher who controls for sex and age cannot generalize to a broader population from the research subjects for whom sex and age have been removed.

Table 1. Top 10 Recommendations regarding the Treatment of Control Variables

1. When in doubt, leave them out!

2. Select conceptually meaningful CVs and avoid proxies.

3. When feasible, include CVs in hypotheses and models

4. Clearly justify the measures of CVs and methods of control.

5. Subject CVs to the same standards of reliability and validity as are applied to other variables.

6. If the hypotheses do not include CVs, do not include CVs in the analysis.

7. Conduct comparative tests of relationships between IVs and CVs.

8. Run results with and without CVs and contrasts findings.

9. Report descriptive statistics and correlations for CVs, and the correlations between the measured predictors and their partialled counterparts.

10. Be cautious when generalizing results involving residual variables.

Source: Becker et al. (2016, p. 158).

Citation Trends of Seminal Research

These eight articles provide a comprehensive view of the most recent understanding of and advice regarding control variables. Each provides clear guidelines as to best practices and, as such, is a valuable resource for organizational scholars. We conducted a citation analysis of these eight studies using Google Scholar to assess the interest among social science researchers regarding the proper treatment of control variables.

Table 2. Citation Trends of Seminal Articles (as of February 2021)


No. of citations

Average number of citations per year

Becker (2005)



Breaugh (2006)



Breaugh (2008)



Spector and Brannick (2011)



Atinc et al. (2012)



Carlson and Wu (2012)



Becker et al. (2016)



Bernerth and Aguinis (2016)



As can be seen in Table 2, there is increasing interest in these series of articles published during the past two decades. However, we are cautious in interpreting this interest and believe that a further study must be conducted on whether the researchers are in fact following the recommendations of these studies. As Atinc et al. (2012) and Carlson and Wu (2012) offered isomorphism as a reason for inclusion of control variables, citation of these articles might well become a normative or an isomorphic practice. Future research may investigate the extent of proper application of the recommendations offered by these researchers in social sciences.

It should be noted that there are multiple other studies published in this area that are not included in Table 2. For instance, Edwards (2008) wrote a very insightful paper that includes a dedicated section on the use of control variables in organizational psychology. Also, more recent significant papers, for example, by Wysocki et al. (2020) and Spector (2020) were not included in this list due to the short lag time allowable for citation. Wysocki et al. (2020) underline the importance of specifying the causal structure between the study variables and the confounders (control variables) while Spector (2020) introduce the hierarchical iterative control approach as a contemporary strategy to make the most of control variables.

Evaluation of Sample Studies

In order to understand the latest practices in management research with regard to use of control variables, we randomly selected 10 articles published in Academy of Management Journal in 2020. Five of these articles were micro-oriented and the remaining five were macro-oriented studies. Though we understand that such a small sample size is not representative of the population, and the results may not generalize to all management articles published recently, we believe this analysis gives a sense of to what extent the researchers are following the practices recommended by various studies published in this area (Atinc et al., 2012; Becker, 2005; Becker et al., 2016; Bernerth & Aguinis, 2016; Breaugh, 2006, 2008; Carlson & Wu, 2012; Spector & Brannick, 2011). Table 3 reports the results of our findings by evaluation of these sample articles published in 2019.

Table 3. Evaluation of 10 Sample Studies

Evaluation criteria

Micro-oriented studies

Macro-oriented studies

Basis for inclusion of CV provided



Citation for CV included



CV included in hypotheses



Measures of CV reported



Proxy CV used



Predicted sign of CV and DV



CV reported in results



CV Included in correlation table



CV descriptive statistics reported



CV in discussion



As can be seen in Table 3, none of the micro- or macro-studies included the control variables in their hypotheses. Also, only one of the papers predicted the signs between the control variables and the dependent variables. Finally, none of these studies published in this highly regarded journal discussed their findings on control variables in their discussion sections. Micro-oriented studies included fewer citations for all the control variables and none of them reported their control variable findings in the results section (listing demographics was not considered reporting findings). All the macro- and micro-studies utilized a control variable that can be considered a proxy variable. Most of the micro- and macro-studies included the control variables in the correlation tables and provided the descriptive statistics about them. Finally, more than half of the micro-studies and all the macro-studies incorporated a basis for inclusion.

In short, some of the current practices in management research seem to be consistent with the recommendations, whereas there needs to be much more progress with some others. This finding that best practices are not routinely followed is not surprising, as prior research has also identified this trend, despite heavy citation of control variables articles. We speculate as to the reasons for this. First, as previously identified, isomorphism in control variable use and reporting is common (Atinc et al., 2012). Second, as argued by Vandenberg (2006), much practice in research comes from doctoral student training and the peer-review process. It is possible that the vast amount of research on control variable best practices has not been emphasized enough by either, particularly as journal reviewers are more likely to focus on other, more sophisticated methodological issues. Finally, as journal space is limited, an author who wants to remove a few sentences may find that those describing the rationale behind control variable choice are easy to cut.

Our review indicates that, at this point, macro-studies may be following best practices more often. This could be due to more utilization of secondary data. We believe that micro-studies must catch up, especially with regard to providing a basis for inclusion, reporting of measures, and their findings on control variables in their results sections. Finally, it is worth warning researchers away from the practice of p-hacking, which is when a researcher “subject{s} data to many calculations or manipulations in search of an equation or classification system that captures strong patterns” (Starbuck, 2016, p. 171). If a researcher runs multiple analyses that include different control variables (with an absence of conceptual rationale) in order to have a better chance of obtaining statistically significant findings, this is p-hacking, and they “render the usual estimates of p-values invalid” (Starbuck, 2016, p. 171).

Future Research

Although control variables have received a great deal of attention in management research, there are additional avenues of investigation that should be pursued. In this section, we discuss four of them. First, as can be evidenced by Table 3, there is still no widespread adoption of best practices for control variable use and reporting. Thus, the degree to which recommended practices have been implemented should be evaluated on a broader scale, and, more important, there should be exploration of why best practices are not adopted.

A second area of research related to control variable use relates to the concept of common method variance (CMV). CMV occurs when relationships among variables in data collected using the same method (i.e., a survey) and statistical relationships among substantive variables are believed to be influenced by shared error variance. This systematic variance is purported to bias substantive relationships due solely to data collection method (Podsakoff et al., 2003). Although there is debate as to the degree to which CMV is both present and biasing (Richardson et al., 2009), CMV is of primary concern in single-source surveys used in nonexperimental research (Podsakoff et al., 2003). As control variables are often included in these research designs as a means to account for alternative explanations, statistical control is one means of attempting to account for CMV. Specifically, if a researcher can identify a potential cause of CMV, such as a respondent’s positive affectivity or socially desirable responding, then including that variable in a statistical control strategy can reduce the unwanted effects of CMV (Simmering et al., 2015). However, use of this technique has been criticized as having the potential to remove substantive variance along with CMV (Spector, 2006) and, also, as not necessarily addressing all forms of CMV. Another fruitful area of inquiry is the degree to which the inclusion of statistical control variables in analyses reduces the presence of CMV in substantive relationships, as proposed by Siemsen et al. (2010) and Sturman et al. (2018).

A third area for future research on control variables relates to data collected through online panel platforms (OPPs) (Porter et al., 2019). OPPs like’s Mechanical Turk (MTurk) and Qualtrics provide access to very large groups of unknown individuals, and researchers have the ability to enact control through elimination due to the qualification process that many of these platforms provide. For instance, MTurk allows for several free and additional paid “qualifications” for their workers, including options like location, age, and other researcher-specified items (e.g., being self-employed). Such items can be used to capture experience, attitudes, or other more research-specific information, limiting the final participant pool to one that excludes respondents outside the domain of interest. Yet, recent research indicates that MTurk participants, specifically, may misrepresent themselves to gain access to a paid survey and that researchers may need to use more sophisticated gatekeeping items to target respondents (Aguinis et al., 2020). Thus, whereas dishonest responding is possible, particularly because participants are paid for survey completion, this approach to elimination is one option previously not available to researchers who collected data without online panels.

A fourth area of future research is the use of metadata from online survey collections as control variables. Metadata are “a set of highly structured and/or encoded data that describes a large set of data” (Lavrakas, 2008). In typical online survey programs like Qualtrics or Survey Monkey, metadata capturing the time the survey is started and finished, the time to completion, and whether or not the survey was fully completed are available to researchers. These metadata can be used as an estimate of careless responding, respondent distraction, and respondent reading capability (Malhotra, 2008). Scholars may remove certain respondents’ data based on such information or may control for it in analyses. Further, research indicates that the day on which a survey is made available on MTurk, or even the time of day, can create small differences in the demographic makeup of respondents, with older and employed workers more likely to participate on weekends (Casey et al., 2017). Metadata such as these have been relatively unexamined in terms of their use as control variables and provide a new area for future research.

Finally, because control variable use varies widely based on subdiscipline, perhaps more focus on the use of control variables in specific areas of study is warranted. Although there are some examples of that in micro-research (Bernerth et al., 2018; Edwards, 2008; Wysocki et al., 2020), the use of control variables in more macro-oriented topics like corporate governance and strategic management has not occurred. By assessing the different practices in the subfields of management, researchers can have a better understanding of the treatment of control variables in studies that utilize different types of data collection methods (primary vs. secondary) and have different types of outcome variables (individual performance vs. firm performance). Further, the use of control variables in more quantitative-oriented fields like operations research and supply chain may provide additional insights to the researchers of these areas.


The conceptual role of control variables, their use in statistical analyses, and the information reported about them in manuscripts are crucial to understanding the substantive relationships that researchers seek to explain. Over the past two decades, research on control variables has provided robust evidence regarding the proper use and reporting of statistical control variables. Yet, multiple studies have indicated that published primary research lags behind recommendations of best practices. This article summarizes these best practices and the rationale behind them, and researchers are encouraged to adopt these benchmarks in their research.


  • Aguinis, H., Villamor, I., Hill, N. S., & Bailey, J. R. (2019). Best practices in data collection and preparation: Recommendations for reviewers, editors, and authors. Organizational Research Methods, March, 1–16.
  • Aguinis, H., Villamor, I., & Ramani, R. S. (2020). MTurk research: Review and recommendations. Journal of Management, 47(4), 823–837.
  • Allen, N. J., & Meyer, J. P. (1990). The measurement and antecedents of affective, continuance and normative commitment to the organization. Journal of Occupational Psychology, 63, 1–18.
  • Atinc, G., Simmering, M. J., & Kroll, M. J. (2012). Control variable use and reporting in macro and micro management research. Organizational Research Methods, 15(1), 57–74.
  • Becker, T. E. (2005). Potential problems in the statistical control of variables in organizational research: A qualitative analysis with recommendations. Organizational Research Methods, 8(3), 274–289.
  • Becker, T. E., Atinc, G., Breaugh, J. A., Carlson, K. D., Edwards, J. R., & Spector, P. E. (2016). Statistical control in correlational studies: 10 essential recommendations for organizational researchers. Journal of Organizational Behavior, 37(2), 157–167.
  • Bernerth, J. B., & Aguinis, H. (2016). A critical review and best-practice recommendations for control variable usage. Personnel Psychology, 69(1), 229–283.
  • Bernerth, J. B., Cole, M. S., Taylor, E. C., & Walker, H. J. (2018). Control variables in leadership research: A qualitative and quantitative review. Journal of Management, 44(1), 131–160.
  • Breaugh, J. A. (2006). Rethinking the control of nuisance variables in theory testing. Journal of Business and Psychology, 20(3), 429–443.
  • Breaugh, J. A. (2008). Important considerations in using statistical procedures to control for nuisance variables in non-experimental studies. Human Resource Management Review, 18(4), 282–293.
  • Carlson, K. D., & Wu, J. (2012). The illusion of statistical control: Control variable practice in management research. Organizational Research Methods, 15(3), 413–435.
  • Casey, L. S., Chandler, J., Levine, A. S., Proctor, A., & Strolovitch, D. Z. (2017). Intertemporal differences among MTurk workers: Time-based sample variations and implications for online data collection. SAGE Open, 7(2), 2158244017712774.
  • DiMaggio, P. J., & Powell, W. W. (1983). The iron cage revisited: Institutional isomorphism and collective rationality in organizational fields. American Sociological Review, 48, 147–160.
  • Draper, R. N., & Smith, H. (1998). Applied regression analysis (3rd ed.). Wiley-Interscience.
  • Edwards, J. R. (2008). To prosper, organizational psychology should . . . overcome methodological barriers to progress. Journal of Organizational Behavior, 29(4), 469–491.
  • Finkelstein, S., Hambrick, D. C., & Cannella, B. (2009). Strategic leadership: Theory and research on executives, top management teams, and boards. Oxford University Press.
  • Hair, J. F., Celsi, M., Money, A., Samouel, P., & Page, M. (2016). Essentials of business research methods (3rd ed.). Routledge.
  • Hair, J. F., Hult, G. T. M., Ringle, C. M., & Sarstedt, M. (2014). A primer on partial least squares structural equations modeling (PLS-SEM). SAGE.
  • Jensen, M. C., & Meckling, W. H. (1976). Theory of the firm: Managerial behavior, agency costs and ownership structure. Journal of Financial Economics, 3(4), 305–360.
  • Johns, G. (2001). In praise of context. Journal of Organizational Behavior, 22(1), 31–42.
  • Judge, T. A., & Cable, D. M. (2004). The effect of physical height on workplace success and income: Preliminary test of a theoretical model. Journal of Applied Psychology, 89(3), 428–441.
  • Kerlinger, F. N., & Lee, H. B. (2000). Foundations of behavioral research (4th ed.). Harcourt College.
  • Kish, L. (1959). Some statistical problems in research design. American Sociological Review, 24(3), 328–338.
  • Kraiczy, N. D., Hack, A., & Kellermanns, F. W. (2014). What makes a family firm innovative? CEO risk-taking propensity and the organizational context of family firms. Journal of Product Innovation Management, 32(3), 334–348.
  • Lavrakas, P. J. (2008). Encyclopedia of survey research methods (2 Vols.). SAGE.
  • Malhotra, N. (2008). Completion time and response order effects in web surveys. Public Opinion Quarterly, 72(5), 914–934.
  • Miller, R. L., & Rao, P. (1971). Applied econometrics. Wadsworth.
  • Pedhazur, E. J., & Schmelkin, L. P. (1991). Measurement, design, and analysis: An integrated approach. Erlbaum.
  • Phillips, S. P. (2005). Defining and measuring gender: A social determinant of health whose time has come. International Journal for Equity in Health, 4(1), 1–4.
  • Podsakoff, P. M., MacKenzie, S. B., Lee, J. Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: A critical review of the literature and recommended remedies. Journal of Applied Psychology, 88(5), 879–903.
  • Porter, C. O., Outlaw, R., Gale, J. P., & Cho, T. S. (2019). The use of online panel data in management research: A review and recommendations. Journal of Management, 45(1), 319–344.
  • Richardson, H. A., Simmering, M. J., & Sturman, M. C. (2009). A tale of three perspectives: Examining post hoc statistical techniques for detection and correction of common method variance. Organizational Research Methods, 12(4), 762–800.
  • Scandura, T. A., Graen, G. B., & Novak, M. A. (1986). When managers decide not to decide autocratically: An investigation of leader–member exchange. Journal of Applied Psychology, 71(4), 579–584.
  • Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Houghton, Mifflin.
  • Siemsen, E., Roth, A., & Oliveira, P. (2010). Common method bias in regression models with linear, quadratic, and interaction effects. Organizational Research Methods, 13(3), 456–476.
  • Simmering, M. J., Fuller, C. M., Richardson, H. A., Ocal, Y., & Atinc, G. M. (2015). Marker variable choice, reporting, and interpretation in the detection of common method variance: A review and demonstration. Organizational Research Methods, 18(3), 473–511.
  • Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014). P-curve: A key to the file drawer. Journal of Experimental Psychology: General, 143(2), 534–547.
  • Smiler, A. P., & Epstein, M. (2010). Measuring gender: Options and issues. In J. C. Chrisler & D. R. McCreary (Eds.), Handbook of gender research in psychology: Volume 1: Gender research in general and experimental psychology (pp. 133–157). Springer.
  • Spector, P. E. (2006). Method variance in organizational research: Truth or urban legend? Organizational Research Methods, 9(2), 221–232.
  • Spector, P. E. (2020). Mastering the use of control variables: The hierarchical iterative control (HIC) approach. Journal of Business and Psychology. Advance online publication.
  • Spector, P. E., & Brannick, M. T. (2011). Methodological urban legends: The misuse of statistical control variables. Organizational Research Methods, 14(2), 287–305.
  • Starbuck, W. H. (2016). 60th anniversary essay: How journals could improve research practices in social science. Administrative Science Quarterly, 61(2), 165–183.
  • Stauffer, J. M., & Mendoza, J. L. (2001). The proper sequence for correcting correlation coefficients for range restriction and unreliability. Psychometrika, 66(4), 593–597.
  • Sturman, M. C., Sturman, A. J., & Sturman, C. J. (2021). Uncontrolled control variables: The extent that a researcher’s degrees of freedom with control variables increases various types of statistical errors. Journal of Applied Psychology. Advance. Advance online publication.
  • Sturman, M. C., Ukhov, A., Richardson, H., & Simmering, M. (2018). Mitigating effect of additional variables on common method variance in structural equations models. Academy of Management Annual Meeting Proceedings, 2018(1), 14939.
  • Tosi, H. L., Werner, S., Katz, J. P., & Gomez-Mejia, L. R. (2000). How much does performance matter? A meta-analysis of CEO pay studies. Journal of Management, 26(2), 301–339.
  • Vandenberg, R. J. (2006). Introduction: Statistical and methodological myths and urban legends: Where, pray tell, did they get this idea? Organizational Research Methods, 9(2), 194–201.
  • Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect: The PANAS scales. Journal of Personality and Social Psychology, 54(6), 1063–1070.
  • Wysocki, A., Lawson, K. M., & Rhemtulla, M. (2020, October 13). Statistical control requires causal justification. PsyArXiv Preprints.
  • Zikmund, W. G., Babin, B. J., Carr, J. C., & Griffin, M. (2008). Business research methods (8th ed.). South-Western Cengage Learning.