Inferential statistical methods stem from the distinction between a sample and a population. A sample refers to the data at hand. For example, 100 adults may be asked which of two olive oils they prefer. Imagine that 60 say brand A. But of interest is the proportion of all adults who would prefer brand A if they could be asked. To what extent does 60% reflect the true proportion of adults who prefer brand A?
There are several components to inferential methods. They include assumptions about how to model the probabilities of all possible outcomes. Another is how to model outcomes of interest. Imagine, for example, that there is interest in understanding the overall satisfaction with a particular automobile given an individual’s age. One strategy is to assume that the typical response
Y
, given an individuals age,
X
, is given by
Y
=
β
0
+
β
1
X
, where the slope,
β
1
, and intercept,
β
0
, are unknown constants, in which case a sample would be used to make inferences about their values. Assumptions are also made about how the data were obtained. Was this done in a manner for which random sampling can be assumed? There is even an issue related to the very notion of what is meant by probability. Let
μ
denote the population mean of
Y
. The frequentist approach views probabilities in terms of relative frequencies and
μ
is viewed as a fixed, unknown constant. In contrast, the Bayesian approach views
μ
as having some distribution that is specified by the investigator. For example, it may be assumed that
μ
has a normal distribution. The point is that the probabilities associated with
μ
are not based on the notion of relative frequencies and they are not based on the data at hand. Rather, the probabilities associated with
μ
stem from judgments made by the investigator.
Inferential methods can be classified into three types: distribution free, parametric, and non-parametric. The meaning of the term “non-parametric” depends on the situation as will be explained. The choice between parametric and non-parametric methods can be crucial for reasons that will be outlined. To complicate matters, the number of inferential methods has grown tremendously during the last 50 years. Even for goals that may seem relatively simple, such as comparing two independent groups of individuals, there are numerous methods that may be used. Expert guidance can be crucial in terms of understanding what inferences are reasonable in a given situation.
Article
Doyin Atewologun
Intersectionality is a critical framework that provides us with the mindset and language for examining interconnections and interdependencies between social categories and systems. Intersectionality is relevant for researchers and for practitioners because it enhances analytical sophistication and offers theoretical explanations of the ways in which heterogeneous members of specific groups (such as women) might experience the workplace differently depending on their ethnicity, sexual orientation, and/or class and other social locations. Sensitivity to such differences enhances insight into issues of social justice and inequality in organizations and other institutions, thus maximizing the chance of social change.
The concept of intersectional locations emerged from the racialized experiences of minority ethnic women in the United States. Intersectional thinking has gained increased prominence in business and management studies, particularly in critical organization studies. A predominant focus in this field is on individual subjectivities at intersectional locations (such as examining the occupational identities of minority ethnic women). This emphasis on individuals’ experiences and within-group differences has been described variously as “content specialization” or an “intracategorical approach.” An alternate focus in business and management studies is on highlighting systematic dynamics of power. This encompasses a focus on “systemic intersectionality” and an “intercategorical approach.” Here, scholars examine multiple between-group differences, charting shifting configurations of inequality along various dimensions.
As a critical theory, intersectionality conceptualizes knowledge as situated, contextual, relational, and reflective of political and economic power. Intersectionality tends to be associated with qualitative research methods due to the central role of giving voice, elicited through focus groups, narrative interviews, action research, and observations. Intersectionality is also utilized as a methodological tool for conducting qualitative research, such as by researchers adopting an intersectional reflexivity mindset. Intersectionality is also increasingly associated with quantitative and statistical methods, which contribute to intersectionality by helping us understand and interpret the individual, combined (additive or multiplicative) effects of various categories (privileged and disadvantaged) in a given context. Future considerations for intersectionality theory and practice include managing its broad applicability while attending to its sociopolitical and emancipatory aims, and theoretically advancing understanding of the simultaneous forces of privilege and penalty in the workplace.
Article
Sebastiano Massaro and Dorotea Baljević
Organizational neuroscience—a novel scholarly domain using neuroscience to inform management and organizational research, and vice versa—is flourishing. Still missing, however, is a comprehensive coverage of organizational neuroscience as a self-standing scientific field. A foundational account of the potential that neuroscience holds to advance management and organizational research is currently a gap. The gap can be addressed with a review of the main methods, systematizing the existing scholarly literature in the field including entrepreneurship, strategic management, and organizational behavior, among others.
Article
Guclu Atinc and Marcia J. Simmering
The use of control variables to improve inferences about statistical relationships in data is ubiquitous in management research. In both the micro- and macro-subfields of management, control variables are included to remove confounding variance and provide researchers with an enhanced ability to interpret findings. Scholars have explored the theoretical underpinnings and statistical effects of including control variables in a variety of statistical analyses. Further, a robust literature surrounding the best practices for their use and reporting exists. Specifically, researchers have been directed to report more detailed information in manuscripts regarding the theoretical rationale for the use of control variables, their measurement, and their inclusion in statistical analysis. Moreover, recent research indicates the value of removing control variables in many cases. Although there is evidence that articles recommending best practices for control variables use are increasingly being cited, there is also still a lag in researchers following recommendations. Finally, there are avenues for valuable future research on control variables.
Article
Don H. Kluemper
The use of surveys is prevalent in academic research in general, and particularly in business and management. As an example, self-report surveys alone are the most common data source in the social sciences. Survey design, however, involves a wide range of methodological decisions, each with its own strengths, limitations, and trade-offs. There are a broad set of issues associated with survey design, ranging from a breadth of strategic concerns to nuanced approaches associated with methodological and design alternatives. Further, decision points associated with survey design involve a series of trade-offs, as the strengths of a particular approach might come with inherent weaknesses. Surveys are couched within a broader scientific research process. First and foremost, the problem being studied should have sufficient impact, should be driven by a strong theoretical rationale, should employ rigorous research methods and design appropriate to test the theory, and should use appropriate analyses and employ best practices such that there is confidence in the scientific rigor of any given study and thus confidence in the results. Best practice requires balancing a range of methodological concerns and trade-offs that relate to the development of robust survey designs, including making causal inferences; internal, external, and ecological validity; common method variance; choice of data sources; multilevel issues; measure selection, modification, and development; appropriate use of control variables; conducting power analysis; and methods of administration. There are salient concerns regarding the administration of surveys, including increasing response rates as well as minimizing responses that are careless and/or reflect social desirability. Finally, decision points arise after surveys are administered, including missing data, organization of research materials, questionable research practices, and statistical considerations. A comprehensive understanding of this array of interrelated survey design issues associated with theory, study design, implementation, and analysis enhances scientific rigor.
Article
Robert P. Gephart and Rohny Saylors
Qualitative research designs provide future-oriented plans for undertaking research. Designs should describe how to effectively address and answer a specific research question using qualitative data and qualitative analysis techniques. Designs connect research objectives to observations, data, methods, interpretations, and research outcomes. Qualitative research designs focus initially on collecting data to provide a naturalistic view of social phenomena and understand the meaning the social world holds from the point of view of social actors in real settings. The outcomes of qualitative research designs are situated narratives of peoples’ activities in real settings, reasoned explanations of behavior, discoveries of new phenomena, and creating and testing of theories.
A three-level framework can be used to describe the layers of qualitative research design and conceptualize its multifaceted nature. Note, however, that qualitative research is a flexible and not fixed process, unlike conventional positivist research designs that are unchanged after data collection commences. Flexibility provides qualitative research with the capacity to alter foci during the research process and make new and emerging discoveries.
The first or methods layer of the research design process uses social science methods to rigorously describe organizational phenomena and provide evidence that is useful for explaining phenomena and developing theory. Description is done using empirical research methods for data collection including case studies, interviews, participant observation, ethnography, and collection of texts, records, and documents.
The second or methodological layer of research design offers three formal logical strategies to analyze data and address research questions: (a) induction to answer descriptive “what” questions; (b) deduction and hypothesis testing to address theory oriented “why” questions; and (c) abduction to understand questions about what, how, and why phenomena occur.
The third or social science paradigm layer of research design is formed by broad social science traditions and approaches that reflect distinct theoretical epistemologies—theories of knowledge—and diverse empirical research practices. These perspectives include positivism, interpretive induction, and interpretive abduction (interpretive science). There are also scholarly research perspectives that reflect on and challenge or seek to change management thinking and practice, rather than producing rigorous empirical research or evidence based findings. These perspectives include critical research, postmodern research, and organization development.
Three additional issues are important to future qualitative research designs. First, there is renewed interest in the value of covert research undertaken without the informed consent of participants. Second, there is an ongoing discussion of the best style to use for reporting qualitative research. Third, there are new ways to integrate qualitative and quantitative data. These are needed to better address the interplay of qualitative and quantitative phenomena that are both found in everyday discourse, a phenomenon that has been overlooked.
Article
Rand R. Wilcox
Hypothesis testing is an approach to statistical inference that is routinely taught and used. It is based on a simple idea: develop some relevant speculation about the population of individuals or things under study and determine whether data provide reasonably strong empirical evidence that the hypothesis is wrong. Consider, for example, two approaches to advertising a product. A study might be conducted to determine whether it is reasonable to assume that both approaches are equally effective. A Type I error is rejecting this speculation when in fact it is true. A Type II error is failing to reject when the speculation is false. A common practice is to test hypotheses with the type I error probability set to 0.05 and to declare that there is a statistically significant result if the hypothesis is rejected.
There are various concerns about, limitations to, and criticisms of this approach. One criticism is the use of the term significant. Consider the goal of comparing the means of two populations of individuals. Saying that a result is significant suggests that the difference between the means is large and important. But in the context of hypothesis testing it merely means that there is empirical evidence that the means are not equal. Situations can and do arise where a result is declared significant, but the difference between the means is trivial and unimportant. Indeed, the goal of testing the hypothesis that two means are equal has been criticized based on the argument that surely the means differ at some decimal place. A simple way of dealing with this issue is to reformulate the goal. Rather than testing for equality, determine whether it is reasonable to make a decision about which group has the larger mean. The components of hypothesis-testing techniques can be used to address this issue with the understanding that the goal of testing some hypothesis has been replaced by the goal of determining whether a decision can be made about which group has the larger mean.
Another aspect of hypothesis testing that has seen considerable criticism is the notion of a p-value. Suppose some hypothesis is rejected with the Type I error probability set to 0.05. This leaves open the issue of whether the hypothesis would be rejected with Type I error probability set to 0.025 or 0.01. A p-value is the smallest Type I error probability for which the hypothesis is rejected. When comparing means, a p-value reflects the strength of the empirical evidence that a decision can be made about which has the larger mean. A concern about p-values is that they are often misinterpreted. For example, a small p-value does not necessarily mean that a large or important difference exists. Another common mistake is to conclude that if the p-value is close to zero, there is a high probability of rejecting the hypothesis again if the study is replicated. The probability of rejecting again is a function of the extent that the hypothesis is not true, among other things. Because a p-value does not directly reflect the extent the hypothesis is false, it does not provide a good indication of whether a second study will provide evidence to reject it.
Confidence intervals are closely related to hypothesis-testing methods. Basically, they are intervals that contain unknown quantities with some specified probability. For example, a goal might be to compute an interval that contains the difference between two population means with probability 0.95. Confidence intervals can be used to determine whether some hypothesis should be rejected. Clearly, confidence intervals provide useful information not provided by testing hypotheses and computing a p-value. But an argument for a p-value is that it provides a perspective on the strength of the empirical evidence that a decision can be made about the relative magnitude of the parameters of interest. For example, to what extent is it reasonable to decide whether the first of two groups has the larger mean? Even if a compelling argument can be made that p-values should be completely abandoned in favor of confidence intervals, there are situations where p-values provide a convenient way of developing reasonably accurate confidence intervals. Another argument against p-values is that because they are misinterpreted by some, they should not be used. But if this argument is accepted, it follows that confidence intervals should be abandoned because they are often misinterpreted as well.
Classic hypothesis-testing methods for comparing means and studying associations assume sampling is from a normal distribution. A fundamental issue is whether nonnormality can be a source of practical concern. Based on hundreds of papers published during the last 50 years, the answer is an unequivocal Yes. Granted, there are situations where nonnormality is not a practical concern, but nonnormality can have a substantial negative impact on both Type I and Type II errors. Fortunately, there is a vast literature describing how to deal with known concerns. Results based solely on some hypothesis-testing approach have clear implications about methods aimed at computing confidence intervals. Nonnormal distributions that tend to generate outliers are one source for concern. There are effective methods for dealing with outliers, but technically sound techniques are not obvious based on standard training. Skewed distributions are another concern. The combination of what are called bootstrap methods and robust estimators provides techniques that are particularly effective for dealing with nonnormality and outliers.
Classic methods for comparing means and studying associations also assume homoscedasticity. When comparing means, this means that groups are assumed to have the same amount of variance even when the means of the groups differ. Violating this assumption can have serious negative consequences in terms of both Type I and Type II errors, particularly when the normality assumption is violated as well. There is vast literature describing how to deal with this issue in a technically sound manner.
Article
Alex Bitektine, Jeff Lucas, Oliver Schilke, and Brad Aeon
Experiments randomly assign actors (e.g., people, groups, and organizations) to different conditions and assess the effects on a dependent variable. Random assignment allows for the control of extraneous factors and the isolation of causal effects, making experiments especially valuable for testing theorized processes. Although experiments have long remained underused in organizational theory and management research, the popularity of experimental methods has seen rapid growth in the 21st century.
Gatekeepers sometimes criticize experiments for lacking generalizability, citing their artificial settings or non-representative samples. To address this criticism, a distinction is drawn between an applied research logic and a fundamental research logic. In an applied research logic, experimentalists design a study with the goal of generalizing findings to specific settings or populations. In a fundamental research logic, by contrast, experimentalists seek to design studies relevant to a theory or a fundamental mechanism rather than to specific contexts. Accordingly, the issue of generalizability does not so much boil down to whether an experiment is generalizable, but rather whether the research design matches the research logic of the study. If the goal is to test theory (i.e., a fundamental research logic), then asking the question of whether the experiment generalizes to certain settings and populations is largely irrelevant.
Article
Jason L. Huang and Zhonghao Wang
Careless responding, also known as insufficient effort responding, refers to survey/test respondents providing random, inattentive, or inconsistent answers to question items due to lack of effort in conforming to instructions, interpreting items, and/or providing accurate responses. Researchers often use these two terms interchangeably to describe deviant behaviors in survey/test responding that threaten data quality. Careless responding threatens the validity of research findings by bringing in random and systematic errors. Specifically, careless responding can reduce measurement reliability, while under specific circumstances it can also inflate the substantive relations between variables. Numerous factors can explain why careless responding happens (or does not happen), such as individual difference characteristics (e.g., conscientiousness), survey characteristics (e.g., survey length), and transient psychological states (e.g., positive and negative affect). To identify potential careless responding, researchers can use procedural detection methods and post hoc statistical methods. For example, researchers can insert detection items (e.g., infrequency items, instructed response items) into the questionnaire, monitor participants’ response time, and compute statistical indices, such as psychometric antonym/synonym, Mahalanobis distance, individual reliability, individual response variability, and model fit statistics. Application of multiple detection methods would be better able to capture careless responding given convergent evidence. Comparison of results based on data with and without careless respondents can help evaluate the degree to which the data are influenced by careless responding. To handle data contaminated by careless responding, researchers may choose to filter out identified careless respondents, recode careless responses as missing data, or include careless responding as a control variable in the analysis. To prevent careless responding, researchers have tried utilizing various deterrence methods developed from motivational and social interaction theories. These methods include giving warning, rewarding, or educational messages, proctoring the process of responding, and designing user-friendly surveys. Interest in careless responding has been growing not only in business and management but also in other related disciplines. Future research and practice on careless responding in the business and management areas can also benefit from findings in other related disciplines.
Article
Wayne Crawford and Esther Lamarre Jean
Structural equation modelling (SEM) is a family of models where multivariate techniques are used to examine simultaneously complex relationships among variables. The goal of SEM is to evaluate the extent to which proposed relationships reflect the actual pattern of relationships present in the data. SEM users employ specialized software to develop a model, which then generates a model-implied covariance matrix. The model-implied covariance matrix is based on the user-defined theoretical model and represents the user’s beliefs about relationships among the variables. Guided by the user’s predefined constraints, SEM software employs a combination of factor analysis and regression to generate a set of parameters (often through maximum likelihood [ML] estimation) to create the model-implied covariance matrix, which represents the relationships between variables included in the model. Structural equation modelling capitalizes on the benefits of both factor analysis and path analytic techniques to address complex research questions. Structural equation modelling consists of six basic steps: model specification; identification; estimation; evaluation of model fit; model modification; and reporting of results.
Conducting SEM analyses requires certain data considerations as data-related problems are often the reason for software failures. These considerations include sample size, data screening for multivariate normality, examining outliers and multicollinearity, and assessing missing data. Furthermore, three notable issues SEM users might encounter include common method variance, subjectivity and transparency, and alternative model testing. First, analyzing common method variance includes recognition of three types of variance: common variance (variance shared with the factor); specific variance (reliable variance not explained by common factors); and error variance (unreliable and inexplicable variation in the variable). Second, SEM still lacks clear guidelines for the modelling process which threatens replicability. Decisions are often subjective and based on the researcher’s preferences and knowledge of what is most appropriate for achieving the best overall model. Finally, reporting alternatives to the hypothesized model is another issue that SEM users should consider when analyzing structural equation models. When testing a hypothesized model, SEM users should consider alternative (nested) models derived from constraining or eliminating one or more paths in the hypothesized model. Alternative models offer several benefits; however, they should be driven and supported by existing theory. It is important for the researcher to clearly report and provide findings on the alternative model(s) tested.
Common model-specific issues are often experienced by users of SEM. Heywood cases, nonidentification, and nonpositive definite matrices are among the most common issues. Heywood cases arise when negative variances or squared multiple correlations greater than 1.0 are found in the results. The researcher could resolve this by considering a small plausible value that could be used to constrain the residual. Non-positive definite matrices result from linear dependencies and/or correlations greater than 1.0. To address this, researchers can attempt to ensure all indicator variables are independent, inspect output manually for negative residual variances, evaluate if sample size is appropriate, or re-specify the proposed model. When used properly, structural equation modelling is a powerful tool that allows for the simultaneous testing of complex models.
Article
Eric Volmar and Kathleen M. Eisenhardt
Theory building from case studies is a research strategy that combines grounded theory building with case studies. Its purpose is to develop novel, accurate, parsimonious, and robust theory that emerges from and is grounded in data. Case research is well-suited to address “big picture” theoretical gaps and dilemmas, particularly when existing theory is inadequate. Further, this research strategy is particularly useful for answering questions of “how” through its deep and longitudinal immersion in a focal phenomenon. The process of conducting case study research includes a thorough literature review to identify an appropriate and compelling research question, a rigorous study design that involves artful theoretical sampling, rich and complete data collection from multiple sources, and a creative yet systematic grounded theory building process to analyze the cases and build emergent theory about significant phenomena. Rigorous theory building case research is fundamentally centered on strong emergent theory with precise theoretical logic and robust grounding in empirical data. Not surprisingly then, theory building case research is disproportionately represented among the most highly cited and award-winning research.
Article
Michael G. Pratt and Gabriel R. Sala
Central to all empirical research—in particular, inductive qualitative field research—observations can provide core insights to work practices, the physical or material elements of organizations, and the integrity of research informants. Yet management research has devoted less attention to observations than it has to other methods. Hence, providing resources and guidance to current and aspiring researchers as to what constitutes observations and how to tackle key questions that must be addressed in designing and implementing observations is key.
Observing, as pertains to research, can be defined as a method that involves using one’s senses, guided by one’s attention, to gather information on, for example, (a) what people are doing (acts, activities, events); (b) where they are doing it (location); and (c) what they are doing it with (objects), over a period of time. Once researchers have determined they want to engage in observation, they have to make several decisions. First, they have to figure out whether observation is a good fit with their study and research question(s). If so, various other choices must be made with regard to degree of revelation, degree of immersion, time in the field, and how to be present in the research context, and still more choices follow. Researchers need to decide when to start (and stop) observing as well as how to observe, record, and report their findings. The article provides a decision-tree model of observational methods to guide researchers through these various choices.
Article
Joel Koopman and Nikolaos Dimotakis
Experience sampling is a method aimed primarily at examining within-individual covariation of transient phenomena utilizing repeated measures. It can be applied to test nuanced predictions of extant theories and can provide insights that are otherwise difficult to obtain. It does so by examining the phenomena of interest close to where they occur and thus avoiding issues with recall and similar concerns. Data collected through the experience sampling method (ESM) can, alternatively, be utilized to collect highly reliable data to investigate between-individual phenomena.
A number of decisions need to be made when designing an ESM study. Study duration and intensity (that is, total days of measurement and total assessments per day) represent a tradeoff between data richness and participant fatigue that needs to be carefully weighed. Other scheduling options need to be considered, such as triggered versus scheduled surveys. Researchers also need to be aware of the generally high potential cost of this approach, as well as the monetary and nonmonetary resources required.
The intensity of this method also requires special consideration of the sample and the context. Proper screening is invaluable; ensuring that participants and their context is applicable and appropriate to the design is an important first step. The next step is ensuring that the surveys are planned in a compatible way to the sample, and that the surveys are designed to appropriately and rigorously collect data that can be used to accomplish the aims of the study at hand.
Furthermore, ESM data typically requires proper consideration in regards to how the data will be analyzed and how results will be interpreted. Proper attention to analytic approaches (typically multilevel) is required. Finally, when interpreting results from ESM data, one must not forget that these effects typically represent processes that occur continuously across individuals’ working lives—effect sizes thus need to be considered with this in mind.
Article
Johannes Meuer and Peer C. Fiss
During the last decade, qualitative comparative analysis (QCA) has become an increasingly popular research approach in the management and business literature. As an approach, QCA consists of both a set of analytical techniques and a conceptual perspective, and the origins of QCA as an analytical technique lie outside the management and business literature. In the 1980s, Charles Ragin, a sociologist and political scientist, developed a systematic, comparative methodology as an alternative to qualitative, case-oriented approaches and to quantitative, variable-oriented approaches. Whereas the analytical technique of QCA was developed outside the management literature, the conceptual perspective underlying QCA has a long history in the management literature, in particular in the form of contingency and configurational theory that have played an important role in management theories since the late 1960s.
Until the 2000s, management researchers only sporadically used QCA as an analytical technique. Between 2007 and 2008, a series of seminal articles in leading management journals laid the conceptual, methodological, and empirical foundations for QCA as a promising research approach in business and management. These articles led to a “first” wave of QCA research in management. During the first wave—occurring between approximately 2008 and 2014—researchers successfully published QCA-based studies in leading management journals and triggered important methodological debates, ultimately leading to a revival of the configurational perspective in the management literature.
Following the first wave, a “second” wave—between 2014 and 2018—saw a rapid increase in QCA publications across several subfields in management research, the development of methodological applications of QCA, and an expansion of scholarly debates around the nature, opportunities, and future of QCA as a research approach. The second wave of QCA research in business and management concluded with researchers’ taking stock of the plethora of empirical studies using QCA for identifying best practice guidelines and advocating for the rise of a “neo-configurational” perspective, a perspective drawing on set-theoretic logic, causal complexity, and counterfactual analysis.
Nowadays, QCA is an established approach in some research areas (e.g., organization theory, strategic management) and is diffusing into several adjacent areas (e.g., entrepreneurship, marketing, and accounting), a situation that promises new opportunities for advancing the analytical technique of QCA as well as configurational thinking and theorizing in the business and management literature. To advance the analytical foundations of QCA, researchers may, for example, advance robustness tests for QCA or focus on issues of endogeneity and omitted variables in QCA. To advance the conceptual foundations of QCA, researchers may, for example, clarify the links between configurational theory and related theoretical perspectives, such as systems theory or complexity theory, or develop theories on the temporal dynamics of configurations and configurational change. Ultimately, after a decade of growing use and interest in QCA and given the unique strengths of this approach for addressing questions relevant to management research, QCA will continue to influence research in business and management.