1-20 of 36 Results  for:

  • Research Methods x
Clear all

Article

Michael G. Pratt and Gabriel R. Sala

Central to all empirical research—in particular, inductive qualitative field research—observations can provide core insights to work practices, the physical or material elements of organizations, and the integrity of research informants. Yet management research has devoted less attention to observations than it has to other methods. Hence, providing resources and guidance to current and aspiring researchers as to what constitutes observations and how to tackle key questions that must be addressed in designing and implementing observations is key. Observing, as pertains to research, can be defined as a method that involves using one’s senses, guided by one’s attention, to gather information on, for example, (a) what people are doing (acts, activities, events); (b) where they are doing it (location); and (c) what they are doing it with (objects), over a period of time. Once researchers have determined they want to engage in observation, they have to make several decisions. First, they have to figure out whether observation is a good fit with their study and research question(s). If so, various other choices must be made with regard to degree of revelation, degree of immersion, time in the field, and how to be present in the research context, and still more choices follow. Researchers need to decide when to start (and stop) observing as well as how to observe, record, and report their findings. The article provides a decision-tree model of observational methods to guide researchers through these various choices.

Article

James A. Muncy and Alice M. Muncy

Business research is conducted by both businesspeople, who have informational needs, and scholars, whose field of study is business. Though some of the specifics as to how research is conducted differs between scholarly research and applied research, the general process they follow is the same. Business research is conducted in five stages. The first stage is problem formation where the objectives of the research are established. The second stage is research design. In this stage, the researcher identifies the variables of interest and possible relationships among those variables, decides on the appropriate data source and measurement approach, and plans the sampling methodology. It is also within the research design stage that the role that time will play in the study is determined. The third stage is data collection. Researchers must decide whether to outsource the data collection process or collect the data themselves. Also, data quality issues must be addressed in the collection of the data. The fourth stage is data analysis. The data must be prepared and cleaned. Statistical packages or programs such as SAS, SPSS, STATA, and R are used to analyze quantitative data. In the cases of qualitative data, coding, artificial intelligence, and/or interpretive analysis is employed. The fifth stage is the presentation of results. In applied business research, the results are typically limited in their distribution and they must be addressed to the immediate problem at hand. In scholarly business research, the results are intended to be widely distributed through journals, books, and conferences. As a means of quality control, scholarly research usually goes through a double-blind review process before it is published.

Article

Jason L. Huang and Zhonghao Wang

Careless responding, also known as insufficient effort responding, refers to survey/test respondents providing random, inattentive, or inconsistent answers to question items due to lack of effort in conforming to instructions, interpreting items, and/or providing accurate responses. Researchers often use these two terms interchangeably to describe deviant behaviors in survey/test responding that threaten data quality. Careless responding threatens the validity of research findings by bringing in random and systematic errors. Specifically, careless responding can reduce measurement reliability, while under specific circumstances it can also inflate the substantive relations between variables. Numerous factors can explain why careless responding happens (or does not happen), such as individual difference characteristics (e.g., conscientiousness), survey characteristics (e.g., survey length), and transient psychological states (e.g., positive and negative affect). To identify potential careless responding, researchers can use procedural detection methods and post hoc statistical methods. For example, researchers can insert detection items (e.g., infrequency items, instructed response items) into the questionnaire, monitor participants’ response time, and compute statistical indices, such as psychometric antonym/synonym, Mahalanobis distance, individual reliability, individual response variability, and model fit statistics. Application of multiple detection methods would be better able to capture careless responding given convergent evidence. Comparison of results based on data with and without careless respondents can help evaluate the degree to which the data are influenced by careless responding. To handle data contaminated by careless responding, researchers may choose to filter out identified careless respondents, recode careless responses as missing data, or include careless responding as a control variable in the analysis. To prevent careless responding, researchers have tried utilizing various deterrence methods developed from motivational and social interaction theories. These methods include giving warning, rewarding, or educational messages, proctoring the process of responding, and designing user-friendly surveys. Interest in careless responding has been growing not only in business and management but also in other related disciplines. Future research and practice on careless responding in the business and management areas can also benefit from findings in other related disciplines.

Article

Eric Volmar and Kathleen M. Eisenhardt

Theory building from case studies is a research strategy that combines grounded theory building with case studies. Its purpose is to develop novel, accurate, parsimonious, and robust theory that emerges from and is grounded in data. Case research is well-suited to address “big picture” theoretical gaps and dilemmas, particularly when existing theory is inadequate. Further, this research strategy is particularly useful for answering questions of “how” through its deep and longitudinal immersion in a focal phenomenon. The process of conducting case study research includes a thorough literature review to identify an appropriate and compelling research question, a rigorous study design that involves artful theoretical sampling, rich and complete data collection from multiple sources, and a creative yet systematic grounded theory building process to analyze the cases and build emergent theory about significant phenomena. Rigorous theory building case research is fundamentally centered on strong emergent theory with precise theoretical logic and robust grounding in empirical data. Not surprisingly then, theory building case research is disproportionately represented among the most highly cited and award-winning research.

Article

James Mattingly and Nicholas Bailey

Stakeholder strategies, or firms’ approaches to stakeholder management, may have a significant impact on firms’ long-term prosperity and, thereby, on their life chances, as established in the stakeholder view of the firm. A systematic literature review surveyed the contemporary body of quantitative empirical research that has examined firm-level activities relevant to stakeholder management, corporate social responsibility, and corporate social performance, because these three constructs are often conflated in literature. A search uncovered 99 articles published in 22 journals during the 10-year period from 2010 to 2019. Most studies employed databases reporting environmental, social, and governance (ESG) ratings, originally created for use in socially responsible investing and corporate risk assessment, but others employed content analysis of texts and primary surveys. Examination revealed a key difference in the scoring of data, in that some studies aggregated numerous indicators into a single composite index to indicate levels of stakeholder management, and other studies scored more articulated constructs. Articulated constructs provided richer observations, including governance and structural arrangements most likely to provide both stakeholder benefits and protections. Also observed were constraining influences of managerial and market myopia, sustaining influences from resilience and complexity frameworks, and recognition that contextual variables are contingencies having impact in recognizing the efficacy of stakeholder management strategies.

Article

Rhonda K. Reger and Paula A. Kincaid

Content analysis is to words (and other unstructured data) as statistics is to numbers (also called structured data)—an umbrella term encompassing a range of analytic techniques. Content analyses range from purely qualitative analyses, often used in grounded theorizing and case-based research to reduce interview data into theoretically meaningful categories, to highly quantitative analyses that use concept dictionaries to convert words and phrases into numerical tables for further quantitative analysis. Common specialized types of qualitative content analysis include methods associated with grounded theorizing, narrative analysis, discourse analysis, rhetorical analysis, semiotic analysis, interpretative phenomenological analysis, and conversation analysis. Major quantitative content analyses include dictionary-based approaches, topic modeling, and natural language processing. Though specific steps for specific types of content analysis vary, a prototypical content analysis requires eight steps beginning with defining coding units and ending with assessing the trustworthiness, reliability, and validity of the overall coding. Furthermore, while most content analysis evaluates textual data, some studies also analyze visual data such as gestures, videos and pictures, and verbal data such as tone. Content analysis has several advantages over other data collection and analysis methods. Content analysis provides a flexible set of tools that are suitable for many research questions where quantitative data are unavailable. Many forms of content analysis provide a replicable methodology to access individual and collective structures and processes. Moreover, content analysis of documents and videos that organizational actors produce in the normal course of their work provides unobtrusive ways to study sociocognitive concepts and processes in context, and thus avoids some of the most serious concerns associated with other commonly used methods. Content analysis requires significant researcher judgment such that inadvertent biasing of results is a common concern. On balance, content analysis is a promising activity for the rigorous exploration of many important but difficult-to-study issues that are not easily studied via other methods. For these reasons, content analysis is burgeoning in business and management research as researchers seek to study complex and subtle phenomena.

Article

Guclu Atinc and Marcia J. Simmering

The use of control variables to improve inferences about statistical relationships in data is ubiquitous in management research. In both the micro- and macro-subfields of management, control variables are included to remove confounding variance and provide researchers with an enhanced ability to interpret findings. Scholars have explored the theoretical underpinnings and statistical effects of including control variables in a variety of statistical analyses. Further, a robust literature surrounding the best practices for their use and reporting exists. Specifically, researchers have been directed to report more detailed information in manuscripts regarding the theoretical rationale for the use of control variables, their measurement, and their inclusion in statistical analysis. Moreover, recent research indicates the value of removing control variables in many cases. Although there is evidence that articles recommending best practices for control variables use are increasingly being cited, there is also still a lag in researchers following recommendations. Finally, there are avenues for valuable future research on control variables.

Article

Thomas Donaldson and Diana C. Robertson

Serious research into corporate ethics is nearly half a century old. Two approaches have dominated research; one is normative, the other empirical. The former, the normative approach, develops theories and norms that are prescriptive, that is, ones that are designed to guide corporate behavior. The latter, the empirical approach, investigates the character and causes of corporate behavior by examining corporate governance structures, policies, corporate relationships, and managerial behavior with the aim of explaining and predicting corporate behavior. Normative research has been led by scholars in the fields of moral philosophy, theology and legal theory. Empirical research has been led by scholars in the fields of sociology, psychology, economics, marketing, finance, and management. While utilizing distinct methods, the two approaches are symbiotic. Ethical and legal theory are irrelevant without factual context. Similarly, empirical theories are sterile unless translated into corporate guidance. The following description of the history of research in corporate ethics demonstrates that normative research methods are indispensable tools for empirical inquiry, even as empirical methods are indispensable tools for normative inquiry.

Article

Critical thinking is more than just fault-finding—it involves a range of thinking processes, including interpreting, analyzing, evaluating, inferencing, explaining, and self-regulating. The concept of critical thinking emerged from the field of education; however, it can, and should, be applied to other areas, particularly to research. Like most skills, critical thinking can be developed. However, critical thinking is also a mindset or a disposition that enables the consistent use and application of critical thought. Critical thinking is vital in business research, because researchers are expected to demonstrate a systematic approach and cogency in the way they undertake and present their studies, especially if they are to be taken seriously and for prospective research users to be persuaded by their findings. Critical thinking can be used in the key stages of many typical business research projects, specifically: the literature review; the use of inductive, deductive, and abductive reasoning and the relevant research design and methodology that follows; and contribution to knowledge. Research is about understanding and explaining phenomena, which is usually the starting point to solve a problem or to take advantage of an opportunity. However, to gain new insights (or to claim to), one needs to know what is already known, which is why many research projects start with a literature review. A literature review is a systematic way of searching and categorizing literature that helps to build the researchers’ confidence that they have identified and recognized prevailing (explicit) knowledge relevant to the development of their research questions. In a literature review, it is the job of the researcher to examine ideas presented through critical thinking and to scrutinize the arguments of the authors. Critical thinking is also clearly crucial for effective reasoning. Reasoning is the way people rationalize and explain. However, in the context of research, the three generally accepted distinct forms of reasoning (inductive, deductive, and abductive) are more analogous to specific approaches to shape how the literature, research questions, methods, and findings all come together. Inductive reasoning is making an inference based on evidence that researchers have in possession and extrapolating what may happen based on the evidence, and why. Deductive reasoning is a form of syllogism, which is an argument based on accepted premises and involves choosing the most appropriate alternative hypotheses. Finally, abductive reasoning is starting with an outcome and working backward to understand how and why, and by collecting data that can subsequently be decoded for significance (i.e., Is the identified factor directly related to the outcome?) and clarified for meaning (i.e., How did it contribute to the outcome?). Also, critical thinking is crucial in the design of the research method, because it justifies the researchers’ plan and action in collecting data that are credible, valid, and reliable. Finally, critical thinking also plays a role when researchers make arguments based on their research findings to ensure that claims are grounded in the evidence and the procedures.

Article

To understand and communicate research findings, it is important for researchers to consider two types of information provided by research results: the magnitude of the effect and the degree of uncertainty in the outcome. Statistical significance tests have long served as the mainstream method for statistical inferences. However, the widespread misinterpretation and misuse of significance tests has led critics to question their usefulness in evaluating research findings and to raise concerns about the far-reaching effects of this practice on scientific progress. An alternative approach involves reporting and interpreting measures of effect size along with confidence intervals. An effect size is an indicator of magnitude and direction of a statistical observation. Effect size statistics have been developed to represent a wide range of research questions, including indicators of the mean difference between groups, the relative odds of an event, or the degree of correlation among variables. Effect sizes play a key role in evaluating practical significance, conducting power analysis, and conducting meta-analysis. While effect sizes summarize the magnitude of an effect, the confidence intervals represent the degree of uncertainty in the result. By presenting a range of plausible alternate values that might have occurred due to sampling error, confidence intervals provide an intuitive indicator of how strongly researchers should rely on the results from a single study.

Article

Conducting credible and trustworthy research to inform managerial decisions is arguably the primary goal of business and management research. Research design, particularly the various types of experimental designs available, are important building blocks for advancing toward this goal. Key criteria for evaluating research studies are internal validity (the ability to demonstrate causality), statistical conclusion validity (drawing correct conclusions from data), construct validity (the extent to which a study captures the phenomenon of interest), and external validity (the generalizability of results to other contexts). Perhaps most important, internal validity depends on the research design’s ability to establish that the hypothesized cause and outcome are correlated, that variation in them occurs in the correct temporal order, and that alternative explanations of that relationship can be ruled out. Research designs vary greatly, especially in their internal validity. Generally, experiments offer the strongest causal inference, because the causal variables of interest are manipulated by the researchers, and because random assignment makes subjects comparable, such that the sources of variation in the variables of interest can be well identified. Natural experiments can exhibit similar internal validity to the extent that researchers are able to exploit exogenous events creating (quasi-)randomized interventions. When randomization is not available, quasi-experiments aim at approximating experiments by making subjects as comparable as possible based on the best available information. Finally, non-experiments, which are often the only option in business and management research, can still offer useful insights, particularly when changes in the variables of interest can be modeled by adopting longitudinal designs.

Article

Alex Bitektine, Jeff Lucas, Oliver Schilke, and Brad Aeon

Experiments randomly assign actors (e.g., people, groups, and organizations) to different conditions and assess the effects on a dependent variable. Random assignment allows for the control of extraneous factors and the isolation of causal effects, making experiments especially valuable for testing theorized processes. Although experiments have long remained underused in organizational theory and management research, the popularity of experimental methods has seen rapid growth in the 21st century. Gatekeepers sometimes criticize experiments for lacking generalizability, citing their artificial settings or non-representative samples. To address this criticism, a distinction is drawn between an applied research logic and a fundamental research logic. In an applied research logic, experimentalists design a study with the goal of generalizing findings to specific settings or populations. In a fundamental research logic, by contrast, experimentalists seek to design studies relevant to a theory or a fundamental mechanism rather than to specific contexts. Accordingly, the issue of generalizability does not so much boil down to whether an experiment is generalizable, but rather whether the research design matches the research logic of the study. If the goal is to test theory (i.e., a fundamental research logic), then asking the question of whether the experiment generalizes to certain settings and populations is largely irrelevant.

Article

Hypothesis testing is an approach to statistical inference that is routinely taught and used. It is based on a simple idea: develop some relevant speculation about the population of individuals or things under study and determine whether data provide reasonably strong empirical evidence that the hypothesis is wrong. Consider, for example, two approaches to advertising a product. A study might be conducted to determine whether it is reasonable to assume that both approaches are equally effective. A Type I error is rejecting this speculation when in fact it is true. A Type II error is failing to reject when the speculation is false. A common practice is to test hypotheses with the type I error probability set to 0.05 and to declare that there is a statistically significant result if the hypothesis is rejected. There are various concerns about, limitations to, and criticisms of this approach. One criticism is the use of the term significant. Consider the goal of comparing the means of two populations of individuals. Saying that a result is significant suggests that the difference between the means is large and important. But in the context of hypothesis testing it merely means that there is empirical evidence that the means are not equal. Situations can and do arise where a result is declared significant, but the difference between the means is trivial and unimportant. Indeed, the goal of testing the hypothesis that two means are equal has been criticized based on the argument that surely the means differ at some decimal place. A simple way of dealing with this issue is to reformulate the goal. Rather than testing for equality, determine whether it is reasonable to make a decision about which group has the larger mean. The components of hypothesis-testing techniques can be used to address this issue with the understanding that the goal of testing some hypothesis has been replaced by the goal of determining whether a decision can be made about which group has the larger mean. Another aspect of hypothesis testing that has seen considerable criticism is the notion of a p-value. Suppose some hypothesis is rejected with the Type I error probability set to 0.05. This leaves open the issue of whether the hypothesis would be rejected with Type I error probability set to 0.025 or 0.01. A p-value is the smallest Type I error probability for which the hypothesis is rejected. When comparing means, a p-value reflects the strength of the empirical evidence that a decision can be made about which has the larger mean. A concern about p-values is that they are often misinterpreted. For example, a small p-value does not necessarily mean that a large or important difference exists. Another common mistake is to conclude that if the p-value is close to zero, there is a high probability of rejecting the hypothesis again if the study is replicated. The probability of rejecting again is a function of the extent that the hypothesis is not true, among other things. Because a p-value does not directly reflect the extent the hypothesis is false, it does not provide a good indication of whether a second study will provide evidence to reject it. Confidence intervals are closely related to hypothesis-testing methods. Basically, they are intervals that contain unknown quantities with some specified probability. For example, a goal might be to compute an interval that contains the difference between two population means with probability 0.95. Confidence intervals can be used to determine whether some hypothesis should be rejected. Clearly, confidence intervals provide useful information not provided by testing hypotheses and computing a p-value. But an argument for a p-value is that it provides a perspective on the strength of the empirical evidence that a decision can be made about the relative magnitude of the parameters of interest. For example, to what extent is it reasonable to decide whether the first of two groups has the larger mean? Even if a compelling argument can be made that p-values should be completely abandoned in favor of confidence intervals, there are situations where p-values provide a convenient way of developing reasonably accurate confidence intervals. Another argument against p-values is that because they are misinterpreted by some, they should not be used. But if this argument is accepted, it follows that confidence intervals should be abandoned because they are often misinterpreted as well. Classic hypothesis-testing methods for comparing means and studying associations assume sampling is from a normal distribution. A fundamental issue is whether nonnormality can be a source of practical concern. Based on hundreds of papers published during the last 50 years, the answer is an unequivocal Yes. Granted, there are situations where nonnormality is not a practical concern, but nonnormality can have a substantial negative impact on both Type I and Type II errors. Fortunately, there is a vast literature describing how to deal with known concerns. Results based solely on some hypothesis-testing approach have clear implications about methods aimed at computing confidence intervals. Nonnormal distributions that tend to generate outliers are one source for concern. There are effective methods for dealing with outliers, but technically sound techniques are not obvious based on standard training. Skewed distributions are another concern. The combination of what are called bootstrap methods and robust estimators provides techniques that are particularly effective for dealing with nonnormality and outliers. Classic methods for comparing means and studying associations also assume homoscedasticity. When comparing means, this means that groups are assumed to have the same amount of variance even when the means of the groups differ. Violating this assumption can have serious negative consequences in terms of both Type I and Type II errors, particularly when the normality assumption is violated as well. There is vast literature describing how to deal with this issue in a technically sound manner.

Article

Rand R. Wilcox

Inferential statistical methods stem from the distinction between a sample and a population. A sample refers to the data at hand. For example, 100 adults may be asked which of two olive oils they prefer. Imagine that 60 say brand A. But of interest is the proportion of all adults who would prefer brand A if they could be asked. To what extent does 60% reflect the true proportion of adults who prefer brand A? There are several components to inferential methods. They include assumptions about how to model the probabilities of all possible outcomes. Another is how to model outcomes of interest. Imagine, for example, that there is interest in understanding the overall satisfaction with a particular automobile given an individual’s age. One strategy is to assume that the typical response Y , given an individuals age, X , is given by Y = β 0 + β 1 X , where the slope, β 1 , and intercept, β 0 , are unknown constants, in which case a sample would be used to make inferences about their values. Assumptions are also made about how the data were obtained. Was this done in a manner for which random sampling can be assumed? There is even an issue related to the very notion of what is meant by probability. Let μ denote the population mean of Y . The frequentist approach views probabilities in terms of relative frequencies and μ is viewed as a fixed, unknown constant. In contrast, the Bayesian approach views μ as having some distribution that is specified by the investigator. For example, it may be assumed that μ has a normal distribution. The point is that the probabilities associated with μ are not based on the notion of relative frequencies and they are not based on the data at hand. Rather, the probabilities associated with μ stem from judgments made by the investigator. Inferential methods can be classified into three types: distribution free, parametric, and non-parametric. The meaning of the term “non-parametric” depends on the situation as will be explained. The choice between parametric and non-parametric methods can be crucial for reasons that will be outlined. To complicate matters, the number of inferential methods has grown tremendously during the last 50 years. Even for goals that may seem relatively simple, such as comparing two independent groups of individuals, there are numerous methods that may be used. Expert guidance can be crucial in terms of understanding what inferences are reasonable in a given situation.

Article

Fred Gault and Luc Soete

Innovation indicators support research on innovation and the development of innovation policy. Once a policy has been implemented, innovation indicators can be used to monitor and evaluate the result, leading to policy learning. Producing innovation indicators requires an understanding of what innovation is. There are many definitions in the literature, but innovation indicators are based on statistical measurement guided by international standard definitions of innovation and of innovation activities. Policymakers are not just interested in the occurrence of innovation but in the outcome. Does it result in more jobs and economic growth? Is it expected to reduce carbon emissions, to advance renewable energy production and energy storage? How does innovation support the Sustainable Development Goals? From the innovation indicator perspective, innovation can be identified in surveys, but that only shows that there is, or there is not, innovation. To meet specific policy needs, a restriction can be imposed on the measurement of innovation. The population of innovators can be divided into those meeting the restriction, such as environmental improvements, and those that do not. In the case of innovation indicators that show a change over time, such as “inclusive innovation,” there may have to be a baseline measurement followed by a later measurement to see if inclusiveness is present, or growing, or not. This may involve social as well as institutional surveys. Once the innovation indicators are produced, they can be made available to potential users through databases, indexes, and scoreboards. Not all of these are based on the statistical measurement of innovation. Some use proxies, such as the allocation of financial and human resources to research and development, or the use of patents and academic publications. The importance of the databases, indexes, and scoreboards is that the findings may be used for the ranking of “innovation” in participating countries, influencing their behavior. While innovation indicators have always been influential, they have the potential to become more so. For decades, innovation indicators have focused on innovation in the business sector, while there have been experiments on measuring innovation in the public (general government sector and public institutions) and the household sectors. Historically, there has been no standard definition of innovation applicable in all sectors of the economy (business, public, household, and non-profit organizations serving households sectors). This changed with the Oslo Manual in 2018, which published a general definition of innovation applicable in all economic sectors. Applying a general definition of innovation has implications for innovation indicators and for the decisions that they influence. If the general definition is applied to the business sector, it includes product innovations that are made available to potential users rather than being introduced on the market. The product innovation can be made available at zero price, which has influence on innovation indicators that are used to describe the digital transformation of the economy. The general definition of innovation, the digital transformation of the economy, and the growing importance of zero price products influence innovation indicators.

Article

Heather A. Haveman and Gillian Gualtieri

Research on institutional logics surveys systems of cultural elements (values, beliefs, and normative expectations) by which people, groups, and organizations make sense of and evaluate their everyday activities, and organize those activities in time and space. Although there were scattered mentions of this concept before 1990, this literature really began with the 1991 publication of a theory piece by Roger Friedland and Robert Alford. Since that time, it has become a large and diverse area of organizational research. Several books and thousands of papers and book chapters have been published on this topic, addressing institutional logics in sites as different as climate change proceedings of the United Nations, local banks in the United States, and business groups in Taiwan. Several intellectual precursors to institutional logics provide a detailed explanation of the concept and the theory surrounding it. These literatures developed over time within the broader framework of theory and empirical work in sociology, political science, and anthropology. Papers published in ten major sociology and management journals in the United States and Europe (between 1990 and 2015) provide analysis and help to identify trends in theoretical development and empirical findings. Evaluting these trends suggest three gentle corrections and potentially useful extensions to the literature help to guide future research: (1) limiting the definition of institutional logic to cultural-cognitive phenomena, rather than including material phenomena; (2) recognizing both “cold” (purely rational) cognition and “hot” (emotion-laden) cognition; and (3) developing and testing a theory (or multiple related theories), meaning a logically interconnected set of propositions concerning a delimited set of social phenomena, derived from assumptions about essential facts (axioms), that details causal mechanisms and yields empirically testable (falsifiable) hypotheses, by being more consistent about how we use concepts in theoretical statements; assessing the reliability and validity of our empirical measures; and conducting meta-analyses of the many inductive studies that have been published, to develop deductive theories.

Article

Statistics used to index interrater similarity are prevalent in many areas of the social sciences, with multilevel research being one of the most common domains for estimating interrater similarity. Multilevel research spans multiple hierarchical levels, such as individuals, teams, departments, and the organization. There are three main research questions that multilevel researchers answer using indices of interrater agreement and interrater reliability: (a) Does the nesting of lower-level units (e.g., employees) within higher-level units (e.g., work teams) result in the non-independence of residuals, which is an assumption of the general linear model?; (b) Is there sufficient agreement between scores on measures collected from lower-level units (e.g., employees perceptions of customer service climate) to justify aggregating data to the higher-level (e.g., team-level climate)?; and (c) Following data aggregation, how effective are the higher-level unit means at distinguishing between those higher levels (e.g., how reliably do team climate scores distinguish between the teams)? Interrater agreement and interrater reliability refer to the extent to which lower-level data nested or clustered within a higher-level unit are similar to one another. While closely related, interrater agreement and reliability differ from one another in how similarity is defined. Interrater reliability is the relative consistency in lower-level data. For example, to what degree do the scores assigned by raters tend to correlate with one another? Alternatively, interrater agreement is the consensus of the lower-level data points. For example, estimates of interrater agreement are used to determine the extent to which ratings made by judges/observers could be considered interchangeable or equivalent in terms of their values. Thus, while interrater agreement and reliability both estimate the similarity of ratings by judges/observers, but they define interrater similarity in slightly different ways, and these statistics are suited to address different types of research questions. The first research question that these statistics address, the issue of non-independence, is typically measured using an interclass correlation statistic that is a function of both interrater reliability and agreement. However, in the context of non-independence, the intraclass correlation is most often interpreted as an effect size. The second multilevel research question, concerning adequate agreement to aggregate lower-level data to a higher level, would require a measure on interrater agreement, as the research is looking for consensus among raters. Finally, the third multilevel research question, concerning the reliability of higher-level means, not only requires a different variation of the intraclass correlation, but is also a function of both interrater reliability and agreement. Multilevel research requires researchers to appropriately apply interrater agreement and/or reliability statistics to their data, as well as follow best practices for calculating and interpreting these statistics.

Article

Intersectionality is a critical framework that provides us with the mindset and language for examining interconnections and interdependencies between social categories and systems. Intersectionality is relevant for researchers and for practitioners because it enhances analytical sophistication and offers theoretical explanations of the ways in which heterogeneous members of specific groups (such as women) might experience the workplace differently depending on their ethnicity, sexual orientation, and/or class and other social locations. Sensitivity to such differences enhances insight into issues of social justice and inequality in organizations and other institutions, thus maximizing the chance of social change. The concept of intersectional locations emerged from the racialized experiences of minority ethnic women in the United States. Intersectional thinking has gained increased prominence in business and management studies, particularly in critical organization studies. A predominant focus in this field is on individual subjectivities at intersectional locations (such as examining the occupational identities of minority ethnic women). This emphasis on individuals’ experiences and within-group differences has been described variously as “content specialization” or an “intracategorical approach.” An alternate focus in business and management studies is on highlighting systematic dynamics of power. This encompasses a focus on “systemic intersectionality” and an “intercategorical approach.” Here, scholars examine multiple between-group differences, charting shifting configurations of inequality along various dimensions. As a critical theory, intersectionality conceptualizes knowledge as situated, contextual, relational, and reflective of political and economic power. Intersectionality tends to be associated with qualitative research methods due to the central role of giving voice, elicited through focus groups, narrative interviews, action research, and observations. Intersectionality is also utilized as a methodological tool for conducting qualitative research, such as by researchers adopting an intersectional reflexivity mindset. Intersectionality is also increasingly associated with quantitative and statistical methods, which contribute to intersectionality by helping us understand and interpret the individual, combined (additive or multiplicative) effects of various categories (privileged and disadvantaged) in a given context. Future considerations for intersectionality theory and practice include managing its broad applicability while attending to its sociopolitical and emancipatory aims, and theoretically advancing understanding of the simultaneous forces of privilege and penalty in the workplace.

Article

A limited dependent variable (LDV) is an outcome or response variable whose value is either restricted to a small number of (usually discrete) values or limited in its range of values. The first type of LDV is commonly called a categorical variable; its value indicates the group or category to which an observation belongs (e.g., male or female). Such categories often represent different choice outcomes, where interest centers on modeling the probability each outcome is selected. An LDV of the second type arises when observations are drawn about a variable whose distribution is truncated, or when some values of a variable are censored, implying that some values are wholly or partially unobserved. Methods such as linear regression are inadequate for obtaining statistically valid inferences in models that involve an LDV. Instead, different methods are needed that can account for the unique statistical characteristics of a given LDV.

Article

James M. Diefendorff, Faith Lee, and Daniel Hynes

Longitudinal research involves collecting data from the same entities on two or more occasions. Almost all organizational theories outline a longitudinal process in which one or more variables cause a subsequent change in other variables. However, the majority of empirical studies rely on research designs that do not allow for the proper assessment of change over time or the isolation of causal effects. Longitudinal research begins with longitudinal theorizing. With this in mind, a variety of time-based theoretical concepts are helpful for conceptualizing how a variable is expected to change. This includes when variables are expected to change, the form or shape of the change, and how big the change is expected to be. To aid in the development of causal hypotheses, researchers should consider the history of the independent and dependent variables (i.e., how they may have been changing before the causal effect is examined), the causal lag between the variables (i.e., how long it takes for the dependent variable to start changing as a result of the independent variable), as well as the permanence, magnitude, and rate of the hypothesized change in the dependent variable. After hypotheses have been formulated, researchers can choose among various research designs, including experimental, concurrent or lagged correlational, or time series. Experimental designs are best suited for inferring causality, while time series designs are best suited for capturing the specific timing and form of change. Lagged correlation designs are useful for examining the direction and magnitude of change in a variable between measurements. Concurrent correlational designs are the weakest for inferring change or causality. Theory should dictate the choice of design, and designs can be modified and/or combined as needed to address the research question(s) at hand. Next, researchers should pay attention to their sample selection, the operationalization of constructs, and the frequency and timing of measures. The selected sample must be expected to experience the theorized change, and measures should be gathered as often as is necessary to represent the theorized change process (i.e., when the change occurs, how long it takes to unfold, and how long it lasts). Experimental manipulations should be strong enough to produce theorized effects and measured variables should be sensitive enough to capture meaningful differences between individuals and also within individuals over time. Finally, the analytic approach should be chosen based on the research design and hypotheses. Analyses can range from t-test and analysis of variance for experimental designs, to correlation and regression for lagged and concurrent designs, to a variety of advanced analyses for time series designs, including latent growth curve modeling, coupled latent growth curve modeling, cross-lagged modeling, and latent change score modeling. A point worth noting is that researchers sometimes label research designs by the statistical analysis commonly paired with the design. However, data generated from a particular design can often be analyzed using a variety of statistical procedures, so it is important to clearly distinguish the research design from the analytic approach.