Ethnoreligious Data Collection
Summary and Keywords
Collecting and examining datasets on ethnicity and religion involves translating and codifying real-world phenomena such as actions taken by governments and other groups into data which can be analyzed by social science statistical techniques. This methodology is intended to be applied to phenomena which in their original form are in a format not readily accessible to statistical analyses, i.e. “softer” phenomena and events such as government policies and conflict behavior. Thus, this methodology is not necessary for phenomena like GDP or government military spending, but is based on behavior by organizations or groups of individuals which are assessed by a coder who translates this behavior into data. Aggregate data collected by this methodology should have three qualities. First, they must be reproducible. Second, the data must be transparent in that all aspects of the data collection process and its products be clear and understandable to other researchers, to the extent that they could, in theory, be replicated. Third, it must measure what it intends to measure in a clear, accurate, and precise manner. A project which accomplishes all of this must be conceptualized properly from the beginning, including the decision on which unit of analysis to use and which cases to include and exclude. It must have appropriate sources and a tight variable design. Finally, the data must be collected in a systematic, transparent, and reproducible manner based upon appropriate sources.
This essay describes a basic methodology for designing and collecting aggregate data on ethnicity and religion using human coders. Datasets based on this methodology should be reproducible and transparent, and must measure what they intend to measure in a clear, accurate, and precise manner. The essay describes a process for collecting such data which involves a number of steps and issues including: (1) defining the unit of analysis, (2) cases to include, (3) variable selection and design, (4) information sources for coding the variables, (5) the coding process, and (6) resources.
The essay reviews the literature, issues, and methods with which to translate and codify real-world phenomena, such as actions taken by governments and other groups, into data which can be analyzed by social science statistical techniques. The discussion focuses on major datasets on ethnicity and religion which are not demographic or survey-based and cover all major countries. As explained below, this is a relatively short list, though a selection of datasets on other topics are also discussed for purposes of comparison. Despite the emphasis on ethnicity and religion, the methodological issues reviewed here are applicable to other topics.
Ethnicity and Religion Datasets
Other than demographic and survey-based data, the number of datasets which focus on ethnicity and religion is limited. (For demographic data on religion see, among others, the CIA World Factbook, Barrett et al. (2001), Adherents.com, and Ethnologue.) The Minorities at Risk (MAR) dataset is the only major dataset which focuses on ethnic conflict and, as discussed in more detail below, is the only such dataset which uses an ethnic group as the unit of analysis and includes multiple variables on the characteristics and behavior of ethnic minorities. Other conflict datasets include ethnic conflicts. However, among these only the State Failure dataset identifies which conflicts are ethnically based and codes them separately. Others, like the PRIO/Uppsala dataset, include only data on conflict in general, and do not single out or differentiate ethnic conflicts.
The Religion and State (RAS) project is probably the most extensive dataset on government religion policy, including 62 variables on the general relationship between religion and state as well as on restrictions on the religious practices of religious minorities, regulation of the majority religion, and religious legislation. Other datasets tend to focus on more narrow topics and/or do not include all states or religious minorities in a state. Brian Grim (Grim et al. 2006; Grim and Finke 2006; 2007) collected a detailed dataset on discrimination against religious minorities, which, unlike the RAS dataset, also includes a variable for societal discrimination. Barrett et al. (2001) collected a variable for whether the state has a dominant religion, though the criteria for coding this is unclear as it is not fully correlated with the presence of an official religion (Fox 2008). Barrett et al. (2001) also includes variables for the treatment of Christians but not other religious minorities. Other projects, such as Open Doors, similarly provide data focusing on the treatment of Christians.
Several data collections with religion variables look at state religion policy for only a portion of the world’s states. Chaves and Cann (1992), Minkenberg (2002) and Norris and Inglehart (2004) focus on Western democracies. Gill (1999) looks at Latin America. These collections have variables which are similar to those in the RAS dataset but fewer of them. Price (1999; 2002) measures the influence of religious law on several aspects of policy for 23 Muslim and 23 non-Muslim states. Fox (2004) collected variables for religious discrimination, institutions, and legitimacy for 105 religious minorities included in the MAR dataset. Most of these variables are precursors to those included in the RAS dataset.
Data Collection, General Principles
The form of aggregate data collection reviewed in this essay involves translating and codifying real-world phenomena such as actions taken by governments and other groups into data which can be analyzed by social science statistical techniques. This methodology is intended to be applied to phenomena which in their original form are in a format not readily accessible to statistical analyses. Thus, this methodology is not necessary for phenomena like GDP or government military spending. Rather it is intended to be applied to “softer” phenomena and events such as government policies and conflict behavior. It is different from survey-based data in that it is not based on asking individuals about their personal characteristics and views but, rather, is based on behavior by organizations or groups of individuals which are assessed by a coder who translates this behavior into data. Given the “soft” nature of ethnicity and religion, this methodology is ideally suited for data collection on these topics.
This methodology is based primarily on the methodologies developed to collect data for the MAR project (for more on this project, see Gurr 1993; 2000) and the RAS project (for more on this project, see Fox and Sandler 2003; 2005; Fox 2006; 2007; 2008). While the examples used in this essay are based on the datasets with which I am more familiar, the general comments and principles described are intended to be more widely applicable.
Aggregate data collected by this methodology should have three qualities. First, they must be reproducible. This means that if another party were to engage in the same data collection using the same methodology and sources, their results would be similar to those of the original data collection. Second, the data must be transparent in that all aspects of the data collection process and its products be clear and understandable to other researchers, to the extent that they could, in theory, be replicated. In general, replication in quantitative international relations research refers to the ability to obtain the data used in a study and replicate the uses to which they were put. (See, for example, the symposium on replication in international studies research in International Studies Perspectives 4 (1), 2003 and particularly James 2003.) However, in theory the data collection process should also be reproducible, though the resources involved in the data collection processes described here make it unlikely that such replication will occur. Third, it must measure what it intends to measure in a clear, accurate, and precise manner.
A project which accomplishes all of this must be conceptualized properly from the beginning, including the decision on which unit of analysis to use and which cases to include and exclude. It must have appropriate sources and a tight variable design. Finally, the data must be collected in a systematic, transparent, and reproducible manner based upon appropriate sources. The remainder of the essay describes each of these steps in detail.
Units of Analysis
The first decision that has to be made, other than the topic of a study, is the unit of analysis. The unit of analysis is the basic entity that will be the basis for measurement and analysis in the study. The selection of a unit of analysis in many ways defines what can and what cannot eventually be done with data once they have been collected. For example, consider three projects which collected data on domestic conflict: the MAR project, the PRIO/Uppsala project, and the State Failure project. The PRIO dataset uses the state as the unit of analysis and the MAR dataset uses an ethnic minority within a state. This means that while the PRIO dataset has a single score for a state in a given year, the MAR dataset often codes several minorities in the same state.
There are both advantages and disadvantages to using ethnic minorities as opposed to the state as the unit of analysis. Both types of study can use data on the characteristics of the state and the international environment, but only MAR-type sub-state units of analysis allow differentiation between different minorities within a state. Thus the characteristics and behavior of these groups can be considered separately in analyses. This allows us to ask questions such as whether particular types of minority might be more or less violent. For example, Fox (2004), using the MAR data, finds that minorities which are separatist and religious minorities engage in significantly higher levels of violence.
In contrast, the PRIO dataset includes only global levels of conflict within a state and the traits of specific minority groups cannot be taken into account. Also, a state may have multiple ongoing conflicts in any given year. The PRIO-type of global state indicator does not account for this.
The MAR sub-state group indicators have two disadvantages in comparison to the state level of analysis. First, they do not include all instances of domestic conflict, such as civil wars that occur within the same ethnic group, including many of Latin America’s civil wars and conflicts between militant Muslims and Muslim states such as Algeria and Egypt. Second, there may be as many as 10,000 ethnic minorities in the world but the MAR project currently includes only 343. Minorities are included in the dataset if they meet one or both of two criteria: if the group is currently politically active in pursuit of group interests, and if the group suffers from persistent discrimination or differential treatment. (Gurr 2000:7–9) This decision has resulted in the criticism that the dataset has issues of selection bias because it essentially includes no null cases (see, for example, Fearon and Laitin 1997). At the time of writing, the MAR project is working on including a number of null cases.
Null cases are important because they allow one to ask the question: why and when do ethnic minorities become politically active? MAR in its current form is limited to what influences the behavior of minorities which are politically active or are likely to become politically active due to high levels of discrimination. Gurr (2000:10–13) addresses these criticisms, arguing that the project has systematically collected a list of groups that are treated differentially and/or are politically active. Thus, the project represents a reasonably complete record of all serious conflicts between ethnic groups and governments. However, it is important to point out that in theory a project using sub-state groups as the unit of analysis can collect all relevant cases and MAR is currently in the process of adding null cases to the dataset.
The State Failure project uses an event as the unit of analysis, as do many international conflict databases such as the Correlates of War and the International Crisis Behavior projects. This creates a different set of advantages and disadvantages as compared to MAR and PRIO-type projects. Specifically, it codes all conflicts that occur within a state separately. Thus, it includes all violent domestic conflict and, unlike PRIO, is able to differentiate between different types of conflict. Specifically, it includes categories for ethnic wars, mass killings, revolutionary wars, and abrupt regime changes, with many events coded in multiple categories. This allows for comparisons between types of conflict and the coding of the ethnic or religious characteristics of the groups involved in a conflict. The State Failure dataset itself does not include this type of information. Fox (2004) has collected information on the religious and civilizational characteristics of the groups involved in the conflicts coded in the dataset.
However, since the State Failure dataset includes only cases where violence occurred, it is, in part, subject to the no null case criticism. It is possible to create a dataset from this type of failure for all states and to create variables for the presence or absence of a state failure in a given year. However, creating such a dataset with null minority or political groups which did not engage in violence would be, in essence, a new data collection. Also, the event as a unit of analysis methodology can be more problematic for protracted conflicts. These projects often break such conflicts up into several events, but this can cause some conflicts to have a greater influence on the overall results, and makes it more difficult to track a conflict over time.
The similarities and contrasts between these three datasets demonstrate that there are generally tradeoffs and compromises involved when selecting a unit of analysis, and that no single dataset is likely to have all of the desired properties and at the same time avoid all shortcomings. For example, a dataset like the PRIO dataset which simply includes all states is not subject to the disadvantages of MAR which are described above. However, in accomplishing this, the PRIO dataset gives up the ability to measure many of the nuances of domestic conflict, including the characteristics and behavior of specific minority groups.
This issue also applies to other common units of analysis. Dyads of states as the unit of analysis are subject to the same advantages and disadvantages as the state as a unit of analysis. However, most existing studies using dyads tend to focus on more simplistic questions such as whether ethnic or religious differences between states influence conflict (see, for example, Henderson 1997; 1998).
In some cases, the appropriate unit of analysis is set by the parameters of the study. For example, the RAS project looks at the extent of government involvement in religion. Thus it is clear that the appropriate unit of analysis is a government. Similarly, measures such as those used by the Polity and Freedom House projects to measure democracy must be coded using the state as the unit of analysis. However, even in this case there can be issues that influence the population of cases included in the study. These are discussed in the next section. Another example is when one wants to collect a few variables that are to be used in conjunction with an existing dataset. In this case, one must use a unit of analysis that is compatible with the unit of analysis of that dataset.
There is an additional advantage to using the state as the unit of analysis. State-level variables are compatible with the majority of existing cross-sectional social science datasets. A plurality, if not a majority, of existing datasets use this unit of analysis, including datasets such as the Polity, Freedom House, and Cingranelli–Richards (CIRI) datasets which focus on conflict, regime type, and human rights issues. These datasets are easily combined, allowing studies which use variables from multiple datasets. In addition, there are a number of state-level statistics available from sources like the UN, which measure economic development, demographics, and many other factors that social scientists may wish to correlate with the data from your project. Thus, a dataset which uses the state as its unit of analysis is likely to be more widely used.
Specifically, state-level data are easily included in datasets like MAR since each ethnic minority is coded for a specific state. But integrating MAR into state-level datasets is more cumbersome. Democracy, for example, can be used as a control variable in examining the level of ethnic rebellion by simply adding the democracy score from another dataset such as the Polity dataset. However, if one wants to examine the impact of ethnic conflict on democracy, creating a state-level variable from the MAR dataset is considerably more cumbersome, though not impossible. Consequently, the Polity data, for example, are included in a large number of studies as a control variable. In contrast, while many add variables from other datasets to MAR, few add MAR variables to other datasets.
A final issue with units of analysis is time. Aggregate data need to be coded within a specific time period. This time period can be a day, a week, a month, a year, or multiple years. Most cross-sectional datasets use the year as the time-unit of analysis. This means that the behavior, phenomenon, or quality in question is coded separately for each year. Thus in the MAR, State Failure, and PRIO datasets, the level of conflict within the relevant unit of analysis is coded separately for each year. This means that the coding for conflict in 2002, for example, is distinct from the codings for 2001 and 2003.
Cases to Include or Exclude
It is always best to include the entire universe of cases. This avoids issues of selection bias. It also has the advantage of reducing the importance of statistical significance in findings based on the dataset. Using the entire universe of cases is akin to having the actual election results in comparison to an exit poll. If a candidate has one vote more than another, that candidate has won because that one vote is a real difference. However, an exit poll uses a sample of a larger population to predict the behavior of that population. Statistically significant differences are those that are sufficiently large that there is only a small probability that these differences do not represent similar differences in the entire population. When looking at the entire population, this is not an issue because all differences are real differences. Some, such as McCloskey (1987), even argue that measures of statistical significance should not be used when examining the entire universe of cases, but the current standard of the discipline is to use measures of statistical significance as measures of the strength of a relationship.
Using the entire universe of cases does not necessarily mean using all possible cases. Rather it means using all cases which fit a clear definition. For example, the MAR dataset does not include all instances of domestic conflict. Rather, it includes all ethnic minorities which meet the criteria for inclusion in the dataset, as described above. Thus all results apply only to this limited set of cases, but within this specific definition MAR includes the entire universe of cases. Similar limitations include limiting data collection to a particular world region or time-span. The latter is an element in all aggregate datasets. As a general rule, longer time-spans are preferred. Also, it is often the case that collecting data over longer time-spans is more efficient in the long run. That is, collecting data for ten years at once is usually more efficient than coding five-year spans in two separate coding processes.
If it is not possible to include the entire universe of cases, in order to avoid issues of selection bias it is best to use random sampling techniques or to focus on a well-defined subset of states such as a specific world region (see, for example, Lai 2006) or a specific religion (see, for example, Price 1999). The latter option limits the relevance of the findings to that subset, but within that subset all potential cases are included.
However, in some cases this is not possible. For example, many social scientists studying religion want to use religiosity as a variable. Religiosity means the extent to which people are religious and is generally measured by survey questions such as: “do you believe in God?” and “how often do you attend religious services?” Such survey data can be converted into state-level variables by computing the percentage of people in a given state who answered these questions in a certain way (Barro and McCleary 2003). However, the most widely used international data collection on this topic, the World Values Survey, collected data for only 81 of the world’s states and fewer than that in any given year. Furthermore, there is selection bias in these surveys as the states included tend to be Western, Christian, and economically developed states with relatively democratic regimes. This is at least in part because the surveys tend to be funded by sources in the states in which the survey is conducted, so the survey is more likely to be performed in developed states whose academia is connected to Western academia.
In cases such as this, all results are tainted by selection bias. Yet it is worthwhile to perform such studies for two reasons. First, when there are practical obstacles to collecting data, it is appropriate to analyze the data that can be collected. These results cannot be considered definitive but they are accurate for the cases that are included and they are better than nothing. Second, based on the large number of studies using the World Values Survey (see, for example, Barro and McCleary 2003; Norris and Inglehart 2004; North and Gwin 2004; Ellingsen 2005; McCleary and Barro 2006a; 2006b; Scheve and Stasavage 2006), the use of these partial data is accepted as within the standards of the discipline if it is clear that collecting the data for all cases is not possible. That being said, it is important to make an effort to include all possible cases so that it can be said that the most comprehensive data available were used for the study.
Even when data are available and when using a unit of analysis that is straightforward, such as the state, issues of which states to include remain. For instance, there are a number of states which are officially recognized and have sovereignty, but which have populations of 10,000 people or fewer. On one hand, including these states has the advantage of creating a larger number of cases. On the other hand, are these states really comparable to the most populous ones, such as China and India, which have populations that are five orders of magnitude larger? Information on these micro-states is often scarce, making including them in the study difficult. Accordingly, most studies include only states with a certain minimum population, usually at least several hundred thousand. This is also true of studies based on other types of group, such as MAR, which uses a population cutoff of 100,000 or 1 percent of the state’s population, whichever is less.
A second issue is what to do with political entities that have some but not all the attributes of states. Consider the Turkish government of Cyprus. To all intents and purposes, save one, this political entity has all the attributes of a state. It is only lacking in international recognition. Similarly, the Palestinian Authority has many of the trappings of a state yet is officially an autonomous region that both is and is not part of Israel. Cases like these are relatively uncommon but politically charged. There is no set rule for including or not including such cases, but a decision must be made. That being said, collecting the data for such cases has the advantage of giving future users the option to decide for themselves. That is, if the data were collected, future users could easily decide to exclude these cases from their studies, but if the data were not collected, these cases could not be included in a study.
When using units of analysis other than the state, issues of potential cases which meet some but not all of the criteria which define that unit of analysis are likely to crop up. Accordingly, a project needs somehow to define which of these cases will and will not be included. It is best if there is a clear set of rules to make this determination.
There are two rules which should be applied when determining which cases to include. First, the variables to be collected cannot be one of the bases for including cases. For example, if one wanted to code a religious discrimination variable to be added to the MAR dataset, this would involve coding only some cases. This is because only some of the minorities in MAR are religious minorities and the likelihood of religious discrimination against minorities which belong to the same religion as the majority group is small. Fox (2004), when coding such a variable, used religious identity as the basis for inclusion. A case was included if 80 percent of a minority’s members belonged to a different religion than 80 percent of the majority groups, or a different denomination of the same religion as the majority groups. When doing this type of selection there is a temptation to use the variable to be collected as a basis for determining which cases should be included, especially when it is unclear whether the case should be included. This should be avoided because it can result in an inherent selection bias.
Second, transparency is essential. Results of a data analysis are always specific to the cases which are included in a study. It is therefore essential that criteria for inclusion are clear and well defined.
Variable Selection and Design
Variable design can only be finalized once the unit of analysis has been determined. Two issues are critically important for designing most aggregate variables. First, is there sufficient information available to code the variable? Missing data are a problem with many datasets, and the more data which are missing, the less useful are the data. Missing data can also involve issues of selection bias due to similarities between states on which less information is available. These states tend to be non-Western, less economically developed, and smaller. There also exist issues of information availability on some issues in autocratic regimes which repress the free flow of information.
Second, variables which are as specific and detailed as possible tend to be more accurate and useful. Specificity leaves coders less room for discretion. This increases inter-coder reliability, which means that two individuals looking at the same information will code the variable in the same way. This requirement is a standard one for events data coding and is discussed in more detail later in the essay. Variable specificity also increases a variable’s transparency – the extent to which exactly what is coded is clear.
Specificity has two elements. The first is being specific as to who and what the variable measures. For example, Norris and Inglehart’s (2004) measures of state involvement in religion includes a variable which measures whether “the state imprisons or detains some religious groups or individuals.” While it is somewhat clear who is taking the action – the state – the object of the action is not at all clear. Is the state imprisoning or detaining members of the majority religion or of a minority religion? If it is the latter, is it all minority religions or just certain minority religions? These questions are very pertinent because restrictions on the majority religion, all minority religions, or some minority religions, reflect different attitudes toward religion. For example, both Iran and the USSR might have been coded the same on this variable for 1981, but for very different reasons.
The second element of specificity is to be as detailed as possible. It is easy to collapse detailed variables, but once the data have been collected, the only way to add more detail is to start a new coding process. Thus it is better to err on the side of too much detail rather than too little. Returning to the example of “the state imprisons or detains some religious groups or individuals,” this variable is lacking in this respect. It was simply coded as yes or no. This means that mass arrests where people are then jailed for years are coded the same as a few isolated incidents where arrests were made and those arrested were released shortly afterward. A more detailed variable could have been collapsed to replicate this one, but creating a more detailed variable from this one is not possible without duplicating most of the work that went into coding the original variable.
There are a number of ways to create a detailed variable. One way is to make use of an existing number or statistic which is a good surrogate for the phenomenon to be measured. For instance, many datasets use battle deaths as a measure of magnitude. However, there can be problems even with this type of measure. In the case of minority populations’ size, for example, both the minority and the government, not to mention third parties, often have an interest in inflating or deflating these numbers. This was particularly true of coding the population size of Roma groups in many former Communist states for MAR. It is also important to assess how well a measure represents what we want to measure. For example, one can measure the number of people in a state who are officially members of an organized religion in order to measure religiosity in a state, but this does not differentiate between nominal members and those who attend religious services regularly, not to mention those who consider themselves religious but do not belong to recognized congregations.
Another option, used by the MAR dataset, is to create Guttman scales. These scales are ordinal ones where each level is clearly higher than the previous one, but the intervals between the various items on the scale are not necessarily the same. For example MAR codes ethnic rebellion as follows:
1 Political banditry, sporadic terrorism.
2 Campaigns of terrorism.
3 Local rebellions: armed attempts to seize power in a locale. If they prove to be the opening round in what becomes a protracted guerrilla or civil war during the year being coded, code the latter rather than local rebellion. Code declarations of independence by a minority-controlled regional government here.
4 Small-scale guerrilla activity. (Small-scale guerrilla activity has all these three traits: fewer than 1000 armed fighters; sporadic armed attacks [fewer than 6 reported per year]; and attacks in a small part of the area occupied by the group, or in one or two other locales.)
5 Intermediate-scale guerrilla activity. (Intermediate-scale guerrilla activity has one or two of the defining traits of large-scale activity and one or two of the defining traits of small-scale activity.)
6 Large-scale guerrilla activity. (Large-scale guerrilla activity has all these traits: more than 1000 armed fighters; frequent armed attacks [more than 6 reported per year]; and attacks affecting a large part of the area occupied by group.)
7 Protracted civil war, fought by rebel military with base areas.
This variable has several advantages. It is sufficiently specific that it is transparent and that inter-coder reliability is high. Also, there is sufficient information to code the variable in most cases. The disadvantage is that it is an ordinal variable, so it is not continuous; nor is it even an interval variable, though some (e.g., Gurr 1993; 2000) use such variables in standard multivariate analyses based on the argument that they can be considered interval-like. It is basically a compromise. Producing a continuous variable was sacrificed in favor of one that measures what we want to measure and which can be coded accurately based on the available information. Another option would have been battle deaths or the number of combatants, measures included in the State Failure dataset, but these measures do not account for the type of warfare involved as the MAR rebellion measure does. Even some projects which use more “natural” statistics often use this type of scale. For example, the State Failure dataset used a Guttman scale for battle deaths where each level includes a range such as 1 to 100 or 101 to 1000. This methodology was used due to the difficulty in obtaining specific information in many cases. Most coding schemes will include compromises like this one. There is no set rule for determining the appropriate compromise, except that whatever compromise is made should be described clearly in order to assure transparency.
The MAR project also uses these Guttman variables in another manner. It collects a list of Guttman variables which are then combined into a composite measure. For instance, it codes the following variables which all measure political restrictions on a minority group:
• Restrictions on freedom of expression
• Restrictions on free movement, place of residence
• Restrictions on rights in judicial proceedings
• Restrictions on political organization
• Restrictions on voting
• Restrictions on recruitment to police, military
• Restrictions on access to civil service
• Restrictions on attainment of high office
Each of these types of restriction is coded on the following Guttman scale:
0 = The activity is not significantly restricted for any group members.
1 = The activity is slightly restricted for most or all group members or sharply restricted for some of them.
2 = The activity is prohibited or sharply restricted for most or all group members.
These scales can be combined to form a variable that ranges from 0 to 18, which arguably has properties sufficiently similar to a continuous variable to be used in most forms of statistical analysis that would apply to a continuous variable (Gurr 1993; 2000).
However, one significant issue with such composite measures is the weighting of the individual components. The basic issue is whether all the items in the composite measure are of equal importance. Most users of such data tend to weight the individual components equally (e.g., Gurr 1993; 2000; Fox 2008) for three reasons. First, this is the simplest and most transparent weighting scheme. That is, it is easy to operationalize and it is the clearest in showing what has been done. Second, while most would agree that not all of the components of the composite measure are of equal importance, there is rarely agreement over which are the most important items. This is at least in part because the issues involved in such weighting are often normative or subjective. Third, assuming one can find a way to decide which items are more important than others which is not subject to substantial criticism, this opens the question of how much extra weight the more important items will be given. For example, if voting is determined to be more important than “restrictions on attaining high office,” is it twice as important? Three times as important? Or perhaps 1.5 times as important?
Put differently, there is often no way to produce an uncontested weighting scheme. Therefore, using the simple and transparent solution of not weighting the items in a composite variable is a reasonable default solution. This reflects the fact that many aggregate measures are to some extent based on a subjective process. The decision regarding which items to include on a scale of political discrimination or human rights violations is at least partially subjective. For instance, will women’s rights be included as one item or will there be several items which look at different specific types of women’s rights? The same can be said for violations of political and religious rights. Two projects, both looking at human rights, but making different decisions on these issues can produce composite variables for human rights which are quite different.
For example, the CIRI dataset includes three variables which look at women’s rights. It includes separate variables for political, social, and economic rights. In contrast, the Freedom House measure does not include any specific measures for women rights. Thus the status of women is a much heavier influence on the CIRI measure than on the Freedom House measure. In fact, the two datasets, while both measuring human rights, use significantly different sub-variables. Freedom House’s human rights component includes: (1) freedom of expression and belief, (2) associational and organizational rights, (3) rule of law, and (4) personal autonomy and individual rights. The CIRI dataset contains more categories, which are both more specific and cover more aspects of human rights. These include (1) political/extrajudicial killings, (2) unlawful/arbitrary deprivation of life, (3) disappearances, (4) torture, (5) political imprisonment, (6) freedom of speech and press, (7) freedom of religion, (8) freedom of movement, (9) freedom of assembly and association, (9) political participation, (10) worker’s rights, and (11) the three women’s rights variables noted above.
Be that as it may, the structure of this type of variable allows for a significant amount of flexibility. Since each item is coded separately, it is possible to use several weighting schemes and compare the results. Future researchers can combine the individual items in these composite variables in different manners or focus on only a few of them. The structure also allows for the use of inductive techniques for combining and weighting variables, such as factor or cluster analysis. However, these techniques can only be applied to variables once they have already been collected. The decision of which variables to collect is based on the methodological concerns discussed above as well as their theoretical relevance to the topic at hand. Be that as it may, it is arguable that the results produced from multiple analyses using multiple weighting schemes will produce a body of research that is more useful than one based on any single weighting scheme. Others, such as Abouharb and Cingranelli (2006), prefer simply to use each component variable separately, rather than combining them.
In this context, it is important to emphasize that all such variables and the weighting schemes used to combine them have an element of subjectivity and can, from this perspective, be considered approximate measures of the general phenomena that are being measured. Yet this type of variable is potentially useful and valuable. It is the best that can be accomplished to measure some phenomena and can provide a genuine measurement of reality. Also, if the measure is transparent, it is clear what is being measured. From this perspective, such variables are accurate measures of the specific phenomena that they measure. For example, in the case of the MAR political restrictions variable described above, correlations with this variable show a general correlation with political restrictions and a specific correlation with the nine types of political restriction included in the variable.
Two factors are important for the purposes of determining the sources for the data. First, the sources must be reliable. Second, the sources must be sufficient to provide the information to code the variables. For purposes of transparency, it is also important to list clearly the sources used. Several types of source are used in events data coding.
(1) General sources can be very useful. When coding a religious rights variable, for example, general sources like the World Christian Encyclopedia (WCE) (Barret et al. 2001) and the US State Department Reports on International Religious Freedom are useful. However, relying solely on these sources has disadvantages. For instance, they do not always provide complete information and sometimes have biases. The WCE focuses on Christian minorities and looks less at other minorities. The facts reported in the US State Department reports are accurate but the reports sometimes spin those facts based on political agendas and sometimes omit information. Also, while the US State Department reports are more accurate than some critics would have it, the fact that there are many who do not consider them completely legitimate undermines the legitimacy of variables based solely on these reports.
That being said, using only general sources often produces results similar to those using multiple sources. For example, Grim et al. (2006) compared codings on state religion policies based only on the US State Department Religious Freedom reports with similar variables in the RAS dataset, which was based on all four of the types of source described here; they found a statistically significant correlation of 0.817. However, Grim et al. (2006) implicitly support the principle of using multiple sources when they use this correlation with the RAS data as an indication of the validity of their data. Using the US State Department reports as the sole source for codings is not unusual. The CIRI dataset is also based primarily on this general source.
(2) Peer-refereed academic sources which focus on specific cases are useful and have high reliability. The problem with this type of source is that it tends to be available for some cases and not others. For example, when coding aspects of legal systems such as laws regulating religion, it is much easier to find such sources for Western states than non-Western, less economically developed states.
(3) Many NGOs and advocacy groups exist which publish reports on the issues that interest them. These reports can be useful but are also sometimes problematic. Like academic sources, they tend to be available for some cases and not others. Also, many NGOs have a political agenda and this can influence their reporting of the facts. Some cases may receive more attention than others. Facts may be spun or included selectively. It is not unheard of for facts to be distorted. Thus, all NGO and advocacy group sources need to be evaluated for reliability.
(4) Journalistic sources can also provide good and reliable information. Of course, the specific sources need to be evaluated for reliability, but mainstream press outlets with good reputations are generally reliable. The world coverage of journalists is certainly not an even one, with higher concentrations of journalists in some areas than others. If one is coding variables based on newsworthy events such as wars and conflicts, this can be an advantage. Even when this is not the case, the sheer volume of media coverage ensures that much of the information that is sought is available even for remote parts of the world. Databases like Lexis/Nexis, which include thousands of publications, make searching for information based on this type of source feasible. However, the older the event, the less use these databases will be. Any project which covers a time period from approximately 1980 onward will not have this issue. Events previous to this will rely increasingly on cruder searches as the time period covered becomes earlier. Microfiche searches are considerably more time consuming. Another option is to use Keesing’s Record of World Events – a source which provides monthly summaries of the news on a country-by-country basis.
It is important to emphasize that the sources used may vary from case to case. Appropriate academic sources may be available for some cases and not others. Also, many media outlets, NGOs, and advocacy groups focus on specific countries and regions. The key is to get the best information available for each case on a case-by-case basis and to ensure that each case has sufficient sources to provide reliable information.
Whatever the sources used, it is always best to find multiple sources for each fact included in the variable codings. This increases the reliability and often the legitimacy of the information upon which the variable codings are based. Nevertheless, it will often occur that only one source exists. It is then important to assess the reliability of the source. If the source is reliable, it is appropriate to code the variable based on this source.
The problems inherent in relying systematically on a single source are illustrated by the comparison noted above performed by Grim et al. (2006) with the RAS dataset. This comparison was in response to criticism surrounding Grim et al.’s (2006) reliance solely on the US State Department reports. While Grim et al. (2006) argue, correctly, that these reports are reliable, the fact that the reports are from a government agency which is perceived in some quarters to have some biases undermines the legitimacy of the data collection. Fox (2008), in collecting the RAS dataset, relied on all four types of source listed above, which helps to alleviate this problem. Furthermore, while the US State Department reports were never found to have false information, the use of these additional sources uncovered additional information which clearly increased the accuracy of the codings.
Be that as it may, there are additional important issues regarding sources that can be relevant to this type of data collection. First, sources are less likely to report non-events – otherwise known as “dogs that did not bark.” One is unlikely to see a news article reporting that there were no ethnic riots in Denmark this week, unless perhaps there was one last week. Thus, if only one obscure source reports an event, one must ask why it was not reported elsewhere. Assuming there are no overt reliability issues with the source, if the issue being coded is not particularly newsworthy and/or the event took place in a location where there is little coverage by the relevant sources, there is no reason to believe the source is not accurate. However, if the event is one that would likely have been reported in other sources if it had occurred, it is most likely that the report is inaccurate and should not be coded.
A second issue is conflicting sources. This occurs often when dealing with demographic issues. For instance, when evaluating the population of a minority group in a state, the state may have an interest in downplaying that minority’s numbers, while minority advocacy groups may have an interest in exaggerating their numbers. The best solution in this case is to find a neutral source. If this is not available, the project should make an estimate based on its estimation of the reliability of the various sources. This should be clearly documented in order to ease the process of reevaluating the estimate should a new and more reliable source come to light.
It is rare that different sources will give opposite information. When this happens there are a number of ways to try and determine the most accurate coding. First, the reliability of each source must be assessed and this can include the source’s sources, if the information is available. Also, sometimes closer examination shows that the sources are both accurate but refer to different time periods, different specific locations in the state, or some other basis for separating the single event into two separate ones.
A third issue is incomplete information. This can take two forms. In some cases it will be clear that information is not available or at least not sufficiently available to code a variable. In this case the variable should be left blank. However, often events will occur that would influence the codings but the information is never recorded at all or at least not recorded in any of the sources used by the coders. For example, a clash between opposition and government forces might occur in a remote location somewhere in the third world, but no record of this event exists in any of the sources used by the coders. Such an event, if it were known to the coders, could certainly influence the coding of a variable measuring conflict.
This is inevitable in many data collections and, by definition, it is impossible to know for which specific codings it is an issue. In practice, all such projects are based on the assumption that the variable, while not perfectly measuring reality, includes measurements that are sufficiently close to reality to be useful. As the information is less available in remote areas, less developed states, places with lower media coverage, and states which restrict information, there is also some inherent inaccuracy in datasets which code information that includes such areas. Nevertheless, as long as a reasonable effort is made to get the best information available, such datasets are considered to be within the accepted practices of the discipline. Otherwise datasets on domestic conflict such as MAR and the PRIO/Uppsala datasets would not be possible.
There are two rules of thumb for coding cases with limited information. First, if you have no direct evidence that something occurred, and you have sources that would probably report the event if it did occur, the event probably did not occur. For example, if there are no reports of battle deaths, it is appropriate to assume that none occurred. It may well be that some did in fact occur, but it is likely that if this is the case, it is a low number of battle deaths. This is because the larger the magnitude of an event, the more likely it is that it will be reported somewhere. Also, reporting verifiable facts is generally preferable to making guesses in the absence of such facts.
Second, partial information is better than no information. If there is partial information on a topic which is reliable, making an educated guess based on that information is appropriate. For instance, say information is available that a total of 5000 battle deaths occurred between 1990 and 1994 in some remote locale, but that no information beyond this is available. An appropriate solution would be to code 1000 battle deaths for each of these five years. Not coding the battle deaths would be worse than using such semi-accurate information because the semi-accurate information probably more closely resembles reality than coding no battle deaths. In such cases, it is important to record the basis for this coding so that if better information becomes available in the future, the codings can be modified accordingly.
It is important to note that all large datasets will almost inevitably have some errors. Sources that seem reliable may not always be reliable. Events that occur are not reported in the sources used by the project. Also, no matter what quality controls are employed, typos in data entry tend to occur. Over time, as the data are used, the scholars using it identify many of these errors and they are fixed. Nevertheless, there will almost assuredly be some errors in most major data collections. The general assumption is that these errors are random, so they do not introduce any systematic bias into the analyses based upon them.
The Coding Process
The coding process described here is not the only possible way to organize a coding project, but it is one that is effective and based on the process used by major datasets on ethnicity and religion, including the MAR and RAS projects. This section will also address the reasoning behind this methodology, the issues that must be dealt with when coding, and the advantages of this system.
The first step is that the coder will assemble sources as described above and write a report based on these sources. While a report is not a strictly necessary and codings can be done directly from the primary sources, a report has a number of advantages which are described below. If there is no report, the sources used should be documented and saved so that they can be referenced in the future.
The major advantage of using a report is that this method tends to result in the use of the most sources. Projects which do not use this method (e.g., Grim et al. 2006; Abouharb and Cingranelli 2006) tend to rely on a few general sources. The more sources used, the more accurate the information on which the codings are based is likely to be.
It is critical that the reports include references that are as detailed as possible. This provides transparency, making the basis for the coding clear to both members of the project and, later, outside researchers. It is almost inevitable that some of the specific codings will be questioned or challenged and that new information will come to light that would influence the codings. In the case of new information coming to light, a properly referenced report facilitates the process of reevaluating and possibly altering the coding in question. In the case of more general criticism, if the source for the challenged coding is a reliable source which is referenced properly, it can defuse such criticism. In one case, I presented a paper at a conference that included the Polity measure of democracy. The next day I received a visit from the cultural attaché at the Romanian embassy who had attended the conference. He complained about how Romania was classified. I explained that the variable was based on the Polity project and showed him the project’s well-documented web-page, which described the reasoning and information behind the coding. This effectively defused his criticism.
In order to assure accuracy and quality, each report should be subjected to an internal referee process. Usually the referee will be the project manager or another experienced member of the project. Also, it is important that there be a single senior member of the project who reviews all reports. This helps insure consistency in format and methodology amongst the project’s different research assistants, which is essential to assure that the results are not biased according to who coded a particular case. These evaluations of the reports are then given back to the research assistants who make revisions. This process is repeated until the report is satisfactory.
These reports also have other uses. First, they can be the basis for a website or other publications. Both the MAR (e.g., Gurr 2000) and the RAS (e.g., Fox 2008) projects’ publications made use of the reports in this manner. Second, they assist in writing publications based on the project by providing the examples and anecdotes which can highlight the statistical results that the dataset produces. Third, they ease the process of checking why a variable was coded a certain way, even years after the original coding when the original coder is no longer available to answer questions. Finally, they can be used to code new variables. This should be done with deliberation because the coders tend to have in mind the facts that are required to code the variables for the original project, and may not have sought or included relevant information for the new variables. However, if the new variables involve issues that should have been included in the reports, using the reports in this manner is appropriate.
The second step is the coding. The research assistant who wrote the report should code the variables. This is essential because the person who wrote the report is the one most familiar with the information. Often when coding the variables, the research assistant may remember relevant information that was not included in the report. This offers an opportunity for this information to be included in the codings and for the report to be appropriately amended. After this, a single individual who is responsible for checking all codings – usually the primary investigator – will go over the report in tandem with the codings and then sit with the coder and discusses the codings. This helps to train the research assistant in the process of coding. These discussions should become shorter as the research assistant becomes more experienced. This supervision and discussion also helps to make sure that all of the coders are coding the variables in the same manner.
No matter how specific the variables, it is inevitable that some cases will come up where the variables are not sufficiently specific in some unanticipated way, and then the coders must use some discretion in coding the variables. It is important that when this happens, the means to resolve this issue be recorded and applied consistently in the future when the same issue comes up again. The principal investigator’s review of all codings helps to assure that this happens. Occasional meetings which are attended by the entire project staff where these issues are discussed can also be helpful.
The following example illustrates a number of these issues. The MAR variable for ethnic protest is a Guttman variable that is defined as follows:
0 None reported.
1 Verbal opposition.
2 Scattered acts of symbolic resistance.
3 Political organizing activity on a substantial scale.
4 A few demonstrations, rallies, strikes, and/or riots, total participation less than 10,000.
5 Demonstrations, rallies, strikes, and/or riots, total participation estimated between 10,000 and 100,000.
6 Demonstrations, rallies, strikes, and/or riots, total participation over 100,000.
Early in Phase III of the project (during the early 1990s), some coders interpreted items 4 to 6 on this scale as the number who attended the largest single demonstration in a given year, and others counted the total number of attendees at demonstrations throughout a given year. Supervision by the primary investigator and project manager as well as project meetings brought this to light. Each interpretation of the coding is a reasonable one, but it is essential for all coders to code the variable based on the same interpretation. Otherwise, the codings for this variable would be heavily influenced by who coded a specific case. The decision was taken to use the first of the above interpretations.
The final step is backup codings. The purpose of backup codings is to insure inter-coder reliability. If two coders using the same information code the variables similarly, this increases confidence that the codings are reproducible. Ideally, both the report and codings for each case would be done separately by at least two different research assistants, but resource limitations generally make this impossible and only the codings are done a second time based on the original report. It is preferable for all cases to have backup codings, but a reasonably sized sample of the codings can be sufficient. It is also common to check the codings against those of other datasets which collected similar variables (see, for example, Grim et al. 2006).
Generally a research assistant who was not involved in the primary coding process for that case uses the report to recode the variables. As the purpose of this endeavor is to provide a second set of codings to be compared to the original codings, the primary investigator does not comment on or alter these codings. For this reason, it is advisable to do backup codings later in the project so that the coders will have sufficient experience to do the backup codings without supervision. When the project is complete, these codings are compared to the original codings on order to assess the reliability of those codings. This is based on the concept of inter-coder reliability.
A final issue is whether to alter the coding scheme after the coding process has already begun. As the project progresses, new information will often come to light which would be better coded if the coding scheme were altered either through altering or adding variables. This is certainly possible but not always advisable. Changing the coding scheme means going back over all previously coded cases and changing the codings when appropriate. Even if this involves just using the project’s reports, this can be time consuming and strain a project’s resources. The more cases that have already been coded, the more this is the case. Also, as noted above, the coders may not have had in mind the information necessary to code the new or altered variables when writing the reports. Thus, there are potential questions regarding the suitability of these reports as the basis for coding the new variables. This would require additional research for each case that was previously coded.
For instance, the RAS project coded two variables which measure whether a state supports religious education in public schools. The first measures optional religious education and the second measures mandatory religious education. It became clear in the course of the project that this topic requires far more nuanced and detailed variables. In several countries, religious education is mandatory for some but optional for others, based on location or religious affiliation. In some cases, the ability to opt out of religious education requires official approval, which is sometimes withheld. Moreover, this religious education is often available only in some religions for which there are a significant number of students but not in other such religions. However, by the time this became clear, about half of the cases had already been coded and it was not feasible to introduce new variables.
In practice, resource limitations and the complexity of managing a constantly changing coding scheme often determine whether or not the coding scheme is altered. While this is not ideal, it is often necessary. Any project will learn from experience of the first round of coding how to do it better on the next round. This will result in a better coding scheme in subsequent rounds of coding.
A final but critical element of events data collection is resources. All such projects have a finite amount of funds, which means that there are a finite amount of staff-hours available to do the work at hand. There are two ways in which this can be addressed. Either one can determine what is to be collected, determine the resources necessary to do this, then seek the resources; or one can start with available resources and determine what can be accomplished. In either case, it is essential to determine the amount of resources that are necessary to collect the data.
Most of the resources necessary for aggregate data collection are available to an academic with a university position: a good library and an internet connection. Most universities also have access to databases like Lexis/Nexis. The key resource which limits human aggregate data collection is manpower. Most of the manpower necessary for such a project is for generating reports and coding the data. In order to assess how much manpower a project will require, it is best to have some research assistants produce two or three sample reports. This will allow a determination of the approximate time needed per report and based on this it is possible to determine the approximate amount of manpower needed for a project. When doing so, it is also important to take into account other demands on research assistants’ time, such as staff meetings and training.
When using this manpower there are several factors that influence efficiency. First, a smaller number of research assistants working over a longer period of time tends to be more efficient because, as they gain experience of the coding process, they will work more efficiently. They will also require less supervision later in the process, which will reduce demands on the senior staff’s time. However, the data will be available at a later date. Second, once the research assistants are producing reports on a topic, it is often the case that additional data can be collected at very little extra cost. For instance, the political restrictions variable from the MAR project described above includes nine items. Adding similar items such as restrictions on the freedom of assembly would, in practice, require minimal additional expenditure of resources. This is because when the research assistants do the research to produce their reports, the information necessary to code this variable would almost certainly be found in the same sources that they are examining in any case. Thus, when expending the resources to collect aggregate data, it is important to ask whether any extra variables relevant to the project (or even useful variables not directly relevant to the project) can be added which will not place a significant extra burden on the project’s resources.
Third, the coders themselves influence efficiency. While graduate students tend to be higher-quality coders, undergraduate coders can often do the work with sufficient supervision. Often this can be done in the context of a class where the undergraduate coders combine readings on the topic of the project with coding. This format can lower the monetary cost of manpower at the price of a higher time commitment by the project’s senior staff. Another issue is whether coders who are expert in the topic or region of the world they are coding are preferable. On one hand, their expertise can be useful, but on the other hand, it is arguable that the “naive” coder may be more objective.
This essay focuses on data produced by human coders. Other projects, most prominently the Kansas Event Data System (KEDS), have developed computer software to read sources, usually media sources like Reuters, and develop variables. This methodology, known as automated coding, has the advantage of creating transparent accurate data without the costs of hundreds or thousands of hours of manpower time. However, it is limited in two key ways. First, the types of variable it can collect are more limited. This methodology is useful to code the number of events which occur during a given period of time based on the presence of certain keywords in the press. However, it is less useful for coding more nuanced variables. For instance, it can code the number of conflict (or other type of) events that occur within a specific period of time, but it cannot sort out the specific types of political discrimination contained in the MAR political restrictions variable described above. Put differently, computer-generated data are superior for certain limited types of variable, but the more nuanced and detailed variables which are required to answer many key questions in the social sciences can currently only be collected by human coders.
Second, since media coverage varies from location to location, the ability to compare between cases is limited. For example, nearly every specific event that occurs in the Palestinian–Israeli conflict is well covered by the press, but the coverage of many conflicts that occurred in Africa is more limited. Accordingly, codings based on this methodology would probably show more events occurring in the Palestinian–Israeli conflict in 1994 than in Rwanda. Yet the genocidal war in Rwanda was clearly more violent in that year. Thus, computer-generated data are more appropriate for time-series analysis across a specific case than for cross-sectional analyses.
While this essay focuses on projects which collect data on religion and ethnic conflict and, to a lesser extent, on other types of conflict and human rights, the general issues discussed here are applicable to other types of events coding. The issues of defining the unit of analysis, which cases to include, variable selection, variable design, information sources, the coding process, and resources are common to all such projects. For instance, if one were to develop a dataset on migration or nationalism, all of these issues would apply.
Human-based aggregate data coding is a time-consuming and potentially expensive endeavor. The design and execution of the project must be well thought out in advance because once the time, effort, and funds have been expended, there is often no second chance. It must be done as well and efficiently as possible on the first try. The basic principles discussed in this essay are intended to aid in the design of such projects and help those managing the projects avoid potential pitfalls. The design of such a project is a small portion of the work involved, but it is the most critical aspect. Poor design will lead to a poor product, no matter how much effort was put into collecting the data. In contrast, well-designed data collection efforts that produce reproducible, transparent, accurate, and detailed data are invaluable to social science researchers.
Abouharb, M.R., and Cingranelli, D.L. (2006) The Human Rights Effect of World Bank Structural Adjustment, 1980–2001. International Studies Quarterly 50, 233–62.Find this resource:
Barret, D.B., Kurian, G.T., and Johnson, T.M. (2001) World Christian Encyclopedia, 2nd edn. Oxford: Oxford University Press.Find this resource:
Barro, R.J., and McCleary, R.M. (2003) Religion and Economic Growth across Countries. American Sociological Review 68 (5), 760–81.Find this resource:
Chaves, M., and Cann, D.E. (1992) Religion, Pluralism and Religious Market Structure. Rationality and Society 4 (3), 272–90.Find this resource:
Ellingsen, T. (2005) Toward a Revival of Religion and Religious Clashes. Terrorism and Political Violence 17 (3), 305–32.Find this resource:
Fearon, J.D., and Laitin, D.D. (1997) A Cross-Sectional Study of Large-Scale Ethnic Violence in the Postwar Period. Unpublished paper, Department of Political Science, University of Chicago, Sept. 30.Find this resource:
Fox, J. (2004) Religion, Civilization and Civil War: 1945 Through the New Millennium, Lanham, MD: Lexington.Find this resource:
Fox, J. (2006) World Separation of Religion and State into the 21st Century. Comparative Political Studies 39 (5), pp. 537–69.Find this resource:
Fox, J. (2007) Do Democracies Have Separation of Religion and State? Canadian Journal of Political Science 40 (1), 1–25.Find this resource:
Fox, J. (2008) A World Survey of Religion and the State. New York: Cambridge University Press.Find this resource:
Fox, J., and Sandler, S. (2003) Quantifying Religion: Toward Building More Effective Ways of Measuring Religious Influence on State-Level Behavior. Journal of Church and State 45 (3), 559–88.Find this resource:
Fox, J., and Sandler, S. (2005) Separation of Religion and State in the 21st Century: Comparing the Middle East and Western Democracies. Comparative Politics 37 (3), 317–35.Find this resource:
Gill, A. (1999) Government Regulation, Social Anomie and Religious Pluralism in Latin America: A Cross-National Analysis. Rationality and Society, 11 (3), 287–316.Find this resource:
Grim, B.J., and Finke, R. (2006) International Religion Indexes: Government Regulation, Government Favoritism, and Social Regulation of Religion. Interdisciplinary Journal of Research on Religion 2 (1), 1–40.Find this resource:
Grim, B.J., and Finke, R. (2007) Religious Persecution in Cross-National Context: Clashing Civilizations or Regulating Religious Economies. American Sociological Review 72, 633–58.Find this resource:
Grim, B.J., Finke, R., Harris, J., Meyers, C., and VanErden, J. (2006) Measuring International Socio-religious Values by Coding State Department Reports. Paper delivered at the annual conference of the American Association of Public Opinion Researchers, Montreal, May 19.Find this resource:
Gurr, T.R. (1993) Minorities at Risk. Washington, DC: United States Institute of Peace.Find this resource:
Gurr, T.R. (2000) Peoples versus States: Minorities at Risk in the New Century. Washington, DC: United States Institute of Peace Press.Find this resource:
Henderson, E.A. (1997) Culture or Contiguity: Ethnic Conflict, the Similarity of States, and the Onset of War, 1820–1989. Journal of Conflict Resolution 41 (5), 649–68.Find this resource:
Henderson, E.A. (1998) The Democratic Peace through the Lens of Culture, 1820–1989. International Studies Quarterly 42 (3), 461–84.Find this resource:
James, P. (2003) Replication Policies and Practices at International Studies Quarterly. International Studies Perspectives 4 (1), 85–8.Find this resource:
Lai, B. (2006) An Empirical Examination of Religion and Conflict in the Middle East, 1950–1992. Foreign Policy Analysis 2 (1), 21–36.Find this resource:
McCleary, R.M., and Barro, R.J. (2006a) Religion and International Economy in an International Panel. Journal for the Scientific Study of Religion 45 (2), 149–75.Find this resource:
McCleary, R.M., and Barro, R.J. (2006b) Religion and Economy. Journal of Economic Perspectives 20 (2), 49–72.Find this resource:
McCloskey, D. (1987) The Rhetoric of Economics. Madison: University of Wisconsin Press.Find this resource:
Minkenberg, M. (2002) Religion and Public Policy: Institutional, Cultural, and Political Impact on the Shaping of Abortion Policies in Western Democracies. Comparative Political Studies 35 (2), 221–47.Find this resource:
Norris, P., and Inglehart, R. (2004) Sacred and Secular: Religion and Politics Worldwide. New York: Cambridge University Press.Find this resource:
North, C.M., and Gwin, C.R. (2004) Religious Freedom and the Unintended Consequences of State Religion. Southern Economic Journal 71 (1), 103–17.Find this resource:
Price, D.E. (1999) Islamic Political Culture, Democracy, and Human Rights. Westport: Praeger.Find this resource:
Price, D.E. (2002) Islam and Human Rights: A Case of Deceptive First Appearances. Journal for the Scientific Study of Religion 41 (2), 213–25.Find this resource:
Scheve, K., and Stasavage, D. (2006) Religion and Preferences for Social Insurance. Quarterly Journal of Political Science 1, 255–86.Find this resource:
Links to Digital Materials
Adherents.com. At www.adherents.com, accessed Jul. 2009. This site includes a listing of sources for religious demographic data.
Association of Religious Data Archives. At www.thearda.com, accessed Jul. 2009. The archive houses a wide variety of datasets on the general topic of religion. It also includes country profiles and, for the USA, county-by-county demographic data.
CIA World Factbook. At www.cia.gov/library/publications/the-world-factbook/index.html, accessed Jul. 2009. The factbook includes basic information on different countries.
Center for International Development and Conflict Management. At www.cidcm.umd.edu, accessed Jul. 2009. This site hosts a number of datasets including MAR, ICB, and Polity.
CIRI Human Rights Data Project. At http:/ciri.binghamton.edu/index.asp, accessed Jul. 2009. The Cingranelli–Richards (CIRI) Human Rights Dataset contains standards-based quantitative information on government respect for 15 internationally recognized human rights for 195 countries, annually from 1981 to 2007.
Correlates of War project. At www.correlatesofwar.org, accessed Jul. 2009. This site houses the Correlates of War dataset as well as many other datasets including variables used in studies of international and domestic conflict.
Ethnologue – Languages of the World. At www.ethnologue.com, accessed Jul. 2009. This site provides a searchable bibliography of over 20,000 sources on world languages and how many speakers of particular languages are in particular countries.
Freedom House. At www.freedomhouse.org, accessed Jul. 2009. Freedom House is a nonprofit organization which advocates freedom around the world. The site includes a wealth of information on this topic and houses the Freedom House dataset on political freedom and democracy.
International Peace Research Institute, Oslo (PRIO). At www.prio.no, accessed Jul. 2009. The International Peace Research Institute site houses multiple datasets on conflict as well as relevant publications.
Kansas Events Data Systems. At http:/web.ku.edu/keds/, accessed Jul. 2009. This site houses the Kansas Events dataset.
Open Doors. At www.opendoorsuk.org, accessed Jul. 2009. Open Doors is a Christian advocacy group which provides information on the treatment of Christians worldwide.
Religion and State (RAS) dataset homepage. At www.religionandstate.org, accessed Jul. 2009. This site houses the RAS dataset and all information regarding the RAS project. The dataset includes 62 variables for 175 countries for 1990 to 2002.
State Failure Political Instability Task Force. At http:/globalpolicy.gmu.edu/pitf/, accessed Jul. 2009. This site houses the State Failure dataset. It contains a list of all “state failures” including ethnic conflicts, revolutionary wars, mass killings, and abrupt regime changes from 1955 to 2006.
United Nations Statistics Division. At http:/unstats.un.org/unsd/default.htm, accessed Jul. 2009. This site houses all of the general economic, development, and population statistics collected by the UN.
This research was supported by the Israel Science Foundation (Grant 896/00), the Sara and Simha Lainer Chair in Democracy and Civility, and the John Templeton Foundation. The opinions expressed in this paper are those of the author and do not necessarily reflect the views of the John Templeton Foundation.