Analyzing political text can answer many pressing questions in political science, from understanding political ideology to mapping the effects of censorship in authoritarian states. This makes the study of political text and speech an important part of the political science methodological toolbox. The confluence of increasing availability of large digital text collections, plentiful computational power, and methodological innovations has led to many researchers adopting techniques of automatic text analysis for coding and analyzing textual data. In what is sometimes termed the “text as data” approach, texts are converted to a numerical representation, and various techniques such as dictionary analysis, automatic scaling, topic modeling, and machine learning are used to find patterns in and test hypotheses on these data. These methods all make certain assumptions and need to be validated to assess their fitness for any particular task and domain.
Wouter van Atteveldt, Kasper Welbers, and Mariken van der Velden
Konstantinos V. Katsikopoulos
Polymath, and also political scientist, Herbert Simon dared to point out that the amounts of time, information, computation, and other resources required for maximizing utility far exceed what is possible when real people have to make real decisions in the real world. In psychology, there are two main approaches to studying actual human judgment and decision making—the heuristics-and-bias and the fast-and-frugal-heuristics research programs. A distinctive characteristic of the fast-and-frugal-heuristics program is that it specifies formal models of heuristics and attempts to determine when people use them and what performance they achieve. These models rely on a few pieces of information that are processed in computationally simple ways. The information and computation are within human reach, which means that people rely on information they have relatively easy access to and employ simple operations such as summing or comparing numbers. Research in the laboratory and in the wild has found that most people use fast and frugal heuristics most of the time if a decision must be made quickly, information is expensive financially or cognitively to gather, or a single/few attributes of the problem strongly point towards an option. The ways in which people switch between heuristics is studied in the framework of the adaptive toolbox. Work employing computer simulations and mathematical analyses has uncovered conditions under which fast and frugal heuristics achieve higher performance than benchmarks from statistics and machine learning, and vice versa. These conditions constitute the theory of ecological rationality. This theory suggests that fast and frugal heuristics perform better than complex optimization models if the available information is of low quality or scarce, or if there exist dominant options or attributes. The bias-variance decomposition of statistical prediction error, which is explained in layperson’s terms, underpins these claims. Research on fast and frugal heuristics suggests a governance approach not based on nudging, but on boosting citizen competence.
Kumail Wasif and Jeff Gill
Bayes’ theorem is a relatively simple equation but one of the most important mathematical principles discovered. It is a formalization of a basic cognitive process: updating expectations as new information is obtained. It was derived from the laws of conditional probability by Reverend Thomas Bayes and published posthumously in 1763. In the 21st century, it is used in academic fields ranging from computer science to social science. The theorem’s most prominent use is in statistical inference. In this regard, there are three essential tenets of Bayesian thought that distinguish it from standard approaches. First, any quantity that is not known as an absolute fact is treated probabilistically, meaning that a numerical probability or a probability distribution is assigned. Second, research questions and designs are based on prior knowledge and expressed as prior distributions. Finally, these prior distributions are updated by conditioning on new data through the use of Bayes’ theorem to create a posterior distribution that is a compromise between prior and data knowledge. This approach has a number of advantages, especially in social science. First, it gives researchers the probability of observing the parameter given the data, which is the inverse of the results from frequentist inference and more appropriate for social scientific data and parameters. Second, Bayesian approaches excel at estimating parameters for complex data structures and functional forms, and provide more information about these parameters compared to standard approaches. This is possible due to stochastic simulation techniques called Markov Chain Monte Carlo. Third, Bayesian approaches allow for the explicit incorporation of previous estimates through the use of the prior distribution. This provides a formal mechanism for incorporating previous estimates and a means of comparing potential results. Bayes’ theorem is also used in machine learning, which is a subset of computer science that focuses on algorithms that learn from data to make predictions. One such algorithm is the Naive Bayes Classifier, which uses Bayes’ theorem to classify objects such as documents based on prior relationships. Bayesian networks can be seen as a complicated version of the Naive Classifier that maps, estimates, and predicts relationships in a network. It is useful for more complicated prediction problems. Lastly, the theorem has even been used by qualitative social scientists as a formal mechanism for stating and evaluating beliefs and updating knowledge.