Bayes’ theorem is a relatively simple equation but one of the most important mathematical principles discovered. It is a formalization of a basic cognitive process: updating expectations as new information is obtained. It was derived from the laws of conditional probability by Reverend Thomas Bayes and published posthumously in 1763. In the 21st century, it is used in academic fields ranging from computer science to social science. The theorem’s most prominent use is in statistical inference. In this regard, there are three essential tenets of Bayesian thought that distinguish it from standard approaches. First, any quantity that is not known as an absolute fact is treated probabilistically, meaning that a numerical probability or a probability distribution is assigned. Second, research questions and designs are based on prior knowledge and expressed as prior distributions. Finally, these prior distributions are updated by conditioning on new data through the use of Bayes’ theorem to create a posterior distribution that is a compromise between prior and data knowledge. This approach has a number of advantages, especially in social science. First, it gives researchers the probability of observing the parameter given the data, which is the inverse of the results from frequentist inference and more appropriate for social scientific data and parameters. Second, Bayesian approaches excel at estimating parameters for complex data structures and functional forms, and provide more information about these parameters compared to standard approaches. This is possible due to stochastic simulation techniques called Markov Chain Monte Carlo. Third, Bayesian approaches allow for the explicit incorporation of previous estimates through the use of the prior distribution. This provides a formal mechanism for incorporating previous estimates and a means of comparing potential results. Bayes’ theorem is also used in machine learning, which is a subset of computer science that focuses on algorithms that learn from data to make predictions. One such algorithm is the Naive Bayes Classifier, which uses Bayes’ theorem to classify objects such as documents based on prior relationships. Bayesian networks can be seen as a complicated version of the Naive Classifier that maps, estimates, and predicts relationships in a network. It is useful for more complicated prediction problems. Lastly, the theorem has even been used by qualitative social scientists as a formal mechanism for stating and evaluating beliefs and updating knowledge.