English translation
Understanding Bayes' Theorem
Bayes’ Theorem updates beliefs in light of new evidence. It combines three key elements: what we believed before seeing the evidence (prior), how strongly the evidence supports the hypothesis (likelihood), and how common the evidence is overall (marginal probability).
I distinguish clearly between prior and likelihood. Confusing with is the most common mistake in Bayes-related problems.
In the previous article, we explored applications of the Central Limit Theorem and learned about the behavior of sums of many independent and identically distributed random variables. Now, we turn to a foundational concept in probability theory—Bayes’ Theorem. Bayes’ Theorem is a crucial tool for understanding probability and reasoning under uncertainty, and it finds widespread application across artificial intelligence and machine learning.
Core Concepts of Bayes’ Theorem
Bayes’ Theorem expresses the relationship among conditional probabilities. First, let’s define several essential terms:
When applying Bayes’ Theorem, begin by writing down the original (prior) probability, the probability of the evidence, and the relevant conditional probabilities—then examine how the evidence shifts the credibility of the event.
- Prior Probability : Our estimated probability that event occurs before observing any evidence.
- Likelihood : The probability that evidence occurs given that event has occurred.
- Marginal Probability of Evidence : The total probability that evidence occurs, computed using the Law of Total Probability: where denotes the complement of (i.e., does not occur).
- Posterior Probability : The updated probability that event occurs after observing evidence .
The mathematical statement of Bayes’ Theorem is:
Understanding Bayes’ Theorem
To build intuition, let’s walk through a simple illustrative example.
After reading Understanding Bayes’ Theorem, take one minute to reflect: Are the core concepts clearly distinguished? Can you reproduce the solution steps? Can you restate the conclusions in your own words?
Example: Medical Testing
Suppose a rare disease affects 1% of the population (i.e., ). A medical test exists to detect it. If a person truly has the disease, the test returns positive with 90% sensitivity:
.
However, if a person does not have the disease, the test still yields a false positive 10% of the time:
.
We want to compute the probability that a person actually has the disease given a positive test result—that is, .
Applying Bayes’ Theorem:
-
Prior Probabilities:
-
Likelihoods:
-
Marginal Probability of Evidence — compute :
-
Posterior Probability:
Thus, even with a positive test result, the probability that the person actually has the disease is only about 8.33%. This illustrates how—even with relatively high test sensitivity—the rarity of the disease significantly reduces the posterior probability.
At this point, summarize Understanding Bayes’ Theorem into a concise review table: first articulate the main conceptual thread, then verify it using a small concrete task.
After finishing Understanding Bayes’ Theorem, try working through a small example end-to-end. Then assess which steps you can now perform independently.
Summary
Through this example, we see how Bayes’ Theorem enables us to rationally update our beliefs when new evidence arrives. Rather than discarding prior knowledge, we integrate it with observed data to refine our understanding of reality. This mode of probabilistic reasoning plays a central role in machine learning and data science—especially in model selection, hyperparameter tuning, and uncertainty quantification.
In the next article, we will delve deeper into Bayesian updating, exploring the roles of priors and posteriors—and how to flexibly revise our knowledge as new data streams in over time.
Continue