Guozhen AIGlobal AI field notes and model intelligence

English translation

Understanding Bayes' Theorem

Published:

Category: Probability for Beginners

Read time: 4 min

Reads: 0

Lesson #16Views are counted together with the original Chinese articleImages are preserved from the source page

Conceptual Diagram of Bayes’ Theorem

Bayes’ Theorem updates beliefs in light of new evidence. It combines three key elements: what we believed before seeing the evidence (prior), how strongly the evidence supports the hypothesis (likelihood), and how common the evidence is overall (marginal probability).

Bayes’ Theorem Checklist Diagram

I distinguish clearly between prior and likelihood. Confusing P(AB)P(A|B) with P(BA)P(B|A) is the most common mistake in Bayes-related problems.

In the previous article, we explored applications of the Central Limit Theorem and learned about the behavior of sums of many independent and identically distributed random variables. Now, we turn to a foundational concept in probability theory—Bayes’ Theorem. Bayes’ Theorem is a crucial tool for understanding probability and reasoning under uncertainty, and it finds widespread application across artificial intelligence and machine learning.

Core Concepts of Bayes’ Theorem

Bayes’ Theorem expresses the relationship among conditional probabilities. First, let’s define several essential terms:

Bayes’ Theorem Comprehension Flashcard

When applying Bayes’ Theorem, begin by writing down the original (prior) probability, the probability of the evidence, and the relevant conditional probabilities—then examine how the evidence shifts the credibility of the event.

  • Prior Probability P(A)P(A): Our estimated probability that event AA occurs before observing any evidence.
  • Likelihood P(BA)P(B|A): The probability that evidence BB occurs given that event AA has occurred.
  • Marginal Probability of Evidence P(B)P(B): The total probability that evidence BB occurs, computed using the Law of Total Probability: P(B)=P(BA)P(A)+P(BA)P(A),P(B) = P(B|A)P(A) + P(B|A')P(A'), where AA' denotes the complement of AA (i.e., AA does not occur).
  • Posterior Probability P(AB)P(A|B): The updated probability that event AA occurs after observing evidence BB.

The mathematical statement of Bayes’ Theorem is:

P(AB)=P(BA)P(A)P(B)P(A|B) = \frac{P(B|A)\,P(A)}{P(B)}

Understanding Bayes’ Theorem

To build intuition, let’s walk through a simple illustrative example.

Probability Reading Roadmap Card

After reading Understanding Bayes’ Theorem, take one minute to reflect: Are the core concepts clearly distinguished? Can you reproduce the solution steps? Can you restate the conclusions in your own words?

Example: Medical Testing

Suppose a rare disease affects 1% of the population (i.e., P(Disease)=0.01P(\text{Disease}) = 0.01). A medical test exists to detect it. If a person truly has the disease, the test returns positive with 90% sensitivity:
P(PositiveDisease)=0.9P(\text{Positive} \mid \text{Disease}) = 0.9.
However, if a person does not have the disease, the test still yields a false positive 10% of the time:
P(PositiveNo Disease)=0.1P(\text{Positive} \mid \text{No Disease}) = 0.1.

We want to compute the probability that a person actually has the disease given a positive test result—that is, P(DiseasePositive)P(\text{Disease} \mid \text{Positive}).

Applying Bayes’ Theorem:

  1. Prior Probabilities:

    • P(Disease)=0.01P(\text{Disease}) = 0.01
    • P(No Disease)=1P(Disease)=0.99P(\text{No Disease}) = 1 - P(\text{Disease}) = 0.99
  2. Likelihoods:

    • P(PositiveDisease)=0.9P(\text{Positive} \mid \text{Disease}) = 0.9
    • P(PositiveNo Disease)=0.1P(\text{Positive} \mid \text{No Disease}) = 0.1
  3. Marginal Probability of Evidence — compute P(Positive)P(\text{Positive}):

    P(Positive)=P(PositiveDisease)P(Disease)+P(PositiveNo Disease)P(No Disease)P(\text{Positive}) = P(\text{Positive} \mid \text{Disease})\,P(\text{Disease}) + P(\text{Positive} \mid \text{No Disease})\,P(\text{No Disease}) =0.9×0.01+0.1×0.99=0.009+0.099=0.108= 0.9 \times 0.01 + 0.1 \times 0.99 = 0.009 + 0.099 = 0.108
  4. Posterior Probability:

    P(DiseasePositive)=P(PositiveDisease)P(Disease)P(Positive)P(\text{Disease} \mid \text{Positive}) = \frac{P(\text{Positive} \mid \text{Disease})\,P(\text{Disease})}{P(\text{Positive})} =0.9×0.010.1080.0833= \frac{0.9 \times 0.01}{0.108} \approx 0.0833

Thus, even with a positive test result, the probability that the person actually has the disease is only about 8.33%. This illustrates how—even with relatively high test sensitivity—the rarity of the disease significantly reduces the posterior probability.

Bayes’ Theorem Application Review Card

At this point, summarize Understanding Bayes’ Theorem into a concise review table: first articulate the main conceptual thread, then verify it using a small concrete task.

Bayes’ Theorem Application Self-Check Card

After finishing Understanding Bayes’ Theorem, try working through a small example end-to-end. Then assess which steps you can now perform independently.

Summary

Through this example, we see how Bayes’ Theorem enables us to rationally update our beliefs when new evidence arrives. Rather than discarding prior knowledge, we integrate it with observed data to refine our understanding of reality. This mode of probabilistic reasoning plays a central role in machine learning and data science—especially in model selection, hyperparameter tuning, and uncertainty quantification.

In the next article, we will delve deeper into Bayesian updating, exploring the roles of priors and posteriors—and how to flexibly revise our knowledge as new data streams in over time.

Continue

Keep reading from here

Browse English site

Reader Messages

Reader messages

Questions, corrections, extra sources, or hands-on results can be left here. No login is required.

Max 800 characters

To reduce spam, each message is checked for length, link count, and posting frequency.

0/800

Messages

0 messages
Loading messages...