Guozhen AIGlobal AI field notes and model intelligence

English translation

4. Deriving Bayes' Theorem: Foundations of Bayesian Learning

Published:

Category: Bayesian Learning

Read time: 4 min

Reads: 0

Lesson #4Views are counted together with the original Chinese articleImages are preserved from the source page

Structure Diagram: Bayesian Theorem Fundamentals — Deriving Bayes’ Theorem

The core idea of Bayesian learning is to integrate existing beliefs with new evidence while explicitly representing uncertainty. While reading, structure your understanding as follows: “Form of Bayes’ Theorem → Derivation of Bayes’ Theorem → Prior → Likelihood”, then return to the code, examples, or metrics in the main text for verification.

Verification Diagram: Bayesian Theorem Fundamentals — Deriving Bayes’ Theorem

After reading, perform a quick reality check using a small, concrete task: identify what the inputs are, locate where processing occurs, and verify whether the output is verifiable and acceptable. If something fails, first revisit “The Form of Bayes’ Theorem”, then proceed to “The Derivation of Bayes’ Theorem”.

In the previous chapter, we introduced the fundamental concepts of statistical inference, emphasizing the importance of decision-making under uncertainty. Next, we delve into the derivation of Bayes’ theorem—a central tool in statistical inference. Bayes’ theorem provides a principled way to update our belief about a hypothesis by combining prior knowledge with newly observed data.

The Form of Bayes’ Theorem

Bayes’ theorem describes how to update a posterior probability using a prior probability and a likelihood function. Its basic form is expressed by the following equation:

Bayes’ Theorem Derivation Checklist

To derive Bayes’ theorem, begin with joint probability. Expressing the same event combination in two different orders reveals the relationship among prior, likelihood, and posterior.

P(HE)=P(EH)P(H)P(E)P(H|E) = \frac{P(E|H) \cdot P(H)}{P(E)}

where:

  • P(HE)P(H|E) is the posterior probability: the probability that hypothesis HH holds after observing evidence EE.
  • P(EH)P(E|H) is the likelihood: the probability of observing evidence EE assuming hypothesis HH is true.
  • P(H)P(H) is the prior probability: our initial belief about HH before observing EE.
  • P(E)P(E) is the marginal probability (or evidence): the total probability of observing EE across all possible hypotheses.

Deriving Bayes’ Theorem

To derive Bayes’ theorem, start from the definition of conditional probability:

Bayesian Learning Implementation Checklist

Don’t stop at “I understand” after reading Bayesian Theorem Fundamentals — Deriving Bayes’ Theorem. Go back, pick one step, and work through it manually—then note where you get stuck. This makes subsequent learning more solid.

P(HE)=P(HE)P(E)P(H|E) = \frac{P(H \cap E)}{P(E)}

By symmetry of conditional probability, we can also write:

P(EH)=P(HE)P(H)P(E|H) = \frac{P(H \cap E)}{P(H)}

From these two equations, we deduce:

P(HE)=P(EH)P(H)P(H \cap E) = P(E|H) \cdot P(H)

Substituting this into the posterior formula yields:

P(HE)=P(EH)P(H)P(E)P(H|E) = \frac{P(E|H) \cdot P(H)}{P(E)}

Next, we need to compute the marginal probability P(E)P(E). Using the law of total probability:

P(E)=P(EH)P(H)+P(E¬H)P(¬H)P(E) = P(E|H) \cdot P(H) + P(E|\neg H) \cdot P(\neg H)

Plugging this expression back into Bayes’ theorem confirms its validity.

Case Study

To better grasp Bayes’ theorem, consider the following concrete example.

Suppose we are conducting a medical test for a disease whose prevalence in a given population is P(Disease)=0.01P(\text{Disease}) = 0.01 (i.e., the prior probability). We also know:

  • If a person has the disease, the test returns positive with probability P(PositiveDisease)=0.9P(\text{Positive}|\text{Disease}) = 0.9 (the likelihood).
  • If a person does not have the disease, the test still returns positive with probability P(Positive¬Disease)=0.05P(\text{Positive}|\neg\text{Disease}) = 0.05.

We want to compute the posterior probability P(DiseasePositive)P(\text{Disease}|\text{Positive}): the probability that a person actually has the disease given a positive test result.

Applying Bayes’ theorem, we first compute the marginal probability P(Positive)P(\text{Positive}):

P(Positive)=P(PositiveDisease)P(Disease)+P(Positive¬Disease)P(¬Disease)P(\text{Positive}) = P(\text{Positive}|\text{Disease}) \cdot P(\text{Disease}) + P(\text{Positive}|\neg\text{Disease}) \cdot P(\neg\text{Disease})

Substituting values:

P(Positive)=0.90.01+0.05(10.01)=0.009+0.0495=0.0585P(\text{Positive}) = 0.9 \cdot 0.01 + 0.05 \cdot (1 - 0.01) = 0.009 + 0.0495 = 0.0585

Then apply Bayes’ theorem:

P(DiseasePositive)=P(PositiveDisease)P(Disease)P(Positive)=0.90.010.05850.1538P(\text{Disease}|\text{Positive}) = \frac{P(\text{Positive}|\text{Disease}) \cdot P(\text{Disease})}{P(\text{Positive})} = \frac{0.9 \cdot 0.01}{0.0585} \approx 0.1538

Thus, even with a positive test result, the actual probability that the patient has the disease is only about 15.38%—highlighting the critical role of prior probabilities when reasoning under uncertainty.

Application Retrospective Card: Bayesian Theorem Fundamentals — Deriving Bayes’ Theorem

At this point, consolidate Bayesian Theorem Fundamentals — Deriving Bayes’ Theorem into a retrospective table: first clarify the main thread, then validate it using a small task.

Application Verification Card: Bayesian Theorem Fundamentals — Deriving Bayes’ Theorem

After finishing Bayesian Theorem Fundamentals — Deriving Bayes’ Theorem, select a small example and walk through the full workflow end-to-end—then assess which steps you can now execute independently.

Summary

Bayes’ theorem offers a structured framework for updating beliefs about events, grounded in the foundational concept of conditional probability. In practice, Bayesian inference leverages both prior knowledge and new data to support more accurate and robust decision-making. In the next chapter, we will explore Bayesian Theorem Fundamentals — Prior and Posterior Distributions in depth.

Continue

Keep reading from here

Browse English site

Reader Messages

Reader messages

Questions, corrections, extra sources, or hands-on results can be left here. No login is required.

Max 800 characters

To reduce spam, each message is checked for length, link count, and posting frequency.

0/800

Messages

0 messages
Loading messages...