English translation
4. Deriving Bayes' Theorem: Foundations of Bayesian Learning
The core idea of Bayesian learning is to integrate existing beliefs with new evidence while explicitly representing uncertainty. While reading, structure your understanding as follows: “Form of Bayes’ Theorem → Derivation of Bayes’ Theorem → Prior → Likelihood”, then return to the code, examples, or metrics in the main text for verification.
After reading, perform a quick reality check using a small, concrete task: identify what the inputs are, locate where processing occurs, and verify whether the output is verifiable and acceptable. If something fails, first revisit “The Form of Bayes’ Theorem”, then proceed to “The Derivation of Bayes’ Theorem”.
In the previous chapter, we introduced the fundamental concepts of statistical inference, emphasizing the importance of decision-making under uncertainty. Next, we delve into the derivation of Bayes’ theorem—a central tool in statistical inference. Bayes’ theorem provides a principled way to update our belief about a hypothesis by combining prior knowledge with newly observed data.
The Form of Bayes’ Theorem
Bayes’ theorem describes how to update a posterior probability using a prior probability and a likelihood function. Its basic form is expressed by the following equation:
To derive Bayes’ theorem, begin with joint probability. Expressing the same event combination in two different orders reveals the relationship among prior, likelihood, and posterior.
where:
- is the posterior probability: the probability that hypothesis holds after observing evidence .
- is the likelihood: the probability of observing evidence assuming hypothesis is true.
- is the prior probability: our initial belief about before observing .
- is the marginal probability (or evidence): the total probability of observing across all possible hypotheses.
Deriving Bayes’ Theorem
To derive Bayes’ theorem, start from the definition of conditional probability:
Don’t stop at “I understand” after reading Bayesian Theorem Fundamentals — Deriving Bayes’ Theorem. Go back, pick one step, and work through it manually—then note where you get stuck. This makes subsequent learning more solid.
By symmetry of conditional probability, we can also write:
From these two equations, we deduce:
Substituting this into the posterior formula yields:
Next, we need to compute the marginal probability . Using the law of total probability:
Plugging this expression back into Bayes’ theorem confirms its validity.
Case Study
To better grasp Bayes’ theorem, consider the following concrete example.
Suppose we are conducting a medical test for a disease whose prevalence in a given population is (i.e., the prior probability). We also know:
- If a person has the disease, the test returns positive with probability (the likelihood).
- If a person does not have the disease, the test still returns positive with probability .
We want to compute the posterior probability : the probability that a person actually has the disease given a positive test result.
Applying Bayes’ theorem, we first compute the marginal probability :
Substituting values:
Then apply Bayes’ theorem:
Thus, even with a positive test result, the actual probability that the patient has the disease is only about 15.38%—highlighting the critical role of prior probabilities when reasoning under uncertainty.
At this point, consolidate Bayesian Theorem Fundamentals — Deriving Bayes’ Theorem into a retrospective table: first clarify the main thread, then validate it using a small task.
After finishing Bayesian Theorem Fundamentals — Deriving Bayes’ Theorem, select a small example and walk through the full workflow end-to-end—then assess which steps you can now execute independently.
Summary
Bayes’ theorem offers a structured framework for updating beliefs about events, grounded in the foundational concept of conditional probability. In practice, Bayesian inference leverages both prior knowledge and new data to support more accurate and robust decision-making. In the next chapter, we will explore Bayesian Theorem Fundamentals — Prior and Posterior Distributions in depth.
Continue