Guozhen AIGlobal AI field notes and model intelligence

English translation

6. Bayesian Theorem Fundamentals: Updating Rules and Examples

Published:

Category: Bayesian Learning

Read time: 4 min

Reads: 0

Lesson #6Views are counted together with the original Chinese articleImages are preserved from the source page

Structure Diagram: Bayesian Theorem Fundamentals — Update Rules and Examples

The core of Bayesian learning lies in integrating prior beliefs with new evidence, while explicitly representing uncertainty. As you read, structure your understanding as follows: “Bayesian Theorem Recap → Update Rule → Process of Updating Probabilities → Assigning Prior Probabilities”, then return to the code, examples, or metrics in the main text to verify your comprehension.

Verification Diagram: Bayesian Theorem Fundamentals — Update Rules and Examples

After reading, perform a quick reality check using a small, concrete task: Identify what the inputs are, where the processing steps occur, and whether the outputs are verifiable. If something goes wrong, first revisit the “Bayesian Theorem Recap”, then check the “Update Rule”.

In the previous article, we introduced the foundational concepts of Bayes’ theorem—including prior and posterior distributions. Now, we will delve deeper into the update rule embedded in Bayes’ theorem: how to revise our beliefs (or model parameters) in light of observed data.

Bayesian Theorem Recap

First, let’s briefly recall the formal expression of Bayes’ theorem:

Bayesian Update Rule Decision Card

When learning the update rule, begin by writing down your initial belief, then examine how that belief changes upon observing new evidence. The more concrete the example, the clearer your Bayesian intuition becomes.

P(HD)=P(DH)P(H)P(D)P(H|D) = \frac{P(D|H) \cdot P(H)}{P(D)}

where

  • P(HD)P(H|D) is the posterior probability: our updated belief about hypothesis HH after observing data DD;
  • P(DH)P(D|H) is the likelihood: the probability of observing data DD assuming hypothesis HH is true;
  • P(H)P(H) is the prior probability: our belief about HH before seeing any data;
  • P(D)P(D) is the marginal likelihood (or evidence): a normalizing constant ensuring the total probability across all possible hypotheses sums to 1.

The Update Rule

From the formula above, we see precisely how the posterior probability depends on both the prior and the observed data. Unlike classical statistical methods, Bayesian learning explicitly incorporates prior knowledge. Once new data arrives, we use Bayes’ rule to systematically update our degree of belief in a given hypothesis.

Bayesian Learning Reading Map Card

Before reading “Bayesian Theorem Fundamentals — Update Rules and Examples”, preview the diagram showing the path from problem to result. After reading, cross-check against the main text to confirm whether you can reproduce the reasoning step-by-step.

Process of Updating Probabilities

In practice, suppose we conduct an experiment to assess whether a coin is fair. Our hypothesis space could be:

  • H1H_1: The coin is fair.
  • H2H_2: The coin is biased.

Assigning Prior Probabilities

Before collecting any data, we might assign equal prior probabilities to the two hypotheses:

  • P(H1)=0.5P(H_1) = 0.5
  • P(H2)=0.5P(H_2) = 0.5

Collecting Data

Suppose we flip the coin 10 times and observe 7 heads and 3 tails. We now want to compute the updated probabilities of H1H_1 and H2H_2 given this outcome.

Computing Likelihoods

Next, we compute the likelihood of the observed data under each hypothesis:

  • Under H1H_1 (fair coin, p=0.5p = 0.5), the likelihood of 7 heads and 3 tails is:
P(DH1)=(107)(0.5)7(0.5)3=10!7!3!(0.5)10P(D|H_1) = \binom{10}{7} \cdot (0.5)^7 \cdot (0.5)^3 = \frac{10!}{7!\,3!} \cdot (0.5)^{10}

Result: P(DH1)=0.1172P(D|H_1) = 0.1172.

  • Under H2H_2 (biased coin, assume p=0.8p = 0.8 for heads), the likelihood is:
P(DH2)=(107)(0.8)7(0.2)3=10!7!3!(0.8)7(0.2)3P(D|H_2) = \binom{10}{7} \cdot (0.8)^7 \cdot (0.2)^3 = \frac{10!}{7!\,3!} \cdot (0.8)^7 \cdot (0.2)^3

Result: P(DH2)=0.2013P(D|H_2) = 0.2013.

Updating Posterior Probabilities

Now apply Bayes’ theorem to compute the posterior probabilities:

  1. First compute the marginal likelihood P(D)P(D):
P(D)=P(DH1)P(H1)+P(DH2)P(H2)=0.11720.5+0.20130.5=0.15825P(D) = P(D|H_1) \cdot P(H_1) + P(D|H_2) \cdot P(H_2) = 0.1172 \cdot 0.5 + 0.2013 \cdot 0.5 = 0.15825
  1. Then compute the posteriors:
  • For H1H_1:
P(H1D)=P(DH1)P(H1)P(D)=0.11720.50.158250.3704P(H_1|D) = \frac{P(D|H_1) \cdot P(H_1)}{P(D)} = \frac{0.1172 \cdot 0.5}{0.15825} \approx 0.3704
  • For H2H_2:
P(H2D)=P(DH2)P(H2)P(D)=0.20130.50.158250.6296P(H_2|D) = \frac{P(D|H_2) \cdot P(H_2)}{P(D)} = \frac{0.2013 \cdot 0.5}{0.15825} \approx 0.6296

Final results:

  • P(H1D)0.3704P(H_1|D) \approx 0.3704
  • P(H2D)0.6296P(H_2|D) \approx 0.6296

These calculations show that, after observing the data, our belief in the biased-coin hypothesis has increased.

Application Retrospective Card: Bayesian Theorem Fundamentals — Update Rules and Examples

After finishing “Bayesian Theorem Fundamentals — Update Rules and Examples”, try applying it to a scenario of your own—pay close attention to whether the inputs, processing steps, and outputs align coherently.

Application Check Card: Bayesian Theorem Fundamentals — Update Rules and Examples

To adapt “Bayesian Theorem Fundamentals — Update Rules and Examples” to your own task, start small: isolate and validate just one critical decision point.

Conclusion

Through the above example, we demonstrated how Bayes’ theorem enables probabilistic updating—integrating new data to dynamically refine our beliefs about hypotheses. In practice, the power of Bayesian learning lies in its ability to formally incorporate prior knowledge and support self-correction as new evidence accumulates in changing environments.

In the next article, we will explore Maximum A Posteriori (MAP) Estimation, advancing further into the world of Bayesian statistical inference—and introducing a practical method for parameter estimation. Stay tuned!

Continue

Keep reading from here

Browse English site

Reader Messages

Reader messages

Questions, corrections, extra sources, or hands-on results can be left here. No login is required.

Max 800 characters

To reduce spam, each message is checked for length, link count, and posting frequency.

0/800

Messages

0 messages
Loading messages...