Guozhen AIGlobal AI field notes and model intelligence

English translation

Generate synthetic data

Published:

Category: Bayesian Learning

Read time: 3 min

Reads: 0

Lesson #12Views are counted together with the original Chinese articleImages are preserved from the source page

Bayesian Learning and Statistical Inference: Overfitting and Regularization in Model Selection

Bayesian learning centers on integrating prior beliefs with new evidence while explicitly quantifying uncertainty. While reading, structure your understanding around the logical flow: “Overfitting → Concrete examples of overfitting → Regularization → Theoretical foundations of regularization”, then verify each concept using the code snippets, case studies, or evaluation metrics presented in the main text.

Bayesian Learning and Statistical Inference: Overfitting and Regularization — Verification Checklist

After reading, reinforce your understanding with a small, realistic task:

  • What is the input?
  • Where does processing occur?
  • Is the output verifiable and acceptable?
    If the task fails, first diagnose overfitting, then consult the examples of overfitting section.

In the previous chapter, we explored the Bayes factor and model comparison, learning how to select among competing models. Next, we delve into two concepts intimately tied to model selection: overfitting and regularization—both essential for ensuring the generalization capability of our Bayesian learning models.

Overfitting

Overfitting occurs when a model performs exceptionally well on training data but exhibits a sharp decline in performance on new, unseen data. This typically arises when the model is excessively complex—i.e., it has too many parameters—and thus fits not only the underlying pattern but also the noise present in the training data.

Overfitting & Regularization Decision Card

When grasping overfitting and regularization, first examine:

  • Training error vs. validation error
  • Parameter complexity
  • Prior constraints
  • Generalization performance

Examples of Overfitting

Consider linear regression: suppose we have a set of data points and fit them using a high-degree polynomial. On the training set, this polynomial may pass through every point almost perfectly—but on a validation set, its predictive accuracy deteriorates significantly. This degradation is a hallmark of overfitting.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline

# Generate synthetic data
np.random.seed(0)
X = np.sort(5 * np.random.rand(80, 1), axis=0)
y = np.sin(X).ravel() + np.random.normal(0, 0.1, X.shape[0])

# Fit polynomials of varying degrees
degrees = [1, 3, 5, 10]
plt.figure(figsize=(15, 10))

for i, degree in enumerate(degrees):
    model = make_pipeline(PolynomialFeatures(degree), LinearRegression())
    model.fit(X, y)
    y_pred = model.predict(X)
    
    plt.subplot(2, 2, i + 1)
    plt.scatter(X, y, s=10, label='Data')
    plt.plot(X, y_pred, label='Prediction (degree={})'.format(degree), color='red')
    plt.title('Polynomial Degree: {}'.format(degree))
    plt.legend()

plt.show()

In the figure above, as the polynomial degree increases, the model’s fit to the training data improves—but its predictive performance on unseen (test) data does not improve proportionally; instead, it begins to degrade. This illustrates overfitting.

Regularization

To combat overfitting, we employ regularization: a technique that adds a penalty term to the loss function to constrain model complexity, thereby reducing the risk of overfitting. Common regularization methods include L1 regularization (Lasso) and L2 regularization (Ridge).

Bayesian Learning Reading Map Card

Don’t stop at “I understand” after reading Bayesian Learning and Statistical Inference: Overfitting and Regularization in Model Selection. Go back, pick one step, implement it yourself—and note where you get stuck. Doing so will solidify your learning for future topics.

The Principle Behind Regularization

Within the Bayesian framework, regularization corresponds to placing a prior distribution over model parameters. A common choice is a Gaussian prior, which yields L2 regularization; conversely, a Laplace prior leads to L1 regularization.

Example of Regularization

Continuing with the earlier example, we now apply Ridge regression (L2 regularization) to mitigate overfitting.

from sklearn.linear_model import Ridge

# Apply Ridge regression
plt.figure(figsize=(10, 5))
ridge_model = make_pipeline(PolynomialFeatures(10), Ridge(alpha=1.0))
ridge_model.fit(X, y)
y_ridge_pred = ridge_model.predict(X)

plt.scatter(X, y, s=10, label='Data')
plt.plot(X, y_ridge_pred, label='Ridge Prediction (degree=10)', color='green')
plt.title('Ridge Regression with Regularization')
plt.legend()
plt.show()

In the plot above, Ridge regression balances model complexity against fitting fidelity. Although the curve no longer passes exactly through all training points, its generalization to new data improves markedly.

Bayesian Learning and Statistical Inference: Overfitting and Regularization — Application Retrospective Card

At this point, consolidate Bayesian Learning and Statistical Inference: Overfitting and Regularization in Model Selection into a concise retrospective table: first articulate the core narrative, then validate it using a small concrete task.

Bayesian Learning and Statistical Inference: Overfitting and Regularization — Application Self-Check Card

After finishing Bayesian Learning and Statistical Inference: Overfitting and Regularization in Model Selection, try walking through a small example end-to-end. Then assess which steps you can now execute independently.

Summary

In Bayesian learning, overfitting and regularization are two foundational concepts. Recognizing overfitting—and knowing how to counteract it via regularization—empowers us to make more robust model selections. In the next chapter, we will explore Bayesian regression, focusing specifically on practical implementation and applications of linear regression models.

Through this tutorial, we hope you’ll internalize the importance of balancing model complexity during selection—and learn to use regularization techniques not only to achieve good fit, but also to avoid overfitting entirely.

Continue

Keep reading from here

Browse English site

Reader Messages

Reader messages

Questions, corrections, extra sources, or hands-on results can be left here. No login is required.

Max 800 characters

To reduce spam, each message is checked for length, link count, and posting frequency.

0/800

Messages

0 messages
Loading messages...