English translation
Simulate 1000 die rolls
For discrete variables, we examine the probability at each point; for continuous variables, we examine the area under the curve over an interval. The computational methods for these two types differ fundamentally—never mix them.
I first determine whether the variable’s possible values form a countable set. If countable, use summation; if continuous, use density and area integration.
In the previous lesson, we introduced the definition of a random variable: a quantitative outcome derived from a random experiment. Now, we turn our attention to discrete random variables and continuous random variables—foundational concepts essential for understanding probability distributions and stochastic processes.
Discrete Random Variables
Definition
When learning about discrete and continuous random variables, begin by asking: Are the possible values countable (e.g., integers) or uncountably infinite (e.g., intervals on the real line)? This distinction determines how probabilities are represented—and how expectations and other quantities are computed.
A discrete random variable is one whose possible values constitute a countable set—either finite or countably infinite. That is, its outcomes can be explicitly listed. For example, when rolling a fair six-sided die, let the random variable denote the number shown on the top face. Then takes values in the set .
Probability Distribution
The probability distribution of a discrete random variable is typically described using a probability mass function (PMF). For a discrete random variable , its PMF is defined as:
where denotes the probability that equals .
Example: Rolling a Die
Suppose we roll a fair six-sided die once, and define as the outcome. Then:
These satisfy for all , and sum to unity:
Code Example
The following Python code uses numpy to simulate 1,000 die rolls and visualize the resulting empirical probability distribution.
import numpy as np
import matplotlib.pyplot as plt
# Simulate 1000 die rolls
np.random.seed(0)
dice_rolls = np.random.randint(1, 7, size=1000)
# Compute empirical probability distribution
values, counts = np.unique(dice_rolls, return_counts=True)
probabilities = counts / len(dice_rolls)
# Plot
plt.bar(values, probabilities)
plt.xticks(values)
plt.xlabel('Face Value')
plt.ylabel('Probability')
plt.title('Empirical Probability Distribution of Die Rolls')
plt.show()
Continuous Random Variables
Definition
After reading “Essential Probability Theory for AI Beginners: Generating Random Variables and Distributions — Part 5: Discrete and Continuous Random Variables”, reflect on three questions:
- What problem does this concept solve?
- At which step is it easiest to make a mistake?
- Can I walk through a small concrete example end-to-end?
A continuous random variable is one whose possible values span an uncountable subset of the real numbers—typically an interval (e.g., all reals, or ). Because there are infinitely many possible values—and each individual point has zero probability—the distribution is described not by pointwise probabilities, but by a probability density function (PDF).
Probability Density Function
For a continuous random variable , its PDF satisfies:
Here, denotes the probability that falls within the open interval .
Example: Normal Distribution
A canonical example of a continuous random variable is the normal distribution (also called Gaussian distribution), with PDF:
where is the mean and is the standard deviation.
Code Example
The following code plots the PDF of the standard normal distribution (, ):
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
# Mean and standard deviation
mu, sigma = 0, 1
# Range of x-values
x = np.linspace(-5, 5, 1000)
# Compute PDF values
y = norm.pdf(x, mu, sigma)
# Plot
plt.plot(x, y)
plt.title('Probability Density Function of Standard Normal Distribution')
plt.xlabel('x')
plt.ylabel('f(x)')
plt.grid()
plt.show()
When reviewing “Essential Probability Theory for AI Beginners: Generating Random Variables and Distributions — Part 5: Discrete and Continuous Random Variables”, place key concepts, procedural steps, and observable outputs side-by-side on a single page for efficient revision.
When practicing “Essential Probability Theory for AI Beginners: Generating Random Variables and Distributions — Part 5: Discrete and Continuous Random Variables”, write down the input conditions, the operations performed, and the resulting outputs together—so you can easily verify correctness later.
Summary
In today’s lesson, we covered the definitions and distinguishing features of discrete and continuous random variables, along with their respective probability representations—PMFs for discrete variables and PDFs for continuous ones. In the next part, we will explore the cumulative distribution function (CDF) and clarify its relationship with—and distinction from—the PDF. This will deepen your understanding of random variable behavior and strengthen your ability to apply these ideas to real-world problems.
We hope this lesson advances your journey through probability theory!
Continue