English translation
Set parameters
The PDF describes density; the CDF describes cumulative probability. In continuous distributions, the density value at a single point does not equal the probability of that point.
I’ll use area under the curve to interpret probability. If you see a PDF value greater than 1, don’t immediately assume it’s an error.
In the previous article, we introduced the fundamental concept of random variables and their classification—namely, discrete and continuous random variables. In this article, we delve deeper into two essential tools associated with these variables: the Cumulative Distribution Function (CDF) and the Probability Density Function (PDF). These concepts lay the groundwork for our further exploration of probability distributions—a prerequisite for our next article, where we’ll examine common distributions such as the binomial distribution.
1. Cumulative Distribution Function (CDF)
The Cumulative Distribution Function quantifies the probability that a random variable takes on a value less than or equal to a given threshold. Specifically, the CDF of a random variable , denoted , gives the probability that is no greater than :
When learning about the CDF and PDF, remember: one answers “What is the cumulative probability over an interval?”, while the other describes “How dense is the probability near a continuous value?”
For a discrete random variable , its CDF is defined as:
For a continuous random variable , the CDF is defined as:
where is the probability density function (PDF) of .
1.1 Example: CDF of a Discrete Random Variable
Consider a simple experiment: rolling a fair six-sided die. Let the random variable represent the outcome. Then takes values in , each with probability . We compute :
1.2 Example: CDF of a Continuous Random Variable
Let be a continuous random variable following the uniform distribution . Its PDF is:
Then its CDF is:
2. Probability Density Function (PDF)
The Probability Density Function characterizes how probability is distributed across the possible values of a continuous random variable. For discrete random variables, we use the probability mass function (PMF); for continuous ones, we use the PDF.
Before reading “Random Variables and Distributions: CDF and PDF”, first glance at the diagram showing the path from problem → reasoning → result. After reading, revisit the diagram to verify whether you can reconstruct the logic step-by-step.
2.1 Definition of the PDF
For a random variable with PDF , the probability that falls within an interval is given by:
A valid PDF satisfies two key properties:
- Non-negativity: for all .
- Unit total area: the integral over the entire real line equals 1:
2.2 Example: PDF of the Uniform Distribution
Continuing with the example, its PDF is:
This reflects the fact that every value in is equally likely.
3. Relationship Between CDF and PDF
For continuous random variables, the CDF and PDF are intimately linked: the PDF is the derivative of the CDF.
To apply “Random Variables and Distributions: CDF and PDF” to your own task, start small—focus on verifying just one critical decision point.
After studying “Random Variables and Distributions: CDF and PDF”, try adapting it to a scenario of your own—pay special attention to whether inputs, processing steps, and outputs align coherently.
Conversely, if the PDF is known, the CDF can be recovered via integration:
3.1 Example: From PDF to CDF
Recall the PDF . Its corresponding CDF is:
This piecewise definition captures the characteristic shape of the uniform distribution.
3.2 Python Example: Computing CDF and PDF
Below is a simple Python example using scipy to compute and visualize the CDF and PDF of the uniform distribution.
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import uniform
# Set parameters
a, b = 0, 1 # support interval for Uniform(0, 1)
# Generate x values
x = np.linspace(-0.5, 1.5, 100)
# Compute PDF and CDF
pdf = uniform.pdf(x, loc=a, scale=b)
cdf = uniform.cdf(x, loc=a, scale=b)
# Plotting
plt.figure(figsize=(10, 5))
# Plot PDF
plt.subplot(1, 2, 1)
plt.title('Probability Density Function (PDF)')
plt.plot(x, pdf, label='PDF', color='blue')
plt.fill_between(x, pdf, alpha=0.2)
plt.xlim(-0.5, 1.5)
plt.xlabel('x')
plt.ylabel('Density')
plt.axhline(0, color='black', lw=1)
plt.axvline(0, color='black', lw=1)
# Plot CDF
plt.subplot(1, 2, 2)
plt.title('Cumulative Distribution Function (CDF)')
plt.plot(x, cdf, label='CDF', color='orange')
plt.axhline(1, color='black', lw=1)
plt.axvline(1, color='black', lw=1)
plt.xlim(-0.5, 1.5)
plt.xlabel('x')
plt.ylabel('Probability')
plt.axhline(0, color='black')
Continue