English translation
Define possible values of random variable X
A random variable transforms uncertain experimental outcomes into computable numbers. It serves as the bridge from events to distributions, expectations, and model evaluation.
I’ll first clarify what a random variable represents. Without a clear understanding of its meaning, subsequent concepts—such as distributions and expectations—lose interpretability.
In the previous article, we covered foundational probability concepts—including conditional probability and independence—essential for understanding relationships among random events. In this article, we delve into a central topic in probability theory: random variables.
Definition of a Random Variable
In probability theory, a random variable is a function that maps outcomes of a random experiment to numerical values. More precisely, given a sample space (i.e., the set of all possible experimental outcomes), a random variable is a measurable function mapping from to the real numbers :
When learning about random variables, first examine how they assign numeric values to random outcomes; then determine whether those values are discrete or continuous. Distributions, expectations, and variances all originate here.
Example
Suppose we conduct a fair six-sided die roll. The sample space is:
We define a random variable representing the number shown on the die. Its definition is:
- If the outcome is 1, then
- If the outcome is 2, then
- If the outcome is 3, then
- If the outcome is 4, then
- If the outcome is 5, then
- If the outcome is 6, then
In this example, maps each possible die outcome to a concrete integer, enabling quantitative analysis.
Classification of Random Variables
Random variables fall broadly into two categories: discrete random variables and continuous random variables. While this article focuses primarily on the definition of random variables, we briefly introduce both types here—and explore them in depth in the next article.
After reading AI-Ready Probability for Beginners: Part 4 — Random Variables and Distributions — Defining Random Variables, take one minute to reflect:
-
Are key concepts clearly distinguished?
-
Can the practice steps be reproduced?
-
Can you restate conclusions in your own words?
-
Discrete random variable: Takes values that are finite or countably infinite. For example, the die-roll result can only be one of .
-
Continuous random variable: Takes any real value within an interval. For instance, a person’s height may assume any real value in a plausible physiological range.
Reflecting on Real-World Applications
Suppose we’re building an AI system to predict user purchase behavior. We might model the number of past purchases by a user as a discrete random variable , where .
Another relevant random variable could be : total spending per online shopping session. Since spending amounts can vary continuously (e.g., $29.99, $30.005, $157.82…), is naturally modeled as a continuous random variable.
Relationship Between Random Variables and Distributions
A random variable is far more than a simple mapping—it is intrinsically linked to a probability distribution. Every random variable has an associated distribution describing the likelihood of its possible values.
Definition of a Probability Distribution
A probability distribution specifies the probability that a random variable takes each of its possible values.
- For a discrete random variable , the distribution is described by a probability mass function (PMF), denoted , defined as:
- For a continuous random variable , the distribution is described by a probability density function (PDF), denoted , satisfying:
Case Demonstration
Let’s compute and visualize the probability distribution of a discrete random variable using Python. Revisiting our die-roll example, takes values with equal probability.
import numpy as np
import matplotlib.pyplot as plt
# Define possible values of random variable X
values = np.array([1, 2, 3, 4, 5, 6])
# Compute probability for each value
probabilities = np.array([1/6] * 6) # Uniform probability: 1/6 each
# Visualize the probability distribution
plt.bar(values, probabilities)
plt.xlabel('Die Face')
plt.ylabel('Probability')
plt.title('Probability Distribution of a Fair Die Roll')
plt.xticks(values)
plt.ylim(0, 1)
plt.show()
This code computes the PMF of and displays it as a bar chart—each face has identical probability .
By now, summarize AI-Ready Probability for Beginners: Part 4 — Random Variables and Distributions — Defining Random Variables into a concise recap table: First outline the core narrative, then test comprehension with a small task.
After finishing AI-Ready Probability for Beginners: Part 4 — Random Variables and Distributions — Defining Random Variables, try walking through a small concrete example end-to-end. Then assess which steps you can now execute independently.
Summary
In this article, we formally defined the concept of a random variable, clarified its relationship with probability distributions, introduced its two main types (discrete and continuous), and illustrated these ideas with intuitive examples and code. In the next article, we’ll dive deeper into the properties and applications of discrete and continuous random variables.
Stay tuned for more installments in this series—and continue strengthening your foundational probability knowledge for AI!
Continue