How should I use this AI Tutorials article?

Use it as the implementation or learning layer, then connect the idea to AI software buyer guides, tool comparisons, benchmarks, API choices, and security checks before making a production decision.

Is this English article different from the Chinese original?

The English edition is localized for global AI readers while preserving the original diagrams, screenshots, prompts, code examples, and source context from the Chinese article.

What should I read after Compute variances?

Continue with AI Software Buyer Guides, AI Tools Workbench, Best AI Coding Agents, AI Model Benchmarks, OpenAI vs Anthropic API, or LLM Security Tools depending on the decision you need to make.

Can this article alone choose an AI product or model?

No. Treat the article as evidence and context, then validate fit with pricing, privacy requirements, integration effort, benchmark results, workflow tests, and fallback planning.

Compute variances

Concept Diagram: Covariance and Correlation

Covariance measures whether two variables change together; correlation removes the influence of scale. A high correlation does not imply causation.

Covariance and Correlation Checklist Diagram

I always start with a scatter plot. A single correlation coefficient cannot reveal outliers, nonlinear relationships, or subgroup structures.

In the previous article, we explored the properties of variance, learning how to quantify the dispersion of a single random variable. This article continues our discussion of key concepts in probability theory: covariance and correlation—essential tools for analyzing relationships between random variables, widely applied in machine learning and data analysis.

Definition of Covariance

Covariance is a measure describing the linear relationship between two random variables. Given random variables $X$ and $Y$ , their covariance is defined as:

Covariance & Correlation Decision Card

When learning covariance and correlation, first assess whether the two variables move in the same direction; then examine the sign and magnitude of the standardized correlation coefficient.

\text{Cov}(X, Y) = \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y])]

This formula helps us intuitively understand covariance: it quantifies how deviations of one variable from its mean relate to deviations of the other variable from its mean.

Properties of Covariance

Interpretation of Sign:
- If $\text{Cov}(X, Y) > 0$ , $X$ and $Y$ are positively associated: when one tends to increase, the other tends to increase as well.
- If $\text{Cov}(X, Y) < 0$ , $X$ and $Y$ are negatively associated.
- If $\text{Cov}(X, Y) = 0$ , there is no linear relationship between them.
Unit Sensitivity:
- Covariance carries units equal to the product of the units of $X$ and $Y$ , making interpretation less intuitive.

Example

Suppose $X$ and $Y$ represent a student’s study time (in hours) and exam score (in points), respectively. Observed data are shown below:

Study Time ( $X$ )	Exam Score ( $Y$ )
1	50
2	55
3	60
4	70
5	75

First, compute the expectations:

\mathbb{E}[X] = \frac{1 + 2 + 3 + 4 + 5}{5} = 3

\mathbb{E}[Y] = \frac{50 + 55 + 60 + 70 + 75}{5} = 62

Then apply the covariance formula:

\text{Cov}(X, Y) = \frac{1}{5} \sum_{i=1}^{5} (X_i - \mathbb{E}[X])(Y_i - \mathbb{E}[Y])

import numpy as np

X = np.array([1, 2, 3, 4, 5])
Y = np.array([50, 55, 60, 70, 75])

cov_xy = np.cov(X, Y)[0][1]  # Extract covariance between X and Y
cov_xy

The computed covariance $\text{Cov}(X, Y)$ is positive, indicating a positive linear association between study time and exam score.

Definition and Computation of Correlation

Correlation is the standardized version of covariance—designed specifically to eliminate unit dependence. It is typically quantified using the Pearson correlation coefficient, defined as:

Probability Reading Map Card

When reading “Covariance and Correlation,” begin by reviewing the tasks, core concepts, exercises, and decision points illustrated in the accompanying figures—then return to the main text to fill in details. This approach helps you quickly identify which real-world scenarios this content applies to.

r_{XY} = \frac{\text{Cov}(X, Y)}{\sqrt{\text{Var}(X) \cdot \text{Var}(Y)}}

where $\text{Var}(X)$ and $\text{Var}(Y)$ denote the variances of $X$ and $Y$ , respectively.

Properties of Correlation

Range:
- The correlation coefficient $r_{XY}$ always lies in the interval $[-1, 1]$ .
- $r_{XY} = 1$ indicates perfect positive linear correlation; $r_{XY} = -1$ , perfect negative linear correlation; $r_{XY} = 0$ , no linear correlation.

Example

Continuing with the earlier example, we compute the Pearson correlation coefficient between study time and exam score.

# Compute variances
var_x = np.var(X)
var_y = np.var(Y)

# Compute correlation coefficient
correlation = cov_xy / (np.sqrt(var_x) * np.sqrt(var_y))
correlation

Running this code yields the correlation coefficient $r$ . Suppose the result is $r = 0.95$ : this indicates a strong positive linear relationship between study time and exam performance.

Covariance and Correlation Application Review Card

When reviewing “Covariance and Correlation,” place key concepts, procedural steps, and observable outcomes on the same page for efficient revision.

Covariance and Correlation Application Checklist Card

When practicing “Covariance and Correlation,” write input conditions, processing steps, and observable outcomes together—making future review straightforward.

Summary

In this article, we introduced covariance and correlation, fundamental tools for investigating relationships between two random variables. By computing covariance and correlation coefficients, we gain deeper insight into underlying data structure and dependencies—laying essential groundwork for the next topic: the Law of Large Numbers.

In the following article, we will delve into the Law of Large Numbers, exploring how sample means converge toward the population mean as sample size increases. We hope you’ll apply these concepts confidently to analyze real-world problems!

Compute variances

Turn the lesson into workflow, model, budget, and security checks before choosing tools.

Workflow fit

Model or tool decision

Budget and usage signal

Security and privacy review

Definition of Covariance

Properties of Covariance

Example

Definition and Computation of Correlation

Properties of Correlation

Example

Summary

Turn this article into AI software, model, API, and security decisions.

Use this article as evidence before choosing AI tools

Keep reading from here

Reader messages

Messages