Data & ComputingStatisticsA-Level

AQAIBAbiturAPBaccalauréat GénéralBachilleratoCAPSCBSE

Covariance

Measure of joint variability.

Understand the formulaSee the free derivationOpen the full walkthrough

C o v (X, Y) = E [X Y] - E [X] E [Y]

Open Full Walkthrough Try Calculator

This public page keeps the free explanation visible and leaves premium worked solving, advanced walkthroughs, and saved study tools inside the app.

Core idea

Overview

Covariance measures the joint variability of two random variables, indicating the direction of their linear relationship. A positive value signifies that variables move in the same direction, while a negative value indicates an inverse relationship.

When to use: Apply this formula when you need to assess the linear dependency between two sets of data or as a step toward calculating correlation. It is used in probability distributions to determine how much variables change together.

Why it matters: It is crucial in finance for risk management and portfolio optimization, helping investors identify assets that do not move in tandem. It also underpins dimensionality reduction techniques like Principal Component Analysis (PCA) in data science.

Symbols

Variables

Cov(X,Y) = Covariance, E[XY] = Mean Product, \mu_x = Mean X, \mu_y = Mean Y

C o v (X, Y)

Covariance

V a r iab l e

E [X Y]

Mean Product

V a r iab l e

μ_{x}

Mean X

V a r iab l e

μ_{y}

Mean Y

V a r iab l e

Walkthrough

Derivation

Understanding Covariance

Covariance measures how two variables vary together: positive covariance means they tend to increase together; negative covariance means one increases as the other decreases.

X and Y have defined means (finite expectations).
Covariance is most informative for roughly linear relationships.

State the population definition:

Covariance is the expected product of deviations from each variable’s mean.

Cov (X, Y) = E [(X - E [X]) (Y - E [Y])]

Give the common sample estimator:

For a sample, divide by n-1 to obtain the usual unbiased estimator of covariance.

s_{x y} = \frac{1}{n - 1} i = 1 \sum n (x_{i} - \overset{x}{ˉ}) (y_{i} - \overset{y}{ˉ})

Note: Correlation is the normalised form: r=\frac{s_{xy}}{ $s_{x}$ $s_{y}$ }.

Result

s_{x y} = \frac{1}{n - 1} i = 1 \sum n (x_{i} - \overset{x}{ˉ}) (y_{i} - \overset{y}{ˉ})

Source: AQA A-Level Mathematics — Statistics (Bivariate Data)

Free formulas

Rearrangements

Solve for $C o v (X, Y)$

Covariance

C o v = E [X Y] - μ_{x} μ_{y}

This sequence demonstrates notational substitutions within the standard covariance formula, replacing expected value operators (E[X], E[Y]) with their corresponding mean symbols ( $μ_{x}$ , $μ_{y}$ ) to show an alternative form of the expression.

Difficulty: 2/5

The static page shows the finished rearrangements. The app keeps the full worked algebra walkthrough.

Visual intuition

Graph

Graph unavailable for this formula.

The graph of covariance plotted against an independent variable does not follow a single fixed shape, as the result depends on the specific joint distribution of the data. Because the formula represents a constant scalar value for a given set of variables, the plot typically appears as a horizontal line or a single point for any specific dataset.

Graph type: constant

Why it behaves this way

Intuition

Imagine a scatter plot of data points (X, Y); covariance describes the overall direction and strength of the linear spread of these points relative to the center defined by the means of X and Y.

Cov(X,Y)

A numerical measure indicating the extent and direction to which two random variables, X and Y, change together.

A positive value means X and Y tend to increase or decrease together; a negative value means one tends to increase as the other decreases.

E[XY]

The expected value or average of the product of corresponding outcomes of X and Y.

Represents the overall tendency of X and Y to co-occur in certain ranges, without adjusting for their individual means.

E[X]

The expected value or mean of the random variable X.

The central tendency or average outcome for variable X.

E[Y]

The expected value or mean of the random variable Y.

The central tendency or average outcome for variable Y.

Signs and relationships

- E[X]E[Y]: This term is subtracted to isolate the *joint variability* of X and Y from the product of their individual average behaviors. If X and Y are independent, their joint behavior is simply the product of their individual

Free study cues

Insight

Canonical usage

The unit of covariance is the product of the units of the two random variables being analyzed.

Common confusion

A common mistake is to confuse covariance with the Pearson correlation coefficient, which is a standardized, dimensionless measure of linear relationship. Covariance retains the units of the product of the variables.

Unit systems

$X$ Depends on the quantity X represents · The unit of the random variable X.

$Y$ Depends on the quantity Y represents · The unit of the random variable Y.

$C o v (X, Y)$ Unit(X) * Unit(Y) · The unit of covariance is the product of the units of the two variables X and Y. For example, if X is in meters and Y is in kilograms, Cov(X,Y) will be in meter-kilograms.

One free problem

Practice Problem

A financial analyst determines that the expected value of the product of two stocks (X and Y) is 45. If the average return of stock X is 5 and the average return of stock Y is 8, find the covariance.

Mean Product45

Mean X5

Mean Y8

Solve for: $C o v$

Hint: Subtract the product of the means from the expected product.

The full worked solution stays in the interactive walkthrough.

Where it shows up

Real-World Context

Comparing study time and exam score trends.

Study smarter

Tips

Covariance depends on the scale of the variables, making direct comparisons difficult.
If X and Y are independent, their covariance is zero.
The formula Cov(X, Y) = E[XY] - E[X]E[Y] is known as the shortcut or computational formula.

Avoid these traps

Common Mistakes

Mixing up means for X and Y.
Interpreting covariance as correlation.

Common questions

Frequently Asked Questions

Covariance measures how two variables vary together: positive covariance means they tend to increase together; negative covariance means one increases as the other decreases.

Apply this formula when you need to assess the linear dependency between two sets of data or as a step toward calculating correlation. It is used in probability distributions to determine how much variables change together.

It is crucial in finance for risk management and portfolio optimization, helping investors identify assets that do not move in tandem. It also underpins dimensionality reduction techniques like Principal Component Analysis (PCA) in data science.

Mixing up means for X and Y. Interpreting covariance as correlation.

Comparing study time and exam score trends.

Covariance depends on the scale of the variables, making direct comparisons difficult. If X and Y are independent, their covariance is zero. The formula Cov(X, Y) = E[XY] - E[X]E[Y] is known as the shortcut or computational formula.

References

Sources

Wikipedia: Covariance
A First Course in Probability by Sheldon Ross
Probability and Statistics for Engineers and Scientists, 9th Edition, by Walpole, Myers, Ye, and Shafer
Sheldon M. Ross, A First Course in Probability
IUPAC Gold Book: Covariance (C01373)
AQA A-Level Mathematics — Statistics (Bivariate Data)

Covariance

Overview

Variables

Derivation

State the population definition:

Give the common sample estimator:

Rearrangements

Graph

Intuition

Insight

Practice Problem

Real-World Context

Tips

Common Mistakes

Related Formulas

Variance (Expectation)

Frequently Asked Questions

Sources