Data & ComputingStatisticsA-Level
AQASATIBAbiturAPBaccalauréat GénéralBachilleratoCambridge

Linear regression

Predicted value from a linear model.

Understand the formulaSee the free derivationOpen the full walkthrough

This public page keeps the free explanation visible and leaves premium worked solving, advanced walkthroughs, and saved study tools inside the app.

Core idea

Overview

Linear regression is a fundamental statistical method used to model the relationship between a scalar dependent variable and one independent variable. It represents the best-fit line through a set of data points by minimizing the sum of the squared differences between observed and predicted values.

When to use: Use this model when you want to predict a continuous numerical value based on a single input variable where a linear trend is observed. It is appropriate when the relationship between variables is relatively constant and the residuals (errors) follow a normal distribution.

Why it matters: It serves as the bedrock for predictive analytics in finance, science, and social research. By quantifying the strength of relationships, it allows organizations to forecast future trends, such as sales growth or disease progression, based on historical inputs.

Symbols

Variables

= Intercept, = Slope, x = Input x, = Prediction

Intercept
Variable
Slope
Variable
Input x
Variable
Prediction
Variable

Walkthrough

Derivation

Formula: Simple Linear Regression

Simple linear regression models the relationship between x and y with a best-fit line =+ x, typically fitted by least squares.

  • The relationship is approximately linear.
  • Residuals are independent with roughly constant variance (homoscedasticity).
  • For inference, residuals are often assumed approximately normal.
1

State the model (prediction line):

is the intercept and is the slope; the model predicts from x.

2

Define the least-squares objective:

Choose and to minimise the sum of squared residuals (ordinary least squares).

Note: This criterion leads to closed-form solutions for and in simple regression.

Result

Source: Edexcel A-Level Mathematics — Statistics (Regression)

Free formulas

Rearrangements

Solve for

Make yhat the subject

yhat is already the subject of the formula.

Difficulty: 1/5

Solve for

Make beta0 the subject

To make (intercept) the subject of the linear regression equation =+x, subtract x from both sides and then rearrange the terms.

Difficulty: 2/5

Solve for

Make the Slope (β₁) the Subject in Linear Regression

Rearrange the linear regression equation to isolate the slope (β₁).

Difficulty: 2/5

The static page shows the finished rearrangements. The app keeps the full worked algebra walkthrough.

Visual intuition

Graph

Graph unavailable for this formula.

The graph is a straight line where the steepness is determined by the value of the independent variable, which causes the line to shift its vertical intercept as that variable changes. For a student of Data and Computing, this means that when the independent variable is small, the predicted value is highly sensitive to changes in the slope, whereas large values of the independent variable make the prediction less reactive to slope adjustments. The most important feature of this relationship is that the slope and the independent variable are inversely proportional, meaning that to maintain the same predicted value, any increase in the independent variable must be balanced by a proportional decrease in the slope.

Graph type: linear

Why it behaves this way

Intuition

Imagine a straight line drawn through a scatter plot of data points, positioned to minimize the total squared vertical distances from each point to the line.

Predicted value of the dependent variable
This is the output value the model estimates for a given input, representing a point on the fitted regression line.
Y-intercept of the regression line
It represents the predicted value of the dependent variable when the independent variable (x) is zero.
Slope of the regression line
It quantifies how much the predicted dependent variable (y) changes for every one-unit increase in the independent variable (x).
Value of the independent variable
This is the input value used to make a prediction about the dependent variable.

Signs and relationships

  • \beta_1: The sign of indicates the direction of the linear relationship: a positive means increases as x increases, while a negative means decreases as x increases.

Free study cues

Insight

Canonical usage

The predicted value (ŷ) must have the same units as the observed dependent variable (y). Consequently, the intercept (β0) must share these units, and the slope (β1)

Common confusion

A common mistake is misinterpreting the units of the intercept (β0) or slope (β1) coefficients, especially when x or y are dimensionless, percentages, or have non-obvious units.

Unit systems

Context-dependent (e.g., USD, score, kg) · The dependent variable, whose value is being predicted.
Context-dependent (e.g., hours, age, m) · The independent variable, used to make predictions.
ŷSame as y · The predicted value of the dependent variable.
Same as y · The intercept, representing the predicted value of y when x is zero.
units(y)/units(x) · The slope coefficient, representing the change in y for a one-unit increase in x.

One free problem

Practice Problem

A retail analyst finds that a store's daily revenue follows an equation where the base revenue is 500 dollars and every additional customer adds 15 dollars. If there are 40 customers today, what is the predicted revenue?

Intercept500
Slope15
Input x40

Solve for: yhat

Hint: Multiply the slope by the number of customers and add the intercept.

The full worked solution stays in the interactive walkthrough.

Where it shows up

Real-World Context

When predicting test score from study hours, Linear regression is used to calculate Prediction from Intercept, Slope, and Input x. The result matters because it helps judge uncertainty, spread, or evidence before making a conclusion from the data.

Study smarter

Tips

  • Always check for a linear pattern using a scatter plot before applying this model.
  • Be cautious of outliers, as they can disproportionately pull the regression line away from the true trend.
  • Avoid extrapolation by only making predictions within the range of your observed x-values.

Avoid these traps

Common Mistakes

  • Mixing up beta0 and beta1.
  • Using x in the wrong units.

Common questions

Frequently Asked Questions

Simple linear regression models the relationship between x and y with a best-fit line \hat{y}=\beta_0+\beta_1 x, typically fitted by least squares.

Use this model when you want to predict a continuous numerical value based on a single input variable where a linear trend is observed. It is appropriate when the relationship between variables is relatively constant and the residuals (errors) follow a normal distribution.

It serves as the bedrock for predictive analytics in finance, science, and social research. By quantifying the strength of relationships, it allows organizations to forecast future trends, such as sales growth or disease progression, based on historical inputs.

Mixing up beta0 and beta1. Using x in the wrong units.

When predicting test score from study hours, Linear regression is used to calculate Prediction from Intercept, Slope, and Input x. The result matters because it helps judge uncertainty, spread, or evidence before making a conclusion from the data.

Always check for a linear pattern using a scatter plot before applying this model. Be cautious of outliers, as they can disproportionately pull the regression line away from the true trend. Avoid extrapolation by only making predictions within the range of your observed x-values.

References

Sources

  1. Wikipedia: Linear regression
  2. Introduction to Statistical Learning: With Applications in R by James, Witten, Hastie, Tibshirani
  3. Statistics by McClave, Benson, Sincich
  4. Wikipedia: Dimensional analysis
  5. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R. Springer.
  6. Montgomery, D. C., Peck, E. A., & Vining, G. G. (2021). Introduction to Linear Regression Analysis (6th ed.). Wiley.
  7. Britannica: Linear regression
  8. Edexcel A-Level Mathematics — Statistics (Regression)