Data & ComputingEvaluation MetricsA-Level
CambridgeAQAAPIBAbiturBaccalauréat GénéralBachilleratoCAPS

Recall (Sensitivity)

Ability to find all positive instances.

Understand the formulaSee the free derivationOpen the full walkthrough

This public page keeps the free explanation visible and leaves premium worked solving, advanced walkthroughs, and saved study tools inside the app.

Core idea

Overview

Recall, also known as sensitivity or the true positive rate, quantifies the ability of a classification model to identify all relevant instances within a dataset. It calculates the ratio of correctly predicted positive observations to the total number of actual positives, focusing on the cost of false negatives.

When to use: Use recall when the primary goal is to minimize false negatives, ensuring that as many positive cases as possible are captured. It is particularly critical in scenarios like medical diagnostics or emergency alert systems where missing a positive result carries a high risk.

Why it matters: High recall is essential in safety-critical applications because it ensures that fewer actual threats or diseases go undetected. In business, it helps in lead generation or fraud detection where capturing every potential opportunity or risk is prioritized over the inconvenience of false alarms.

Symbols

Variables

R = Recall, TP = True Positives, FN = False Negatives

Recall
True Positives
False Negatives

Walkthrough

Derivation

Understanding Recall (Sensitivity)

Recall is the fraction of actual positives that are correctly detected, measuring how well the classifier finds positive cases.

  • Binary classification setting.
  • Confusion matrix counts TP and FN are available.
1

Identify the needed confusion-matrix counts:

TP are correctly predicted positives; FN are actual positives missed by the model.

2

Compute recall:

Divide true positives by all actual positives. High recall means few missed positives.

Note: In medicine, high recall (sensitivity) is often prioritised to reduce missed diagnoses.

Result

Source: OCR A-Level Computer Science — Algorithms and Data

Free formulas

Rearrangements

Solve for

Make R the subject

Exact symbolic rearrangement generated deterministically for R.

Difficulty: 3/5

Solve for

Make TP the subject

Exact symbolic rearrangement generated deterministically for TP.

Difficulty: 3/5

Solve for

Make FN the subject

Exact symbolic rearrangement generated deterministically for FN.

Difficulty: 3/5

The static page shows the finished rearrangements. The app keeps the full worked algebra walkthrough.

Visual intuition

Graph

The graph of Recall (R) against an independent variable representing the number of false negatives (FN) is a hyperbolic curve. As the number of false negatives increases, the recall value decreases non-linearly, approaching an asymptote at zero as the denominator grows infinitely large.

Graph type: hyperbolic

Why it behaves this way

Intuition

Visualize a set of items, some truly positive. Recall is the fraction of these truly positive items that a classifier successfully 'picked up' or identified, illustrating its coverage of the positive class.

TP
Number of positive instances correctly identified by the model.
These are the 'hits' - the cases that were truly positive and the model successfully detected them.
FN
Number of positive instances incorrectly classified as negative by the model.
These are the 'misses' - actual positive cases that the model failed to detect, leading to an oversight.
TP + FN
Total number of actual positive instances in the dataset.
This represents the complete set of all cases that truly belong to the positive class, whether detected or missed.
Recall
Proportion of all actual positive instances that were correctly identified by the model.
How good the model is at finding all the relevant items, ensuring that as few true positives as possible are overlooked.

Signs and relationships

  • Denominator (TP + FN): The sum of true positives and false negatives represents all actual positive instances. Using this as the denominator normalizes the count of correctly identified positives, showing the proportion of all relevant items

Free study cues

Insight

Canonical usage

Recall is used to calculate a dimensionless performance metric, representing a ratio of counts, typically reported as a decimal between 0 and 1 or as a percentage.

Common confusion

Students sometimes attempt to assign units to the individual components (TP, FN) or the final recall value, overlooking its nature as a dimensionless ratio of counts.

Dimension note

Recall is a ratio of counts (true positives to actual positives), making it a dimensionless quantity. It quantifies a proportion and does not carry physical units.

Unit systems

count · Represents the number of correctly identified positive instances (True Positives).
count · Represents the number of positive instances incorrectly identified as negative (False Negatives).

One free problem

Practice Problem

A diagnostic test for a rare disease correctly identified 85 patients with the condition. However, 15 patients who actually had the disease were incorrectly told they were healthy. Calculate the Recall (Sensitivity) of this test.

True Positives85
False Negatives15

Solve for:

Hint: Divide the correctly identified positives by the total number of actual positive cases, which is the sum of TP and FN.

The full worked solution stays in the interactive walkthrough.

Where it shows up

Real-World Context

Medical screening where missing cases is risky.

Study smarter

Tips

  • Remember that recall does not account for false positives; use it alongside precision for a balanced view.
  • In highly imbalanced datasets, recall is often a more informative metric than simple accuracy.
  • Increasing the classification threshold typically decreases recall while potentially increasing precision.

Avoid these traps

Common Mistakes

  • Confusing recall with precision.
  • Using FP instead of FN.

Common questions

Frequently Asked Questions

Recall is the fraction of actual positives that are correctly detected, measuring how well the classifier finds positive cases.

Use recall when the primary goal is to minimize false negatives, ensuring that as many positive cases as possible are captured. It is particularly critical in scenarios like medical diagnostics or emergency alert systems where missing a positive result carries a high risk.

High recall is essential in safety-critical applications because it ensures that fewer actual threats or diseases go undetected. In business, it helps in lead generation or fraud detection where capturing every potential opportunity or risk is prioritized over the inconvenience of false alarms.

Confusing recall with precision. Using FP instead of FN.

Medical screening where missing cases is risky.

Remember that recall does not account for false positives; use it alongside precision for a balanced view. In highly imbalanced datasets, recall is often a more informative metric than simple accuracy. Increasing the classification threshold typically decreases recall while potentially increasing precision.

References

Sources

  1. Wikipedia: Precision and recall
  2. An Introduction to Statistical Learning: With Applications in R (James, Witten, Hastie, Tibshirani)
  3. Wikipedia: Sensitivity and specificity
  4. An Introduction to Statistical Learning: with Applications in R by James, Witten, Hastie, Tibshirani (Springer, 2013)
  5. The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Hastie, Tibshirani, Friedman (Springer, 2009)
  6. Precision and recall Wikipedia article
  7. OCR A-Level Computer Science — Algorithms and Data