Confusion matrix accuracy
Accuracy from true/false positives/negatives.
This public page keeps the free explanation visible and leaves premium worked solving, advanced walkthroughs, and saved study tools inside the app.
Core idea
Overview
Accuracy is the most intuitive performance measure for classification models, representing the ratio of correctly predicted observations to the total number of samples. It combines both true positive and true negative results to provide a broad assessment of how often the classifier is correct across all classes.
When to use: Accuracy is best utilized when the target classes in the dataset are nearly balanced, meaning there is a similar number of samples for each label. It is appropriate when the costs of false positives and false negatives are roughly equal.
Why it matters: It allows stakeholders to quickly grasp the reliability of a system in general terms, such as an OCR engine or a simple sentiment analyzer. High accuracy indicates a model that performs well across the entire distribution, assuming the data is not skewed.
Symbols
Variables
TP = True Positives, TN = True Negatives, FP = False Positives, FN = False Negatives, acc = Accuracy
Walkthrough
Derivation
Confusion Matrix Accuracy
Accuracy measures the proportion of all predictions that were correct: acc = (TP + TN) / (TP + TN + FP + FN). It is reliable only when class distributions are balanced.
- The dataset's class distribution is reasonably balanced.
- False positives and false negatives carry similar costs.
Define the Confusion Matrix Terms
Every prediction falls into one of these four categories. TP and TN are correct; FP and FN are errors.
Sum Correct Predictions
The total number of predictions the classifier got right.
Sum All Predictions
Every sample in the evaluation set, regardless of outcome.
Calculate Accuracy
Dividing correct predictions by the total gives accuracy as a value between 0 and 1 (multiply by 100 for %).
Example
A spam filter with TP=45, TN=50, FP=2, FN=3 achieves 95% accuracy.
Note: Accuracy alone can be misleading on imbalanced datasets — consider also precision, recall, and F1-score.
Result
Source: A-Level Data & Computing — Machine Learning
Free formulas
Rearrangements
Solve for
Make acc the subject
acc is already the subject of the formula.
Difficulty: 1/5
Solve for
Make TP the subject
Rearrange the confusion matrix accuracy formula to express True Positives (TP) as the subject in terms of Accuracy (acc), True Negatives (TN), False Positives (FP), and False Negatives (FN).
Difficulty: 2/5
The static page shows the finished rearrangements. The app keeps the full worked algebra walkthrough.
Visual intuition
Graph
Graph unavailable for this formula.
The graph is a straight line because True Positives appear as a first-degree term, meaning accuracy increases linearly as True Positives rise, provided the other variables remain constant. For a student of Data and Computing, this linear relationship shows that small x-values represent models struggling to identify positive cases, while large x-values indicate a model successfully capturing more of the target class. The most important feature is the y-intercept determined by the negative True Negatives, which highl
Graph type: linear
Why it behaves this way
Intuition
Imagine a 2x2 grid (the confusion matrix) where actual outcomes are rows and predicted outcomes are columns; accuracy is the sum of the diagonal elements (correct predictions)
Free study cues
Insight
Canonical usage
Accuracy is a dimensionless ratio representing the proportion of correctly classified instances out of the total number of instances, typically expressed as a decimal or percentage.
Common confusion
A common confusion is failing to understand that accuracy, while a numerical value, does not have physical units. It is a proportion and should be interpreted as such, often converted to a percentage for clarity and ease
Dimension note
Accuracy is a dimensionless quantity because it is a ratio of counts (number of correct predictions to total predictions). It represents a proportion and therefore has no physical units.
Unit systems
One free problem
Practice Problem
An email spam filter processed 100 messages. It correctly identified 45 as spam and 50 as legitimate. However, it mistakenly marked 2 legitimate emails as spam and failed to catch 3 spam messages. Calculate the accuracy of the filter.
Solve for:
Hint: Accuracy is the sum of correct predictions (TP and TN) divided by the total number of samples.
The full worked solution stays in the interactive walkthrough.
Where it shows up
Real-World Context
Reporting overall model correctness on a test set.
Study smarter
Tips
- Always verify class distribution before reporting accuracy as the primary metric.
- Incorporate the confusion matrix to see where the specific errors are occurring.
- Use accuracy as a baseline to compare different model architectures on the same dataset.
Avoid these traps
Common Mistakes
- Ignoring class imbalance.
- Using TP only.
Common questions
Frequently Asked Questions
Accuracy measures the proportion of all predictions that were correct: acc = (TP + TN) / (TP + TN + FP + FN). It is reliable only when class distributions are balanced.
Accuracy is best utilized when the target classes in the dataset are nearly balanced, meaning there is a similar number of samples for each label. It is appropriate when the costs of false positives and false negatives are roughly equal.
It allows stakeholders to quickly grasp the reliability of a system in general terms, such as an OCR engine or a simple sentiment analyzer. High accuracy indicates a model that performs well across the entire distribution, assuming the data is not skewed.
Ignoring class imbalance. Using TP only.
Reporting overall model correctness on a test set.
Always verify class distribution before reporting accuracy as the primary metric. Incorporate the confusion matrix to see where the specific errors are occurring. Use accuracy as a baseline to compare different model architectures on the same dataset.
References
Sources
- Wikipedia: Confusion matrix
- An Introduction to Statistical Learning (James, Witten, Hastie, Tibshirani)
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow by Aurélien Géron
- Confusion matrix (Wikipedia article)
- Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd ed.).
- Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (2nd ed.). O'Reilly Media.
- A-Level Data & Computing — Machine Learning