Data & ComputingMachine LearningA-Level
CambridgeAQAAPOntarioNSWCBSEGCE O-LevelMoE

Binary Cross-Entropy Calculator

Loss function for binary classification.

Use the free calculatorCheck the variablesOpen the advanced solver
This is the free calculator preview. Advanced walkthroughs stay in the app.
Result
Ready
Loss

Formula first

Overview

Binary Cross-Entropy measures the divergence between two probability distributions, typically the true labels and the predicted probabilities in a binary classification task. It calculates a loss value that penalizes predictions exponentially as they diverge from the actual class value.

Symbols

Variables

L = Loss, y = Actual Label (0/1), p = Predicted Prob

Loss
Variable
Actual Label (0/1)
Variable
Predicted Prob
Variable

Apply it well

When To Use

When to use: This equation is the standard loss function for binary classification problems where the output is a single probability between 0 and 1. It is most effective when paired with a sigmoid activation function in the final layer of a neural network.

Why it matters: It provides a smooth, convex surface for optimization, allowing gradient descent to effectively update model weights. By heavily penalizing confident but incorrect predictions, it forces the model to learn more distinct boundaries between classes.

Avoid these traps

Common Mistakes

  • Using p=0 or p=1 directly.
  • Forgetting the (1-y) term.

One free problem

Practice Problem

A machine learning model identifies a transaction as fraudulent (y = 1). The model's predicted probability of fraud is 0.85. Calculate the binary cross-entropy loss for this specific prediction.

Actual Label (0/1)1
Predicted Prob0.85

Solve for:

Hint: When y = 1, the formula simplifies to L = -ln(p).

The full worked solution stays in the interactive walkthrough.

References

Sources

  1. Wikipedia: Cross-entropy
  2. Goodfellow, I., Bengio, Y., Courville, A. (2016). Deep Learning. MIT Press.
  3. Deep Learning (Ian Goodfellow, Yoshua Bengio, and Aaron Courville)
  4. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. (Chapter 6, Section 6.2.2.2)
  5. Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer. (Chapter 4, Section 4.3.4)
  6. Standard curriculum — Machine Learning (Classification Losses)