Cross-Entropy (Bernoulli) Calculator
Cross-entropy between true Bernoulli(p) and model Bernoulli(q).
Formula first
Overview
Cross-entropy for a Bernoulli distribution quantifies the divergence between the true binary probability p and the predicted probability q. It is the standard metric used in binary classification to penalize models based on how much their predicted distribution differs from the actual target distribution.
Symbols
Variables
H(p,q) = Cross-Entropy, p = True Probability, q = Model Probability
Apply it well
When To Use
When to use: Apply this equation when evaluating binary classification models where outcomes are mutually exclusive. It is the primary loss function used during the training of logistic regression models and binary neural networks.
Why it matters: This function is superior to mean squared error for classification because it provides stronger gradients when the model is confidently wrong. This results in faster convergence during optimization processes like gradient descent.
Avoid these traps
Common Mistakes
- Using percentages instead of probabilities (0.7 not 70).
- Taking ln of 0 (q must be strictly between 0 and 1).
One free problem
Practice Problem
A machine learning model predicts a 0.7 probability (q) that an image contains a cat. The actual image is indeed a cat (p = 1.0). Calculate the binary cross-entropy for this prediction in nats.
Solve for:
Hint: Since p = 1, the term (1-p) becomes zero, meaning you only need to calculate -ln(q).
The full worked solution stays in the interactive walkthrough.
References
Sources
- Wikipedia: Cross-entropy
- Elements of Information Theory (2nd ed.) by Thomas M. Cover and Joy A. Thomas
- Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
- Elements of Information Theory (Cover and Thomas)
- Cover, Thomas M., and Joy A. Thomas. Elements of Information Theory. 2nd ed. Wiley-Interscience, 2006.
- Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016.