Logistic Regression (Log-Odds)
Calculates the log-odds of an event occurring based on a linear combination of predictor variables.
This public page keeps the free explanation visible and leaves premium worked solving, advanced walkthroughs, and saved study tools inside the app.
Core idea
Overview
Logistic regression is a statistical model used to predict the probability of a binary outcome (e.g., yes/no, pass/fail) based on one or more predictor variables. Unlike linear regression, it models the log-odds (logit) of the outcome, which ensures that the predicted probabilities fall between 0 and 1. This transformation allows for the use of linear models for classification problems, making it a powerful tool in fields like psychology for predicting categorical behaviors or states.
When to use: Apply this formula when you need to model the relationship between a set of independent variables and a dichotomous (binary) dependent variable. It's used to predict the probability of an event occurring, such as predicting whether a student will pass an exam or if a patient will respond to a treatment.
Why it matters: Logistic regression is fundamental for understanding and predicting categorical outcomes in many scientific and practical domains. In psychology, it helps researchers identify factors influencing decisions, diagnoses, or behavioral choices. Its ability to provide probabilities makes it invaluable for risk assessment, intervention planning, and developing predictive models for real-world applications.
Symbols
Variables
p = Probability of Event, = Intercept, = Coefficient for x1, = Predictor Variable 1, = Coefficient for xk
Walkthrough
Derivation
Formula: Logistic Regression (Log-Odds)
This formula expresses the log-odds of a binary outcome as a linear combination of predictor variables.
- The dependent variable is binary (dichotomous).
- Observations are independent.
- There is a linear relationship between the independent variables and the log-odds of the dependent variable.
- No multicollinearity among independent variables.
Define the Probability and Odds:
Let `p` be the probability that the dependent variable `Y` equals 1 (the event of interest). The probability of `Y` equaling 0 is then `1-p`.
Step
The odds of an event occurring are defined as the ratio of the probability of the event occurring to the probability of it not occurring.
Introduce the Logit Transformation:
To transform the odds, which range from 0 to infinity, into a variable that ranges from negative infinity to positive infinity, we take the natural logarithm. This is known as the logit transformation.
Model the Logit as a Linear Function:
The core idea of logistic regression is to model this logit (log-odds) as a linear combination of the independent variables (``) and their respective coefficients (`β_i`), plus an intercept (`β₀`). This linear predictor can then be transformed back to a probability.
Note: The coefficients `β_i` represent the change in the log-odds for a one-unit increase in ``, holding other variables constant.
Result
Source: Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE Publications. Chapter 19: Logistic Regression.
Free formulas
Rearrangements
Solve for
Logistic Regression: Make p the subject
To make (Probability of Event) the subject, convert the log-odds back to odds, then solve for .
Difficulty: 3/5
Solve for
Logistic Regression: Make β₀ the subject
To make (Intercept) the subject, subtract the sum of all other predictor terms from the log-odds.
Difficulty: 1/5
Solve for
Logistic Regression: Make β₁ the subject
To make (Coefficient for x₁) the subject, isolate the term containing and then divide by .
Difficulty: 2/5
Solve for
Logistic Regression: Make x₁ the subject
To make (Predictor Variable 1) the subject, isolate the term containing and then divide by .
Difficulty: 2/5
Solve for
Logistic Regression: Make βₖ the subject
To make (Coefficient for xₖ) the subject, isolate the term containing and then divide by .
Difficulty: 2/5
Solve for
Logistic Regression: Make xₖ the subject
To make (Predictor Variable k) the subject, isolate the term containing and then divide by .
Difficulty: 2/5
The static page shows the finished rearrangements. The app keeps the full worked algebra walkthrough.
Visual intuition
Graph
Graph unavailable for this formula.
The graph is a straight line where the slope is defined by the coefficient beta1, representing a constant rate of change in log-odds as the predictor variable x1 increases. For a psychology student, this linear relationship means that larger values of x1 consistently shift the predicted log-odds in one direction, indicating a steady increase or decrease in the likelihood of a psychological event occurring. The most important feature of this curve is that the linear relationship allows for a direct, proportional change in log-odds for every unit increase in x1, regardless of the starting value.
Graph type: linear
Why it behaves this way
Intuition
Imagine an S-shaped curve (sigmoid function) that maps a linear combination of predictor variables to a probability between 0 and 1, representing the likelihood of a binary outcome.
Signs and relationships
- \ln(...): The natural logarithm transformation maps the odds (which range from 0 to infinity) into a linear scale (from negative infinity to positive infinity), making the dependent variable suitable for a linear model and
- \beta_i: A positive indicates that as increases, the log-odds of the event occurring increase (and thus the probability p increases). A negative indicates the opposite effect.
Free study cues
Insight
Canonical usage
The equation is used to model the log-odds (logit) of a binary outcome as a linear function of one or more predictors, where the resulting logit is a dimensionless quantity.
Common confusion
Mistaking the regression coefficient (beta) for a direct change in probability (p) rather than a change in the log-odds.
Dimension note
The logit function transforms a probability ratio into a value on the real number line. Because it is the natural logarithm of a ratio of two probabilities (the odds), the left-hand side of the equation is strictly
Unit systems
One free problem
Practice Problem
A logistic regression model predicts the likelihood of a student passing an exam. The intercept (β₀) is -2.5. For every hour of study (x₁), the coefficient (β₁) is 0.8. If a student studies for 4 hours, what is the log-odds of them passing the exam?
Solve for:
Hint: Calculate the linear predictor (right side of the equation).
The full worked solution stays in the interactive walkthrough.
Where it shows up
Real-World Context
Predicting the likelihood of a patient developing a specific mental health condition based on demographic factors and symptom severity scores.
Study smarter
Tips
- The left side, `ln(p/(1-p))`, is called the log-odds or logit function.
- The right side, `β₀ + β₁x₁ + ... + βₖxₖ`, is the linear predictor, often denoted as `L` or `η`.
- To convert log-odds back to probability `p`, use the inverse logit function: `p = 1 / (1 + e^(-L))`.
- Interpret `β` coefficients as the change in log-odds for a one-unit increase in the corresponding `x` variable, holding others constant.
Avoid these traps
Common Mistakes
- Interpreting `β` coefficients directly as changes in probability, rather than changes in log-odds.
- Assuming a linear relationship between predictors and the probability itself, instead of the log-odds.
- Not checking for multicollinearity among predictor variables, which can inflate standard errors of coefficients.
Common questions
Frequently Asked Questions
This formula expresses the log-odds of a binary outcome as a linear combination of predictor variables.
Apply this formula when you need to model the relationship between a set of independent variables and a dichotomous (binary) dependent variable. It's used to predict the probability of an event occurring, such as predicting whether a student will pass an exam or if a patient will respond to a treatment.
Logistic regression is fundamental for understanding and predicting categorical outcomes in many scientific and practical domains. In psychology, it helps researchers identify factors influencing decisions, diagnoses, or behavioral choices. Its ability to provide probabilities makes it invaluable for risk assessment, intervention planning, and developing predictive models for real-world applications.
Interpreting `β` coefficients directly as changes in probability, rather than changes in log-odds. Assuming a linear relationship between predictors and the probability itself, instead of the log-odds. Not checking for multicollinearity among predictor variables, which can inflate standard errors of coefficients.
Predicting the likelihood of a patient developing a specific mental health condition based on demographic factors and symptom severity scores.
The left side, `ln(p/(1-p))`, is called the log-odds or logit function. The right side, `β₀ + β₁x₁ + ... + βₖxₖ`, is the linear predictor, often denoted as `L` or `η`. To convert log-odds back to probability `p`, use the inverse logit function: `p = 1 / (1 + e^(-L))`. Interpret `β` coefficients as the change in log-odds for a one-unit increase in the corresponding `x` variable, holding others constant.
References
Sources
- Wikipedia: Logistic regression
- Applied Logistic Regression by David W. Hosmer Jr., Stanley Lemeshow, and Rodney X. Sturdivant
- An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani
- Applied Logistic Regression (Hosmer, Lemeshow, & Sturdivant)
- Statistical Methods for Psychology (Howell)
- Discovering Statistics Using IBM SPSS Statistics (Field)
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R. Springer.
- Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression (3rd ed.). Wiley.