Cohen's Kappa (Agreement)
Measure of inter-rater agreement that accounts for chance.
This public page keeps the free explanation visible and leaves premium worked solving, advanced walkthroughs, and saved study tools inside the app.
Core idea
Overview
Cohen's Kappa is a statistical measure used to assess the reliability of agreement between two raters who classify items into mutually exclusive categories. Unlike simple percent agreement, it accounts for the agreement that would occur by random chance, providing a more conservative and accurate estimate of inter-rater consistency.
When to use: This equation is used when two independent observers are categorizing data and you need to ensure their observations are reliable. It is specifically designed for nominal or categorical data rather than ordinal or continuous scales.
Why it matters: In fields like clinical psychology, high inter-rater reliability ensures that diagnoses are consistent regardless of the clinician. Without adjusting for chance, researchers might overestimate the validity of their observational data, leading to flawed conclusions.
Symbols
Variables
\kappa = Cohen's Kappa, p_o = Observed Agr., p_e = Expected Agr.
Walkthrough
Derivation
Formula: Cohen's Kappa
Measures inter-rater agreement between two observers for categorical data, correcting for the agreement expected purely by chance.
- Two raters categorise the same set of items independently.
- Categories are mutually exclusive and exhaustive.
Calculate observed agreement p_o and chance agreement p_e:
is the proportion of items on which raters agree. is the expected proportion if both raters chose categories at random in proportion to their observed frequencies.
Apply the Kappa formula:
Subtracting from both numerator and denominator removes chance agreement. κ = 0 means agreement is no better than chance; κ = 1 means perfect agreement.
Result
Source: GCSE Psychology — Research Methods
Free formulas
Rearrangements
Solve for
Make k the subject
Cohen's Kappa is already the subject of the formula.
Difficulty: 1/5
Solve for
Make po the subject
Rearranges the formula to solve for observed agreement.
Difficulty: 2/5
Solve for
Make pe the subject
Rearranges the formula to solve for expected agreement.
Difficulty: 3/5
The static page shows the finished rearrangements. The app keeps the full worked algebra walkthrough.
Visual intuition
Graph
The graph is a linear function where Kappa (k) changes at a constant rate relative to the observed agreement (po). As the independent variable po increases, the value of k rises along a straight line with a slope determined by the constant probability of chance agreement (pe).
Graph type: linear
Why it behaves this way
Intuition
Visualize two overlapping regions: one representing the total observed agreement between raters, and another representing the agreement expected purely by chance.
Signs and relationships
- p_o - p_e: This numerator term represents the observed agreement that is *beyond* what would be expected by random chance. Subtracting '' isolates the agreement truly due to the raters' consistency.
- 1 - p_e: This denominator term represents the maximum possible agreement that could occur beyond chance. It normalizes the 'agreement beyond chance' ( - ), scaling Cohen's Kappa to a meaningful range, typically from -1 to
Free study cues
Insight
Canonical usage
Cohen's Kappa is a dimensionless statistical coefficient used to quantify inter-rater agreement, ranging from -1 to 1.
Common confusion
Students might mistakenly look for units for Cohen's Kappa or its components, but all terms in the formula (, , and κ) are dimensionless proportions or coefficients.
Dimension note
Cohen's Kappa is inherently dimensionless as it is a ratio of differences between proportions (which are themselves dimensionless). It represents a coefficient of agreement.
Unit systems
Ballpark figures
- Quantity:
- Quantity:
- Quantity:
One free problem
Practice Problem
Two research assistants are coding video clips for aggressive behavior. They agree on 85% of the clips (po = 0.85). Given the frequency of the behaviors, the expected agreement by chance is calculated to be 40% (pe = 0.40). What is the Cohen's Kappa for these raters?
Solve for:
Hint: Subtract the chance agreement from the observed agreement, then divide by the difference between 1 and the chance agreement.
The full worked solution stays in the interactive walkthrough.
Study smarter
Tips
- A kappa of 1 indicates perfect agreement, while 0 indicates agreement no better than chance.
- Negative values suggest that agreement is actually worse than what would be expected by random guessing.
- Interpret results using established benchmarks, such as Landis and Koch’s scale where 0.61–0.80 is considered substantial.
Avoid these traps
Common Mistakes
- Using simple percent agreement when chance is high.
Common questions
Frequently Asked Questions
Measures inter-rater agreement between two observers for categorical data, correcting for the agreement expected purely by chance.
This equation is used when two independent observers are categorizing data and you need to ensure their observations are reliable. It is specifically designed for nominal or categorical data rather than ordinal or continuous scales.
In fields like clinical psychology, high inter-rater reliability ensures that diagnoses are consistent regardless of the clinician. Without adjusting for chance, researchers might overestimate the validity of their observational data, leading to flawed conclusions.
Using simple percent agreement when chance is high.
A kappa of 1 indicates perfect agreement, while 0 indicates agreement no better than chance. Negative values suggest that agreement is actually worse than what would be expected by random guessing. Interpret results using established benchmarks, such as Landis and Koch’s scale where 0.61–0.80 is considered substantial.
References
Sources
- Wikipedia: Cohen's kappa
- Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE Publications.
- Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37-46.
- Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159-174.
- Agresti, A. (2013). Categorical Data Analysis (3rd ed.). John Wiley & Sons.
- GCSE Psychology — Research Methods