PsychologyResearch MethodsGCSE

AQAIBCambridgeCBSEEdexcelESOICSE

Cohen's Kappa (Agreement)

Measure of inter-rater agreement that accounts for chance.

Understand the formulaSee the free derivationOpen the full walkthrough

κ = \frac{p _{o} - p _{e}}{1 - p _{e}}

Open Full Walkthrough Try Calculator

This public page keeps the free explanation visible and leaves premium worked solving, advanced walkthroughs, and saved study tools inside the app.

Core idea

Overview

Cohen's Kappa is a statistical measure used to assess the reliability of agreement between two raters who classify items into mutually exclusive categories. Unlike simple percent agreement, it accounts for the agreement that would occur by random chance, providing a more conservative and accurate estimate of inter-rater consistency.

When to use: This equation is used when two independent observers are categorizing data and you need to ensure their observations are reliable. It is specifically designed for nominal or categorical data rather than ordinal or continuous scales.

Why it matters: In fields like clinical psychology, high inter-rater reliability ensures that diagnoses are consistent regardless of the clinician. Without adjusting for chance, researchers might overestimate the validity of their observational data, leading to flawed conclusions.

Symbols

Variables

\kappa = Cohen's Kappa, p_o = Observed Agr., p_e = Expected Agr.

κ

Cohen's Kappa

V a r iab l e

p_{o}

Observed Agr.

V a r iab l e

p_{e}

Expected Agr.

V a r iab l e

Walkthrough

Derivation

Formula: Cohen's Kappa

Measures inter-rater agreement between two observers for categorical data, correcting for the agreement expected purely by chance.

Two raters categorise the same set of items independently.
Categories are mutually exclusive and exhaustive.

Calculate observed agreement p_o and chance agreement p_e:

$p_{o}$ is the proportion of items on which raters agree. $p_{e}$ is the expected proportion if both raters chose categories at random in proportion to their observed frequencies.

p_{o} = \frac{agreements}{total items}, p_{e} = \sum P (A_{i}) P (B_{i})

Apply the Kappa formula:

Subtracting $p_{e}$ from both numerator and denominator removes chance agreement. κ = 0 means agreement is no better than chance; κ = 1 means perfect agreement.

κ = \frac{p _{o} - p _{e}}{1 - p _{e}}

Result

κ = \frac{p _{o} - p _{e}}{1 - p _{e}}

Source: GCSE Psychology — Research Methods

Free formulas

Rearrangements

Solve for $κ$

Make k the subject

κ = \frac{p _{o} - p _{e}}{1 - p _{e}}

Cohen's Kappa is already the subject of the formula.

Difficulty: 1/5

Solve for $p_{o}$

Make po the subject

p_{o} = κ (1 - p_{e}) + p_{e}

Rearranges the formula to solve for observed agreement.

Difficulty: 2/5

Solve for $p_{e}$

Make pe the subject

p_{e} = \frac{κ - p _{o}}{κ - 1}

Rearranges the formula to solve for expected agreement.

Difficulty: 3/5

The static page shows the finished rearrangements. The app keeps the full worked algebra walkthrough.

Visual intuition

Graph

The graph is a linear function where Kappa (k) changes at a constant rate relative to the observed agreement (po). As the independent variable po increases, the value of k rises along a straight line with a slope determined by the constant probability of chance agreement (pe).

Graph type: linear

Why it behaves this way

Intuition

Visualize two overlapping regions: one representing the total observed agreement between raters, and another representing the agreement expected purely by chance.

κ

The proportion of agreement between two raters after removing the agreement expected by chance.

How much better two raters agree than if they were just guessing randomly. A higher kappa means better agreement beyond chance.

p_{o}

The observed proportion of agreement between the two raters.

The straightforward percentage of times the raters assigned the same category, without considering if that agreement was just luck.

p_{e}

The proportion of agreement that would be expected to occur purely by chance, given the marginal probabilities of each rater's classifications.

The baseline agreement you'd expect if the raters were assigning categories randomly, but still following their overall tendencies for how often they use each category.

Signs and relationships

p_o - p_e: This numerator term represents the observed agreement that is *beyond* what would be expected by random chance. Subtracting ' $p_{e}$ ' isolates the agreement truly due to the raters' consistency.
1 - p_e: This denominator term represents the maximum possible agreement that could occur beyond chance. It normalizes the 'agreement beyond chance' ( $p_{o}$ - $p_{e}$ ), scaling Cohen's Kappa to a meaningful range, typically from -1 to

Free study cues

Insight

Canonical usage

Cohen's Kappa is a dimensionless statistical coefficient used to quantify inter-rater agreement, ranging from -1 to 1.

Common confusion

Students might mistakenly look for units for Cohen's Kappa or its components, but all terms in the formula ( $p_{o}$ , $p_{e}$ , and κ) are dimensionless proportions or coefficients.

Dimension note

Cohen's Kappa is inherently dimensionless as it is a ratio of differences between proportions (which are themselves dimensionless). It represents a coefficient of agreement.

Unit systems

$p_{o}$ dimensionless · The observed proportion of agreement between raters, a value between 0 and 1.

$p_{e}$ dimensionless · The expected proportion of agreement by chance, a value between 0 and 1.

$κ$ dimensionless · Cohen's Kappa coefficient, a statistical measure of inter-rater agreement corrected for chance, typically ranging from -1 to 1.

Ballpark figures

Quantity:
Quantity:
Quantity:

One free problem

Practice Problem

Two research assistants are coding video clips for aggressive behavior. They agree on 85% of the clips (po = 0.85). Given the frequency of the behaviors, the expected agreement by chance is calculated to be 40% (pe = 0.40). What is the Cohen's Kappa for these raters?

Observed Agr.0.85

Expected Agr.0.4

Solve for: $k$

Hint: Subtract the chance agreement from the observed agreement, then divide by the difference between 1 and the chance agreement.

The full worked solution stays in the interactive walkthrough.

Study smarter

Tips

A kappa of 1 indicates perfect agreement, while 0 indicates agreement no better than chance.
Negative values suggest that agreement is actually worse than what would be expected by random guessing.
Interpret results using established benchmarks, such as Landis and Koch’s scale where 0.61–0.80 is considered substantial.

Avoid these traps

Common Mistakes

Using simple percent agreement when chance is high.

Common questions

Frequently Asked Questions

Measures inter-rater agreement between two observers for categorical data, correcting for the agreement expected purely by chance.

This equation is used when two independent observers are categorizing data and you need to ensure their observations are reliable. It is specifically designed for nominal or categorical data rather than ordinal or continuous scales.

In fields like clinical psychology, high inter-rater reliability ensures that diagnoses are consistent regardless of the clinician. Without adjusting for chance, researchers might overestimate the validity of their observational data, leading to flawed conclusions.

Using simple percent agreement when chance is high.

A kappa of 1 indicates perfect agreement, while 0 indicates agreement no better than chance. Negative values suggest that agreement is actually worse than what would be expected by random guessing. Interpret results using established benchmarks, such as Landis and Koch’s scale where 0.61–0.80 is considered substantial.

References

Sources

Wikipedia: Cohen's kappa
Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE Publications.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37-46.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159-174.
Agresti, A. (2013). Categorical Data Analysis (3rd ed.). John Wiley & Sons.
GCSE Psychology — Research Methods