PsychologyStatisticsA-Level
AQACambridgeWJECOCRIBAbiturAPBaccalauréat Général

Chi-Square Test (X²)

Difference between observed and expected frequencies.

Understand the formulaSee the free derivationOpen the full walkthrough

This public page keeps the free explanation visible and leaves premium worked solving, advanced walkthroughs, and saved study tools inside the app.

Core idea

Overview

The Chi-Square test is a non-parametric statistical method used to evaluate the significance of the difference between observed frequencies and expected frequencies in categorical data. In psychology, it is a foundational tool for determining if the distribution of certain behaviors or traits deviates significantly from a theoretical or null hypothesis distribution.

When to use: Apply this test when analyzing nominal or ordinal data where you need to compare actual counts against a predicted model. It assumes that observations are independent and that the expected frequency in each category is at least 5 for the results to be statistically valid.

Why it matters: It allows psychologists to conclude whether experimental results, such as the preference for a specific therapy, are due to chance or a genuine underlying effect. This helps in validating theories regarding social behavior, personality distributions, and survey results in diverse populations.

Symbols

Variables

\chi^2 = Chi-Square, O = Observed, E = Expected

Chi-Square
Observed
Expected

Walkthrough

Derivation

Formula: Chi-Square Test (X²)

Calculation of the test statistic for categorical data association.

  • Data is nominal.
  • Expected frequencies are high enough.
1

State the contribution formula:

Summing the squared differences between observed (O) and expected (E) frequencies reveals the deviation from the null hypothesis.

Result

Source: A-Level Psychology — Research Methods / Statistics

Visual intuition

Graph

Graph unavailable for this formula.

The Chi-Square distribution curve is positively skewed, starting at the origin (0,0) and rising to a peak before tapering off toward the x-axis as values increase. The shape changes based on the degrees of freedom, becoming more symmetrical and bell-shaped as the sample size increases. This distribution illustrates the probability density of a sum of squared standard normal deviates, with the area under the curve representing the probability of obtaining a specific test statistic.

Graph type: polynomial

Why it behaves this way

Intuition

Imagine comparing two sets of bars on a graph: one showing the actual counts from your data (observed frequencies), and another showing the counts you'd expect if a particular hypothesis (like 'no difference')

The Chi-Square statistic, representing the overall discrepancy between observed and expected frequencies.
A higher value indicates a greater difference between what was observed and what was expected, suggesting the observed pattern is less likely to be due to chance.
O
The observed frequency or count in a specific category.
What you actually counted or saw happen in your experiment or survey for a given group or response.
E
The expected frequency or count in a specific category, typically derived from a null hypothesis or theoretical distribution.
What you would expect to count or see happen in your experiment or survey if there were no real effect or difference (i.e., if chance alone were operating).
The summation operator, indicating the sum of the calculated values for all categories.
This symbol means you add up the individual contributions from each category to get the total statistic.

Signs and relationships

  • (O - E)^2: The difference between observed and expected frequencies is squared to ensure that all deviations (whether O is greater or smaller than E) contribute positively to the total statistic.
  • /E: Dividing the squared difference by the expected frequency (E) normalizes the contribution of each category. This means that a given absolute difference (O - E)

Free study cues

Insight

Canonical usage

The Chi-Square test statistic is a dimensionless value representing the discrepancy between observed and expected frequencies.

Common confusion

Students sometimes incorrectly attempt to assign units to observed or expected frequencies, or to the Chi-Square statistic itself, rather than recognizing them as dimensionless counts or scores.

Dimension note

The Chi-Square statistic is inherently dimensionless because it is derived from ratios of frequencies (counts), which are themselves dimensionless quantities.

Unit systems

count · Observed frequency, representing the number of occurrences in a specific category.
count · Expected frequency, representing the theoretically predicted number of occurrences in a specific category.

One free problem

Practice Problem

A clinical psychologist expects 10 patients to select a specific coping mechanism based on a baseline study. If 15 patients actually select that mechanism, what is the Chi-Square contribution (X²) for this specific category?

Observed15
Expected10

Solve for:

Hint: Subtract the expected value from the observed value, square the result, and then divide by the expected value.

The full worked solution stays in the interactive walkthrough.

Where it shows up

Real-World Context

Testing if a preference for a study method differs by gender.

Study smarter

Tips

  • Always use raw frequency counts rather than percentages or proportions.
  • Ensure that each participant or observation contributes to only one category.
  • A higher Chi-Square value suggests a greater discrepancy between your data and the null hypothesis.

Avoid these traps

Common Mistakes

  • Using percentages instead of raw frequencies.
  • Incorrectly calculating degrees of freedom.

Common questions

Frequently Asked Questions

Calculation of the test statistic for categorical data association.

Apply this test when analyzing nominal or ordinal data where you need to compare actual counts against a predicted model. It assumes that observations are independent and that the expected frequency in each category is at least 5 for the results to be statistically valid.

It allows psychologists to conclude whether experimental results, such as the preference for a specific therapy, are due to chance or a genuine underlying effect. This helps in validating theories regarding social behavior, personality distributions, and survey results in diverse populations.

Using percentages instead of raw frequencies. Incorrectly calculating degrees of freedom.

Testing if a preference for a study method differs by gender.

Always use raw frequency counts rather than percentages or proportions. Ensure that each participant or observation contributes to only one category. A higher Chi-Square value suggests a greater discrepancy between your data and the null hypothesis.

References

Sources

  1. Gravetter, F. J., & Wallnau, L. B. (2017). Statistics for the Behavioral Sciences (10th ed.). Cengage Learning.
  2. Wikipedia: Chi-squared test
  3. Discovering Statistics Using IBM SPSS Statistics (Field, A.)
  4. Statistics for Psychology (Aron, A., Aron, E., Coups, E.)
  5. Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE Publications.
  6. A-Level Psychology — Research Methods / Statistics