PsychologyStatisticsUniversity
AQAIB

Pearson's r (Calculation)

Detailed calculation of the Pearson correlation coefficient.

Understand the formulaSee the free derivationOpen the full walkthrough

This public page keeps the free explanation visible and leaves premium worked solving, advanced walkthroughs, and saved study tools inside the app.

Core idea

Overview

Pearson's r, or the product-moment correlation coefficient, quantifies the strength and direction of the linear relationship between two continuous variables. It is calculated by dividing the sum of products of deviations from the mean by the square root of the product of the squared deviations for each variable.

When to use: Use this metric when analyzing two interval- or ratio-level variables that appear to share a linear trend. It assumes bivariate normality and is sensitive to outliers, so data should be screened for extreme values before calculation.

Why it matters: In psychological research, it identifies how variables like personality traits and behavioral outcomes co-vary. Understanding these associations allows for the development of predictive models and the validation of measurement scales.

Symbols

Variables

r = Pearson's r, SP = Sum of Products, SS_x = Sum Squares X, SS_y = Sum Squares Y

Pearson's r
Sum of Products
Sum Squares X
Sum Squares Y

Walkthrough

Derivation

Formula: Pearson's r (Correlation Coefficient)

Measures the linear relationship between two continuous variables, ranging from −1 to +1.

  • Both variables are continuous and measured on interval/ratio scales.
  • The relationship is linear.
  • Bivariate normality (for inference).
1

Compute covariance:

Covariance captures the joint variability of x and y.

2

Standardise by the product of SDs:

Dividing by both standard deviations confines r to the range [−1, +1].

Result

Source: University Psychology — Statistics

Free formulas

Rearrangements

Solve for

Pearson's r (Calculation) - Shorthand Notation

This rearrangement simplifies the definitional formula for Pearson's r by introducing common shorthand notations for the Sum of Products (SP) and the Sums of Squares ( and ).

Difficulty: 2/5

The static page shows the finished rearrangements. The app keeps the full worked algebra walkthrough.

Visual intuition

Graph

Graph unavailable for this formula.

The graph of this equation is a constant horizontal line because the formula calculates a single scalar value representing the strength of a relationship between two variables. Regardless of the independent variable plotted on the x-axis, the resulting correlation coefficient remains fixed for a given dataset.

Graph type: constant

Why it behaves this way

Intuition

Imagine a scatter plot of data points; Pearson's r describes how tightly these points cluster around a straight line, and whether that line slopes upwards or downwards.

r
A dimensionless index that quantifies the strength and direction of the linear association between two continuous variables.
A value of +1 means perfect positive linear correlation, -1 means perfect negative linear correlation, and 0 means no linear correlation.
The product of the deviations of individual data points from their respective means.
If both X and Y are simultaneously above or below their means, this product is positive. If one is above and the other is below, it's negative. This term captures how X and Y 'co-vary' around their averages.
The sum of the products of deviations, representing the total co-variation between X and Y across all data points.
A large positive sum indicates that X and Y generally increase or decrease together. A large negative sum indicates that as one increases, the other generally decreases.
The geometric mean of the sums of squares of deviations for X and Y, serving as a normalization factor.
This term scales the co-variation (numerator) by the individual variability within X and Y, ensuring that 'r' is a standardized measure independent of the variables' units or magnitudes.

Signs and relationships

  • Σ(X-\bar{X})(Y-\bar{Y}): The sign of the numerator determines the sign of 'r'. If most data points show (X-) and (Y-) having the same sign (both positive or both negative), their product is positive, leading to a positive sum and

Free study cues

Insight

Canonical usage

Pearson's r is a dimensionless statistic, meaning its value does not carry any units, regardless of the units of the original variables.

Common confusion

A common mistake is attempting to assign units to the Pearson correlation coefficient itself, or incorrectly assuming that the two variables X and Y must have the same units.

Dimension note

Pearson's r is a ratio of the covariance of two variables to the product of their standard deviations. The units of the original variables X and Y cancel out in the calculation, resulting in a pure number without units.

Unit systems

any consistent unit · The unit of the first continuous variable. For example, 'score', 'seconds', or 'dollars'.
any consistent unit · The unit of the second continuous variable. Can be different from the unit of X. For example, 'score', 'seconds', or 'dollars'.
none · The Pearson correlation coefficient is a pure number between -1 and +1, indicating strength and direction of linear association.

Ballpark figures

  • Quantity:

One free problem

Practice Problem

A clinical psychologist is studying the link between hours of meditation (X) and anxiety scores (Y). After calculating the sums, they find the sum of products (sp) is -45, the sum of squares for X (ssx) is 50, and the sum of squares for Y (ssy) is 72. Calculate Pearson's r.

Sum of Products-45
Sum Squares X50
Sum Squares Y72

Solve for:

Hint: Divide the sum of products by the square root of the product of the two sums of squares.

The full worked solution stays in the interactive walkthrough.

Study smarter

Tips

  • Ensure the relationship is linear via a scatterplot before computing.
  • Values closer to -1 or +1 indicate stronger relationships.
  • A value of zero indicates no linear association exists between the variables.

Common questions

Frequently Asked Questions

Measures the linear relationship between two continuous variables, ranging from −1 to +1.

Use this metric when analyzing two interval- or ratio-level variables that appear to share a linear trend. It assumes bivariate normality and is sensitive to outliers, so data should be screened for extreme values before calculation.

In psychological research, it identifies how variables like personality traits and behavioral outcomes co-vary. Understanding these associations allows for the development of predictive models and the validation of measurement scales.

Ensure the relationship is linear via a scatterplot before computing. Values closer to -1 or +1 indicate stronger relationships. A value of zero indicates no linear association exists between the variables.

References

Sources

  1. Gravetter, F. J., Wallnau, L. B., Forzano, L. B., & Witnauer, J. E. (2021). Essentials of statistics for the behavioral sciences (10th ed.).
  2. Field, A. (2018). Discovering statistics using IBM SPSS Statistics (5th ed.). SAGE Publications.
  3. Wikipedia: Pearson correlation coefficient
  4. Discovering Statistics Using IBM SPSS Statistics (Field, 2018)
  5. Statistics for the Behavioral Sciences (Gravetter & Wallnau, 2017)
  6. Cohen Statistical Power Analysis for the Behavioral Sciences
  7. Field Discovering Statistics Using IBM SPSS Statistics
  8. Howell Statistical Methods for Psychology