Pearson's r (Calculation)
Detailed calculation of the Pearson correlation coefficient.
This public page keeps the free explanation visible and leaves premium worked solving, advanced walkthroughs, and saved study tools inside the app.
Core idea
Overview
Pearson's r, or the product-moment correlation coefficient, quantifies the strength and direction of the linear relationship between two continuous variables. It is calculated by dividing the sum of products of deviations from the mean by the square root of the product of the squared deviations for each variable.
When to use: Use this metric when analyzing two interval- or ratio-level variables that appear to share a linear trend. It assumes bivariate normality and is sensitive to outliers, so data should be screened for extreme values before calculation.
Why it matters: In psychological research, it identifies how variables like personality traits and behavioral outcomes co-vary. Understanding these associations allows for the development of predictive models and the validation of measurement scales.
Symbols
Variables
r = Pearson's r, SP = Sum of Products, SS_x = Sum Squares X, SS_y = Sum Squares Y
Walkthrough
Derivation
Formula: Pearson's r (Correlation Coefficient)
Measures the linear relationship between two continuous variables, ranging from −1 to +1.
- Both variables are continuous and measured on interval/ratio scales.
- The relationship is linear.
- Bivariate normality (for inference).
Compute covariance:
Covariance captures the joint variability of x and y.
Standardise by the product of SDs:
Dividing by both standard deviations confines r to the range [−1, +1].
Result
Source: University Psychology — Statistics
Free formulas
Rearrangements
Solve for
Pearson's r (Calculation) - Shorthand Notation
This rearrangement simplifies the definitional formula for Pearson's r by introducing common shorthand notations for the Sum of Products (SP) and the Sums of Squares ( and ).
Difficulty: 2/5
The static page shows the finished rearrangements. The app keeps the full worked algebra walkthrough.
Visual intuition
Graph
Graph unavailable for this formula.
The graph of this equation is a constant horizontal line because the formula calculates a single scalar value representing the strength of a relationship between two variables. Regardless of the independent variable plotted on the x-axis, the resulting correlation coefficient remains fixed for a given dataset.
Graph type: constant
Why it behaves this way
Intuition
Imagine a scatter plot of data points; Pearson's r describes how tightly these points cluster around a straight line, and whether that line slopes upwards or downwards.
Signs and relationships
- Σ(X-\bar{X})(Y-\bar{Y}): The sign of the numerator determines the sign of 'r'. If most data points show (X-) and (Y-) having the same sign (both positive or both negative), their product is positive, leading to a positive sum and
Free study cues
Insight
Canonical usage
Pearson's r is a dimensionless statistic, meaning its value does not carry any units, regardless of the units of the original variables.
Common confusion
A common mistake is attempting to assign units to the Pearson correlation coefficient itself, or incorrectly assuming that the two variables X and Y must have the same units.
Dimension note
Pearson's r is a ratio of the covariance of two variables to the product of their standard deviations. The units of the original variables X and Y cancel out in the calculation, resulting in a pure number without units.
Unit systems
Ballpark figures
- Quantity:
One free problem
Practice Problem
A clinical psychologist is studying the link between hours of meditation (X) and anxiety scores (Y). After calculating the sums, they find the sum of products (sp) is -45, the sum of squares for X (ssx) is 50, and the sum of squares for Y (ssy) is 72. Calculate Pearson's r.
Solve for:
Hint: Divide the sum of products by the square root of the product of the two sums of squares.
The full worked solution stays in the interactive walkthrough.
Study smarter
Tips
- Ensure the relationship is linear via a scatterplot before computing.
- Values closer to -1 or +1 indicate stronger relationships.
- A value of zero indicates no linear association exists between the variables.
Common questions
Frequently Asked Questions
Measures the linear relationship between two continuous variables, ranging from −1 to +1.
Use this metric when analyzing two interval- or ratio-level variables that appear to share a linear trend. It assumes bivariate normality and is sensitive to outliers, so data should be screened for extreme values before calculation.
In psychological research, it identifies how variables like personality traits and behavioral outcomes co-vary. Understanding these associations allows for the development of predictive models and the validation of measurement scales.
Ensure the relationship is linear via a scatterplot before computing. Values closer to -1 or +1 indicate stronger relationships. A value of zero indicates no linear association exists between the variables.
References
Sources
- Gravetter, F. J., Wallnau, L. B., Forzano, L. B., & Witnauer, J. E. (2021). Essentials of statistics for the behavioral sciences (10th ed.).
- Field, A. (2018). Discovering statistics using IBM SPSS Statistics (5th ed.). SAGE Publications.
- Wikipedia: Pearson correlation coefficient
- Discovering Statistics Using IBM SPSS Statistics (Field, 2018)
- Statistics for the Behavioral Sciences (Gravetter & Wallnau, 2017)
- Cohen Statistical Power Analysis for the Behavioral Sciences
- Field Discovering Statistics Using IBM SPSS Statistics
- Howell Statistical Methods for Psychology