Point-Biserial Correlation#
The point-biserial correlation is a special case of the Pearson correlation that quantifies the linear association between a dichotomous (binary) variable and a continuous variable. It is commonly used when a natural grouping (e.g., gender, pass/fail) needs to be related to a metric outcome. Mathematically, it is identical to the Pearson correlation when the dichotomous variable is coded as 0 and 1.
When to Use#
- One variable is dichotomous (exactly two categories, e.g., yes/no, male/female)
- The other variable is metric (interval or ratio scaled)
- You want to determine the strength and direction of the association between the two variables
- Observations are independent of one another
- As an alternative to the t-test when you want to report an association measure rather than a group difference
Assumptions#
- Dichotomous variable with exactly two categories (0/1 coded)
- Continuous variable approximately normally distributed in both groups
- Independent observations (no repeated measures)
- Homogeneity of variances across both groups (desirable)
Formula#
The point-biserial correlation can be calculated directly from group statistics:
Here, is the mean of group 1, is the mean of group 0, is the standard deviation of all values, and are the group sizes, and is the total sample size.
Alternatively, you can simply compute the Pearson correlation between the 0/1-coded dichotomous variable and the continuous variable β the result is identical:
Example#
Practical Example: Gender and Reaction Time
A psychology experiment investigates whether there is an association between gender (male = 0, female = 1) and reaction time (in milliseconds).
Data (n = 10):
- Male (0): 320, 345, 310, 298, 330 ms β ms
- Female (1): 290, 275, 305, 280, 295 ms β ms
Calculation:
- Overall mean: ms
- Standard deviation: ms
- , ,
Interpretation: There is a strong negative association (). Female participants show shorter reaction times on average compared to male participants.
Effect Size#
The point-biserial correlation coefficient is itself an effect size measure and lies on the same scale as Pearson's :
| | Interpretation | |---|---| | 0.10 | Small effect | | 0.30 | Medium effect | | 0.50 | Large effect |
These conventions follow Cohen (1988). Additionally, the coefficient of determination can be calculated to express the proportion of explained variance.
Further Reading
- Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Lawrence Erlbaum Associates.
- Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE Publications.
- Glass, G. V. & Hopkins, K. D. (1996). Statistical Methods in Education and Psychology (3rd ed.). Allyn & Bacon.