Biserial Correlation

In Relationship between Correlation and t Test and Relationship between Correlation and Chi-square Test we introduced the point-serial correlation coefficient, which is simply the Pearson’s correlation coefficient when one of the samples is dichotomous.

The biserial correlation coefficient is also a correlation coefficient where one of the samples is measured as dichotomous, but where that sample is really normally distributed. In such cases, the point-serial correlation generally under-reports the true value of the association. The biserial correlation coefficient provides a better estimate in this case.

Assuming that we have two sets X = {x1, …, xn} and Y = {y1, …, yn} where the xi are 0 or 1, then the biserial correlation coefficient, denoted rb, is calculated as follows:


Where n0 = number of elements in X which are 0, n1 = the number of elements in X which are 1 (and so n = n0+n1), p0 = n0/n, p1 = n1/n, m0 = the mean of {yi: xi = 0}, m1 = the mean of {yi: xi = 1}, s is the standard deviation of Y and


Example 1: Calculate the biserial correlation coefficient for the data in columns A and B of Figure 1.

Biserial correlation coefficient

Figure 1 – Biserial Correlation Coefficient

The biserial correlation of -.06821 (cell J15) is calculated as shown in column L. Note that the value is a little more negative than the point-serial correlation (cell C4).

Real Statistics Function: The following function is provided in the Real Statistics Resource Pack.

BCORREL(R1, R2) = the biserial correlation coefficient corresponding to the data in column ranges R1 and R2, where R1 is assumed to contain only 0’s and 1’s.

For biserial correlation coefficient for Example 1 can be calculated using the BCORREL function, as shown in cell G6 of Figure 1.