Two Sample Hypothesis Testing for Correlation

We now extend the approach for one sample hypothesis testing of the correlation coefficient to two samples.

Theorem 1: Suppose r1 and r2 are as in the Theorem 1 of Correlation Testing via Fisher Transformation where r1 and r2 are based on independent samples and further suppose that ρ1 = ρ2. If z is defined as follows, then z ∼ N(0,1).

where

Proof: By Theorem 1 of Correlation Testing via Fisher Transformation for i = 1, 2

By Property 1 and 2 of Basic Characteristics of the Normal Distribution it follows that

where s is as defined above. Since ρ1 = ρ2 it follows that ρ´1 = ρ´2, and so

from which the result follows.

We can use Theorem 1 to test whether the correlation coefficients of two populations are equal based on taking a sample from each population and comparing the correlation coefficients of the samples.

Example 1: A sample of 40 couples from London is taken comparing the husband’s IQ with his wife’s. The correlation coefficient for the sample is .77. Is this significantly different from the correlation coefficient of .68 for a sample of 30 couples from Paris?

H0: ρ1 = ρ2

$r'_1$  = FISHER(r1) = FISHER(.77) = 1.020

$r'_2$  = FISHER(r2) = FISHER(.68) = 0.829

s = SQRT(1/(n1 – 3) + 1/(n2 – 3)) = SQRT(1/37 + 1/27) = 0.253

z = ($r'_1$ – $r'_2$)/s = (1.020 – .829) / .253 = 0.755

p-value = 2(1 – NORMSDIST(z) = 1 – NORMSDIST(.522)) = 0.45

We next perform either one of the following tests:

p-value = .45 > .05 = α

zcrit = NORMSINV(1 – α/2) = NORMSINV(.975) = 1.96 > .755 = z

In either case the null hypothesis is not rejected.

Note that in Example 1 the couples from Paris are selected independently from the couples from London. A different test is required if the samples are dependent.

Click here for an example on how to perform Two Sample Hypothesis Testing for Correlation with Dependent Samples.

Real Statistics Functions: The following supplemental functions are provided in the Real Statistics Resource Pack.

Correl2Test(r1, n1, r2, n2, alpha, lab): array function which outputs z, p-value (two-tailed), lower and upper (i.e. lower and upper bound of the 1 – alpha confidence interval), where r1 and n1 are the correlation coefficient and sample size for the first sample and r2 and n2 are similar values for the second sample. If lab = TRUE then the output takes the form of a 4 × 2 range with the first column consisting of labels, while if lab = False (default) then output takes the form of a 4 × 1 range without labels.

Correl2Test(R1, R2, R3, R4, alpha, lab) = CorrelTest(r1, n1, r2, n2, alpha, lab) where r1 = CORREL(R1, R2), n1 = the common sample size between R1 and R2 (i.e. the number of pairs from R1 and R2 which both contain numeric data), r2 = CORREL(R3, R4) and n2 = the common sample size between R3 and R4.

If alpha is omitted it defaults to .05.

Observation: Correl2Test(.77,40,.68,30,.05) generated the values z = .755, p-value = .45, consistent with what we observed above, plus lower = -.296 and upper = .596. Since 0 is in the confidence interval (-.296, .596) the test is not significant and we cannot reject the null hypothesis that the two correlation coefficients are equal.

4 Responses to Two Sample Hypothesis Testing for Correlation

1. Gustaf says:

Hi and thank you for the nice informative pages.
Since I am not a very experienced user I must ask.
I use your correl2test(r1, n1, r2, n2, alpha, lab)
as follows =correl2test( 0,569;10190;0,641; 2039;0,05)
but not get only one number instead of 4
I get -4,652529256

By the way I have excel2010

Regards Gustaf

• Charles says:

Gustaf,
Correl2Test is an array function and so you can’t simply highlight one cell and press the Enter key. You need to highlight a column range with at least 4 cells and press Ctrl-Shift-Enter. See Array Formulas and Functions for more details.
Charles

• Gustaf says:

Thank you Charles,
I will check it out.

I had some trouble with your p-value also so I solved it like this:
The cell for the p-value: =IF(AW4>=0; 2*(1-NORM.DIST(AW4;0;1;TRUE));2*NORM.DIST(AW4;0;1;TRUE))
where AW4 is the z.
The NORM.DIST is a new function from Excel. The other one NORMSDIST does not work anymore apparently.

I have a question though. I now tested the hypothesis that of equality. What if I tested Ho: rho1>rho2. Any tips for me there?

I am very thankful for your commitment to these pages you offer by the way.

/Gustaf

• Charles says:

Gustaf,
You could also use the formula =2*(1-NORM.DIST(ABS(AW4);0;1;TRUE)) or =2*(1-NORM.S.DIST(ABS(AW4);TRUE)).
The formula NORMSDIST still works on my computer. I understood that Excel still supports this function, but wants people to migrate to NORM.DIST.
I beleive that if you are testing Ho: rho1>rho2, then you should use a one-tail test, i.e. =1-NORM.S.DIST(ABS(AW4);TRUE).
Charles