We show how to calculate Spearman’s rho and how to use it for hypothesis testing.

**Example 1**: The table in Figure 1 associates the IQ of each adolescent in a sample with the number of hours they listen to rap music per month. Determine the strength of the correlation between IQ and rap music using both the Pearson’s correlation coefficient and Spearman’s rank correlation. Compare the results.

To calculate Spearman’s rho, we need to determine the rank for each of the IQ scores and each of the Rap scores. E.g. the rank of the first IQ score (cell A4 in Figure E3.1) is =RANK.AVG(A4,A$4:A$13,1), and so we put this formula in cell C4. If you are using Excel 2007 you would use the supplemental function RANK_AVG instead of RANK.AVG (as explained in Ranking).

We now calculate both correlation coefficients as follows:

Pearson’s correlation = CORREL(A4:A13,B4:B13) = -0.036

Spearman’s rho = CORREL(C4:C13,D4:D13) = -0.115

The above analysis shows there isn’t much of a correlation between IQ and listening to rap music based on the sample, although the Spearman’s rho is closer to zero (indicating independent samples) than the Pearson’s.

**Observation**: For sufficiently high values of (some say *n* > 10), Theorem 1 and 2 of One Sample Hypothesis Testing for Correlation can now be used for hypothesis testing using the Spearman’s rank coefficient instead of the correlation coefficient. For lower values, the table of critical values found in Spearman’s Rho Table can be used

When conducting an analysis, if you discover the presence of outliers (e.g. via a histogram or scatter diagram), proceed as follows:

Calculate the Pearson’s correlation coefficient for the sample with and without the outliers. If there isn’t much difference, then you can be pretty confident that the outliers are not influencing the results. You can also calculate the Spearman’s rank coefficient. If this is pretty similar to the Pearson’s correlation coefficient, this is also a good indicator that the outliers are not substantially influencing the results.

If there are clear differences then you will need to be cautious about how you treat the outliers.

Similarly if you test the sample data for the *x* and y variables and see that either one of them is not roughly normal (using the techniques described in Testing for Normality and Symmetry), then you will need to use the Spearman’s coefficient rather than Pearson’s.

Figure 2 displays a scatter diagram for the data in Example 1 We see that the data is pretty randomly scattered although there is a potential outlier where the rap music listening spikes to 45. This gives some evidence that the Spearman’s rho might be a better choice.

**Observation**: When there are no ties in the ranking, there is alternative way of calculating Spearman’s rho using the following property.

**Property 1**: When there are no ties, Spearman’s rho =

where *d _{i} = rank x_{i} – rank *y

_{i}.**Example 2**: Calculate Spearman’s rho for the data from Example 1 using Property 1.

**Figure 3 – Alternative way of calculating rho for Example 1**

Using Property 1 and the table in Figure 3:

**Example 3**: Repeat the analysis for Example 1 of One Sample Hypothesis Testing for Correlation using Spearman’s rho.

As for Example 1, Spearman’s rho is the correlation coefficient on the ranked data, namely CORREL(C5:C19,D5:D19) = -.674 (see Figure 4).

We now use the table in Spearman’s Rho Table to find the critical value for the two-tail test where *n *= 15 and *α* = .05. Interpolating between the values for *n* = 14 and 16, we get a critical value of .525. Since the absolute value of rho is larger than the critical value, we reject the null hypothesis that there is no correlation between cigarette smoking and longevity.

Since *n* = 15 ≥ 10, we can use a t-test instead of the table. By Theorem 1 of One Sample Hypothesis Testing for Correlation, we use the test statistic

Since |*t*| = 3.29 > 2.16 = *t _{crit} *= TINV(.05,13), we again conclude that there is a significant negative correlation between the number of cigarettes smoked and longevity.

**Observation**: To conduct a one-tail test, use the table in Spearman’s Rho Table with *α* divided by 2.

**Real Statistics Excel Function**: The following function is provided in the Real Statistics Resource Pack:

**RhoCRIT**(*n, α, t*) = the critical value of the Spearman’s rho test for samples of size *n*, for the given value of alpha, and *t* = 1 (one tail) or 2 (two tails).

Thank you so much. Life has become much easier with real-statistics.

Walid (MSc, PhD)

Hi ! I tried using the formula RhoCRIT to find a critical value for my data, but all it shows is #N/A .

I already installed the resources from the site following the step-by-step guide, so it should’ve worked.

My sample size is 275, alpha 0.05, 2 tailed. Can you use the formula to calculate it ? Thanks !

Hi Taufiq,

The RhoCRIT function is only valid for values of n between 4 and 30 (see http://www.real-statistics.com/statistics-tables/spearmans-rho-table/). For n = 275 you use the t-test as described in http://www.real-statistics.com/correlation/spearmans-rank-correlation/spearmans-rank-correlation-detailed/ (see example following Figure 4 on that webpage).

Charles

Thank you sir

Sir

In example 1, you wrote:”We now calculate both correlation coefficients as follows:

Pearson’s correlation = CORREL(A4:A13,B4:B13) = -0.036

Spearman’s rho = CORREL(C4:C13,D4:D13) = -0.115

The above analysis shows there isn’t much of a correlation between IQ and listening to rap music based on the sample, although the Spearman’s rho is closer to zero (indicating independent samples) than the Pearson’s.”

Spearman’s rho is closer to zero?

kindly give me the answer of this..How will i rank the same 4 scores example 1,2,3,4,5,5,5,5,6,7,8

Valen,

I don’t completely understand your question, but the rankings for the 11 numbers you listed are 1, 2, 3, 4, 6.5, 6.5, 6.5,6.5, 9, 10, 11.

Charles

Sir,

What is the different between Pearson’s correlation and Spearman’s rho ?

What’s the correlation we should use?

Thank

Vinh,

Pearson’s correlation is the usual correlation coefficient calculated on the raw data. Spearman’s rho is Pearson’s correlation calculated on the ranks of the raw data. Generally you should use Pearson’s correlation.

Charles

thanks for the explanation.

I think it would be a little clearer if you explained that even tho

its called “spearman’s rho”, if is often represented r with a subscript of s (to distinguish it from pearson’s corr. coeff)

Also the calculation of t shows “r” (implying pearson’s coef)

when i believe it should be “r sub s”

Kirk,

You are correct. I actually used r with a subscripted s in one place on the webpage but not in the others. I have now made this change. Thanks for your valuable suggestions.

Charles

Sir, thank You for such perfect resource!

Just one question. Sir, in case of significant amount of the connected ranks does the spearsman rho show still correct result? Starting from which number of connected rank groups the correction for spearsman rho coefficient should be applied? or it will be not applicable at all?

Kate,

By connected ranks do you mean tied ranks?

Charles

Yes, could you please explain how you would do this if there are tied ranks in one of the values. I know there is a formula, but I don’t understand how exactly you would substitute values into that formula.

Lily,

If there are no ties you can use Property 1 of the referenced page to calculate Spearman’s rho. The more ties, the less accurate the results.

Even if there are ties, however, you can use the approach shown in Example 3 of the referenced page (using RANK.AVG and CORREL). I believe this approach produces the correct result even if there are a lot of ties.

Charles

Kate,

If there are no ties you can use Property 1 to calculate Spearman’s rho. The more ties, the less accurate the result. In any case you can use the approach shown in Example 3 of the referenced page. I believe this approach produces the correct result even if there are a lot of ties.

Charles

Sir, the command for finding critical value of t statistic ie.RhoCrit do not work if n= 4:30. Please give suggetion, because it takes more time to get critical value in other way.

Lalit,

I just tried =RhoCRIT(4), =RhoCRIT(5), …, =RhoCRIT(30) and they work perfectly. They generate the values found in the table shown on the webpage http://www.real-statistics.com/statistics-tables/spearmans-rho-table/.

What version of the Real Statistics Resource Pack are you using? You can find this out by entering the formula =VER() in any cell.

Charles