On this webpage we show how to use Spearman’s rank correlation for hypothesis testing. In particular, we show how to test whether there is a correlation between two random variables by testing whether or not Spearman’s rho = 0 (the null hypothesis).

For low values of rho, a table of critical values can be used (see Spearman’s Rho Table). For higher values (generally about *n* > 10), Theorem 1 of Correlation Testing via t Test and Theorem 1 of Correlation Testing via Fisher Transformation is applied using Spearman’s rho in place of Pearson’s correlation *r.*

In general, however, Kendall’s tau is often the preferred non-parametric approach since it has more desirable statistical properties.

**Example 1**: Repeat the analysis for Example 1 of Correlation Testing via t Test using Spearman’s rho, i.e. test whether Spearman’s rho is significantly different from zero based on the sample data in range B4:C18 of Figure 1.

**Figure 1 – Hypothesis testing of Spearman’s rho**

Spearman’s rho is the correlation coefficient on the ranked data, namely CORREL(D4:D18,E4:E18) = -.674. Alternatively it can be computed using the Real Statistics formula =SCORREL(D4:D18,E4:E18).

We now use the table in Spearman’s Rho Table to find the critical value of .521 for the two-tail test where = 15 and = .05. Since the absolute value of rho is larger than the critical value, we reject the null hypothesis that there is no correlation.

Since *n* = 15 ≥ 10, we can use a t-test instead of the table. By Theorem 1 of Correlation Testing via t Test, we use the test statistic

Since |*t*| = 3.29 > 2.16 = *t _{crit} *= TINV(.05,13), we again conclude that there is a significant negative correlation between the number of cigarettes smoked and longevity. The details of the analysis are shown in Figure 2.

For Excel 2007 users, replace the formula in cell H11 by TINV(H10,H7) and the formula in cell H12 by TDIST(ABS(H9),H7,2).

**Observation**: To conduct a one-tail test use the table in Spearman’s Rho Table with *α *multiplied by 2.

**Real Statistics Excel Function**: The following function is provided in the Real Statistics Resource Pack:

**RhoCRIT**(*n, α, tails, h*) = the critical value of the Spearman’s rho test for samples of size *n*, for the given value of *alpha* (default .05), and *tails* = 1 or 2 (default). If *h* = TRUE (default) harmonic interpolation is used; otherwise linear interpolation is used.

**SCORREL**(R1, R2, *lab, tails, alpha*): an array function which outputs a column range consisting of Spearman’s correlation coefficient (rho) and the t-stat and p-value which test the null hypothesis that rho = 0. If *lab* = TRUE then a column of labels are added to the output (default is FALSE). *tails* = 1 or 2 (default),* alpha* = significant level (default .05)

For Example 1, RhoCRIT(15, .05, 2) = .521, as described above.

For Example 1 of Spearman’s Correlation, SCORREL(B4:B18,C4:C18,TRUE) returns the output shown in Figure 2.

**Figure 2 – Output from SCORREL function**

Hi Charle,

Really you’re doing a great work.

I would participate and suggest a formula to get the exact rank.

In cell D4 the formula is:

=((COUNTIF(B$4:B$100,”=” & B4)*(COUNTIF(B$4:B$100,”<" & B4)+1))+((COUNTIF(B$4:B$100,"=" & B4)))*(COUNTIF(B$4:B$100,"=" & B4)-1)/2)/COUNTIF(B$4:B$100,"=" & B4)

In cell E4 same formula just replace column B with C

Pls try it and let me know your view point.

Thanks

Dr. Buenas noches, escuseme, como puedo determinar la correlación de Spearman con tres o más variables?.

Dr Good evening, excuse me ¿how can I do a Spearman Correlation with three or more variables?

Gerardo,

To calculate the Pearson’s correlation for three or more variables you can use the Real Statistics RSquare(R1,j) function as described on the webpage

http://www.real-statistics.com/multiple-regression/multiple-correlation-advanced/

To do the same thing for the Spearman’s correlation, first you could use RANK.AVG (or RANK_AVG for Excel 2007 users) to get the ranks of the data elements (range R1) and then use the RSquare(R1,j) function.

Alternatively, you can perform multiple linear regression on the ranked data.

Charles

Dr thank you very much

The =SCORREL($L$22:$L$37;Q22:Q37;TRUE) just returns “roh”, but not all the values and fields. Am I using it wrong?

Anyway, Thank you for the Statistics Package.

Jan,

SCORREL is an array function which outputs multiple cells. You need to highlight a multiple cell range and press Ctrl-Shift-Enter. See the following webpage for more details:

Array functions and formulas

Charles

Thank you for your help. I didn’t know that function.

The knowledge shared is so much helpful. Keep it up.

As a Lecturer in the Statistics department I do appreciate.

By Narris Twesigye.

How does one interpret the spearman statistical test result if the statistical p-value is equal to the alpha value (i.e. statistical p=0.05 while alpha=0.05) although the spearman rho calculated (absolute) is found to be less than the tabulated (critical) spearman rho?

Mary,

The value of alpha = .05 is a somewhat arbitrary number, and so in any case, the best thing is to simply report the p-value without worrying too much as to whether it is a significant result or not. It is obviously significant for alpha of.051 and not significant for alpha of .025.

Regarding the critical value, what was your calculated Spearman’s rho value and what did the table say was the critical value? Given a p-value of .05, I presume that the calculated and critical rho values are almost the same. If not, probably something is wrong.

Charles

Charles,

(re-posting as the punctuation messed it up.)

Thanks for a great website – I am making my way through it. You make a complex subject easy to understand.

I am puzzled by this example of life expectancy and cigarettes in figure 2. As I understand it H0 is that they are uncorrelated. The alpha is 0.05 and the p-value of the spearman rho statistic is 0.005821. By comparing the t stat to the critical t you conclude that H0 should not be rejected as t stat is less than t crit (i.e. they are uncorrelated). Clearly they are correlated as a simple xy graph will show. Should you not be comparing H12 to H10 and concluding that it is significant and rejecting the null hypothesis. Elsewhere on your website for significance you talk about comparing abs(rho) to rho-crit in order to check for significance and clearly here abs(-0.674) greater than 0.521 so it is significant. Should you be comparing abs(t stat) to t-crit?

Or have I mis-understood something here?!

Thanks again for a great website.

In your comment you state that “By comparing the t stat to the critical t you conclude that H0 should not be rejected”.

I don’t see where this is stated. On the webpage I clearly state that the null hypothesis should be rejected.

Charles

Charles,

Sorry I wasn’t very clear. In figure 2 above in cell H13 the result is ‘no’ for significance.

Yes, you are correct. I have now replaced Figure 2 with one that contains the correct formula. Thanks for identifying this error.

Charles

Dear Sir

please tell me what should be the answer if values remain same. i mean if i take values 1,1,1,1,1 for variable A and 1,1,1,1,1 for variable B in spearman rho correlation.

It will be undefined since the variance of each sample is 0 and so the denominator of the formula used to compute the correlation coefficient will be zero.

Charles

Hi Charles. Thanks for making Real-Statistics available to all. Do you know of a way to use SCORREL when there are missing data? For example suppose we have want to calculate SCORREL(A5:A19,B5:B19,TRUE) but we delete the observation in cell B10. Is there an easy way to do this? I can revise to SCORREL((A5:A9,A11:A19),(B5:B9,B11:B19),TRUE) to do the calculation but I have 100s of correlations to calculate and there are random missing values scattered throughout my data set. Any thoughts would be appreciated. Thanks, Tim.

Tim,

Suppose your data is stored say in range A1:B100 and you are trying to calculate SCORREL(A1:A100,B1:B100). Highlight range D1:E100 and enter the array formula =DELROWBLANK(A1:B100) in the highlighted range and press Ctrl-Shft-Enter. The result will be the same data with any pairs with one or two empty cells removed. Now suppose 10 rows were removed (thus the range D91:E100 contains #N/A). The result you are now looking for can be calculated using the formula =SCORREL(D1:D90,E1:E90).

You can also use the Reformatting a Data Range by Rows data analysis tool to remove rows with empty cells.

The above assumes that missing data is equivalent to empty cells. If instead missing data is equivalent to some non-numeric value (e.g the word “missing”) then you would use the array formula DELROWNonNum instead.

In a future release I will automatically remove non-numeric entries from the data before calculating SCORREL.

Charles

Thank you Charles, but could you tell me how exactly can I use SCORREL function to correlate the questionnaire and the observation?

Veton,

The referenced webpage describes how to use the SCORREL function or the Correlation data analysis tool to show the correlation between between two paired data sets. If you need need further information, you need to explain further what additional information you need.

Charles

Hi! Could somebody please help on my Master`s thesis I am doing. I have an observation instrument and a questionnaire to compare and make a correlation between them based on some categories and points. I need Spearman`s correlation function to do this in Excel 2007. I would be so grateful if you could help me.

You can do this as explained on the referenced webpage. In particular, you can use the SCORREL function found in the Real Statistics Resource Pack. There is a version for Excel 2007 which you can download for free.

Charles

Excel freezes everytime i try using the =Scorrel function!! i really do need it, what should i do?? thanks!

Richard,

If you send me an Excel spreadsheet with your data, I will try to figure out what is going wrong. Send it to czaiontz@gmail.com. Also which version of Excel are you using?

Charles

Charles,

I’m experimenting with the Spearman Rho, using ranked data that contain ties. The examples given in text books use rank date with no ties. My interest is comparing the rho value using SCORREL and the conventional 1-sum(d^2)/n(n^2-1) formula.

Thanks

Dan

Dan,

I have not tried to make such a comparison, but would be interested to in hearing what you discover.

Charles

Dear Charles,

The clarity of your presentation and examples combined with the comprehensive cross-referencing of all relevant techniques make your site a real winner. After 25 years without touching stats it has been a tough week getting back into it. I have fought with the examples on many other sites but the simplicity of your presentation has allowed me to quickly overcome the hurdles I had faced elsewhere. Thank-you for your outstanding work,

Mart

i am now doing a proyect based on spearman and i have come to a problem, what should i do if i have only one sample instead, i have nothing to compare that sample with. i was wondering how could i calculate it, if its possible to do so.

thanks

Natalie,

Correlation coefficients such as Spearman’s only make sense if there are at least two data sets. Sorry, but I don’t know of any way to calculate this coefficient with only one sample.

Charles

Sir, the command for finding critical value of t statistic ie.RhoCrit do not work if n= 4:30. Please give suggetion, because it takes more time to get critical value in other way.

Lalit,

I just tried =RhoCRIT(4), =RhoCRIT(5), …, =RhoCRIT(30) and they work perfectly. They generate the values found in the table shown on the webpage http://www.real-statistics.com/statistics-tables/spearmans-rho-table/.

What version of the Real Statistics Resource Pack are you using? You can find this out by entering the formula =VER() in any cell.

Charles

Sir, thank You for such perfect resource!

Just one question. Sir, in case of significant amount of the connected ranks does the spearsman rho show still correct result? Starting from which number of connected rank groups the correction for spearsman rho coefficient should be applied? or it will be not applicable at all?

Kate,

By connected ranks do you mean tied ranks?

Charles

Yes, could you please explain how you would do this if there are tied ranks in one of the values. I know there is a formula, but I don’t understand how exactly you would substitute values into that formula.

Lily,

If there are no ties you can use Property 1 of the referenced page to calculate Spearman’s rho. The more ties, the less accurate the results.

Even if there are ties, however, you can use the approach shown in Example 3 of the referenced page (using RANK.AVG and CORREL). I believe this approach produces the correct result even if there are a lot of ties.

Charles

Kate,

If there are no ties you can use Property 1 to calculate Spearman’s rho. The more ties, the less accurate the result. In any case you can use the approach shown in Example 3 of the referenced page. I believe this approach produces the correct result even if there are a lot of ties.

Charles

thanks for the explanation.

I think it would be a little clearer if you explained that even tho

its called “spearman’s rho”, if is often represented r with a subscript of s (to distinguish it from pearson’s corr. coeff)

Also the calculation of t shows “r” (implying pearson’s coef)

when i believe it should be “r sub s”

Kirk,

You are correct. I actually used r with a subscripted s in one place on the webpage but not in the others. I have now made this change. Thanks for your valuable suggestions.

Charles

Sir,

What is the different between Pearson’s correlation and Spearman’s rho ?

What’s the correlation we should use?

Thank

Vinh,

Pearson’s correlation is the usual correlation coefficient calculated on the raw data. Spearman’s rho is Pearson’s correlation calculated on the ranks of the raw data. Generally you should use Pearson’s correlation.

Charles

kindly give me the answer of this..How will i rank the same 4 scores example 1,2,3,4,5,5,5,5,6,7,8

Valen,

I don’t completely understand your question, but the rankings for the 11 numbers you listed are 1, 2, 3, 4, 6.5, 6.5, 6.5,6.5, 9, 10, 11.

Charles

Sir

In example 1, you wrote:”We now calculate both correlation coefficients as follows:

Pearson’s correlation = CORREL(A4:A13,B4:B13) = -0.036

Spearman’s rho = CORREL(C4:C13,D4:D13) = -0.115

The above analysis shows there isn’t much of a correlation between IQ and listening to rap music based on the sample, although the Spearman’s rho is closer to zero (indicating independent samples) than the Pearson’s.”

Spearman’s rho is closer to zero?

Hi ! I tried using the formula RhoCRIT to find a critical value for my data, but all it shows is #N/A .

I already installed the resources from the site following the step-by-step guide, so it should’ve worked.

My sample size is 275, alpha 0.05, 2 tailed. Can you use the formula to calculate it ? Thanks !

Hi Taufiq,

The RhoCRIT function is only valid for values of n between 4 and 30 (see http://www.real-statistics.com/statistics-tables/spearmans-rho-table/). For n = 275 you use the t-test as described in http://www.real-statistics.com/correlation/spearmans-rank-correlation/spearmans-rank-correlation-detailed/ (see example following Figure 4 on that webpage).

Charles

Thank you sir

Thank you so much. Life has become much easier with real-statistics.

Walid (MSc, PhD)