Kendall’s Tau Correlation

Kendall’s tau correlation is another non-parametric correlation coefficient which is defined as follows.

Let x1, …, xn be a sample for random variable x and let y1, …, yn be a sample for random variable y of the same size n. There are C(n, 2) possible ways of selecting distinct pairs (xi, yi) and (xj, yj). For any such assignment of pairs, define each pair as concordant, discordant or neither as follows:

  • concordant if (xi > xj and yi > yj) or (xi < xj and yi < yj)
  • discordant if (xi > xj and yi < yj) or (xi < xj and yi > yj)
  • neither if xi = xj or yi = yj (i.e. ties are not counted).

Now let C = the number of concordant pairs and D = the number of discordant pairs. Then define tau as

Kendall's tau

We can use Kendall’s tau for hypothesis testing even when x and y are not binormally distributed and even when there are outliers. Click on the following topics for more details:

15 Responses to Kendall’s Tau Correlation

  1. Marc Grodent says:

    Dear Charles,

    Congratulations for your web site.

    I’m using Gaussian copulas to generate correlated multivariate data.
    In this context, the Kendall’s Tau is used to assess the correlation between the variables.
    It seems that the KCORREL function returns a “#VALUE!” when the sample size exceeds 621…

    Best regards,

    Marc

    • Charles says:

      Marc,
      I see the problem that you have described and will try to figure out what the problem is. I should have an answer shortly. Thanks for identifying this bug.
      Charles

    • Charles says:

      Marc,
      Thanks for identifying this error. It turns out there was an overflow error when calculating the correction for ties. I have now corrected this and will provide the new version of the KCORREL function in the next release of the software, which should be available in a few days.
      Charles

  2. Eny says:

    Oh… one more thing (I forgot), if we choose to do Spearman’s or Kendall’s, and our data sets are comprised of 2 variables, where Var 1 is ordinal data (Likert-like rank data: 1, 2, 3, 4 and 5) and Var 2 is interval data (measurement/estimation), should I convert first the interval data sets into ranked data set before I run the analysis?

    The total number of my data sets is 47 pairs.

    Thank you again,

    Eny

    • Charles says:

      Eny,
      For Spearman’s you need to convert the data sets into ranked data. For Kendall’s tau this is not necessary (at least if you follow the procedure outlined in the website), although you will get the same result if you do. These measures of correlation are especially useful when there are outliers. Often Kendall’s tau is preferred over Spearman’s rho because it has some additional statistical properties (e.g. there is a commonly accepted measure of standard error).
      Charles

  3. Eny says:

    Dear Dr Zaiontz,

    I ‘accidentally’ found your website yesterday when I was scouring the internet to get some guidance on correlation analyses (non-parametric).

    Could you please enlighten me: when should we use Spearman’s rho correlation, and when should we use Kendall’s tau correlation?

    From my searching, I noted that both are for non-parametric and non-normally distributed data, and both can deal with two ordinal data sets, or one ordinal data set and one interval data set.

    So, when we can use one or the other? What is the urgency of using one over the other?

    I downloaded your Real Statistics Add-Ins, however, I did not see any option for either Spearman’s, Kendall’s or even Pearson’s. I did see “T-Test and Non-Parametric Equivalent”, but these options did not give Spearman’s or Kendall’s results.

    How should I do Spearman’s or Kendall’s using your data pack?

    Thank you,

    Eny

    • Charles says:

      Dear Eny,

      Pearson’s can be calculated by the Excel function CORREL (or the equivalent function PEARSON).

      Spearman’s correlation for the data in ranges R1 and R2 can be calculated in Excel by the formula =CORREL(RANK.AVG(R1,R1,1),RANK.AVG(R2,R2,1)).

      I explain how to calculate Kendall’s tau on the referenced webpage, but there isn’t a simple standard Excel formula that I know of.

      In any case I plan to include functions for both Spearman’s rho and Kendall’s tau in the next release of the Real Statistics add-in, which is planned for early next week.

      Charles

      • Eny says:

        Hi Charles,

        I did write “=CORREL(RANK.AVG(R1,R1,1),RANK.AVG(R2,R2,1))” as per your suggestion and I got #N/A as the answer.

        My data are set in range N7:N53 for Var 1, and range O7:O53 for Var 2.

        I am a bit confused with the R syntax you mentioned, but I wrote:

        =CORREL(RANK.AVG(N7:N53,R1,1),RANK.AVG(O7:O53,R2,1)), and I got #N/A.

        I used Excel 2010.

        Thank you,
        Eny

    • Charles says:

      Eny,
      The latest release of the Real Statistics software contains the new functions SCORREL and KCORREL which calculate Spearman’s and Kendall’s coefficients. You can also use the new Correlation data analysis tool to get these results.
      Charles

      • Eny says:

        Hi Charles,

        Thanks a bunch for this. Do I need to to convert the data sets into ranked data prior to analyzing with “=SCORREL(N7:N53,O7:O53)” ?

        Thanks again,
        Eny

Leave a Reply

Your email address will not be published. Required fields are marked *