Generalized Extreme Studentized Deviate Test

The Generalized Extreme Studentized Deviate (ESD) Test is a generalization of Grubbs’ Test and handles more than one outlier. All you need to do is provide an upper bound on the number of potential outliers.

We test the null hypothesis that the data has no outliers vs. the alternative hypothesis that there are at most k outliers (for some user specified value of k).

To test the data set S with n elements is we generate k test statistics G1, G2, …, Gk where each Gj is a two-tailed Grubbs’ statistic, defined as follows:

S1 = S

 j is the mean of Sj and sj is the standard deviation of Sj


Sj+1 = Sj − {xj} where xj = the element  in Sj such that |x− | is maximized

Essentially you run k separate Grubbs’ tests, testing whether Gj > Gj-crit  where Gj-crit is Gcrit as described above, but adjusted for the correct value of the sample size; i.e. n is replaced by − + 1. Now let r be the largest value of jk such that Gj > Gj-crit. Then we conclude there are r outliers, namely x1, …, xr. If r = 0 there are no outliers.

Note that if Gj > Gj-crit and h < j, then both xh and xj are outliers even if Gh ≤ Gh-crit.

Example 2: Identify all the outliers in the data set shown in range A5:B15 of Figure 1.

ESD (Grubbs') Test

Figure 1 – First trial of the ESD Test

Looking at the data set, we see five potential outliers: 3, 40, 350, 410 and 440. As we did in Grubbs’ Test we need to test for normality. In fact, if we were to run the Shapiro-Wilks test it would show that the data set without the five potential outliers is normally distributed. We therefore use the ESD Test with k = 5 (for five outliers); in fact, just to be sure we will set k = 6.

The Grubbs’ Test for the first outlier us shown on the right side of Figure 1. This is the two-tailed version of the test shown in Figure 2. We see that the minimum data value is 3 (cell E5) and the maximum value is 440 (cell E6). We see from cells E9 and E10 that the maximum value is further away from the mean than the minimum value, and so our first test is of 440 as an outlier.

The test is not significant, but as we shall see, this doesn’t necessarily mean that 440 is not an outlier. We now run the test five more times. The next two trials are shown in Figure 5.

ESD Test

Figure 2 – Trials 2 and 3 of the ESD Test

The data set for the second trial (range I5:J15) is the same as for the first trial, but with the data element 440 removed. The second trial shows that once again the maximum value (410) is further away from the mean than the minimum value (3). This means that our second trial is a test as to whether 410 is an outlier. Once again the test is not significant.

Removing 410, we get the data for the third trial as shown in range O5:P15. Once again the maximum value (350) is further away from the mean than the minimum value (3), but this time the test is significant, which means that 350 is an outlier. But this automatically classifies 440 and 410 as outliers too.

We summarize the results of all six trials in Figure 6.

Trial outlier G G-crit sig
1 440 2.497556 2.757735 no
2 410 2.729992 2.73378 no
3 350 2.714963 2.708246 yes
4 3 2.721414 2.680931 yes
5 40 2.83852 2.651599 yes
6 220 1.707766 2.619964 no

Figure 3 – ESD Test Summary

Figure 3 confirms that 3, 40, 350, 410 and 440 are outliers (220 is not an outlier).

Real Statistics Function: The Real Statistics Resource Pack provides the following array function to perform the ESD test.

ESD(R1, lab, alpha): outputs a 4 × k column range with the following entries: potential outlier, G, Gcrit and test significance in each column

If lab = TRUE (default FALSE) then the output is a 4 × (k+1) range with a column of labels added. alpha = the significance level (default .05). The potential outlier is either the maximum or minimum value in R1, depending on which is farthest away from the mean of R1. The test significance if “yes” if G > Gcrit and “no” otherwise.

If lab = FALSE then the value of  is set to the number of columns in the highlighted range, while if lab = TRUE then the value of  is set to the number of columns in the highlighted range minus 1 (the extra column contains the labels).

For Example 1, if you highlight the range AN23:AT26, enter the formula =ESD(A5:A15,TRUE) and press Ctrl-Shft-Enter, then the output that appears is displayed in Figure 4.

ESD Test Real Statistics

Figure 4 – Output from ESD formula

Since the highlighted range contains 7 columns and lab = TRUE, k = 6.

Real Statistics Function: The Real Statistics Resource Pack also provides the following simpler array function to perform the ESD test.

OUTLIERS(R1, alpha): outputs a column range with up to k outliers where k is the number of columns in the highlighted range. alpha defaults to .05.

For Example 1, if you highlight the range AV23:AV31, enter the supplemental formula =OUTLIERS(A4:A14) and press Ctrl-Shft-Enter, then the output that appears is displayed in Figure 5.

Outliers Grubbs' test Excel

Figure 5 – Output from OUTLIERS formula

Since the highlighted range contains 9 rows, k = 9. As you can see from Figure 5, even if we perform the ESD test with 9 trials, we still get the same five outliers.

17 Responses to Generalized Extreme Studentized Deviate Test

  1. Jesse says:

    Hi Chuck, Thank you for the clear definition and great example.
    In the text above Figure 1 for Example 2:
    “Identify all the outliers in the data set shown in range A18:B28 of Figure 1”
    Do you mean “in range A5:B15 of Figure 1” or am I missing something?

    • Charles says:

      Hi Jesse,
      Yes, you are correct. Thanks for finding this mistake. I have corrected the error on the webpage.
      I really appreciate your help in making the website better and easier to use.

  2. Sue says:

    What is the smallest number of values for which this statistical approach is valid? I tend to have only 5 or 6 data points, which ESD makes all outliers until I get down to 2 points which obviously stops working. Is there a method for such a small sample?

    • Charles says:

      Most statistical methods don’t really work that well with such small samples (5 or 6 data points), unless that values are really extreme. E.g. with data such as 2, 3, 6, 8, 400, it is pretty easy to see that 400 is an outlier, but not so clear with 2, 3, 6, 8, 15.

  3. Pingback: Part 2 – Anomaly Detection on Twitter Sentiment Stream following Microsoft SQL Server 2016 Announcement | Forefront Analytics

  4. Tom Paulson says:

    Should the ESD and OUTLIER functions give the same answer? I am finding that when I have a smaller set of data (n=14) sometimes OUTLIER returns all the values as being outliers, whereas running ESD gives what appears to be the real couple of outliers.

    • Charles says:

      They won’t necessarily give the same answers.

      • Tom Paulson says:

        Is there a way to determine which one would be more correct to use in a given situation?

        • Charles says:

          There isn’t even a clear definition of what is an outlier. Is it related to IQR units from the median or standard deviations from the mean? Is an outlier a naturally occurring thing: e.g. in a sample of 10 perhaps I should consider a data element that is 3 standard deviations from the mean to be an outlier, but should I do the same in a sample of 10,000? In general, I would use both tests to discover potential outliers and then evaluate each to determine whether or not I should do something special with that data element

  5. Rick says:

    Along the lines of the first comment, how is the value for k determined, or even estimated? If running repeated Shapiro-Wilk tests for normality, removing one value at a time, doesn’t this itself give you the number of outliers, without the need for the ESD?

    • Charles says:

      One way to estimate k is to plot the data.
      You don’t necessarily need to use ESD to find the outliers, but it can be helpful in doing so.

  6. Yvette says:

    Hi Charles , Thanks for the fantastic contents.
    But just one question: why not just do the test from the k+1 to 1?

    • Charles says:

      I am not completely sure that I understand what you are suggesting, but he problem is that you don’t really know the correct value of k to use.

  7. Alec Scott says:

    Excellent site, thanks.

    Just above Figure 4, you have “=ESD(A4:A14,TRUE)”. I think this should be: “=ESD(A5:B15,TRUE)” to be consistent with Example 2 on this page.

    • Charles says:

      Thanks for catching this mistake. I have just made your suggested change on the webpage.

  8. Bob Krile says:

    It appears you are doing a 2-sided test, since you are looking at both minimum and maximum deviation from the mean. As such, shouldn’t the cell E17 in Figure 1 be defined as TINV(E15/2,E16), rather than TINV(E15,E16)?

Leave a Reply to Sue Cancel reply

Your email address will not be published. Required fields are marked *