The **Generalized Extreme Studentized Deviate** (**ESD**) **Test **is a generalization of Grubbs’ Test and handles more than one outlier. All you need to do is provide an upper bound on the number of potential outliers.

We test the null hypothesis that the data has no outliers vs. the alternative hypothesis that there are at most *k* outliers (for some user specified value of *k*).

To test the data set *S* with *n* elements is we generate *k* test statistics *G*_{1}, *G*_{2}, …, *G _{k}* where each

*G*is a two-tailed Grubbs’ statistic, defined as follows:

_{j}*S*_{1} = *S*

*x̄** _{j}* is the mean of

*S*and

_{j}*s*is the standard deviation of

_{j}*S*

_{j}*S _{j+}*

_{1}=

*S*− {

_{j}*x*} where

_{j}*x*= the element in

_{j}*S*such that |

_{j}*x*−

_{j }*x̄*| is maximized

Essentially you run *k* separate Grubbs’ tests, testing whether *G _{j} > G_{j}-crit* where

*G*is

_{j}-crit*G*as described above, but adjusted for the correct value of the sample size; i.e.

_{crit}*n*is replaced by

*n*−

*j*+ 1. Now let

*r*be the largest value of

*j*≤

*k*such that

*G*>

_{j}*G*. Then we conclude there are

_{j}-crit*r*outliers, namely

*x*

_{1}, …,

*x*. If

_{r}*r*= 0 there are no outliers.

Note that if *G _{j} > G_{j}-crit* and

*h < j*, then both

*x*and

_{h}*x*are outliers even if

_{j}*G*.

_{h}≤ G_{h}-crit**Example 2**: Identify all the outliers in the data set shown in range A5:B15 of Figure 1.

**Figure 1 – First trial of the ESD Test**

Looking at the data set, we see five potential outliers: 3, 40, 350, 410 and 440. As we did in Grubbs’ Test we need to test for normality. In fact, if we were to run the Shapiro-Wilks test it would show that the data set without the five potential outliers is normally distributed. We therefore use the ESD Test with *k* = 5 (for five outliers); in fact, just to be sure we will set *k* = 6.

The Grubbs’ Test for the first outlier us shown on the right side of Figure 1. This is the two-tailed version of the test shown in Figure 2. We see that the minimum data value is 3 (cell E5) and the maximum value is 440 (cell E6). We see from cells E9 and E10 that the maximum value is further away from the mean than the minimum value, and so our first test is of 440 as an outlier.

The test is not significant, but as we shall see, this doesn’t necessarily mean that 440 is not an outlier. We now run the test five more times. The next two trials are shown in Figure 5.

**Figure 2 – Trials 2 and 3 of the ESD Test**

The data set for the second trial (range I5:J15) is the same as for the first trial, but with the data element 440 removed. The second trial shows that once again the maximum value (410) is further away from the mean than the minimum value (3). This means that our second trial is a test as to whether 410 is an outlier. Once again the test is not significant.

Removing 410, we get the data for the third trial as shown in range O5:P15. Once again the maximum value (350) is further away from the mean than the minimum value (3), but this time the test is significant, which means that 350 is an outlier. But this automatically classifies 440 and 410 as outliers too.

We summarize the results of all six trials in Figure 6.

Trial | outlier | G | G-crit | sig |

1 | 440 | 2.497556 | 2.757735 | no |

2 | 410 | 2.729992 | 2.73378 | no |

3 | 350 | 2.714963 | 2.708246 | yes |

4 | 3 | 2.721414 | 2.680931 | yes |

5 | 40 | 2.83852 | 2.651599 | yes |

6 | 220 | 1.707766 | 2.619964 | no |

**Figure 3 – ESD Test Summary**

Figure 3 confirms that 3, 40, 350, 410 and 440 are outliers (220 is not an outlier).

**Real Statistics Function**: The Real Statistics Resource Pack provides the following array function to perform the ESD test.

**ESD**(R1, *lab, alpha*): outputs a 4 × *k* column range with the following entries: potential outlier, *G*, *G _{crit}* and test significance in each column

If *lab* = TRUE (default FALSE) then the output is a 4 × (*k*+1) range with a column of labels added. *alpha* = the significance level (default .05). The potential outlier is either the maximum or minimum value in R1, depending on which is farthest away from the mean of R1. The test significance if “yes” if *G* > *G _{crit}* and “no” otherwise.

If *lab* = FALSE then the value of is set to the number of columns in the highlighted range, while if *lab* = TRUE then the value of is set to the number of columns in the highlighted range minus 1 (the extra column contains the labels).

For Example 1, if you highlight the range AN23:AT26, enter the formula =ESD(A5:A15,TRUE) and press **Ctrl-Shft-Enter**, then the output that appears is displayed in Figure 4.

**Figure 4 – Output from ESD formula**

Since the highlighted range contains 7 columns and *lab* = TRUE, *k* = 6.

**Real Statistics Function**: The Real Statistics Resource Pack also provides the following simpler array function to perform the ESD test.

**OUTLIERS**(R1, *alpha*): outputs a column range with up to *k* outliers where *k* is the number of columns in the highlighted range. *alpha* defaults to .05.

For Example 1, if you highlight the range AV23:AV31, enter the supplemental formula =OUTLIERS(A4:A14) and press **Ctrl-Shft-Enter**, then the output that appears is displayed in Figure 5.

**Figure 5 – Output from OUTLIERS formula**

Since the highlighted range contains 9 rows, *k* = 9. As you can see from Figure 5, even if we perform the ESD test with 9 trials, we still get the same five outliers.

It appears you are doing a 2-sided test, since you are looking at both minimum and maximum deviation from the mean. As such, shouldn’t the cell E17 in Figure 1 be defined as TINV(E15/2,E16), rather than TINV(E15,E16)?

Bob,

In Excel TINV is already the two-tailed inverse.

Charles

Excellent site, thanks.

Just above Figure 4, you have “=ESD(A4:A14,TRUE)”. I think this should be: “=ESD(A5:B15,TRUE)” to be consistent with Example 2 on this page.

Alec,

Thanks for catching this mistake. I have just made your suggested change on the webpage.

Charles

Hi Charles , Thanks for the fantastic contents.

But just one question: why not just do the test from the k+1 to 1?

Yvette,

I am not completely sure that I understand what you are suggesting, but he problem is that you don’t really know the correct value of k to use.

Charles

Along the lines of the first comment, how is the value for k determined, or even estimated? If running repeated Shapiro-Wilk tests for normality, removing one value at a time, doesn’t this itself give you the number of outliers, without the need for the ESD?

Rick,

One way to estimate k is to plot the data.

You don’t necessarily need to use ESD to find the outliers, but it can be helpful in doing so.

Charles

Should the ESD and OUTLIER functions give the same answer? I am finding that when I have a smaller set of data (n=14) sometimes OUTLIER returns all the values as being outliers, whereas running ESD gives what appears to be the real couple of outliers.

Tom,

They won’t necessarily give the same answers.

Charles

Is there a way to determine which one would be more correct to use in a given situation?

There isn’t even a clear definition of what is an outlier. Is it related to IQR units from the median or standard deviations from the mean? Is an outlier a naturally occurring thing: e.g. in a sample of 10 perhaps I should consider a data element that is 3 standard deviations from the mean to be an outlier, but should I do the same in a sample of 10,000? In general, I would use both tests to discover potential outliers and then evaluate each to determine whether or not I should do something special with that data element

Charles

Pingback: Part 2 – Anomaly Detection on Twitter Sentiment Stream following Microsoft SQL Server 2016 Announcement | Forefront Analytics

What is the smallest number of values for which this statistical approach is valid? I tend to have only 5 or 6 data points, which ESD makes all outliers until I get down to 2 points which obviously stops working. Is there a method for such a small sample?

Sue,

Most statistical methods don’t really work that well with such small samples (5 or 6 data points), unless that values are really extreme. E.g. with data such as 2, 3, 6, 8, 400, it is pretty easy to see that 400 is an outlier, but not so clear with 2, 3, 6, 8, 15.

Charles

Hi Chuck, Thank you for the clear definition and great example.

In the text above Figure 1 for Example 2:

“Identify all the outliers in the data set shown in range A18:B28 of Figure 1”

Do you mean “in range A5:B15 of Figure 1” or am I missing something?

Hi Jesse,

Yes, you are correct. Thanks for finding this mistake. I have corrected the error on the webpage.

I really appreciate your help in making the website better and easier to use.

Charles