A number of unplanned comparisons are available. A few of the commonly used post-hoc tests (e.g. Fisher’s LSD, Student Newman-Keuls (SNK) and Tukey’s B) are not very accurate and usually should not be used. The key issue is to correct for experiment-wise error.

Although the Bonferroni and Dunn/Sidák correction factors can be used, since we are considering unplanned tests, we must assume that all pairwise tests will be made (or at least taken into account). For *k* groups, this results in *m* = *C*(*k*, 2) = *k*(*k*–1)/2 tests. For an experiment-wise error of α we need to use *α*/*m* as the alpha for each test (Bonferroni) or 1 – (1 – *α*)^{1/m} (Dunn/ Sidák). This makes these tests too conservative.

More useful tests are Tukey’s HSD and REGWQ. These tests are designed only for pairwise comparisons (i.e. no complex contrasts). We also describe extensions to Tukey’s HSD test (Tukey-Kramer and Games and Howell) where the sample sizes or variances are unequal. We also describe the Scheffé test, which can be used for non-pairwise comparisons.

We discuss these tests on this webpage. Where the variances are unequal we can also use the Brown-Forsythe F* Test.

General guidelines are:

- Tukey’s test is usually the safe choice. It is a good choice for comparing large numbers of means
- REGWQ test is even better (i.e. has more power) for comparing all pairs of means, but should not be used when group sizes are different
- Hochberg’s GT2 (not reviewed here) is best when sample sizes are very different
- Games-Howell is useful when uncertain about whether population variances are equivalent.

**Tukey’s HSD (Honestly Significant Difference)**

The idea behind this test is to focus on the largest value of the difference between two group means. The relevant statistic is

and *n* = the size of each of the group samples. The statistic *q* has a distribution called the studentized range *q* (see Studentized Range Distribution). The critical values for this distribution are presented in the Studentized Range q Table based on the values of *α*, *k* (the number of groups) and *df _{W}*. If

*q > q*then the two means are significantly different.

_{crit}This test is equivalent to

Picking the largest pairwise difference in means allows us to control the experiment-wise for all possible pairwise contrasts; in fact, Tukey’s HSD keeps experiment-wise *α* = .05 for the largest pairwise contrast, and is conservative for all other comparisons.

Note that the statistic *q* is related to the usual *t* statistic by *q* = *t*. Thus we can use the following *t* statistic

The critical value for *t* is now given by *t _{crit} = q_{crit}* /. If

*t > t*then we reject the null hypothesis that H

_{crit}_{0}:

*μ*, and similarly for other pairs.

_{max}= μ_{min}As described above, to control type I error, we can’t simply use the usual critical value for the distribution, but instead use a critical value based on the largest difference of the means.

From these observations we can calculate confidence intervals in the usual way:

or equivalently

**Example 1**: Analyze the data from Example 3 of Planned Comparisons using Tukey’s HSD test to compare the population means of women taking the drug and the control group taking the placebo.

Using the Studentized Range q Table with *α* = .05, *k* = 4 and *df** _{W}* = 44, we get

*q*= 3.7775. Note that since there is no table entry for

_{crit}*df*= 44, we need to interpolate between the entries for

*df*= 40 and

*df*= 48. Alternatively, we can employ Excel’s table lookup capabilities. We can also use the Real Statistics function QCRIT(4,44,.05,2,FALSE), as described below, to get the same result of 3.7775.

The critical value for differences in means is

Since the difference between the means for women taking the drug and women in the control group is 5.83 – 3.83 = 1.75 and 1.75 is smaller than 1.8046, we conclude that the difference is not significant (just barely). The following table shows the same comparisons for all pairs of variables:

**Figure 1 – Pairwise tests using Tukey’s HSD for Example 1**

From Figure 1 we see that the only significant difference in means is between women taking the drug and men in the control group (i.e. the pair with largest difference in means). We can also use the t-statistic to calculate the 95% confidence interval as described above. In Figure 2 we compute the confidence interval for the comparison requested in the example as well as for the variables with maximum difference.

**Figure 2 – Tukey HSD confidence intervals for Example 1**

**Real Statistics Function**: The following function is provided in the Real Statistics Resource Pack:

**QCRIT**(*k, df, α, tails, h*) = the critical value of the Studentized range *q* for *k* independent variables, the given degrees of freedom and value of alpha, and *tails* = 1 (one tail) or 2 (two tails, default). If *h* = TRUE (default) harmonic interpolation is used; otherwise linear interpolation is used.

This function is based on the table of critical values provided in Studentized Range q Table. Note too that in the previous example we found that QCRIT(4,44,.05,2,FALSE) = 3.7775 using linear interpolation (between the table values of *df* = 40 and *df* = 48). If harmonic interpolation were used (see Interpolation) then we would have obtained the value QCRIT(4,44) = 3.7763.

The Real Statistics Resource Pack also provides the following functions which provide estimates for the Studentized range distribution and its inverse based on a somewhat complicated algorithm.

**QDIST**(*q, k, df*) = the value of the Studentized range distribution at *q* for* k* independent variables and *df* degrees of freedom.

**QINV**(*p, k, df, tails*) = the inverse of the Studentized range distribution at *p* for *k* independent variables, *df* degrees of freedom and *tails* = 1 or 2 (default 2).

**Observation**: Note that the values calculated by QCRIT and QINV will be similar, at least within the range of alpha values in the table of critical values. E.g. QINV(.015,4,18,2) = 4.82444 while QCRIT(4,18,.015,2) = 4.75289.

Note that QDIST outputs a two-tailed value. E.g. QDIST(4.82444,4,18) = 0.15. To get the usual cdf value for the Studentized range distribution, you need to divide the result from QDIST by 2, which for this example is .0075, as confirmed by the fact that QINV(.0075,4,18,1) = 4.82444.

Finally note that the algorithm used to calculate QINV (and QDIST) is pretty accurate except at low values of *p* and *df*. In particular, for *df* = 1 and certainly when *p* ≤ .025, QCRIT will be more accurate than QINV (at least for those values found in the table of critical values). This is also true when *df* = 2 and *p* ≤ .01 or when *df* = 3 and *p* = .001.

**Real Statistics Data Analysis Tool**: The Real Statistics Resource Pack contains a Tukey’s HSD Test data analysis tool which produces output very similar to that shown in Figure 2.

For example, to produce the first test in Figure 2, follow the following steps: Enter **Ctrl-m** and select the **Analysis of Variance** data analysis tool from the list. On the dialog that appears, select the **Single Factor Anova** option. A dialog box similar to that shown in Figure 1 of Confidence Interval for ANOVA appears. Enter A3:D15 in the **Input Range**, check **Column headings included with data**, select the **Tukey** **HSD** option and click on the **OK** button.

The report shown in Figure 3 now appears. We see that only MC-WD is significant, although WC-WD is close.

**Figure 3 – Real Statistics Tukey HSD data analysis**

**Tukey-Kramer Test**

When sample sizes are unequal, the Tukey test can be modified by replacing by in the above formulas. In particular, the standard error for the *q* statistic becomes

Note that the Real Statistics Tukey HSD data analysis tool described above actually performs the Tukey-Kramer Test when the sample sizes are unequal.

**Example 2**: Analyze the data in range A3:D15 of Figure 4 using the Tukey-Kramer test to compare the population means of women taking the drug and the control group taking the placebo. This example is the same as Example 1 but with some data missing, and so there are unequal sample sizes.

**Figure 4 – Data and ANOVA for Example 2**

Enter **Ctrl-m** and select **Single Factor Anova and Follow-up Tests** from the menu. A dialog box similar to that shown in Figure 1 of Confidence Interval for ANOVA appears. Enter A3:D15 in the **Input Range**, check **Column headings included with data**, select the **Tukey** **HSD** option and click on **OK**.

The output is shown in Figure 5.

**Figure 5 – Tukey-Kramer Data Analysis**

**Games and Howell Test**

A better alternative to Tukey-Kramer when variances are unequal is Games and Howell. Here the standard error becomes

Thus we use different pooled variances for each pair instead of the same pooled variance *MS _{W}*. We employ the following test

where the standard error is as above and *q′ _{crit} *is the critical value of the Studentized range statistic but with the degrees of freedom given by

*df′*as defined in Two Sample t-Test with Unequal Variances, namely

In this way we also take care of the case where the variances are unequal in exactly the same manner as in Theorem 1 of Two Sample t-Test with Unequal Variances, except that we now use the q-statistic instead of the t-statistic. Note that the supplemental function **DF_POOLED** can be used to calculate *df′*.

**Example 3**: Repeat Example 2 using the Games-Howell test.

We repeat the same steps as we used in Example 2 except that we choose the Games-Howell option. The output of the test is shown in Figure 6.

**Figure 6 – Games-Howell Data Analysis**

**REGWQ Test**

The Ryan, Einot, Gabriel, Welsh Studentized Range Q (**REGWQ**) test uses what is known as a step-down approach. No confidence intervals are calculated.

First, the equality of all of the means is tested at the *α _{k}* level. If the null hypothesis of equality of means is rejected, then each subset of

*k*– 1 means is tested at the

*α*level; otherwise, the procedure stops.

_{k-1}In general, if the hypothesis of equality of a set of *p* means is rejected at the *α _{p}* level, then each subset of

*p*– 1 means is tested at the

*α*level; otherwise, the set of

_{p-1}*p*means is considered not to differ significantly and none of its subsets is tested. One continues in this manner until no subsets remain to be tested.

First we arrange the sample means in descending order *x̄ _{1}* ≥

*x̄*≥ … ≥

_{2}*x̄*. At each stage

_{k}*p*, we define

*α*as follows:

_{p}Now, the equality of the means *x̄ _{i}*, …,

*x̄*where

_{j}*p = j – i + 1*is rejected when

as for Tukey’s HSD test, except that the *α _{p} *are defined as above. The only problem with this approach is that

*q*values are required for values of

_{crit}*α*are usually not found in the Studentized Range q Table. To partially address this issue we provide a second table of values for

*q*(also in Studentized Range q Table) for values of

*α*= .025, .005 and .001. The best we can do for values not in the table is to interpolate.

**Example 4**: Apply the REGWQ to the data in Example 1 of Confidence Interval for ANOVA.

**Figure 7 – REGWQ for Example 4**

There are three tables in Figure 7. The table in the upper left hand side of the figure consists of the variables sorted from highest to lowest mean. We will refer to the index (i.e. rank) of these variables, where *i* = 1 refers to the variable with highest mean (i.e. women who take the drug) and *i* = 4 refers to the variable with lowest mean (i.e. men in the control).

The table in the upper right side of the figure is used to determine the critical values of mean differences. These are based on the possible values of *p*, namely 2, 3 and 4. For each *p*, first we determine the required values of *α _{p}* (range R8:T8) as described above. For each

*p*we then find the

*q*values from the Studentized Range q Table (interpolating as necessary – via rows 10 and 11 in Figure 4) corresponding to

_{crit}*p*,

*df*= 44 and

*α*(range R12:T12). We multiply these values by

_{p}to get the critical values for the mean differences corresponding to *p* = 2, 3 and 4 (range R13:T13).

The final table iterates all possible pairs of variables (given by the indices* i* and *j*, which point to the entries in the first table). For each pair the value of *p* is calculated as *i + j *– 1 (row 17). The difference of the means corresponding to indices *i* and *j* is calculated in row 20. Each mean difference is then compared with the critical value calculated in the second table for the value of *p* in that column. The pairs whose mean difference is larger than the critical value are significant.

For our example, the only mean differences in the first two columns are significant. This means that there is a significant difference between women who take the drug and men in the control (first column) and there is also a significant difference between women who take the drug and women in the control.

**Dunnett’s Test**

This test is used when we want to compare one group (usually the control treatment) with the other groups. In this case, Dunnett’s test is more powerful than the other tests described on this webpage.

The test is similar to Tukey’s HSD, except that instead of testing

we test whether

where *n* = size of the group samples, *x̄*o = mean of the control group, *x̄ _{j}* is the mean of any other group and

*t*is the (two-tailed) Dunnett’s critical value given in the Dunnett’s Table. The table contains the values

_{d}*t*(

_{d}*k, df*) where

_{W}, α*k*= the number of groups (treatments) including the control.

The test can also be used when the group means are not equal via

although, since Dunnett’s Table is based on equal group sizes, the above formula is only accurate if the group sizes are not too different.

**Real Statistics Function**: The following function is provided in the Real Statistics Resource Pack:

**DCRIT**(*k, df, α, h*) = the critical value *t _{d}* for

*k*independent variables, the given degrees of freedom and value of alpha. If

*h*= TRUE (default) harmonic interpolation is used; otherwise linear interpolation is used.

**Real Statistics Data Analysis Tool**: We now show how to use the Dunnett’s Test data analysis tool to address Example 5.

**Example 5**: Assuming that Method 1 is the control group in Example 1, compare the mean of this method with the means of the other methods using Dunnett’s test. The data is repeated in Figure 8.

Enter **Ctrl-m**, select **Analysis of Variance** and press the **OK** button. Next select **Single Factor Anova **from the dialog box that appears. A dialog box similar to that shown in Figure 1 of Confidence Interval for ANOVA appears. Enter A3:D11 in the **Input Range**, check **Column headings included with data**, select the **Dunnett’s Test** option and click on the **OK** button.

The output is similar to that shown in Figure 8, although initially the contrast coefficients in range G4:G7 are blank. You need to enter a +1 and -1 in two of these cells, one of which must correspond to the control group (Method 1 in this example). The output for the comparison of Methods 1 and 3 is shown in Figure 8.

**Figure 8 – Dunnett’s Test**

We see that there is a significant difference between Method 1 and 3 (cell L11). We can of course compare Method 1 with Method 2 or Method 4 by simply changing where the value for the +1 contrast is placed.

Note that cell I11 contains the formula =DCRIT(COUNT(I4:I7),H11,L2). Thus the critical value in Dunnett’s Table when *k* = 4, *df* = 28 and *α* = .05 is 2.483.

**Scheffé Test**

To carry out Scheffé’s test follow the following steps:

- Calculate the planned comparison t-test
- Square the t-statistic to get
*F*(since*F = t*^{2}) - Find the critical value of
*F*with*df*degrees of freedom for given value of_{B}, df_{W}*α*and multiply it by*df*. Thus the critical value is_{B}*df** FINV(_{B}*α, df*)._{B}, df_{W} - If
*F*> the critical value then reject null hypothesis

Since Scheffé’s test is very conservative, it is not recommended for pairwise comparisons where better tests are available, but it can be useful for more complicated comparisons where the other unplanned tests don’t apply.

**Real Statistics Data Analysis Tool**: Scheffé’s test can be carried out using the **Single Factor Anova and Follow-up Tests **data analysis tool provided by the Real Statistics Resource Pack. We show how this is done in the following example.

**Example 6**: Carry out the test in Example 2 from Planned Comparisons for ANOVA using Scheffé’s test.

The steps are very similar to those used for Tukey’s HSD, as described above, namely enter **Ctrl-m** and select **Single Factor Anova and Follow-up Tests** from the menu. A dialog box similar to that shown in Figure 1 of Confidence Interval for ANOVA will appear. Enter I21:L29 in the **Input Range**, check **Column headings included with data**, select the **Scheffe** option and click on **OK**.

A report similar to that shown in Figure 9 will appear but with no numbers in the shaded range O23:O26. You must now enter the contrast values. Setting the contrast coefficients for Method 1 and 2 to -0.5 and the contrast coefficient for Method 4 to 1, we get the output shown in Figure 5.

**Figure 9 – Scheffé data analysis for Example 6**

This figure shows there is no significant difference between Method 4 and the average of Methods 1 and 2.

** **

Hello,

Great website! Doing these things manually in Excel really does allow for much more flexibility. I have some questions about the Tukey/Tukey-Kramer tests.

First, a conceptual question: my interpretation of the Tukey test is that one computes q-values for each pair of means compared, using the mean values, MS_W, and n. This is compared to a critical q-value that depends only on df_W, m, and alpha.

Where I’m confused is how the test “focuses on the largest value of the difference between two group means.” I see only individual paired comparisons made.

Second, under the Tukey-Kramer test, the s.e. for the q statistic is modified as you note. However, instead replacing (1/n) with 2 * (1/n_i + 1/n_j), should it not be 0.5 * (1/n_i + 1/n_j)? For example, in the trivial case of comparing n_i = n_j = 1, multiplying by 0.5 backs out the original Tukey equation with n = 1. I checked an Excel dataset using Prism, and my q-values check out with the latter.

Thanks!

Bowen,

Thanks for your excellent questions. Let me give you a quick response, although I plan to look at your questions more carefully and provide a more complete response shortly-

The s.e. provided is not for the q statistic, but for the t-statistic that corresponds to the q-statistic (i.e the q-value divided by the square root of 2.). Having said this I can see how this might be confusing. I will either explain things better on the website or change the data analysis tool to provide the s.e. of the q statistic. The resulting test and confidence interval should be the same, however. The q statistic is based on the largest difference of mean values. Note that no matter which pair of means you choose the q value does not change.

Charles

Dear Charles

Thank you so much for this tool that is helping lot of persons to solve our statistical analysis. Really appreciate it.

I would like to know about Tukey’s HSD Test, but please you should know that I’m a beginner in these kind of study. So my question is:

How do I set the contrast values that should I enter in the shaded range G5:G8 and you explained (Since Tukey’s HSD Test is a pairwise comparison you should only use two contrasts: one with 1 and the other with -1.) ?

I understood is that I should only choice two groups to compare and write (1) in the group with the higher value of mean and (-1) in the group with lower value of mean, between the two groups that I want to compare. For example in the same Figure 3, If I want to compare MD with MC, should I write in shaded range G5:G8 MD with (1) and MC with (-1)

Please, I hope you can help me, and apologize for my question and take part of your time….if you have some information that I should read about it, please feel free to advice me.

Thanks once more and have great rest of the week,

ED

Dear Ed,

You have understood things very well. Based on the real-world analysis that you want to accomplish you need to choose the two groups you want to compare. Once you do this you assign a contrast value of +1 to one and -1 to the other exactly as you described.

If you do other followup tests (e.g. the Contrasts option of the Single Factor Anova data analysis tool) then you can assign other values to the contrast besides 1 and -1, but for Tukey’ HSD 1 and -1 are the only choices.

Charles

Dear Charles,

I have been really busy with my experiments and classes, but now coming back to carry out my stats analysis from data obtained and using your “”AMAZING” tools,

Thank you a lot,

Best regard

ED

Charles

Another outstanding effort with Release 2.11.

Perhaps some words about the new q-distribution functions you recently added might be included here? And, an observation: it doesn’t seem that all of the new functions are yet included in the tools inventory listings.

Regards, Rich

Rich,

Glad you like the new release. I have already started updating the website to explain the new capabilities. I hope to have this completed this weekend.

Charles

Ok. Here is my question again, but better asked (I hope). Yes, Tukey HSD tells me whether there is a significant difference in a pair of means. Now I have multiple yes/no answers. How can I now group them according to their differences? Example: group A (treatments 1 and 2) is significantly different from group B (treatments 3 and 4, 5, and 6). How can I tell if there is a group C (treatments 5 and 6) and/or more groups of statistically different treatments just by looking at the yes/no answers? My goal is to make a table with the means of the treatments from largest to smallest in a row in excel. I would like to underline the treatments that are within the same group. How can I do this?

I just figured it out! I wasn’t showing any differences because all of my contrast values were 1 instead of 1 and -1. Can you explain why we need to use these values, how we determine which we should use, and how to come up with the value?

Elizabeth,

I am pleased that you figured this out yourself. How you set the contrast values depends on what you want to test. What is important is that the sum of the contrast values is equal to 0 for any single test. Any variable which is not part of the analysis gets a value of 0.

For example, if you want to compare variable x2 with variable x4 just set the 2nd contrast value to 1 and the 4th to -1 (or vice versa). If you want to compare x1 with the average of the values of x2 and x3, just set the 1st contrast to 1 and the 2nd and 3rd to -.5.

Suppose x1 represents men who live in the US, x2 represents women who live in the US, x3 represents men who live in Canada and x4 represents women who live in Canada. If you want to compare men vs. women set x1 = x3 = .5 and x2 = x4 = -.5 (alternatively you could set x1 = x3 = -1 and x2 = x4 = +1 and get the same result). If instead you want to compare people in the US with Canadians set x1 = x2 = .5 and 3 = x4 = -.5.

I tend to use contrast values so that the positive contrasts add up to 1 and the negative contrasts add up to -1, but this isn’t essential.

Charles

Thank you so much for your quick response! I get it now 🙂

Is there any way I tell it to read replicates within a treatment across a row instead of down columns (as in the option to group by row or column in excel’s regular ANOVA analysis) without transposing the data (i.e. all my data has the dependent variables listed down the columns and replicate measurements across the same row)?

David,

Unfortunately the Real Statistics ANOVA data analysis tool currently only accepts two formats: the format that you referenced and the standard format (similar to that used by tools such as SPSS). The only solution I can think of is to transpose the data.

Charles

Hello,

Thank you for this excellent tool to further my statistic ability when using excel. I’m currently experiencing a small difficulty in performing my own analysis, however.

When I analyze my data set, the shaded grey column labeled “c” in the Tukey output is empty. Because of this, a lot of the remaining outputs cannot be calculated.

additionally, there is no column for individual “yes” or “no” on whether each group is different. Based on the ANOVA p value, there is significant difference, but I’m unable to discern which groups.

Thank you,

Addie

Addie,

You just need to choose which two groups you want to compare and put a +1 in the grey column corresponding to one of those groups and -1 in the grey column corresponding to the other. This is explained on the webpage http://www.real-statistics.com/one-way-analysis-of-variance-anova/unplanned-comparisons/.

Charles

Thank you! I missed that when I first read. And thank you for this tool pack. It will significantly help my work.

Charles:

There’s a little bug in Real Statistics 3.2 when using Single Factor Anova (first option of Analysis of Variance menu):

When I used 4 samples (different size: 19, 17, 15 and 18), I got an error in the Random Factor Table: It seems there’s an error calculating the estimate group variance. For my example, that value is negative. For that reason I obtained “#NUM!” in lower and upper bounds for the mean. Perhaps is necessary to use the ABS Excel function in the numerator for getting a positive number?

The values I use for my example are the following.

Thank you.

William.

G1 G2 G3 G4

78 81 83 47

45 59 32 90

77 58 33 37

2 48 84 15

45 43 35 28

46 98 68 5

30 16 56 84

96 6 21 36

44 99 2 2

31 18 31 4

72 37 86 49

28 6 39 36

29 2 17 2

68 67 53 65

51 6 24 77

15 50 79

99 84 86

73 94

78

William,

It is quite possible for the value to be negative, in which case the remaining values will be meaningless. I will explain this once I update the webpage. The value for df can also be less than one, in which case there will be an error since Excel’s chi-square function will round df down to zero.

Charles

Thank you for your answer, Charles. I didn’t know that value could be negative and df could be less than one.

Regards.

William.

William,

Obviously in these cases the estimates generated by the formulas are not particularly useful since a negative variance cannot exist and a df of less than 1 (even if not rounded off) will produce a huge confidence interval.

Charles

Thanks for the great website!

How would I apply the Tukey HSD test to a two factor ANOVA? I’ve successfully used your add-on to use the Tukey following a single way ANOVA, but am at a loss on how to get it to apply to the other ANOVAs.

Thank you!

John,

I haven’t yet addressed Tukey HSD for two factor ANOVA. You should be able to use the one-factor version for the main effects. The following is a reference explaining how to use Tukey HSD for interactions: http://ocw.mit.edu/courses/brain-and-cognitive-sciences/9-07-statistical-methods-in-brain-and-cognitive-science-spring-2004/lecture-notes/21_anova4.pdf. I hope this is helpful.

Charles

Hello,

Above it says:”Example 1: Analyze the data from Example 1 of Confidence Interval for ANOVA using Tukey’s HSD test to compare the population means of women taking the drug and the control group taking the placebo.”

However if I follow the link it says:”Example 1: Find the confidence intervals for each of the methods in Example 3 of Basic Concepts for ANOVA.”

Following that link it says:”Example 3: Repeat the analysis for Example 2 where the last participant in group 1 and the last two participants in group 4 leave the study before their reading tests were recorded.”

Example 2 on the same page says:”Example 2: A school district uses four different methods of teaching their students how to read and wants to find out if there is any significant difference between the reading scores achieved using the four methods. It creates a sample of 8 students for each of the four methods.

I cannot find any examples comparing women taking drugs vs. placebos or that has 44 degrees of freedom. Do you mean this page?

http://www.real-statistics.com/one-way-analysis-of-variance-anova/planned-comparisons/

David,

Yes, the example should refer to Example 3 on the Planned Comparisons webpage. I have just changed the link.

Thanks for finding this mistake.

Charles

Hi, thanks for all this useful help and resources. I have tried to match the tukey hsd results with minitab output and by formula from books. With the books i have almost the same std error and confidence intervals that real stats output. I dont understand quite well the c^2/n part .

With minitab output they use the t test version of tukey and the formulas for the family ratio error and standard error of diferences yield different results from the formulas used in your website for t tukey hsd. Minitab say they use 0,007 for individual error alpha if you go for 0,05 family error rate in the case of 10 comparisons. but it doesnt follow the 1-(1-alpha)^c family error formula or explain how they get that number.

This are the numbers Iam working with, this are from 3 plants workers ages. first row are labels not data.

1 2 3

29 32 25

27 33 24

30 31 24

27 34 25

28 30 26

Help!! :)) thank you very much.

Marcy,

I don’t use Minitab so that it is difficult for me to comment about what results they get. I have checked my results with another popular software product and the results for your data are exactly the same.

Regarding .007 for individual alpha for familywise alpha of .05 with 10 comparisons, this doesn’t seem to have anything to do with the data that you have included in your comment since you only have 3 groups.

Charles

Evening,Charles.

Thanks for your wonderful website.

It helps me a lot for my final project problem.

However, I have a question for you regarding how to perform a table like Figure 1.

Could you please tell me how to create the table in a great detail?

Thanks!

You can download the

Real Statistics Examples Part 1file that contains the worksheet in Figure 1 from the webpagehttp://www.real-statistics.com/free-download/real-statistics-examples-workbook/

Charles

Thanks for your help, sir!

But, I really get confused, when I found that there are many sheets to be checked.

if you don’t mind, would you please tell me in what sheet that I can find the result like figure 1?

Thank you very much

regards,

Dillah

Tukey 1

Thank you very much, Sir!

Thanks for this fantastic resource. I have a question about confidence intervals for differences between means when variances of groups are heterogenous – that is, cases where you’d be using Welch’s test and the Games and Howell test. Would that CI be calculated in the standard way from the standard error and t-statistic, but with the SE and degrees of freedom for t taken from the Games and Howell calculations? Or does the difference in variances introduce an additional wrinkle that needs to be considered?

Thanks again,

Alistair

The confidence intervals can be calculated in the usual way.

Charles

Thank you. I also had a question about a possible typo. In the introduction to this section, you write “Games-Howell is useful when uncertain about whether population variances are equivalent.”, but in the Games-Howell section, the first sentence reads “A better alternative to Tukey-Kramer when sample sizes are unequal is Games and Howell. ” I’m guessing the second of those is also supposed to say “variances” instead of “sample sizes,” but just wanted to double-check.

Alistair,

You are correct. I had corrected this on my offline copy of the site, but not on the website itself. Thanks very much for catching this error and helping to improve the website. Much appreciated.

I have now made the correction to the referenced webpage.

Charles

Charles,

I want to write my own Excel program for computing the critical q value in Tukey’s test.

Problem is that I cannot find the statistic q in the formulas provided by Excell.

Finding q-crit via q-crit = (2)exp0.5 * t-crit doesn’t seem to work to me since q=f(alpha, k, dfW) and t=f(alpha, dfW).

I went to your the Studentized Range q Table which gave of course al the values necessary to perform Tukey’s HSD test, but I feel this is not being as practical as having it formulated on the same Excel sheet together with Anova and ICC computations. Could you help me to find a solution? I realize that your anwer may very well be that working with the Real Statistics Data Analysis Tool would be a far better solution, but I want to do it my own way…

Thank you very much, you have always been a great support.

Erik

Erik,

The Real Statistics Resource Pack contains the QDIST function, which calculates the p-values for the Studentized Range function. It also contains QINV which calculates the critical values. These functions don’t use the Studentized Range q Table, but instead calculate the values of the distribution. The program to do this is not easy.

In any case, you can call these functions from within software that you write yourself. See the following webpage for details as to how to do this.

http://www.real-statistics.com/real-statistics-environment/calling-real-statistics-functions-in-vba/

Charles

Hi Charles,

I’ve run several one-way ANOVAs with Tukey’s HSD and consistently come up with the problem of the q statistic coming up with #DIV/0! in the spreadsheet. I’m pretty confident in my setup so I’m not sure why. Here’s some dummy data that gives this result, if needed:

all gut tentacle Neo

4 1 3 0

3 2 1 2

5 3 2 1

3 2 1 1

5 3 2 2

10 5 4 2

2 1 2 2

3 1 2 1

P value is significant and visually I can see where the significant differences are, but can’t seem to convince the formula to report the q-stat correctly.

Anna,

You need to fill in the c column in the output, by placing a +1 in any one of the highlighted cells and a -1 in another (these are the contrast coefficients). E.g. in your problem, if you place +1 next to Group 1 and -1 next to Group 3, you would compare Group 1 with Group 3. You cam compare any pair of groups without inflating the experimentwise error (which is the point of Tukey HSD).

Charles

Hi Sr:

How much can differ cuantitative, the sample size when we go to use dunnett´s test?

Felix,

The version of the test in the Real Statistics Resource pack assumes that all sizes of all the groups are equal. The test can be modified to handle unequal sample sizes. This is done in the same manner as Tukey-Krammer modifies the Tukey HSD test to handle unequal sample sizes.

Charles

Hello,

I’ve been trying to use both contrast and Tukey HSD for One way Anova.Then I got different result for both comparison analyses. Contrast give sig. result while Tukey HSD not on a pairwise comparison in my analysis.Can you enlighten me why this happen?

Fahmi,

It is not so uncommon to find that different tests for the same situation give conflicting results. Each test uses its own set of assumptions and has its own advantages and disadvantages.

First and foremost, you need to make sure that each test is suitable for the analysis that you are undertaking (and that the assumptions of that test are met). In your situation, Tukey HSD is used for multiple pairwise comparisons, while contrasts can be used for comparisons that are not pairwise. While you can use Tukey’s HSD to perform multiple pairwise comparisons, you cannot use it it to compare more than two variables at a time (although you can use contrasts for this).

Charles

Dear Charles,

As I’m a beginner in all of this, I’m a bit confused about the output of the Tukey HSD test. How can I know at which level of significance two groups are different? (based on the P-values, I want to add the appropriate ‘*, **, ***’ symbols to my charts. However, I can’t find the P-values associated with each pairwise comparison)

Kind regards

Francis,

You can calculate the p-value by using the QDIST function. E.g. for Figure 3, p-value = =QDIST(3.663207,4,44) = .060185.

Charles

Thanks. I really appreciate your help.

I am showing a significant between group difference via ANOVA and a significant difference between 2 of my 3 groups on post hoc Tukey HSD. However, I do not see a p-value in the generated output chart.

When I try the QDIST function mentioned above “=QDIST(q-stat,# of levels,df)”, the formula outputs “1”.

Can anybody help with this?

Austin,

If you are using one of the latest releases of Real Statistics, you should see a p-value in the output. This won’t be the case if you are using the Mac version.

If the QDIST function outputs 1, then p-value = 1 and you clealry don’t have a significant result.

Charles

Brilliant site, thank you! I typed in your data and ran the add-in function for this; however, the DCRIT function, namely =DCRIT(COUNT(I3:I6),H10,L1) in my case, returns #VALUE!. I cannot get the function to work with different parameters either. Note, other RealStats functions are working in Excel during this instance as well. Thoughts?

Dustin,

Glad you like the site.

The formula on the referenced webpage is =DCRIT(COUNT(I4:I7),H11,L2). Your formula should work provided you have shifted all the data one row up. If so, then I suggest that you send me an Excel file with your data so that I can see what went wrong.

Charles

I had the same problem.

Also with my data the function DCRIT results in an error. But I found out that it is because the size of my dataset (i.e. “k”) is bigger than 240 (it is 294) and so the formula doesn’t recognize it as a number of the Dunnett’s table and return an error. Eventually, I put manually the right number (i.e. the number of df lower, 240) and the formula works as it should 🙂

Anyway, it’s an awesome tools! Thank you very much!

Have a nice day

Davide,

Last week I also came across this error in the DCRIT function. It will be corrected in the next release of the software, which should be available in the next couple of days.

Charles

Dear Porfessor,

In other page I write a sugestions for write the p-value.

But I need calculate…

And I try follow what you write in this page and not obtain the values.

Example:

q-stat df

-18.74358529 4285

ah… the sample/gropups are 7.

I try use and not obtain any p-value. Say there are a error in formula.

It seems to me that my values are very big because if I rduce the df I obtain values.

Thanks

José

Good morning,

Since the tool for the Dunnett’s test shows only if there is significance or not (with “yes” or “no”), how can I know the value of the p-values in a Dunnett’s test? Is there a way to do that?

Thank you very much!

Kind regards

Davide

Davide,

The best I can do at present is make an estimate based on the Table of Critical Values for Dunnett’s Test as shown in

Dunnett’s Test Table

E.g. suppose k = 10 and df = 20. From the table I see that the critical values at .01, .05 and .10 are respectively 3.694, 2.946 and 2.600. Thus if the test statistic is 3.2, then I know that the p-value is somewhere between .01 and .05; a rough estimate is something like .03. I can make this more precise by using the DCRIT function. In fact, I see that DCRIT(10,20,.03) = 3.070667, which is lower than 3.2. Thus I need a value lower than .03 (but higher than .01). I try .02 next. Since DCRIT(10,20,.02) = 3.2265, which is higher than 3.2, I need a value higher than .02 (but lower than .03).

I can iterate this “divide and conqueror” approach to whatever degree of accuracy I desire. I settle for .0212 since DCRIT(10,20,.0212) = 3.200038. Note that this approach uses harmonic interpolation, as explained on the webpage

Interpolation

I had planned to create a Real Statistics function that would do all of this for you for the previous release, but I ran out of time. I will likely put it in the next release.

Charles

Dear Charles,

Your website really opens my mind! I tried analysing my data with Weich’s followed by the Games and Howell Test but there was no show-up of P-value. Did I miss something?Is there a formula like QDist to calculate the P-value in this test?

Regards,

Long

In the latest release of the Real Statistics software (Rel 4.11) the p-value is given for the Games-Howell test. In any case, you can use the QDIST function.

Charles

Dear Charles,

In figure 4, there is a description of a calculation.

I was trying to manually calculate the Description part myself, and couldn’t understand how did you calculate the Std Err (column L in the figure).

Shouldn’t it be s/sqrt(n)? i.e. for MD for example SQRT(2.87/11)=0.51 (you got 0.48)

Thanks,

Nadav

Nadav,

That too is a reasonable approach, and in fact I used to use that calculation. I recently changed the approach to use the common variance MSE found in cell I15, since this seemed to be the more commonly used approach. A case can be made for either approach.

Charles

Hi Charles,

Thank you for the answer, make sense.

You have great site with very comprehensive explanations and examples, thank you!

Nadav

Glad I could help.

Charles

Hi Charles,

I have just discovered your add-in and website and am pleasantly surprised that it’s so well detailed. I was playing around with the add-in and am confused by the output I see for the Tukey HSD after the one-way ANOVA. I don’t see the pairwise significance table shown in “Fig 1” or the table shown in “Fig 2” on the Unplanned Comparisons webpage. I only see the table shown in “Fig 3”. I guess another way of describing what I’m seeing (using your Worksheet example #2), is that I don’t see the Tukey’s HSD table, but only the Tukey HSD/Kramer table. Am I missing something here? Thanks.

Jamie,

In the current implementation (as shown in Figure 3), you need to make the pairwise comparisons one at a time. For whichever pairwise comparisons you desire, you place a 1 and a -1 in the shaded area corresponding to the two pairs you want to compare.

Charles

Thanks for the quick reply Charles. I see now how it works though I’m afraid I don’t find it very intuitive. As a friendly suggestion, it would be nicer to have the pairwise table generated automatically as in Fig. 1.

On a different note, I have noticed that when I close Excel and reopen, that RealStats no longer appears under the Add-In menu, but when I go to the Add-ins options it is still checked. Then I need to uncheck, hit ok, and then recheck to reload it…and then it works again. Perhaps this is a bug? I’m using Office 2013 and don’t seem to have this trouble with other add-ins. Cheers.

Jamie,

Thanks for your input. Others have also suggested this sort of change to Tukey’s HSD. I will eventually made this change.

Regarding RealStats disappearing from the Add-in menu, please see the following webpage:

Disappearing Addin Menu

Charles

Once again, thank you Charles for your speedy and helpful reply.

Cheers,

Hi Charles,

Thank you for the excellent program for everybody.

When I read the outcomes by Scheffé’s test using RealStats.xlam, the P value was lower than other program. The cell for P value by Scheffé’s test did not include the coefficient of 1/dfB. (Sorry for not using a small capital for B.) When the F value was multiplied by this coefficient, the P value obtained was exactly same as those by other programs. I would be happy, if you could have a chance to check.

Best regards,

Kats

No problem

Charles

Kats,

I don’t see the problem with 1/dfB that you are referring to.

Can you send me an Excel file with an example where the Real Statistics test has a p-value that is lower than some other problem and what value you believe is correct.

Charles

Hi when you say :Since the difference between the means for women taking the drug and women in the control group is 5.83 – 3.83 = 1.75 and 1.75 is smaller than 1.8046, we conclude that the difference is not significant

What does that imply ?is thit different ? And how to select the best combination between this six man.woman categories ?

Sam,

Not significant means that you cannot conclude that the population means are different.

If you are using a post hoc test such as Tukey HSD, then you can do any combination of categories that are interesting to you (without increasing the experimentwise error rate).

Charles

Thanks for these programs! I’ve installed the program for Mac 2016 and I’m doing a One-way ANOVA with Tukey HSD. Regardless of whether the groups have the same N value or the individual cell values, the output returns a Q-test with std err, lower, upper, x-crit, and Cohen d of zero. The df and q-crit are reported. Any reason why this is or what I’m doing wrong?

Did you fill in the shaded part of the output? You need to place a +1 in one cell and -1 in another cell in the shaded area.

Charles

Hi Charles,

Thanks so much for the amazing resources you have created, between all of the web pages and the RealStats Excel package. I’m following along with this example using the RealStats Excel functions (in Excel 2016 on a Mac), and for some reason, “=QCRIT(4,44,0.05,2)” returns a different value for me than what you show on this page. (3.7767272 vs. 3.7775). Any ideas about this?

Hi Stephanie,

=QCRIT(4,44,.05,2,FALSE) has value 3.7775, while =QCRIT(4,44,.05,2) has the value you calculated. The first formula uses linear interpolation, while the second uses harmonic interpolation (the default).

I have now updated the webpage to make this clearer. Thanks for bringing this to my attention. I appreciate your help in improving the website.

Charles

Dear Charles,

after a one-way unbalanced ANOVA on four groups, I did post-hoc Scheffé test with experiment-wise alfa = 0.05.

This because I want to compare also group of means, not only pairwise comparisons.

But after applying Scheffé to a pair of the above means with the latest Real Statistics pack, I obtain the following result:

F-stat = 7.23, F-crit = 8.12, p-value = 0.000215, lower = -38.8, upper = 1.12

and for another pair of means:

F-stat = 3.72, F-crit = 8.12, p-value = 0.014, lower = -25.9, upper = 4.98

So, while both comparisons are not significant, because F-stat < F-crit, p-values look like they were significant! I can not understand this, could you please explain me?

Thank you very much

Best Regards

Piero

Piero,

As stated on the website, you shouldn’t use Scheffe Test for pairwise comparisons.

Charles

Charles,

I understand this. But in my design I have 4 groups (A,B,C,D) of different sizes, and I need all pairwise comparisons (I can use Tukey-Kramer for this), but I should test also:

A vs (B+C+D)

This latter can’t be done with Tukey-Kramer, for this reason I was thinking to use Scheffé or maybe Games-Howell, I don’t know which is better for unbalanced design.

Do you think it’s fine to present in my study Tukey-Kramer results for pairwise comparisons, but to use a different test for the latter more complex contrast?

Thank you very much for all your suggestions

Regards

Piero

Piero,

Games-Howell is useful when the variances of the groups are different. If the sample sizes are different but the homogeneity of variances assumption holds then you can use Tukey-Kramer instead.

Based on the info you have provided, Tukey-Kramer would probably be the correct choice if you only had pairwise comparisons.

If you have a lot of non-pairwise comparisons, then Scheffe’ is probably the best choice. If you have many pairwise comparisons and only one non-pairwise comparison you might consider using Tukey-Kramer for the pairwise comparisons and Bonferroni/Contrasts for the non-pairwise comparisons.

Charles

Charles,

thank you very much for your suggestions, now you clarified me a lot of doubts!

Piero

Hello Charles

very useful all the information that you publish, this made me understand many things.

However I want to ask you about the value of “c” in the table that Games Howell throws, since, for my data that column is empty, therefore all the cells below that use the value “c” give me error.

thank you very much

Ana,

You need to manually fill in this column by placing a +1 in one of the cells and a -1 in another of the cells. These two cells determine which pairwise comparison you want to make. To make multiple such comparisons, just change which cells get the +1 and -1.

Charles