Suppose we conduct an ANOVA and reject the null hypothesis that all the sample means are equal. Since there are significant differences among the means, it would be useful to find out where the differences are. To accomplish this we perform extended versions of the two-sample comparison t-tests described in Two Sample t-Test with Equal Variances.
In fact, the current trend is to avoid using the omnibus ANOVA altogether and jump immediately to the comparison tests. Using this approach, as we will see, only the value of MSE from ANOVA is required for the analysis.
For example, suppose you want to investigate whether a college education improves earning power, considering the following five groups of students:
- High School Education
- College Education: Biology majors, Physics majors, English majors, History majors
You select 30 students from each group at random and find out their salaries 5 years after graduation. The omnibus ANOVA shows there are differences between the mean salaries of the four groups. You would now like to pinpoint where the differences lie. For example, you could ask the following questions:
- Do college educated people earn more than those with just a high school education?
- Do science majors earn more than humanities majors?
- Is there a significant difference between the earnings of physics and biology majors?
The null hypothesis (two-tail) for each of these questions is as follows:
These tests are done employing something called contrasts.
Definition 1: Contrasts are simply weights ci such that
The idea is to turn a test such as the ones listed above into a weighted sum of means
using the appropriate values of the contrasts as weights. For example, for the first question listed above, we use the contrasts
and redefine the null hypothesis as
We then use a t-test as described in the following examples. Note that we could have used the following contrasts instead:
The results of the analysis will be the same. The important thing is that the sum of the contrasts adds up to 0 and that the contrasts reflect the problem being addressed.
For the second question, by using the contrasts
The third question uses the contrasts
Example 1: Compare methods 1 and 2 from Example 3 of Basic Concepts for ANOVA.
We set the null hypothesis to be:
H0 : µ1 – µ2 = 0, i.e. neither method (Method 1 or 2) is significantly better
When the null hypothesis is true, this is the two-sample case investigated in Two Sample t-Test with Equal Variances where the population variances are unknown but equal. As before, we use the following t-test
which has distribution T(n1 + n2 – 2).
But as we have seen, s2 = MSW and dfW = n1 + n2 – 2, and so when the null hypothesis is true,
Figure 1 – Comparison test of Example 1
Since p-value = .02373 < .05 = α, we reject the null hypothesis and conclude that there is a significant difference between method 1 and 2.
Observation: In fact, there is a generalization of this approach, namely the use of the statistic
where the cj are constants such that c1 + c2 + c3 + c4 = 0. As before t ~ T(dfW). Here the denominator is the standard error. We now summarize this result in the following property.
Property 1: If the cj are contrasts, then the statistic t has distribution T(dfW) where
Observation: Since by Property 1 of F Distribution, t ~ T(df) is equivalent to t2 ~ F(1, df), it follows that Property 1 is equivalent to
Example 1 (continued): We can use Property 1 with c1 = 1, c2 = -1 and c3 = c4 = 0 as follows:
Figure 2 – Use of contrasts for Example 1
Here the standard error (in cell N14) is expressed by the formula =SQRT(I15*R11) and the t-stat (in cell O14) is expressed by the formula =S11/N14. As before p-value = .0237 < .05 = α.
Observation: The t-test can also be used to create a confidence interval for a contrast, exactly as was done in Confidence Interval for ANOVA.
Example 2: Compare method 4 with the average of methods 1 and 2 from Example 3 of Basic Concepts of ANOVA.
We test the following null hypothesis
H0: µ1 = (µ2 + µ3) / 2
using Property 1 with c1 = 1, c2 = c4 = -.5 and c3 = 0.
Figure 3 – Use of contrasts for Example 2
Since p-value = .9689 > .05 = α, we can’t reject the null hypothesis, and so conclude there is no significant difference between method 4 and the average of methods 1 and 2.
Observation: The above analysis employs a two-tail test. We could also have conducted a one-tail test using the null hypothesis H0: µ1 ≤ (µ2 + µ3) / 2. In that case we would use the one-tail version of TDIST in calculating the t-stat.
Observation: Contrasts are essentially vectors and so we can speak of two contrasts being orthogonal. Per Definition 8 of Matrix Operations, assuming that the group sample sizes are equal (i.e. ni = nj for all i, j), contrasts (c1,…,ck) and (d1,…,dk) are orthogonal provided
Geometrically this means the contrasts are at right angles to each other in space.
Assuming there are k groups, if you have k – 1 contrasts C1, …, Ck-1 that are pairwise orthogonal, then any other contrast can be expressed as a linear combination of these contrasts. Thus you only ever need to look at k – 1 orthogonal contrasts. Since
Note that if C1, …, Ck-1 are pairwise orthogonal, then SS between groups,
Thus any k – 1 pairwise orthogonal contrasts partition SSB.
Thus, if none of the t-tests for a set of k–1 pairwise orthogonal contrasts are significant, then the ANOVA F-test will also not be significant. Consequently, if the omnibus ANOVA F-test is significant, then at least one of k–1 pairwise orthogonal contrasts will be significant.
A non-significant F-test does not imply that all possible contrasts are non-significant. Also a significant contrast doesn’t imply that the F-test will be significant.
In general to reduce experiment-wise error you should make the minimum number of meaningful tests, preferring orthogonal contrasts to non-orthogonal contrasts. The key point is to make only meaningful tests.
When the group sample sizes are not equal (i.e. unbalanced group samples), we need to modify the definition of orthogonal contrasts. In fact, contrasts (c1,…,ck) and (d1,…,dk) are orthogonal provided
The assumptions for contrasts are the same as those for ANOVA, namely
- Independent samples
- Within each group, participants are independent and randomly selected
- Each group has the same population variance
- Each group is drawn from a normal population
The same tests that are employed to test the assumptions of ANOVA (e.g. QQ plots, Levene’s test, etc.) can be used for contrasts. Similarly the same corrections (e.g. transformations) can be used for contrasts. In addition, two other approaches can be used with contrasts, namely Welch’s correction when the variances are unequal or a rank test (e.g. Mann-Whitney U test or ANOVA on ranked data) when normality is violated. Keep in mind that ranks are only ordinal data and so linear combinations (including averages) cannot be used, only comparisons of type µ1 = µ2. Also any conclusions drawn from ANOVA on ranked data applies to the ranks and not the observed data.
Observation: As described in Experiment-wise Error Rate, in order to address the inflated experiment-wise error rate, either the Dunn/Sidák or Bonferroni correction factor can be used.
Dunn/Sidák correction was described in Experiment-wise Error Rate. To test k orthogonal contrasts in order to achieve an experiment-wise error rate of αexp, the error rate α of each contrast test must be such that 1 – (1 – α)k = αexp. Thus α = 1 – (1 – αexp)1/k. E.g. if k = 4 then to achieve an experiment-wise error rate of .05, α = 1 – ∜.95 = 0.012741.
Bonferroni correction simply divides the experimental-wise error rate by the number of orthogonal contrasts. E.g. for 4 orthogonal contrasts, to achieve an experiment-wise error rate of .05, simply set α = .05/4 = .0125. Note that the Bonferroni correction is a little more conservative than the Dunn/Sidák correction, since αexp / k < 1 – (1 – αexp)1/k.
In the above calculations we have assumed that the contrasts have equal values for α. This is not strictly necessary. E.g. in the example above, for the Bonferroni correction, we can use .01 for the first three contrasts and .02 for the fourth. The important thing is that the sum be .05 and that the split be determined prior to seeing the data.
If the contrasts are not orthogonal then the above correction factors are too conservative, i.e. they over-correct.
Example 3: A drug is being tested for its effect on prolonging the life of people with cancer. Based on the data in the left side of Figure 3, determine whether there are significant differences in the 4 groups, and check (1) whether there is a difference in life expectancy between the people taking the drug and those taking a placebo, (2) whether there is a difference in effectiveness of the drug between men and women and (3) whether there is a difference in life expectancy for people with this type of cancer for men versus women.
Figure 3 – Data and ANOVA output for Example 4
The ANOVA output in Figure 3 shows (p-value = .00728 < .05 = α) there is a significant difference between the 4 groups. We now address the other questions to try to pinpoint where the differences lie. First we investigate whether the drug provides any significant advantage.
Figure 4 – Contrast test for effectiveness of drug in Example 3
Figure 4 shows the result with uncontrolled type I error and then the results using the Bonferroni and Dunn/Sidák corrections. The tests are all significant, i.e. there is a significant difference between the population means of those taking the drug from those in the control group taking the placebo.
We next test whether there is a difference in effectiveness of the drug between men and women.
Figure 5 – Contrast test for effectiveness of drug for men vs. women
Since p-value = .0547 > α in all the tests, we conclude there is no significant difference between longevity of men versus women taking the drug.
The final test is to determine if men and women with this type of cancer have different life expectancy (whether or not they take the drug).
Figure 6 – Contrast life expectancy of men vs. women
The result is significant (p-value = .0421 < .05 = α) if we don’t control for type I error, but the result is not significant if we use the Bonferroni or Dunn/Sidák correction (p-value = .0421 > .0167 = α).
Real Statistics Excel Function: The Real Statistics Resource Pack provides the following function:
DunnSidak(αexp, k) = α = 1 – (1 – αexp)1/k
Real Statistics Data Analysis Tool: The Real Statistics’ Single Factor Anova and Follow-up Tests data analysis tool provides support for performing contrast tests. Use of this tool is described in Example 4.
Example 4: Repeat the analysis from Example 2 using the Contrasts option of the Single Factor Anova and Follow-up Tests supplemental data analysis tool.
Enter Ctrl-m and select Single Factor Anova and Follow-up Tests from the menu. A dialog box will appear as in Figure 7.
Figure 7 – Dialog box for Single Factor Anova and Follow-up Tests
Enter the sample range in Input Range, click on Column headings included with data and check the Contrasts option. Select the type of alpha correction that you want, namely no experiment-wise correction, the Bonferroni correction or the Dunn/Sidák correction (as explained in Experiment-wise Error Rate). In any case you set the alpha value to be the experiment-wise value (defaulting as usual to .05).
Note too that the contrast output that results from the tool will not contain any contrasts. You need to fill in the desired contrasts directly in the output (e.g. for Example 4 you need to fill in the range O32:O35 in Figure 7 with the contrasts you desire).
When you click on OK, the output from this tool is displayed (as in Figure 8). The fields relating to effect size are explained in Effect Size for ANOVA.
Figure 8 – Real Statistics Contrast data analysis
Caution: If your Windows settings for the decimal separator is comma (,) and the thousands separator is period (.) then you may get incorrect values for alpha when using the Bonferroni or Sidak/Dunn corrections.