Planned Comparisons

Suppose we conduct an ANOVA and reject the null hypothesis that all the sample means are equal. Since there are significant differences among the means, it would be useful to find out where the differences are. To accomplish this we perform extended versions of the two-sample comparison t-tests described in Two Sample t-Test with Equal Variances.

In fact, the current trend is to avoid using the omnibus ANOVA altogether and jump immediately to the comparison tests. Using this approach, as we will see, only the value of MSE from ANOVA is required for the analysis.

For example, suppose you want to investigate whether a college education improves earning power, considering the following five groups of students:

  • High School Education
  • College Education: Biology majors, Physics majors, English majors, History majors

You select 30 students from each group at random and find out their salaries 5 years after graduation. The omnibus ANOVA shows there are differences between the mean salaries of the four groups. You would now like to pinpoint where the differences lie. For example, you could ask the following questions:

  1. Do college educated people earn more than those with just a high school education?
  2. Do science majors earn more than humanities majors?
  3. Is there a significant difference between the earnings of physics and biology majors?

The null hypothesis (two-tail) for each of these questions is as follows:

  1. \mu_{HS} = \frac{\mu_{Bio} + \mu_{Phy} + \mu_{Eng} + \mu_{His}}{4}
  2. \frac{\mu_{Bio} + \mu_{Phy}}{2} = \frac{\mu_{Eng} + \mu_{His}}{2}
  3. \mu_{Phy} = \mu_{Bio}

These tests are done employing something called contrasts.

Definition 1: Contrasts are simply weights ci such that


The idea is to turn a test such as the ones listed above into a weighted sum of means


using the appropriate values of the contrasts as weights. For example, for the first question listed above, we use the contrasts


and redefine the null hypothesis as


We then use a t-test as described in the following examples. Note that we could have used the following contrasts instead:


The results of the analysis will be the same. The important thing is that the sum of the contrasts adds up to 0 and that the contrasts reflect the problem being addressed.

For the second question, by using the contrasts

image5032the null hypothesis once again can be expressed in the form:


The third question uses the contrasts


Example 1: Compare methods 1 and 2 from Example 3 of Basic Concepts for ANOVA.

We set the null hypothesis to be:

H0 : µ1 – µ2 = 0, i.e. neither method (Method 1 or 2) is significantly better

When the null hypothesis is true, this is the two-sample case investigated in Two Sample t-Test with Equal Variances where the population variances are unknown but equal. As before, we use the following t-test


which has distribution T(n1 + n2 – 2).

But as we have seen, s2 = MSW and dfW = n1 + n2 – 2, and so when the null hypothesis is true,

image1138which has distribution T(dfW). For Example 1, we have the following results:

Pairwise comparison test Excel

Figure 1 – Comparison test of Example 1

Since p-value = .02373 < .05 = α, we reject the null hypothesis and conclude that there is a significant difference between method 1 and 2.

Observation: In fact, there is a generalization of this approach, namely the use of the statistic

Comparison t statistic

where the cj are constants such that c1 + c2 + c3 + c4 = 0. As before t ~ T(dfW). Here the denominator is the standard error. We now summarize this result in the following property.

Property 1: If the care contrasts, then the statistic t has distribution T(dfW) where


Observation: Since by Property 1 of F Distribution, t ~ T(df) is equivalent to t2 ~ F(1, df), it follows that Property 1 is equivalent to


Example 1 (continued): We can use Property 1 with c1 = 1, c2 = -1 and c3 = c4 = 0 as follows:

Planned comparison ANOVA Excel

Figure 2 – Use of contrasts for Example 1

Here the standard error (in cell N14) is expressed by the formula =SQRT(I15*R11) and the t-stat (in cell O14) is expressed by the formula =S11/N14. As before p-value = .0237 < .05 = α.

Observation: The t-test can also be used to create a confidence interval for a contrast, exactly as was done in Confidence Interval for ANOVA.

Example 2: Compare method 4 with the average of methods 1 and 2 from Example 3 of Basic Concepts of ANOVA.

We test the following null hypothesis

H0: µ1 = (µ2 + µ3) / 2

using Property 1 with c1 = 1, c2 = c4 = -.5 and c3 = 0.

Contrasts one-way ANOVA

Figure 3 – Use of contrasts for Example 2

Since p-value = .9689 > .05 = α, we can’t reject the null hypothesis, and so conclude there is no significant difference between method 4 and the average of methods 1 and 2.

Observation: The above analysis employs a two-tail test. We could also have conducted a one-tail test using the null hypothesis H0: µ1 ≤ (µ2 + µ3) / 2. In that case we would use the one-tail version of TDIST in calculating the t-stat.

Observation: Contrasts are essentially vectors and so we can speak of two contrasts being orthogonal. Per Definition 8 of Matrix Operations, assuming that the group sample sizes are equal (i.e. ni = nj for all i, j), contrasts (c1,…,ck) and (d1,…,dk) are orthogonal provided


Geometrically this means the contrasts are at right angles to each other in space.

Assuming there are k groups, if you have – 1 contrasts C1, …, Ck-1 that are pairwise orthogonal, then any other contrast can be expressed as a linear combination of these contrasts. Thus you only ever need to look at k – 1 orthogonal contrasts. Since

image1164for any contrast C = (c1,…,ck), each of the Cj is orthogonal to the unit vector (1,…,1) and so k – 1 contrasts (and not k) are sufficient.

Note that if C1, …, Ck-1 are pairwise orthogonal, then SS between groups,


Also each

and so

Thus any k – 1 pairwise orthogonal contrasts partition SSB.

Thus, if none of the t-tests for a set of k–1 pairwise orthogonal contrasts are significant, then the ANOVA F-test will also not be significant. Consequently, if the omnibus ANOVA F-test is significant, then at least one of k–1 pairwise orthogonal contrasts will be significant.

A non-significant F-test does not imply that all possible contrasts are non-significant. Also a significant contrast doesn’t imply that the F-test will be significant.

In general to reduce experiment-wise error you should make the minimum number of meaningful tests, preferring orthogonal contrasts to non-orthogonal contrasts. The key point is to make only meaningful tests.

When the group sample sizes are not equal (i.e. unbalanced group samples), we need to modify the definition of orthogonal contrasts. In fact, contrasts (c1,…,ck) and (d1,…,dk) are orthogonal provided

Contrasts unbalanced

The assumptions for contrasts are the same as those for ANOVA, namely

  • Independent samples
  • Within each group, participants are independent and randomly selected
  • Each group has the same population variance
  • Each group is drawn from a normal population

The same tests that are employed to test the assumptions of ANOVA (e.g. QQ plots, Levene’s test, etc.) can be used for contrasts. Similarly the same corrections (e.g. transformations) can be used for contrasts. In addition, two other approaches can be used with contrasts, namely Welch’s correction when the variances are unequal or a rank test (e.g. Mann-Whitney U test or ANOVA on ranked data) when normality is violated. Keep in mind that ranks are only ordinal data and so linear combinations (including averages) cannot be used, only comparisons of type µ1 = µ2. Also any conclusions drawn from ANOVA on ranked data applies to the ranks and not the observed data.

Observation: As described in Experiment-wise Error Rate, in order to address the inflated experiment-wise error rate, either the Dunn/Sidák or Bonferroni correction factor can be used.

Dunn/Sidák correction was described in Experiment-wise Error Rate. To test k orthogonal contrasts in order to achieve an experiment-wise error rate of αexp, the error rate α of each contrast test must be such that 1 – (1 – α)k = αexp. Thus α = 1 – (1 – αexp)1/k. E.g. if k = 4 then to achieve an experiment-wise error rate of .05, α = 1 – ∜.95 = 0.012741.

Bonferroni correction simply divides the experimental-wise error rate by the number of orthogonal contrasts. E.g. for 4 orthogonal contrasts, to achieve an experiment-wise error rate of .05, simply set α = .05/4 = .0125. Note that the Bonferroni correction is a little more conservative than the Dunn/Sidák correction, since αexp / k < 1 – (1 – αexp)1/k.

In the above calculations we have assumed that the contrasts have equal values for α. This is not strictly necessary. E.g. in the example above, for the Bonferroni correction, we can use .01 for the first three contrasts and .02 for the fourth. The important thing is that the sum be .05 and that the split be determined prior to seeing the data.

If the contrasts are not orthogonal then the above correction factors are too conservative, i.e. they over-correct.

Example 3: A drug is being tested for its effect on prolonging the life of people with cancer. Based on the data in the left side of Figure 3, determine whether there are significant differences in the 4 groups, and check (1) whether there is a difference in life expectancy between the people taking the drug and those taking a placebo, (2) whether there is a difference in effectiveness of the drug between men and women and (3) whether there is a difference in life expectancy for people with this type of cancer for men versus women.


Figure 3 – Data and ANOVA output for Example 4

The ANOVA output in Figure 3 shows (p-value = .00728 < .05 = α) there is a significant difference between the 4 groups. We now address the other questions to try to pinpoint where the differences lie. First we investigate whether the drug provides any significant advantage.

Contrasts plus experimentwise correction

Figure 4 – Contrast test for effectiveness of drug in Example 3

Figure 4 shows the result with uncontrolled type I error and then the results using the Bonferroni and Dunn/Sidák corrections. The tests are all significant, i.e. there is a significant difference between the population means of those taking the drug from those in the control group taking the placebo.

We next test whether there is a difference in effectiveness of the drug between men and women.

Contrasts non-pairwise comparison

Figure 5 – Contrast test for effectiveness of drug for men vs. women

Since p-value = .0547 > α in all the tests, we conclude there is no significant difference between longevity of men versus women taking the drug.

The final test is to determine if men and women with this type of cancer have different life expectancy (whether or not they take the drug).

Contrasts life expectancy gender

Figure 6 – Contrast life expectancy of men vs. women

The result is significant (p-value = .0421 < .05 = α) if we don’t control for type I error, but the result is not significant if we use the Bonferroni or Dunn/Sidák correction (p-value = .0421 > .0167 = α).

Real Statistics Excel Function: The Real Statistics Resource Pack provides the following function:

DunnSidak(αexp, k) = α = 1 – (1 – αexp)1/k

Real Statistics Data Analysis Tool: The Real Statistics’ Single Factor Anova and Follow-up Tests data analysis tool provides support for performing contrast tests. Use of this tool is described in Example 4.

Example 4: Repeat the analysis from Example 2 using the Contrasts option of the Single Factor Anova and Follow-up Tests supplemental data analysis tool.

Enter Ctrl-m and select Single Factor Anova and Follow-up Tests from the menu. A dialog box will appear as in Figure 7.

Contrasts dialog box

Figure 7 – Dialog box for Single Factor Anova and Follow-up Tests

Enter the sample range in Input Range, click on Column headings included with data and check the Contrasts option. Select the type of alpha correction that you want, namely no experiment-wise correction, the Bonferroni correction or the Dunn/Sidák correction (as explained in Experiment-wise Error Rate). In any case you set the alpha value to be the experiment-wise value (defaulting as usual to .05).

Note too that the contrast output that results from the tool will not contain any contrasts. You need to fill in the desired contrasts directly in the output (e.g. for Example 4 you need to fill in the range O32:O35 in Figure 7 with the contrasts you desire).

When you click on OK, the output from this tool is displayed (as in Figure 8). The fields relating to effect size are explained in Effect Size for ANOVA.

Contrasts data analysis tool

Figure 8 – Real Statistics Contrast data analysis

Caution: If your Windows settings for the decimal separator is comma (,) and the thousands separator is period (.) then you may get incorrect values for alpha when using the Bonferroni or Sidak/Dunn corrections.

25 Responses to Planned Comparisons

  1. RUI CHEN says:

    Could you please give me some guidance about how to do the planned contrast for the categorical dependent variable?

    Thank you!

    • Charles says:

      It is not clear to me what you are asking. Can you give an example?

      • RUI CHEN says:

        In your examples, the dependent variables are continuous. But the dependent variable in my experiment is categorical (buy or not) with 2 by 2 design, I think the data can not analyzed by t-test. Do you have some experience of the planned contrast in this case?
        Thank you so much!

        • Charles says:

          Since your dependent variable is binary categorical, which test are you using (ANOVA would not be a good choice)?

          • RUI CHEN says:

            Thanks a lot Charles!
            I run logistic regression for the interaction effect. But I don’t know how to do the planned contrast next.

          • Charles says:

            Sorry, but I don’t support contrasts for logistic regression yet.

  2. Rajeev Gupta says:

    Could please tell me, where we use t-test for equal variance or Unequal variance.
    I am bit confused.

    Thanks in Advance.

    • Charles says:


      When the variances of the two samples are relatively equal then the results of the two tests are approximately the same. It turns out that the t-test with equal variances gives a good result even when the variances are pretty unequal, but when the variance of one of the samples is quite different (approximately when the variance of one sample is more than four times the variance of the other sample), you should use the t-test with unequal variances instead of the t-test with equal variances.

      Since the t-test with unequal variances can always be used (at least where the other assumptions for the t-test are satisfied), when in doubt just use this test.


  3. Declan says:

    Good day Charles 🙂

    I often think that a t test with equal variances is more like an academic exercise. For example, nobody knows what the population mean or population standard deviation is, yet there are equations for it. If you don’t know it, use the sample standard deviation/mean etc. Based on this assumption, I’ve been applying the same logic to t tests- as it’s unlikely that the samples with have the same variances.

    Interesting comment though that you made about the variance of one sample needs to be four times the variance of the other sample. Didn’t know that. Thanks. 🙂 In R, I think I’ve been doing the same thing VAR.EQUAL=FALSE. 🙂 But I will now have to be more careful.

    Merry Christmas to you.

    • Charles says:

      Keep in mind that “variance of one sample needs to be at most four times the variance of the other sample” is a rule-of-thumb”. To be on the safe side you can always use the test with unequal variances. When the variances are equal or fairly equal the results will be almost the same as the t test with equal variances.

  4. Declan says:

    Hi Charles,

    When I try Dunn/Sidak correction and Bonferroni Correction, I get an error:

    ‘A run time error has occurred. The analysis tool will be aborted. Type Mismatch.’

    If I do contrasts with no correction, then it’s fine. but not the other 2 options.


    • Charles says:

      Thanks for reminding me about this error. It is related to the fact that some users employ a period as the decimal point and other use a comma. I have now resolved this problem, and it should be fixed in the next release of the software, which will be available later this week.

  5. terry says:

    could you kindly elaborate on how to to fill in the desired contrasts directly in the output. i am lost.

    • Charles says:

      E.g. you would fill in the contrasts in the range O7:O10 in Figure 2. Initially all the cells in this range are blank and you need to fill in contrast values that sum to 0 such that the positive values add up to 1 and the negative values add up to -1. E.g. 1, 0, -1. 0 or -.5, +.5, +.5, -.5.

  6. terry says:

    if its not too much to ask, could you send me an email that i could send a more exemplified question? just need to know i am doing the right thing. thanks for your response

  7. RUI CHEN says:

    Thanks Charles!

  8. Bárbara says:

    Hello Charles,

    I installed the program and ran a one-way ANOVA for some data with three groups. The ANOVA was significant, and now I would like to know which group is different. I’m trying to run some multiple comparison test, like Games-Howell, but it’s not working. I’m using the Excel in a Macbook, and actually, in my options when I open the program I do not have “Single Factor Anova and Follow-up Tests “, just “Analysis of Variance”. What can be happening? Thank you, Bárbara!

  9. Piero says:

    Dear Charles,

    I want to test differences between four groups of kids:
    group A doesn’t practice any sport (control group)
    groups B,C,D do practice different sports

    so, by assuming one-way ANOVA results significant, I want to test if there are differences between control group and sports group (A vs B+C+D) and also between each pair of sports group (that is, B vs C, B vs D, C vs D).
    These contrasts are not orthogonal because the test is unbalanced (different group sizes).
    I think that the Real Statistics contrast analysis is the right approach to be used in my case, but which is the right correction factor I should use, in order to keep the experiment-wise error rate to alpha=0.05?

    And conversely, if I wish to test all possible pairs of means, what could be the better approach, being the sample unbalanced?

    Thank you very much for your help
    Best Regards

Leave a Reply

Your email address will not be published. Required fields are marked *