**Univariate case**

As we describe in Planned Comparisons, for a sample with *m* groups, **contrasts** are linear combinations of the group means based on contrast coefficients (i.e. weights) *c _{j}* such that = 0. Hypothesis testing using contrasts consists of using a null hypothesis in the form of = 0 based on the appropriate values of the contrast coefficients as weights.

As usual we use as an estimate for , and note that is a random variable with zero mean and variance *MS _{W}* . Thus, when the null hypothesis is true the test statistic

has distribution *T*(*df _{W}* ). Since

*t*~

*T*(

*df*) is equivalent to

*t*

^{2}~

*F*(1,

*df*), it follows that this test is equivalent to

Contrasts and are **orthogonal** provided = 0, which in the balanced case, where *n*_{1} = *n*_{2} = ⋯ = *n _{m}*, is the same as stating that the vectors (

*c*

_{1}, …,

*c*) and (

_{m}*d*

_{1}, …,

*d*) are orthogonal.

_{m}Note that the null hypotheses of orthogonal contrasts are independent of one another; i.e. the result of one has no impact on the result of the other.

If *ψ*_{1}*, …, ψ _{m}*

_{-1}are

*m*–1 contrasts that are pairwise orthogonal, then any other contrast can be expressed as a linear combination of these contrasts. Thus you only ever need to look at

*m*– 1 orthogonal contrasts. Since = 0 for any contrast, each of the

*ψ*is orthogonal to the unit vector (1, …, 1) and so

_{j}*m*–1 contrasts (and not

*m*) are sufficient. Also by Corollary 1 of Orthogonal Vectors and Matrices, we can always find a set of

*m*–1 contrasts.

Note that if *ψ*_{1}*, …, ψ _{m}*

_{-1}are pairwise orthogonal contrasts, then

*SS*between groups,

*SS*= + ⋯ + . Also each = 1, and so

_{B}*m*– 1 = + ⋯ + . Thus any m – 1 pairwise orthogonal contrasts partition

*SS*.

_{B}**Multivariate case**

Multivariate **contrasts** are linear combinations of the group mean vectors based on contrast coefficients (i.e. weights) *c _{j}* such that = 0. Hypothesis testing using contrasts consists of using a null hypothesis in the form of = 0 based on the appropriate values of the contrast coefficients as weights.

As usual we use as an estimate for , and note that is a random vector with zero mean and covariance matrix

Contrasts and are **orthogonal** provided = 0, which in the balanced case, where *n*_{1} = *n*_{2} = ⋯ = *n _{m}*, is the same as stating that the vectors (

*c*

_{1}, …,

*c*) and (

_{m}*d*

_{1}, …,

*d*) are orthogonal.

_{m}Note that the null hypotheses of orthogonal contrasts are independent of one another; i.e. the result of one has no impact on the result of the other.

If *ψ*_{1}*, …, ψ _{m}*

_{-1}are

*m*–1 contrasts that are pairwise orthogonal, then any other contrast can be expressed as a linear combination of these contrasts. Thus you only ever need to look at

*m*– 1 orthogonal contrasts. Since = 0 for any contrast, each of the

*ψ*is orthogonal to the unit vector (1, …, 1) and so

_{j}*m*– 1 contrasts (and not

*m*) are sufficient. Also by Corollary 1 of Orthogonal Vectors and Matrices, we can always find a set of

*m*– 1 contrasts.

Let *ψ* = * *be a contrast. Then we can define the hypothesis *SSCP* matrix for as follows:

Note that if *ψ*_{1}*, …, ψ _{m}*

_{-1}are pairwise orthogonal contrasts, then the hypothesis matrix

*H*can be partitioned

*H*= + ⋯ + . Also each = 1, and so

*m*– 1 = + ⋯ + .

Since in our representation of the data we exchange the roles of rows and columns and represent contrasts as column vectors, we actually calculate

The test we use to determine whether to retain or reject the null hypothesis is similar to the omnibus MANOVA tests, and uses the test statistic

**Wilk’s Lambda**: *Λ _{ψ} *=

For the Wilks Lambda test, we use *df*_{1} =* k* and *b* = 1 and so the test becomes

where *df*_{1} =* k* and *df*_{2} = *n – m – k* + 1.

**Example 1**: We have seen that there are significant differences between the four groups in Example 1 of Manova basic Concepts, but we still don’t know where these differences are. Based on the observations we made when looking at the charts in Figure 3 of Manova Basic Concepts, we would like to answer the following questions:

- Is there a significant difference between the clay and salty groups?
- Is there a significant difference between the loam and sandy groups?
- Do the loam/sandy groups have a higher mean vector than the clay/salty groups?

To answer the first question we select the **Contrasts** option in the Real Statistics MANOVA data analysis tool (see Figure 1 of Real Statistics Manova Support). The result is as shown in Figure 1.

**Figure 1 – Contrasts option of MANOVA data analysis tool**

Don’t worry about the #DIV/0! And zero entries. Remember we still haven’t fill in the contrast coefficients in the shaded area.

For question (a), we use the following coefficients: 1 for clay, -1 for salty and 0 for loam and sandy. The results are shown in the top half of Figure 2.

**Figure 2 – Contrasts for questions (a) and (b)**

This time we get non-error values for all the cells. Key formulas used to create the contrast table in Figure 2 are given in Figure 3.

**Figure 3 – Representative formulas from Figure 2**

The p-value (cell S10) = 0.896 > .05 = *α*, and so there is no significant difference between the mean vectors of the clay and salty groups.

For question (b) we can again select the **Contrasts** option in the Real Statistics MANOVA data analysis tool and enter the contrast with coefficients 1 for loam, -1 for sandy and 0 for clay and salty. Alternatively you can simply copy the output for the contrast for question (a) right below it (as in the bottom part of Figure 2) and simply change the contrast coefficients. As we can see from Figure 2, the p-value for question (b) is .060268, which is again larger than *α* = .05, and so we conclude there is no significant difference between the mean vectors of the loam and sandy groups.

For question (c) we use contrasts 1/2 for loam and sandy and -1/2 for clay and salty. This time, as we see in Figure 4, we get a p-value of .004132 < .05 = *α*, which shows there is a significant difference between the mean vectors.

**Figure 4 – Contrasts for question (c)**

**Observation**: It turns out that the three contrasts evaluated above are mutually orthogonal. You don’t necessarily have to use only orthogonal contrasts. In fact, it is best to use the contrasts that correspond to the tests that you believe are necessary to perform based on the actual experiment under consideration.

**Example 2**: We have seen that the loam/sandy groups have a significantly higher mean vector than the clay/salty groups. Which of the factors (yield, water and herbicide) account for this difference?

When we ran the Contrasts option of MANOVA for Example 1, we saw that there were two other tests conducted for question (c), as shown in Figure 4, namely the **simultaneous confidence intervals** and the **Bonferroni confidence intervals**. These two tests were also produced for questions (a) and (b), but we didn’t show the results since there weren’t significant differences in the mean vectors and so these tests were not useful.

The two tests are extensions of the same follow-up tests conducted for Hotelling’s *T*^{2} tests as described in Hotelling’s T-square Tests. We now described the MANOVA version of these tests.

For each *p*, 1 ≤ *p ≤ k*

The 1 – *α* simultaneous confidence interval is

where* E* = [*e _{pq}*]

Here *F _{crit}* = FINV(

*α*,

*df*

_{1},

*df*

_{2}),

*df*

_{1}=

*k*and

*df*

_{2}=

*n – m – k*+ 1.

The results (see range A40:L43 of Figure 4) show that zero lies in the confidence interval for all three dependent variables, and so no one of the independent variables accounts for the difference in mean vectors observed in Example 1(c), but it is the combination of the independent variables.

Generally we use the Bonferroni confidence intervals instead of the simultaneous confidence intervals since the intervals tend to be narrower. The 1 – *α* Bonferroni confidence interval is

* ψ _{p} ± t_{crit} ∙ se_{p}*

where the contrast mean for the *p*th dependent variable *ψ _{p}* and standard error

*se*are as defined above and

_{p}*t*= TINV(

_{crit}*α*/

*k*, df) where

*df = n – m*.

The results (see range A45:L48 of Figure 4) show that although the Bonferroni confidence intervals are indeed narrower than the simultaneous confidence intervals, zero still lies in the confidence interval for all three dependent variables, and so no one of the independent variables accounts for the difference in mean vectors observed in Example 1(c), but it is the combination of the independent variables.

**Effect size**

We use the following value for the Mahalanobis distance squared as a measure of effect size for the multivariate contrast *ψ*:

For example, the effect size for Example 1(a) is 0.16071 (cell S11 in Figure 3).

Dear Dr. Charles Zaiontz, I am a physician practicing in Udine (Italy). I am studying the statistics by utilizing your very interesting program “Real Statistics Using Excel”. In “MANOVA follow up using Contrasts” and in “Simultaneous confidence intervals”, you suggest to use the contrast [-0.5 0.5 -0.5 0.5] to calculate the “p – standard error”, sep.

If I use the contrast [1 -1 -1 1], while I get the same Wilk’s Lambda and the same F – value (0.606269 and 5.628408, respectively), the “Simultaneous confidence intervals” values, [-4.126 – +22.264; -12.421 – +9.634; -0.803 – +3.590], double: [-8.2526 – +44.5276; -24.8428 – +19.2678; -1.6056 – 7.181]. Why should I use [-0.5 0.5 -0.5 0.5] contrasts and not [-1 1 -1 1]?

Best regards, Roberto Mioni.

Roberto,

I typically choose contrasts such that the sum of the negative contrasts equals -1 and the sum of the positive contrasts equals +1. The contrasts that you suggest will probably produce the same outcome.

Charles

Dear Dr. Charles Zaiontz, thanks for answering my question.

However the contrasts that I suggest do not produce the same simultaneous confidence intervals and, probably, are incorrect. In fact the sum of positive contrasts must be equal to +1, while the sum of negative contrasts must be equal to -1, just as you say.

I was not aware of this rule!

Congratulations again for your program.

Roberto Mioni.

Hello Charles, and thank you, this is very helpful!

I have a bit of an exceptional case with my data. I am investigating the use of various policy instruments by municipalities based on a survey they answered. I divided them between 3 size categories (SMALL, MEDIUM and LARGE). The dependent variables are “Number of instruments currently used”, “Nbr. of inst. used in the past”, “Nbr. of inst. planned for future implementation”, “Nbr. of inst. never considered for implementation” and “Nbr. of inst. where the respondent didn’t answer”.

My issue is that “Nbr. of inst. used in the past” is equal to zero for the SMALL and MEDIUM categories. This causes Wilk’s Lambda to be either >1 or induces a division by zero when I use Real Statistics. Would you know a way around this?

Again, thank you! My statistics are very rusty and your website has been of great help.

Nicolas,

Sorry, but I don’t know enough about what you are trying to accomplish to offer a way around the problem you are having.

Charles