There is a problem with the analysis given for Example 1 of ANOVA with Repeated Measures for One Within Subjects Factor, namely that the sphericity requirement for ANOVA with matched samples has not been met. In fact the requirements for ANOVA with matched samples are as follows:
- Subjects are independent and randomly selected from the population
- Normality (or at least symmetry) – although overly restrictive it is sufficient for this condition to be met for each treatment (i.e. each column in the analyses that we present)
Sphericity is an assumption which is stronger than homogeneity of variances, and is defined as follows.
Definition 1: Suppose you have k samples of equal size. The data has the sphericity property when the pairwise differences in variance between the samples are all equal.
Such data has the compound symmetry property when the all variances are equal and all the covariances are equal.
Observation: Obviously if there are only two treatments then sphericity is automatically satisfied.
Property 1: Compound symmetry implies sphericity
Proof: Suppose that the data satisfies the compound symmetry property. We consider the case with k = 3; the general case follows similarly. Then the covariance matrix has the form
This can then be simplified to
where ρ is the correlation coefficient. By Property 5 of Basic Concepts of Correlation
From which it follows that the data satisfies the sphericity property.
Observation: The converse is not true since it is easy enough to give a counter-example.
Example 1: Determine whether the data for Example 1 of ANOVA with Repeated Measures for One Within Subjects Factor (repeated in Figure 0) meets the sphericity assumption.
Figure 0 – Data for Example 1
Since we have 4 treatments T0, T1, T2 and T3 the covariance matrix S is an array of form [cij] where cij = cov(Ti, Tj) for i ≠ j and cij = var(Ti) for i = j. For this example S is given in Figure 1.
Figure 1 – Covariance matrix for Example 1
The worksheet in Figure 1 can be created by highlighting the range W25:Z28 and entering the supplemental array formula =COV(H6:K20) described in Method of Least Squares for Multiple Regression (or the standard Excel techniques described following Figure 2 of Method of Least Squares for Multiple Regression) by referencing cells in the worksheet in Figure 0.
Clearly the variances (the values along the main diagonal) are different and the covariances are different, and so compound symmetry doesn’t strictly hold, but it is not clear whether these differences are significant.
Next we calculate the variances of the pairwise differences between the treatment values. This can either be calculated directly or by using Property 5 of Basic Concepts of Correlation that
The results for Example 1 are given in Figure 2. For example, cell AD6 is simply the difference between the score for subject 1 before the training and 1 week after (i.e. =H6-I6, referencing the cells in Figure 0) and cell AD22 can be calculated either as =VAR(AD6:AD20) or as =W25+X26-2*X25 (referencing the cells in Figure 1).
Figure 2 – Sphericity for Example 1
From Figure 2, we conclude that the variances of the 6 pairs are clearly not equal, and so it appears that the sphericity assumption has not been met. The resolution is to use a correction factor called epsilon to reduce the degrees of freedom in the test described in ANOVA with Repeated Measures for One Within Subjects Factor. The most common version of epsilon, due to Greenhouse and Geisser, takes a value between and 1 where k = the number of treatments (i.e. the number of columns, 4 in our example). The value epsilon is given by the formula:
Here S = the sample covariance matrix, = the mean of all the elements in S, = the mean of all the elements on the diagonal of (i.e. the mean of the variances). For our example, we have = 0.493 as calculated in cell AR12 of Figure 3 (with references to cells in the worksheet shown in Figure 0).
Figure 3 – GG and HF epsilon for Example 1
In general, is viewed as too conservative (i.e. the correction factor should be higher), and so another common correction factor called the Huynh and Feldt epsilon is also commonly used. It turns out that this factor tends to be slightly too high.
The Huynh and Feldt epsilon is calculated as follows where n = the number of subjects.
For our example, = 0.538, as calculated in cell AR19 of the worksheet in Figure 3.
There is one more commonly used correct factor, namely . This is the lower bound. The correction for sphericity can’t be lower than this.
Real Statistics Functions: The Real Statistics Resource Pack provides the following two functions which calculates the GG epsilon and HF epsilon correction factors for both the one factor and two factor repeated measures Anova.
GGEpsilon(R1, ngroups, raw) = Greenhouse and Geisser epsilon value for the data in range R1 where ngroups = the number of independent groups; if raw = TRUE then R1 contains raw data, otherwise it contains a covariance matrix
HFEpsilon(R1, ngroups, nsubj) = Huynh and Feldt epsilon value for the data in range R1 where ngroups = the number of independent groups; if nsubj = 0 then R1 contains raw data, otherwise it contains a covariance matrix which is derived from raw data withnsubj subjects (corresponding to rows).
For the one factor case ngroups = 1. For Example 1, referring to Figure 0, we see that GGEpsilon(H6:K20) = .493 and HFEpsilon(H6:K20) = .538. Referring to Figure 1, we see that GGEpsilon(W25:Z28,1,FALSE) = .493 and HFEpsilon(W25:Z28,1,15) = .538.
Example 2: Revise the analysis for Example 1 by using the GG and HF epsilon correction factors.
We correct the column and error degrees of freedom by multiplying the dfGroups and dfE values shown in Figure 2 of ANOVA with Repeated Measures for One Within Subjects Factor (and duplicated in the range N3:T8 of Figure 4 below) by an epsilon correction factor. E.g. the revised value of dfE using the Greenhouse and Geisser epsilon is therefore dfE ∙ = 42 ∙ .493 = 20.7. These revised values of the degrees of freedom are then used to calculate new values of MSGroups and MSE, and then F = MSGroups/MSE. Note that the revised degrees of freedom won’t necessarily be a whole number. The results are summarized in Figure 4.
Figure 4 – ANOVA analysis using GG and HF epsilon
Observation: Since Excel only uses whole number degrees of freedom, the actual values used in the above calculations are rounded down to the nearest whole number, i.e. dfCol = 1 and dfE = 22. You may need to interpolate to get slightly more accurate results. The above analysis shows that the null hypothesis should still be rejected (p < .05 = α).
Real Statistics Data Analysis Tool: The One Factor Repeated Measures Anova supplemental data analysis tool contained in the Real Statistics Resource Pack can be used to automatically perform analysis of variance for repeated measures, including the calculation of the GG and HF epsilon correction factors.
For Example 2, enter Ctrl-m and double click on Analysis of Variance (as shown Figure 0 of Anova Confidence Interval). Next select Repeated Measures Anova: one factor from the dialog box that appears. A dialog box will now appear as shown in Figure 5.
Figure 5 – Dialog box for Repeated Measures Anova
Fill in the dialog box as shown in the figure and click on OK. The output is as shown in Figure 6.
Figure 6 – One Factor Repeated Measures Anova
Observation: A rule of thumb is that if sphericity is greater than .75 then the Huynh and Feldt epsilon should be used; otherwise the Greenhouse and Geisser epsilon should be used. When sphericity is very low, it might be better to use MANOVA since it does not rely on the sphericity assumption. A rule of thumb is that for sphericity < .7 you should use MANOVA provided your sample size is at least 10 + k; otherwise you should use ANOVA.
Observation: For more information about sphericity, including other ways of calculating Greenhouse and Geisser epsilon, as well as Machly’s test for sphericity, click here.
Observation: We can also calculate an effect size for the omnibus ANOVA. This is omega squared, although the formula used is different from the one used for ANOVA with independent variables.
For Example 2, ω2 = .14 using the GG epsilon correction factor. As we have noted elsewhere, the effect size for comparisons is more meaningful than the effect size of the omnibus ANOVA.