Some examples of this type analysis are:
- A study is made of 10 subjects each of whom is asked to take a test on reading comprehension, mathematical ability and knowledge of history
- A study is made of 10 monkeys, each of whom is given training once a week for 5 weeks with their score recorded each week
- A study is made of 10 married couples, and the husband’s IQ is compared with his wife’s
The important characteristic of each of these examples is that the treatments are not independent of each other. The most common of these analyses is to compare the results of some treatment given to the same participant over a period of time (like the second example above).
For the version of ANOVA with repeated measures with one within-subjects factor, we can use Excel’s Two Factor ANOVA without Replication data analysis tool. Essentially the following meanings are given to the terms in Definition 2 of Two Factor ANOVA without Replication: MSRow = MSA and MSCol = MSB and similarly for the df and SS forms. The column terms (representing the within-subjects factor) are the ones that are of interest. The row terms represent the subjects.
Since the same subject is involved, the different treatments are not independent of each other. This results in an additional assumption called sphericity, which is described in Sphericity.
As usual we start with an example.
Example 1: A program has been developed to reduce the levels of stress for working women. In order to determine whether the program is successful a sample of 15 women was selected and their level of stress was measured (low scores indicate higher levels of stress) before the program, as well as 1, 2 and 3 weeks after the beginning of the program. Based on the data in Figure 1 (range G5:K20) determine whether the program is effective in reducing stress.
Figure 1 – Data for Example 1
We use Excel’s Anova: Two-Factor Without Replication data analysis tool (Figure 2) to carry out the analysis.
Figure 2 – Output from Anova: Two Factor Without Replication
For this problem we aren’t interested in the analysis of the rows, only the columns, which correspond to variations by time. Since the test statistic F = 29.13 > 2.83 = F-crit (or p-value = 2.4E-10 < .05 = α), we reject the null hypothesis, and conclude there are significant difference between the means.
As usual, we can do further analysis to discover where the differences are, and so in this way determine whether the program is effective. These correspond to the planned and unplanned comparison tests for one-way ANOVA.
Example 2: Compare treatment means before and after the program for the data in Example 1. Determine whether the program is effective and determine the effect size.
We use the same approach as for independent treatments, except that we can only use MSE in calculating the standard error when the sphericity requirement is met and the contrast involves all the treatment levels. When these requirements are met then using MSE yields more power.
Since the sphericity requirement is not met for Example 2 (see Sphericity for details), it is better to calculate the standard deviation in a fashion similar to that for paired data as described in Paired Sample t-Test. We use the contrast weights (1, -1/3, -1/3, -1/3) for Example 1 to compute the equivalent of the differences between the paired data, and then compute the mean, ignoring the sign, and standard deviation of these differences. The standard error of the means is then the standard deviation divided by . This analysis is presented in Figure 3.
Figure 3 – t test on contrast for Example 2
Some representative formulas in Figure 3 are given in Figure 4.
Figure 4 – Representative formulas for Figure 3
Since p < .05, we conclude that there is a significant difference between stress before and after the program, and so the program appears to be effective.
The effect size d for this contrast (cell F26 in Figure 3) is
The effect size r for this contrast is
Observation: We can summarize the results of this analysis as follows:
To investigate the effects of a stress reduction program therapy for working women, the stress levels of 15 participants were taken before the program and then 1, 2 and 3 weeks after the start of the program. The overall variance for repeated measures showed a significant difference between weeks (F(3, 42) = 29.13, p < .05). The mean level of stress before the program was 12.53, which increased to a mean of 22.73 3 weeks after the start of the program (higher measures indicate lower levels of stress), a difference of 10.20. A contrast on this difference was significant (t(14) = -6.65, p < .05). Using the standard deviation of contrast differences for each participant yielded an effect size of d = 1.17, showing the importance of the program in treating stress.
Observation: As usual, if you don’t conduct the omnibus ANOVA test, you can run α – 1 planned orthogonal contrasts without adjusting alpha. It is also commonly acceptable to run α – 1 planned contrasts even if they aren’t all orthogonal. If you conduct both the omnibus ANOVA and planned contrasts then you will need to adjust alpha using the Bonferroni or Dunn-Sidák correction as described in Planned Comparisons for ANOVA.
For pairwise unplanned tests contrasts can be used adjusting alpha using the Bonferroni or Dunn-Sidák correction, or the Tukey HSD (for pairwise comparisons) or Scheffé test (for compound comparisons).
where s.e. is the pairwise standard error (as was done for Example 2) and not . The critical value of the studentized range q is based on the values of α, a (the number of treatments) and df = n – 1 .
For Scheffé test, again don’t use MSE to compute the standard error and instead of using dfB * FINV(α, dfB, dfW) as the critical value, use (a – 1) * FINV(α, a – 1, n – 1).