Two within subjects factors

We now consider in detail the case where there are two within-subjects factors: Training and Skill in the following example.

Example 1: A company has created a new training program for their customer service staff. To test the effectiveness of the program they took a sample of 10 employees and assessed their performance in three areas:  Product (knowledge of the company’s products and services), Client (their ability to relate to the customer with politeness and empathy) and Action (their ability to take action to help the customer). They then had the same 10 employees take the training course and rated their performance after the program in the same three areas. Based on the data in Figure 1, determine whether the program was effective.

Data two within subjects

Figure 1 – Data for Example 1

We use two factor ANOVA with repeated measures to perform this analysis. We interrupt the analysis of this example to give some background, after which we will resume the analysis.

Definition 1: We modify the structural model of Definition 1 of Two Factor ANOVA with Replication as follows.  Note that we will use a to indicate the number of levels for factor A (instead of r) and b to indicate the number of levels for factor B  (instead of c). Also m = the number of subjects/participants.

We use terms such as i (or i.) as an abbreviation for the mean of {xijk: 1 ≤ j  ≤ b, 1 ≤ k ≤ m}. We also use terms such as j (or .j) as an abbreviation for the mean of {xijk: 1 ≤ i ≤ a, 1 ≤ k ≤ m}.

We define the effects αi and βj where


Similarly, we define ai and bj where


We use δij for the effect of level i of factor A with level j of factor B, i.e. the interaction of level i of factor A and level j of factor B. Thus, δij = μij – μi – μj + μ. Similarly, we have


It is easy to show that


Finally, we can represent each element in the sample as


where εijk denotes the error (or unexplained) amount, where


where γk is a random effect corresponding to the subjects/participants. All the interaction terms are also random effects; together they make up the error components of our model.

As before we have the sample version


where  eijk is the counterpart to εijk in the sample. The null hypotheses for main effects are:

H0: µ1. = µ2. = ··· µa. (Factor A)

H0: µ.1 = µ.2 = ··· µ.b (Factor B)

These are equivalent to:

H0: αi for all i (Factor A)

H0: βj for all j (Factor B)

In addition there is a null hypothesis for the effects due to interaction between factors A and B.

H0: γij for all i, j

Definition 2: Using the terminology from Definition 1 of Two Factor ANOVA with Replication, except that we use a for the number of levels in factor A (instead of r) and b for the number of levels in factor B (instead of c), and also adding C = the participant factor and m = number of participants, define:

Two within subjects factors

We can also define four types of between group terms.


And similarly for BetAC and BetBC. There is also the following BetABC version:

image2432 image2433

Property 1:

image2434 image2435

Proof: It is clear that

image2436 image2437

If we square both sides of the equation, sum over i, j and k and then simplify (with various terms equal to zero as in the proof of Property 2 of Basic Concepts of ANOVA), we get the first result. The second result is trivial.

Property 2: If a sample is made as described in Definition 1 of Basic Concepts of ANOVA, with the xijk independently and normally distributed and with all \sigma^2_{.j} (or \sigma^2_{i.}) equal, then

image1469 image2438

Proof: The proof is similar to that of Property 1 of Basic Concepts of ANOVA.

Theorem 1: Suppose a sample is made as described in Definitions 1 and 2 of Two Factor ANOVA with Replication, with the xijk independently and normally distributed.

If all μi are equal and all \sigma^2_{i} are equal then


If all μj are equal and all \sigma^2_{j} are equal then


Also, under certain circumstances,


Proof: The result follows from Property 2 and Theorem 1 of F Distribution.

Property 3:

image2442 image2443 image2444 image2445

Observation: We use the following tests:

ANOVA repeated measures tests

If the null hypothesis for factor A is true, then \sum\limits_i {\alpha^2_i} = 0, and so


whereas if the null hypothesis is not true then \frac{MS_A}{MS_{AC}} > 1. The results are similar for the other null hypotheses.

Example 1 (continued): We now show how to apply the analysis to Example 1. There are two treatment factors: Training (with levels pre and post) and Skill (with levels Product, Client and Action) plus the Subject factor. We begin by creating the worksheet in Figure 2.

Data ANOVA matched samples

Figure 2 – Data for Example 1 plus treatment means

In addition to the raw data (range A3:G14), Figure 2 shows the means for the interaction between the various factors: Training × Skill (B15:G15), Training × Subject (J5:K14), Skill × Subject (M5:O14), as well as the means for each factor: Training (J15:K15), Skill (M15:O15) and Subject (H5:H14). Finally the grand mean is shown in cell H15.

From the information in Figure 2 we create the tables in Figure 3.

ANOVA repeated measures construction

Figure 3 – Construction of ANOVA for Example 1

Row 4 contains the values of a, b, m and n for Example 1. E.g. a = the number of Training levels = COUNTA(J3:K3) and n = PRODUCT(R4:T4) or COUNT(B5:G14). The various degrees of freedom are calculated from these values as described in Definition 2. Some examples are given in Figure 4.

Representative df cells

Figure 4 – Representative df  from Figure 3

The count values in column R are used to calculate the SS values in column S. In all cases the count value is equal to \frac{n}{df+1}. The values for SS are calculated as in Definition 2. Some examples are given Figure 5.

Representative SS

Figure 5 – Representative SS from Figure 3

From the table in Figure 3, we arrive directly at the ANOVA analysis in Figure 6.

ANOVA repeated measures Excel

Figure 6 – ANOVA with repeated measures for Example 1

Note that the p-values for both main effects and the interaction effect are significant, but before we conclude this definitively we need to take sphericity into account. For the Training factor there are only two levels and so sphericity is automatic. Since the Skill factor has more than two levels we need to calculate the GG and HF epsilon correction factors (see Figure 7) as we did in Sphericity for the one within subjects factor analysis.

Sphericity correction factors Excel

Figure 7 – Sphericity correction factors for Skill (B) effect

Here the covariance matrix is calculated using the supplemental array formula =COV(M5:O14). We see that GG epsilon is close to 1 and HF epsilon is 1, and so it appears that there is no problem with sphericity.

We also need to worry about the sphericity of the interaction effect. For this we will simply look at the lower bound which for the interaction effect is equal to \frac{1}{(a-1)(b-1)} = \frac{1}{(2-1)(3-1)} = .5. Figure 8 provides a revised version of Figure 6 taking sphericity into account.

ANOVA repeated measures corrected

Figure 8 – ANOVA with repeated measures corrected for sphericity

We see from Figure 8 that the main effects and interaction effect are all significant. We now use planned and/or unplanned tests to better understand what is going on.

Observation: As described in ANOVA with Repeated Measures with One Within Subjects Factor where the sphericity assumption holds we can use the same types of tests that we used for ANOVA with independent variables. For these tests the standard error is based on MSE where the appropriate value of MSE is employed: MSA×C with dfA×C = (a–1)(m–1) for the main effect A, MSB×C with dfB×C = (b–1)(m–1) for the main effect B and MSA×B×C with dfA×B×C = (a–1)(b–1)(m–1) for the interaction effect A × B.

Where sphericity doesn’t hold it is best to use the contrast-specific standard error and not use an MSE type value. In fact, it is probably better to use the contrast/specific standard error in all cases. Unplanned tests can also be used as described in ANOVA with Repeated Measures with One Within Subjects Factor.

Example 2: For the data in Example 1 determine whether there is a significant difference in the training program for each of the three skills.

Comparing means

Figure 9 – Comparison of means for interaction

As we can see from Figure 9, it appears that training improves performance for each skill, although only slightly in the case of the Product skill. We need to determine whether any of these differences are significant. We do this by performing pairwise comparisons.

Contrast ANOVA matched samples

Figure 10 – Contrast for pre and post training for Product

The test for the Product skill in Figure 10 shows there is no significant difference between the means for Product pre-test and post-test (p > .05). If we perform the same analysis for Client we see there is a significant difference: t(9) = -4.63, p = .001 < .05, d = 1.47. Similarly for Action: t(9) = -11.21, p = 3E-05 < .05, d = 2.43. Note that the contrast analysis is equivalent to the t-test on paired samples. Even if we use a Bonferroni or Dunn-Sidák correction, the results will be similar.

If we had chosen to make the 6 unplanned tests using Tukey’s HSD, including a test of Product pre-training vs. post-training, then from the above analysis we know that tobs = -1.34. We need to compare the absolute value of this observed value of t with the critical value of t (as described in Unplanned Comparisons for ANOVA). First we look up the q critical value q(k, df, α) where α = .05, k = ab = 6 and df = m – 1 = 9. From the Studentized Range q Table, we have q(6, 9, .05) = 5.024, and so tcrit = \frac{q_{crit}}{\sqrt{2}} = 3.55. Since |tobs| = 1.34 < tcrit = 3.55 we conclude that there is no significant difference between the Product performance before and after the training.

Example 3: For the data in Example 1 determine whether the training has had the same effect on the Action skill as it has had on the Client skill.

We test the following null hypothesis:

         H0μAction×Post – μAction×Pre = μClient×Post – μClient×Pre

We now compare the differences between Post and Pre training for Action with the differences Post and Pre training for Client.

Comparison ANOVA matched samples

Figure 11 – Contrast for Training effect on Client vs Action

Here we see that p = .64 > .05 = α, and so we conclude there is no significant difference between training effect on Client skills vs. Action skills.

Real Statistics Data Analysis Tool: The contrasts used in Examples 2 and 3 can also be produced using the Contrasts option of the Repeated Measures Anova and Contrasts data analysis tool found in the Real Statistics Resource Pack.

First enter Ctrl-m and select Repeated Measures Anova and Contrasts from the menu. A dialog box will appear as in Figure 5 of Sphericity. Next enter the appropriate range in the Input Range field, select the Contrasts option, choose the appropriate Alpha correction for contrasts option (see Figure 5 of Sphericity) and click on OK.

For Example 2 you should choose the range J5:K14 in the worksheet from Figure 2 as the input. For Example 3 you should choose the range M5:O14 as the input. In both cases you will need to deselect the Headings included in the range option and add the headings in manually.

18 Responses to Two within subjects factors

  1. Andrew K. says:

    Hi Charles,

    Thank you for this excellent resource!

    I am interested in a repeated-measures ANOVA with 2 within-subjects factors that models the effect of each repeated factor (factorA – 5 levels; factorB – 4 levels) and their interaction (factorA*factorB – 20 levels) on a continuous dependent variable. I am also interested in performing contrasts within levels of each factor separately and the interaction levels.

    The problem I am having is when I switch from using SPSS to your code in Excel the results change slightly, enough to make a difference in my original SPSS-based conclusions. I modified your worksheet “ANOVA Match 2.1” for my data and went over it calc by calc to make sure I didn’t make mistakes. I used the “Follow up 2-factor ANOVA” tool to build all contrasts, but based the data on descriptive stats output from the “Mixed Repeated Measures ANOVA” tool.

    To summarize:

    Effect of factorA and factorB:
    When comparing SPSS output and Excel almost all ANOVA table values match – i.e. SS, df (sphericity & GG epsilon), MS, and F values match – but the p-values don’t match because Excel p-values are slightly higher.

    Effect of the interaction:
    After comparing outputs I found 1) GG epsilon is larger in Excel, and 2) p-value differs (due at least to difference in GG epsilon and possibly unknown bias in aforementioned calculation of p-values).

    When comparing levels in either factor, the p-values in Excel are higher than SPSS. Note: I don’t know how to perform contrasts on the interaction levels in SPSS, but my concerns still apply.

    So, my loaded question is: Why could SPSS and Excel output be different?

    Ultimately I want to test using your Excel tools but have concerns when those results don’t match SPSS. Any assistance you can provide is greatly appreciated!

    -Andrew K.

    • Charles says:

      I don’t know why the results from SPSS and Excel would be different.
      As you probably know, I have not yet implemented Repeated Measures ANOVA with two between subject factors. The Mixed Repeated Measures ANOVA tool handles the case with one between subjects factor and one within subjects factor. The GG epsilon values will be different.
      The description on the referenced webpage should be correct as are the spreadsheet calculation in the Examples workbook. I am using the approach found in David Howell’s Statistical Methods for Psychology textbook.
      In this case, I would tend to rely on the SPSS output, since they have already packaged the tools that you need, whereas perhaps you need to work a little harder with the Excel version that I have described on the website.

  2. mahpishanu says:

    Thanks a lot for your nice stats explanation and useful tool.
    I have the following data and I need to find the difference between (old and young), (young and kid) and (kid and young).

    Would it be possible that you could please help me?
    here is my data:

    subjects measures conds values
    1 weight old 71
    2 weight old 64
    3 weight old 79
    1 height old 172
    2 height old 151
    3 height old 179
    1 heartbeats old 72
    2 heartbeats old 99
    3 heartbeats old 90
    1 weight young 84
    2 weight young 83
    3 weight young 59
    1 height young 161
    2 height young 176
    3 height young 198
    1 heartbeats young 80
    2 heartbeats young 79
    3 heartbeats young 62
    1 weight kid 23
    2 weight kid 25
    3 weight kid 30
    1 height kid 146
    2 height kid 98
    3 height kid 58
    1 heartbeats kid 68
    2 heartbeats kid 86
    3 heartbeats kid 84

    • Charles says:

      Based on the type of data you have presented it seems like you have two dependent variables, namely weight and heartbeat and one independent variable age group (kid, young, old). This is compatible with MANOVA. See MANOVA for more details.

      Whether this is really the test to use depends on the hypotheses you want to test.


  3. Michael Orendurff says:

    I’m having some trouble filling out the dialog box, and I’m not sure how arrange the rows and columns in excel. How is a two factor x two level RM ANOVA different from a one-factor four level RM ANOVA in the column and row layout?

    My ANOVA table always has missing cells that do not calculate. I’m running iOS 10.11.1 (El Capitan) and Excel 2011 for Mac.

  4. Thomas Green says:

    Hi. Your work is excellent, but I’d appreciate some help. I can’t find the menu entry for ‘two-way anova with repeated measures on both factors’. It’s there in the multivariate workbook , but not on my computer. When I look at the RealStatistics menu in Excel I see ‘Analysis of Variance’; when I click on that I’m offered Anova:one factor, Anova:two factors, Anova:three fixed factors, Manova single factor, Nested anova:two factors, Repeated measures:one factor, Repeated measures: mixed, and Ancova. None of those seem to be what I want.

    I’ve had the same problem on both the Windows version and the Mac version. I’m sure I’m missing something! If you could tell me where to find the right button I’d be very grateful.

    Best wishes and thank you in advance

    Thomas Green

    • Charles says:

      Sorry Thomas, but I don’t yet provide a two-way anova with repeated measures on both factors option.

      • Thomas Green says:

        Aieee! There’s one in the Workbook, or am I mistaken? It’s exactly my design – each participant did the same 3 programming problems (factor 1) with or without syntax highlighting (factor 2). I could pool the scores over problems to get a simple test of the effect of highlighting, but the graphs suggest an interaction effect – syntax highlighting seems to be more useful on the harder problems, as you’d expect. What’s my best route forward – planned comparisons? It’s a bit cheeky of me to ask for stats advice, when you’ve already put in so much unpaid work just for the benefit of total strangers. Please feel free to ignore me if you like.

        Many thanks,


        • Charles says:


          Yes, there is an example of your design described on the website, and you can indeed find the example in the Examples Workbook. I just haven’t implemented it yet in the software. You can use planned comparisons to pinpoint the effect. I give quite a few examples (plus support in the Real Statistics Resource Pack) of how to do this for two factor ANOVA (but not for repeated measures) and one factor ANOVA with repeated measures.


  5. san2x says:

    hai charles,
    how about if my study is just two-way within subjects?? what do you think are its assumptions???

    thank you 🙂

    • Charles says:

      What do you mean by “how about if my study is just two-way within subjects??” How does your study differ from the type of analysis described on the referenced webpage?

      The assumptions for Anova with repeated measures (in general) are:
      – Dependent variables have intevral measure
      – Independent variables should be repeated measure
      – No significant outliers
      – Distribution of each group should be normal (although the test is pretty robust to violations of this assumption)
      – Sphericity (although the correction factors can be used when this assumption is violated).


  6. David says:

    Also, I see from your site you can adapt a two-way ANOVA with repeated measures on only one factor for an “Unbalanced model”. Can you do the same here for a two-way ANOVA with repeated measures on both factors?

    • Charles says:

      I have not yet worked on using regression approaches (or general linear models) for this situation. I plan to turn my attention to this situation shortly.

  7. David says:

    I’ve installed the Resource Pack for Excel 2007. Is the “Repeated Measures ANOVA” function in the 2007 Resource Pack suitable for Repeated Measures on 2 factors, or only for 1 factor?

    After I format my data to be in a table arranged similar to what you show in Figure 1, can I use to Resource Pack to generate the rest of the readouts?

    Using the “Repeated Measures ANOVA” option with the “Contrasts” box ticked gives a readout, but it is not what I observe in your Figure 2.

    Is Figure 2 generated using some other options in your Resource Pack, or do you have to actually create the treatment means tables yourself every time?

    Thank you for this tool pack and useful website!

  8. Marianna Lemus says:

    I just would like to thank you for creating something straight forward, with practical examples and yet enough theory that one can remain focus on what we are trying to achieve without getting lost in theory. I am talking, obviously, for those that have a pre- understanding and just want to double check the test and methods chosen with regards to the data and specific parameters are right. Well done indeed and please, keep this going. THANK YOU!

Leave a Reply

Your email address will not be published. Required fields are marked *