The same assumptions as for ANOVA (normality, homogeneity of variance and random independent samples) are required for ANCOVA. In addition, ANCOVA requires the following additional assumptions:
- For each independent variable, the relationship between the dependent variable (y) and the covariate (x) is linear
- The lines expressing these linear relationships are all parallel (homogeneity of regression slopes)
- The covariate is independent of the treatment effects (i.e. the covariant and independent variables are independent
Example 1: Show that the assumptions hold for the data in Example 1 of Basic Concepts of ANCOVA.
We start by creating a box plot of the reading scores for each of the four methods (using the data from Figure 1 of Basic Concepts of ANCOVA). See Figure 1.
Figure 1 – Box plot for data in Example 1
Each plot looks relatively symmetric and the variances don’t appear to be wildly different. As we can see from the data in Figure 1 of Basic Concepts of ANCOVA, the variances for the reading scores vary from 44.8 to 164.8, which is likely to be an acceptable range to meet the homogeneity of variances assumption.
We now turn our attention to the ANCOVA-specific assumptions. We create a scatter diagram of the y data values against the x data values for each of the four methods. This is done by creating a scatter diagram for Method 1 in the usual way and then choosing Design > Data|Select Data and clicking on the Add button on the left side. Enter the name Method 2 and specify the range for the x and y values in the dialog box that appears. After repeating this procedure for Method 3 and Method 4 and adding linear trend lines for each method, the resulting chart is as in Figure 2.
Figure 2 – Checking whether regression lines are parallel
Although the four lines are not parallel, their slopes are quite similar, indicating that the homogeneity of slopes assumption is met. A further indication of this is to test the complete regression model y, x, t, x*t against the full regression model y, x, t. If there is no significant difference between the models then the interaction terms are not significant, implying that the homogeneity of regression slopes assumption is met. We conduct the same type of test in Testing the Significance of Extra Variables on the Regression Model.
First we use Excel’s regression data analysis tool to create the complete model (see Figure 3) using the range B4:H39 from Figure 1 of Regression Approach to ANCOVA when prompted for the Input X range.
Figure 3 – Complete model (y, x, t, x*t) for data in Example 1
Now we test (see Figure 4) whether there is a significant difference between the complete and full models (as described in Figure 5 of Regression Approach to ANCOVA and Figure 3 above).
Figure 4 – Testing homogeneity of regression line slopes
Row 6 of Figure 4 computes the difference between the R-Square values of the complete and full models. Row 7 computes the difference between the residual degrees of freedom of the two models. The F statistic (cell AB8) is then defined via the formula =AB6*Z7/(AB7*(1-Z6)). Since the p-value for this statistic is larger than .05, we conclude there is no significant difference between the two models, and so accept the homogeneity of regression slopes.
Alternatively, we can get the same result by using the Real Statistics function
RSquareTest(B4:H39, B4:E39, A4:A39) = 0.4615