When the homogeneity of variances assumption is not met, especially with unequal sample sizes, Welch’s Test is a good approach for performing an ANOVA analysis.
Property 1: If F is defined as follows:
Example 1: Repeat Example 1 of Kruskal-Wallis using the data in range E19:G29 of Figure 1 by performing Welch’s Test.
Figure 1 – Welch’s Test
We see from row 33 of Figure 1 that the variances of the three groups are 16.2, 86.5 and 265.6, and so we suspect there is a significant difference between the variances. This is confirmed by using Levene’s test (on the medians) since Levene(E20:G29,1) = 0.005478. Thus the normal one-way ANOVA is not the correct test to use. We employ Welch’s test instead, as shown in Figure 1.
We see from Figure 1 that the p-value = .041355 < .05 = α, and so we conclude that there is a significant difference between the means of the three groups.
Note that if we had used ANOVA (see Figure 2) we would have come to a completely different conclusion (since p-value = .14 > .05 = α).
Figure 2 – ANOVA on the same data
Real Statistics Function: The Real Statistics Resource Pack contains the following array function where R1 is the data without headings, organized by columns:
WELCH_TEST(R1, lab): outputs a column range with the values F, df1, df2 and p-value for Welch’s test for the data in range R1.
If lab = TRUE a column of labels is added to the output, while if lab = FALSE (default) no labels are added.
For Example 1, the result of WELCH_TEST(E20:G29,TRUE) is similar to range D40:E43 of Figure 1. The main difference is that this function uses the Real Statistics F_DIST function instead of the Excel function F.DIST (or FDIST) to calculate the p-value and so obtains a more accurate result.
Real Statistics Data Analysis Tool: The Real Statistics Resource Pack provides access to Welch’s test via the One Factor Anova data analysis tool, as described in the following example.
Example 2: Repeat Example 1 using the Real Statistics data analysis tool.
Enter Ctrl-m and double click on Analysis of Variance, and select Anova: one factor on the dialog box that appears. Now fill in the dialog box that appears as shown in Figure 3.
Figure 3 – Dialog box for Welch’s data analysis tool
The output is shown in Figure 4.
Figure 4 – Welch’s test data analysis tool
Note that the results shown in Figure 4 agrees with those in Figure 1 except that the p-value is slightly lower. The reason for this is that Figure 1 uses the formula = FDIST(E40, E41, E42), which is equivalent to =1–F.DIST(E40,E41,E42, TRUE). Both of these formulas truncate the value in E42 down to an integer value, i.e. to =FDIST(4.315278,2,11). The calculation in Figure 4 is more exact and uses F_DIST instead of F.DIST and so the full value of df2 = 11.69964 is used.
As can be seen from Figure 3, data for Welch’s test can be organized in standard format. The first 10 of the 27 rows of the data for Example 1 in standard format is shown in Figure 5.
Figure 5 – Data in standard format