Resampling for ANOVA

Another approach to handling ANOVA type analyses when the assumptions are violated is to use resampling, as described in Resampling Procedures.

Example 1: Repeat Example 1 of Kruskal-Wallis using bootstrapping (the data is repeated in Figure 1).

Resampling data ANOVA

Figure 1 – Sample data

The sample data contains 27 data elements: 10 New, 9 Old and 8 Control. As can be seen in Kruskal-Wallis, the data violates the homogeneity of variance assumption, and so we can’t be sure whether ANOVA will yield valid results. We therefore use the Resampling data analysis tool as follows.

Press Ctrl-m and double-click on the Resampling data analysis tool from the menu. Next fill in the dialog box that appears as shown in Figure 2 and click on the OK button.

Resampling dialog box ANOVA

Figure 2 – Resampling dialog box

The output is shown in Figure 3.

Bootstrapping ANOVA

Figure 3 – Bootstrapping test for ANOVA

The data analysis tool first calculates the F-stat for sample data. This can be done using the Excel or Real Statistics One-sample ANOVA data analysis tool or via the ANOVA1 function. For Example 1, F-stat = ANOVA1(A4:C13) = 2.109681.

The data analysis tool now creates a new sample of size 27 (the size of the orginal sample) by randomly drawing 27 elements from the the original sample with replacement and places the first 10 in the New group, the next 9 in the Old group and the remaining elements in the Control group. It now calculates the F-stat for this new sample. This is repeated 10,000 times (since Iterations is set to 10,000 in Figure 2).

For each iteration, the data analysis tool determines whether the bootstrap F-stat is larger than 2.109681 (the F-stat for the original sample). The p-value for the test is equal to the count of bootstrap F-stats > 2.109681 divided by 10,000. As we can see from Figure 3, for Example 1, p-value = .1452 (cell P26). Based on α = .05, this means that we cannot reject the null hypothesis that the three groups have equal means.

Observation: The analysis can also be done using randomization. The approach is identical to that described above, except that the samples of size 27 are done without replacement (e.g. by using the SHUFFLE function instead of the RANDOMIZE function).

Alternatively, the sampling can be performed on the residuals (i.e. the raw data minus the appropriate group mean) instead of the raw data, using either bootstrapping or randomization. For Example 1 this can be done by selecting the ANOVA (via errors) option in the Resampling dialog box as shown in Figure 2. The output from the randomization version of the test is shown in Figure 4.

ANOVA resampling via residuals

Figure 4 – Randomization test on residuals for ANOVA

6 Responses to Resampling for ANOVA

  1. Nick Chan says:

    Is there a way for your program to do boostrapping in a Two Way Anova?

    • Charles says:

      Nick,
      The Real Statistics software does not yet provide a bootstrapping capability for Two/way Anova. You can use the approach described on the website for one-way Anova for two-way Anova.
      Charles

  2. Gustavo says:

    Dear Charles,

    I got a problem using the Resampling tool for ANOVA. When I was filling the dialog box (just like your example), Excel told me that alpha must be a number between 0 and 5, and the Bin Size could not be a decimal number, because the tool gives an error. So, I had to put 1. Could you help me?

    Thanks a lot!

    • Charles says:

      Gustavo,

      These sorts of error messages result from the fact that the decimal symbol (comma vs period) in default values is not being recognized correctly. I have tried to correct this, but it seems to be difficult to do this in all the various languages that Excel supports (and the various ways of assigning default values).

      Generally, the solution to the problem with alpha is to re-enter the value (i.e. ignore the default value). Thus, if the default value of alpha is written as .05 (or even ,05), simply re-enter this value using whatever convention is typical for your version of Excel (i.e. .05 or ,05).

      Regarding the bin size, you should be able to enter a decimal value, using whatever decimal symbol is typical for your version of Excel (i.e. comma or period).

      Charles

  3. Manfred Becker says:

    Hi,
    I want to compare two/three grroups with different sample sizes for results of an intelligence test (general results and results in test-factors (verbal, quantitative, nonverbal). Which statistic I need?
    Kind regards Manfred

    • Charles says:

      Manfred,
      As always it depends on what you are trying to analyze. If you are comparing the intelligence scores of two groups (i.e. two independent variables), you probably want a two sample t test. With 3 groups, you probably want ANOVA. But since you have three dependent variances MANOVA looks to be the likely test. The t test doesn’t require that the two samples be equal in size. You can use ANOVA even if the samples are unequal in size, but then you will need to use regression to do the analysis. MANOVA requires that all the samples be equal in size, although if there are only two groups then you can use Hotelling’s T-square test, which doesn’t require that the samples have the same size.
      All these tests are described in the Real Statistics website and are included in the Real Statistics software.
      Charles

Leave a Reply

Your email address will not be published. Required fields are marked *