Box’s test is used to determine whether two or more covariance matrices are equal. Bartlett’s test for homogeneity of variance presented in Homogeneity of Variances is derived from Box’s test. One caution: Box’s test is sensitive to departures from normality. If the samples come from non-normal distributions, then Box’s test may simply be testing for non-normality.

Suppose that we have *m* independent populations and we want to test the null hypothesis that the population covariance matrices are all equal, i.e.

H_{0}: *Σ _{1} = Σ_{2} =⋯= Σ_{m}*

Now suppose that *S _{1}*, …,

*S*are sample covariance matrices from the

_{m}*m*populations where each

*S*is based on

_{j}*n*independent observations each consisting of

_{j}*k*× 1 column vector (or alternatively a 1 ×

*k*row vector).

Now define *S* as the pooled covariance matrix

where *n* = define the following:

The null hypothesis (of equal covariance matrices) is rejected when *M*(1 – *c* ) > χ^{2}-crit (or p-value < *α)*.

This estimate works pretty well provided *n _{j}* > 20,

*m*≤ 5 and

*k*≤ 5. A better estimate can be obtained using the

*F*distribution by defining the following:

If *c _{2}* >

*c*define

^{2}*F*=

*F*, while if

^{+}*c*<

_{2}*c*define

^{2}*F*=

*F*. Then

^{–}*F*~

*F*(

*df, df*). The null hypothesis is rejected if

_{2}*F*>

*F*.

_{crit}**Observation**: If any of the *S _{j}* is not invertible then |

*S*| = 0, and so ln|

_{j}*S*| will be undefined. Thus

_{j}*M*will be undefined and the test will fail.

**Example 1**: Determine whether the covariance matrices for Young, Middle and Old are equal in Example 1 of ANOVA with Repeated Measures with One Between Subjects Factor and One Within Subjects Factor.

**Figure 1 – Covariance matrices for Example 1**

The sample covariance matrices for Young, Middle and Old are calculated (see Figure 1) using the COV supplemental array function from the data in Figure 1 of ANOVA with Repeated Measures with One Between Subjects Factor and One Within Subjects Factor. Since the *n _{j}* (

*j*= 1, 2, 3) are all equal, the pooled covariance is simply the average of the Young, Middle and Old covariance matrices.

The calculations required for Box’s test are given in Figure 2.

**Figure 2 – Box’s test for Example 1**

*m* = number of matrices = 3 (Young, Middle, Old), *k* = the size of each covariance matrix = 5 (each matrix is 5 × 5), *n _{1} = n_{2} = n_{3} *= number of subjects in each sample = 7 and so

*n*=

*n*= 21. In columns Q, R, S and V,

_{1}+ n_{2}+ n_{3}*nn*=

*n*– 1 = 6 for Young,

_{1}*nn*=

*n*– 1 = 6 for Middle,

_{2}*nn*=

*n*– 1 = 6 for Old and

_{3}*nn*=

*n – m*= 18 for Pooled. The other entries are as described above.

Generally we use a significant level of α = .001 for this test. From Figure 2 we see that *M* = 34.81 and both the chi-square test and the *F* test are not significant. We therefore have no reason to reject the null hypothesis that the three covariance matrices are equal.

Dear Charles,

For my research project I want to examine whether adolescents (high versus low trait aggression and boys versus girls) differ in their preferences for different types of violent media content.

I have the following dependent variables (gathered through a content analysis of adolescents’ favorite television programs):

– overall aggression

– 3 subtypes of aggression (physical, verbal and indirect aggression)

– aggression in different contexts (graphic, realistic, humorous, rewarded, punished)

My data is not normally distributed. Therefore, I have considered testing non-parametrically. However, I would have to conduct so many separate Mann-Whitney tests I believe the loss of power would be insurmountable. So, I have returned to my original plan of testing my hypotheses with one two-way ANOVA (with overall aggression as the dependent variable and sex and trait aggression level as between-subjects factors) and two MANOVA’s (one with the three subtypes of aggression and one with all the context variables, both with sex and trait aggression level as between-subjects factors). Does this make sense?

If so, do I understand correctly that if the Levene’s test is not significant for all dependent variables the (M)ANOVA is robust enough to test the non normally distributed data with? Do I have to test the Levene’s test separately for the two factors? And what if Levene’s test is significant for some dependent variables? I have now tested it with both factors simultaneously (I don’t whether that is correct) but then three of my context variables are significant.

Thank you very much in advance.

Kind regards,

Amber

Amber,

1. I don’t really have enough information about your situation to be able to judge whether or not the approach that you are using makes sense.

2. Regarding the MANOVA assumptions, please see the following webpage:

http://www.real-statistics.com/multivariate-statistics/multivariate-analysis-of-variance-manova/manova-assumptions/

Charles

Hello Charles, Thank you for all the information!

I have conducted a Manova of 10 dependent variables between 2 groups. each group n=50.

Box’s M: F(55,31014)=2.175, p=0.000

What does it mean? I understand that the covariance is unequal. Does this invalidate the subsequent analysis?

Leven’s test of equality of error variances showed significance for 4 of the dependent variables.

do you have any recommendations/suggestions for me?

references for articles that can help?

Thank you for your time and help!

Shiran

The significant result for Box’s M test indicates that either the normality assumption fails of the equal covariance assumption fails. This could invalidate MANOVA. If the sample sizes across the groups are the same, then this is less of a problem and you should be able to use MANOVA anyway. I would then use the Pillai Trace.

When you say “between 2 groups” do you mean that you have two independent variables? If so you can use Hotelling’s T-square test, which is a special case of MANOVA. In this case, there is a version of the test when the covariance matrices are unequal. See the following webpage:

http://www.real-statistics.com/multivariate-statistics/hotellings-t-square-statistic/hotellings-t-square-unequal-covariance-matrices/

With so many dependent variables and so small a sample size, I would be concerned about the power of the test.

Charles

thank you very much for your answer!

I meant 1 independent variable (culture) with 2 groups (usa/russia).

is there anything else i can do for strengthening the results?

thank again!

shiran

Essentially this just means that you have dependent variables for USA and the same dependent variables for Russia (sic). It sounds like I am missing something. What are you trying to test?

Charles

I hope i can explain myself..

i’m looking for the differences between usa/russia participants regarding several statements. i’m assuming that russia will have a lower mean across all statements.

Hi,Charles

I have received a warning message:

Box’s Test of Equality of Covariance Matrices is not computed because there are fewer than two nonsingular cell covariance matrices.

What wrong and how can i deal with it ?

best,

Sirinna

Sirinna,

This sounds like an error message from SPSS. I am not familiar with the error messages from SPSS, SAS, etc. The website focuses on Excel.

Charles

Hello. I am very sorry but I don’t really understand a lot. I am doing Mixed-design ANOVA and my normality assumption is not met. And I have Levene’s test: p = 0,71 which means there is homogeneity of variances, right?

Do I have to check this Box’s test? SPSS automaticlly calculated it and p = 0,013. What does that mean? Is this data important for me?

Thank you very much!

Sincerly

LanaLo,

Yes, your interpretations of Levene’s and Box’s tests are correct. If you are conducting an ANOVA you really don’t need to use Box’s test; Levene’s test should be sufficient. Also note that Box’s test is very sensitive to normality; since your data is not normal it is not surprising that you can get a a significant Box’s test even though the homogeneity of variance is actually met.

Charles

Hello Charles,

I am conducting a research on advertising effectiveness, when studying gender interaction on the dependent variables (during MANOVA), i found that my box’s m test was significant. i have checked the normality of my data and found it normal. also, i have large and unequal sample sizes, so is it allowable in my case to proceed with manova even if box’s m was significant? further, i have also read that pillai’s criterion should be used if box”s m is significant instead of wiki’s but i was was unable to find any reference. please advice me on the matter and if possible provide me with a reference.

Regards

Danish,

You can proceed with MANOVA, but it is important to report the reservations you have based on Box’s M test.

Regarding which criteria to use, see Field, A. (2009)

Discovering statistics using SPSS. 3rd Ed. SAGE. Other sources areOlson, C.L. (1976)

On choosing a test statistic in multivariate analysis of variance. Psychological Bulletin, 83, 579-586Olson, C.L. (1979)

Practical considerations in choosing a MANOVA test statistic. Psychological Bulletin, 86, 1350-1352Steven, J.P. (1980)

Power of the multivariate analysis of variance tests. Psychological Bulletin, 88, 728-737.Charles

Hello Charles,

I have a design where there is one intergroup factor (2 levels) and one repeated measures factor (3 measures). I used to think, that for assumptions you must check both sphericity (Mauchley’s) and homogeneity of variances (Levene’s test). However, I was recently told that I could use M Box test for this. Which approach do you recommend?

Also – having read above that M Box is sensitive to non-normal data, can one try M Box, and if significant, check sphericity + homogeneity instead?

What about checking correlation of means*st.devs only – seems crude, is it used?

hello Artek,

Since Box’s M test is sensitive to non-normality, I tend not to use it. If you see that the data is normal, then I guess it is ok to use.

Usually I check for homogeneity of variances. I do this by comparing the sample variances (or using Levene’s test if I have any doubt). I then rely on the GG and HF sphericity correction factors. I have never tried to use the method you describe in your last paragraph.

Charles

Charles

Charles

I have conducted Shapiro-Wilk normality tests (passed) on data then used in a 2-Way Anova. I am aware that the study is underpowered, with only 7 participants each of three training groups. Box’s Test is contravened (p<0.05) with a significance of 0.034. Does this invalidate the subsequent results/analysis? Or should I be using p<0.01 for this test. If the latter should I not be using a similar p value for other tests (e.g. Levene's)?

Barry

Barry,

If you are performing 2-way Anova, I would use Levene’s test (instead of Box’s test) to check homogeneity of variances. In any case, the Box’s Test result is close to .05 and so is not too concerning.

Charles

Hi, could you please help, im doing my dissertation and ive performed a 2 factor manova but my box test is showing .000 sig? what shall i do? should i no longer use this test. Im trying to test the difference between divorced/intact families and the duration of both martial status.

Charlie,

Make sure that the problem isn’t due to outliers. You should check for univariate outliers and multivariate outliers (using Mahalanobis distances). If so, you could rerun the test without these outliers. In this case you should still report the existence of the outliers.

Box’s M test is quite sensitive to violations of normality. If you believe the test is giving a significant result because the data is not multivariate normally distributed, then results from MANOVA might still be valid (although it will likely be difficult to determine whether the problem is normality of homogeneity of covariance matrices).

Don’t use MANOVA, but use separate ANOVA tests instead. Not ideal, but it might be the best you can do.

Charles

Hi Charles

What does it mean when the Box’s M test is violated? Can you then perform Discriminant analysis anyways and just be aware of the it when you interpret your results?

Best,

Julie

Julie,

Violation of this test could mean that any results obtained from the test that depends in the assumption of equal covariance matrices is invalid. It really depends on by how much the covariance matrices are different.

A particular problem with Box’s M test is that a significant result may be due to a violation of normality and not equal covariance matrices. The specific test that you are using may be pretty robust to violations of normality but not heterogeneous covariance matrices (in fact this is usually the case).

Charles

Hi Charles

I don’t know how I can interpret Box’s test….. when it is significant, does it mean that our groups are not homogeneous? am I right?

Fatima,

Yes, you are correct. A significant result means that we reject the null hypothesis that the groups have equal covariance matrices.

Charles

Does this work for scalar observations (i.e. k = 1)?

Also, if I want to use a similar test to check for sameness of several regression coefficients, could I do it? Here’s what I want to do:

I have several models: “y = alpha + beta*x”, each based on n scalar observations. I want to use the above test, plugging in (k = 1) and replacing your S_j with alpha_j or beta_j (separate tests). Would that allow me to test the null hypothesis that all the true alpha’s or beta’s are the same?

Tim,

When there is only one independent variable Box’s M test is known as Bartlett’s test.

I have no reason to believe that you can use this test that the alpha or beta values are the same.

Charles

this is nice but what is the relationship b/n sample proportion Box’s test!

Sorry, but I don’t understand your question.

Charles

Hi there, I have received an error message:

Box’s test of equality of covariance is not computed because there are less than two nonempty cells.

Levene’s test of equality of error variances is not computed because there are less than two nonempty cells.

Can someone explain what this means, and what I should do please?

Liz

Liz,

It just means that you have too little data for these tests to be used. If you believe that you do indeed have more than two data elements then this probably means that you haven’t entered the arguments to these functions correctly.

Charles

In the “F-minus” equation, should the denominator read “(a-minus – df2)” instead of “(a-minus – M)”? The former agrees with the results in Fig.2 and the latter does not.

Todd,

It looks like the mistake is in Fig 2. I will fix the website and issue new versions of the software and the multivariate examples workbook.

Thanks for catching this error.

Charles