Homogeneity of Variances

Certain tests (e.g. ANOVA) require that the variances of different populations are equal. This can be determined by the following approaches:

  • Comparison of graphs (esp. box plots)
  • Comparison of variance, standard deviation and IQR statistics
  • Statistical tests

The F test presented in Two Sample Hypothesis Testing of Variances can be used to determine whether the variances of two populations are equal. For three or more variables the following statistical tests for homogeneity of variances are commonly used:

  • Levene’s test (includes the Brown-Forsythe test)
  • O’Brien’s test
  • Fligner Killeen test
  • Bartlett’s test
  • Conover’s test

Using the terminology from Definition 1 of Basic Concepts for ANOVA, the following null and alternative hypotheses are used for all of these tests:

H0: \sigma_1^2 = \sigma_2^2 = ⋯ = \sigma_k^2

H1: Not all variances are equal (i.e. \sigma_i^2 ≠ \sigma_j^2 for some i, j)

Topics

References

Wikipedia (2015) Levene’s test
https://en.wikipedia.org/wiki/Levene%27s_test

Field, A. (2009) Discovering statistics using SPSS. 3rd Ed. SAGE.

Ramsey, P. H. (1994) Testing variances in psychological and educational research
https://psycnet.apa.org/record/1994-28101-001

Wang, Y. et al. (2016) Comparing the performance of approaches for testing the homogeneity of variance assumption in one-factor ANOVA models
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5965542/#bibr32-0013164416645162

Hitchcock, D. (2017) Test about several variances. Nonparametric statistics
https://people.stat.sc.edu/hitchcock/notes518fall13sec53filledin.pdf

Conover, W. J., Johnson, M. E.,  Johnson, M. M. (1981) A comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. Technometrics, 23(4), 351-361.
https://www.academia.edu/3492119/A_comparative_study_of_tests_for_homogeneity_of_variances_with_applications_to_the_outer_continental_shelf_bidding_data

200 thoughts on “Homogeneity of Variances”

      • Thank you so much for your response Charles.

        May I know what is the appropriate statistical test for a data set with two dependent variables, one with homogeneous variance and the other with non-homogeneous? I’m investigating the cytotoxic effects of a carcinogenic compound in an experimental setting with four treatment groups and one control at two distinct time points (referred as the dependent variables in the question). I intended to use one-way ANOVA, but one of the dependent variables has unequal variance. Can I perform Welch Anova with Games Howell as post hoc in this case eventhough one of the dependent variables has homogenous variance?

        Thanks,
        Geetha

        Reply
        • Hi Charles,

          I just realised that Welch Anova cannot be used when control is involved (0 variance). So, can I go with Kruskal Wallis, which has Bonferroni as its post hoc? Could you please advise me on this?

          Thanks,
          Geetha

          Reply
          • Geetha,
            1. Welch’s ANOVA and Kruskal Wallis can both handle a control group. When you say that the variance is zero, does this mean that you only have one sample element from the control? This is a problem for all of the tests.
            2. The Real Statistics website presents a number of options for post-hoc analysis after Kruskal-Wallis.
            Charles

        • Hello Geetha,
          I don’t know what you mean by “one of the dependent variables has unequal variance”. If you are using one-way ANOVA, then there is only one dependent variable.
          Charles

          Reply
    • If the variances are not homogenous then ANOVA may not be viable even if the normality tests are good. It really depends on how different the group variances are. With a balanced model (group sizes are the same) then even when the variance of one group is 3 or 4 times another then ANOVA is probably ok.
      Charles

      Reply
  1. Which test do you use to test if samples (fish and water samples collected from three different locations) meet with parametric assumptions before carrying out one way ANOVA on GraphPad?

    Reply
    • Hello Reuben,
      The two main assumptions for one-way ANOVA are normality and homogeneity of variances. You can typically use Shapiro-Wilk to test normaility and Levene’s Test for homogeneity of variances.
      Charles

      Reply
  2. I’m trying to run a three way ANOVA. I have 3 factors with 2,3 and 7 levels. They don’t have homogeneity of variance but they do have equal sample size. Can I still run a Tukey’s post hoc test?

    Reply
    • Hello Alex,
      Tukey’s post hoc test requires homogeneity of variances. For one-way ANOVA Games-Howell is a good test when the variances are not equal, but I don’t know of a version of Games-Howell when there is more than one factor.
      You can use bootstrapping.
      Charles

      Reply
  3. Hi
    I wonder which equation I should use for split-split plot design?? Levene’s test?? may you help me in this??
    Thank you in advance

    Reply
  4. Hi Charles,
    I want to do split plot Anova, but one from two groups doesn’t meet the homogeneity assumptions. Is it still okay to use this split plot Anova? If not what should I do then? I have same sample size for each group.
    Thankyou

    Reply
  5. Hi Charles,

    I have a question about the Games Howell test I performed after I did a Welch’s Anova.
    I have triplicate samples for three different conditions with a unequal variance. My problem is now, that I do not get a result for the Games Howell Test, because the area for the “c” is empty. I am a little bit confused, could you explain me for what the “c”, next to grous stands for? Or what could be the explaination why I have no results for that test? And when I have a result how do I see which samples are significantly different?

    Thanks in advance

    Reply
    • Victoria,
      You need to fill in two of the cells in the c area. One with the contrast value +1 and the other with the contrast value -1. The comparison will be for the two corresponding variables. You can do this multiple times to make multiple comparisons. This is explained at
      Unplanned Comparisons.
      Charles

      Reply
  6. Hi Charles,

    I hope everything is going well with you. Are you eventually planning to show an example for the Conover squared rank test for testing homogeneity of variance? I’m always glad to see some new techniques on here! Keep up the great work!

    Reply
    • Kevin,
      Thanks for your continued support. I do expect to add this capability. I did decide to add the Fligner-Killen test instead several months ago.
      Charles

      Reply
  7. Hello Charles,
    Want to use Levetes test in checking the variability of my data but can’t remember the formula and process manually not with the help of software.
    Thank you in advance

    Reply
  8. am doing homogeinity test under randomized complete block design. should I exclude block factor and do only with treatments when doing the test

    Reply
    • Mugo,
      If I remember correctly, the homogeneity of variances assumption for RCBD is the same as for two-way ANOVA, namely that the variances of the interactions between treatment-block levels are all equal. If there are no replications, then none of these variances are defined and so there is nothing to test.
      Charles

      Reply
  9. Hi charles,

    Thank you for your great website.
    I have questions about my recent project. It is an ERP study and there are 6 participants in each group. My questions are regarding the interpretation of behavioral data. Would you mind helping me with this matter?

    Reply
  10. Hi Charles,

    I am loving the RealStats add-in for Excel, and am impressed at how thoughtful you are with responding to comments. Thank you!

    I would like to do a repeated measures ANOVA; however, a Levene’s test has shown that my groups have unequal variances. I have read in one of your responses to a comment that the repeated measures is fairly robust to there being unequal variances, and read (maybe online somewhere), that it is robust up to 1 group having 4x the variance of another; however, my most extreme case shows one group with 11x the variance of another. Do you have any recommendations of what I should do? Further, how to then best do a post-hoc test. I’m familiar with Games-Howell for unequal variances, but not sure if this applies with repeated measures?

    Thanks so much for your help! I’ll be sure to cite the RealStats add-in in my publication.

    Matt

    Reply
    • Actually, I think I figured it out. Once I realized your website had the nice navigation bar along the right side I could make a lot more sense of the content.
      I first did a Friedman test and followed that up with a Wilcoxon Signed-Ranks Test for Paired Samples. Does this sound like the right approach? Is the Friedman test even necessary? Thanks, again! Wonderful website and I love that I could download the Example Workbooks!

      Reply
    • Matt,
      It is true that ANOVA is generally pretty robust to violations of up to 3x or 4x (provided the group sizes are equal). For repeated measures ANOVA, however, you are interested in the variances of differences between the groups. I suggest that you look at the following webpage:
      Sphericity
      You can use contrasts and Tukey HSD as post-hoc tests. These are described on the following webpage:
      https://real-statistics.com/anova-repeated-measures/one-within-subjects-factor/
      Charles

      Reply
  11. Hi Charles,

    Thanks for all the useful information provided on the website. I have a question about my current research. I’m comparing 2 groups in a repeated measures design. My groups are however very unequal in sample size. One group has about 300 participants, and the other has around 70 participants. When running the analyses most assumptions are violated, like levene’s test and box’s test of equality of covariance.

    When I look for what to do when these assumptions are violated, I see that with unequal variance the Welch’s test is recommended and with unequal sample sizes the Games-Howell test seems to be the one for good use.

    I am wondering however, with these big differences in sample size (although the standard deviations in both groups are pretty similar) if it is not better to match the participant’s and make the groups equal (in sample size) instead of going forward with the analyses and using the post-hoc tests to correct for the violated assumptions and unequal sample sizes. I was wondering what you think is the best option in my situation.
    Many thanks in advance!

    Reply
    • Pete,

      First, I need to understand whether you are using (1) a repeated measures design or (2) a one-way ANOVA with replications.
      In case (1), you wouldn’t use Welch’s test or Games-Howell. I will assume for now that you have case (2).

      In general, I would avoid removing data to make the group sizes equal. One-way ANOVA is pretty resilient to unequal sample sizes and so I would go with that approach. If the variances are roughly equal, you don’t need to use Welch’s test, but can use ANOVA. Also, if you get a significant result, the real action is in the follow-up tests anyway.

      With unequal group sample sizes but roughly equal variances, you can use Tukey-Kramer instead of Games-Howell as the follow-up test.

      Charles

      Reply
  12. Hi Charles,

    I ran a three-way ANOVA. Levene’s found homogeneity was violated and sample sizes were unequal. I’m a little confused about how to resolve this. I read somewhere that you can use a more conservative significance level. Is this true?

    Eleanor

    Reply
    • Eleanor,
      If you send me an Excel file with your data and the three-way ANOVA test that you ran, I can try to figure out whether your test is valid.
      Charles

      Reply
    • Gagandeep,
      As shown on the referenced webpage, Levene’s test is ANOVA on the absolute value of the residuals. Thus, if you get a significant result, you can apply Tukey’s HSD test as post hoc test. Here the data won’t be the original data, but the absolute value of the residuals.
      Charles

      Reply
  13. Hi Charles,

    Are you familiar with the Conover test for homogeneity of variance? It might also be called the Conover-Iman test. It’s another nonparametric test for homogeneity of variance based on ranks, similar to Fligner-Killeen but involving the square of the ranks and their sums. I found the formula on a site somewhere and tried to implement it on a test dataset, but got confused on all the multiple summation symbols. I was just wondering if you had access to it, and if so, if you might consider making a page here about it. Seeing a raw example worked out would be very helpful…thanks very much for all you do here!

    Reply
  14. Charles,

    I ran one-way ANOVA in equal size groups and if I use a 95% confidence level, the result is that my variances are significantly different, but if I use a 99% confidence level it results in non significant difference. Can you help me understand why is this happening or a reference to read about it?

    Thanks in advance! Your website is very helpful.

    Reply
    • Valeria,
      This should not happen. Are you using Levene’s test?
      If you send me an Excel file with your data and test, I will try to figure out what is happening.
      Charles

      Reply
  15. Hi Charles,

    I had a quick question, if that’s alright. I ran a Levene’s test and ANOVA on my data (comparing 11 groups with equal sample sizes) but Levene’s test was violated. Here’s the problem: ANOVA says there is significance between my groups but Welch’s ANOVA says there isn’t.

    Is it true that you can still use ANOVA even if Levene’s is violated as long as your sample sizes are equal?

    Thanks!

    Reply
    • Lola,
      When you say that Levene’s test is violated, I assume that you mean that there is a significant result.
      ANOVA tends to be fairly robust to violations of the homogeneity of variance assumption when the sample sizes are equal, but this is not absolute. If the variances are really different, then I would use Welsh’s ANOVA.
      Charles

      Reply
  16. Very nice posts on ANOVA and I benefit a lot from them. Here is a question:
    Some textbook says that one of the assumptions is equal variance for the residual terms. Is it equivalent to equal variance of the populations? Thanks.

    Reply
  17. I want to run anova-its a 2×2 with 1 repeated measures and 1 independent variable. The Levene’s test is significant on one occasion but the is only p= 0.041. My groups are equal with 21 in each. Am I still able to run this test?

    Many thanks,

    Katie

    Reply
  18. I have a question will be greatful for any help at this point

    I have tested four participants. on 4 occasions under 4 different conditions. ie. each condition has 4 sets of data.
    I was told to use the repeated measures ANOVA.
    they all satisfy the assumption of sphericity, but have unequal variance and 2 of the 4 conditions are not normally distributed.
    Is it still possible to do the ANOVA?

    Reply
    • K,
      Homogeneity of variances is not an assumption in this type of analysis (you are testing the same participants based on Time and Treatment). Repeated Measures ANOVA is pretty robust to violations of normality, and so you should be fine provided you don’t have outliers or very skewed data.
      Charles

      Reply
  19. I am fascinated by this website, and very appreciative that it exists. I have a question and hope you can help. My participants are all from one cohort. I am wanting to look at 80 students 40 in each group, Group A will not get the intervention and Group B will receive the intervention. I will be have all participants take the same assessment/survey and than all participants retake the same assessment/survey 3 hours later. During the 3 hours the intervention group will have experienced the intervention. Am I correct that I would do a mixed ANOVA? How would I test for violations/assumptions?
    Thank you for any assistance.

    Reply
    • Linda,
      Yes, this sounds like a repeated measures ANOVA with one repeated measures Time factor (before/after) and one Treatment factor.
      You can test for normality using Shapiro-Wilk and for homogeneity of variances using Levene’s test. You can use an epsilon correction instead of testing for sphericity.
      Charles
      PS. Actually I wrote too quickly. With only two time periods, you don’t need to worry about sphericity.

      Reply
      • Thank you so much for your response, that does help quite a bit. If my post test scores are somewhat similar is there a test to calculate the strength of association. I am looking at anxiety and how much it decreases and does the intervention group’s anxiety decrease more significantly than the control group. I actually expect both groups to have less anxiety on their post test/survey, but I believe the intervention group will have a more significant drop in anxiety. So how would I calculate the impact of the intervention?
        Thank you again for providing some clarity in a subject that can cause a great deal of stress 🙂

        Reply
        • Linda,
          In general, the strength of a linear association can be measured by the correlation coefficient.
          For these types of tests, however, the effect size measurement (Cohen’s d, r = correlation, partial eta, omega, etc.) is what is usually used.
          Charles

          Reply
  20. Hi Charles,

    I want to conduct a 2 by 3 Mixed ANOVA however I have a significant Levene’s value for one of the levels of my independent variable (all others are okay). My sample size is 60 and all other assumptions have been met. Am I still able to use the ANOVA and if so how do I justify it?

    Thank you,

    Rachel

    Reply
    • Rachel,
      Technically no, since the assumptions have been violated. Whether the test is valid really depends on several factors, e.g.: (1) by how much is the homogeneity of variances assumption violated and (2) how far away from the alpha value are the p-values that you obtained.
      Charles

      Reply
      • Hi Charles,

        I have found this really useful, I also have exactly the same thing as Rachel here. I have a 3 x 3 Mixed ANOVA design but one of the IV’s has a significant Levene’s, with the p value being .026 – is that too far away to use the ANOVA ?

        Your guidance on this would be really appreciated. Thank you so much for your time.

        Best
        Jenna

        Reply
        • Jeena,
          It really depends on a number of factors, such as whether the variables with the higher means also have higher or lower variances, but it is likely that you will get reasonable results with a p value this high.
          Charles

          Reply
  21. Hi Charles,

    This site has been very helpful so far, I wonder if you can please help me further.

    I have one dependent variable and one independent variable with 14 levels. I am looking at the difference in the concentration of DNA found in 14 different areas on 4 different participants. The group sizes are all equal, but the data fails the homogeneity of variance test. Can you tell me why this might be? The data also failed the normality test.
    I am wondering what test if can use here? I want to know if there is any significant difference between the different areas in terms of the DNA concentration. I want to know which areas are different too. I could transform the data to make it normal if necessary.

    Thank you so much!
    Dejana

    Reply
  22. Hi!
    I have 50 items test for grade 1 students for four different school years, each school year has different population. I run OPLM in each of the school year. I have a slope result from OPLM for the different school years and I want to know if their slope is significantly different from one another. I want to use SPSS but I don’t know what test to be used. And what post hoc analysis to be used. thanks.

    Reply

Leave a Comment