Homogeneity of Variances

Certain tests (e.g. ANOVA) require that the variances of different populations are equal. This can be determined by the following approaches:

  • Comparison of graphs (esp. box plots)
  • Comparison of variance, standard deviation and IQR statistics
  • Statistical tests

The F test presented in Two Sample Hypothesis Testing of Variances can be used to determine whether the variances of two populations are equal. For three or more variables the following statistical tests for homogeneity of variances are commonly used:

  • Levene’s test
  • Fligner Killeen test
  • Bartlett’s test

Using the terminology from Definition 1 of Basic Concepts for ANOVA, the following null and alternative hypotheses are used for all of these tests:

H0\sigma_1^2 = \sigma_2^2 = ⋯ = \sigma_k^2

H1: Not all variances are equal (i.e. \sigma_i^2 ≠ \sigma_j^2 for some i, j)


182 Responses to Homogeneity of Variances

  1. Ned from Norn Iron says:

    Many thanks for this… The easy to follow guide to Levene’s and Bartlett’s included in your download is just what I needed to sort out a tricky analytical problem…

    • admin says:

      I am very pleased that the site has been useful for you. I hope that you will use it again in the future.

      • mugo says:

        am doing homogeinity test under randomized complete block design. should I exclude block factor and do only with treatments when doing the test

        • Charles says:

          If I remember correctly, the homogeneity of variances assumption for RCBD is the same as for two-way ANOVA, namely that the variances of the interactions between treatment-block levels are all equal. If there are no replications, then none of these variances are defined and so there is nothing to test.

  2. Sriya says:


    Can you please let me know what transformation method I should be using if both standard deviation to means and means to variances are not proportional? There is no strong correlation for both?


    • Charles says:

      There is no easy answer to your question. It all depends on your data. There are an unlimited number of transformations as well (1/x, x^2, etc.). It also may turn out that a particular transformation creates more problems than it solves.

  3. Colin says:


    Will you add a real statistics function for “Bartlett’s Test” ?


    • Charles says:

      Bartlett’s Test is also called Box’s Test. This is already included in the Real Statistics Resource Pack (see multivariate statistics portion of the website).

  4. Deborah says:

    Hi, I just wanted to ask, what happens if your levene’s test is positive so homogeneity of variance cannot be assumed in a factorial independent measures ANOVA. I know that you have to change the significance to p=0.01 instead of 0.05 (or something along those lines) but what do I do in terms of SPSS? I have run the test as normal but I don’t know how I am supposed to interpret my results considering levene’s positivity.

    Please help!!! Thank you

  5. sonia says:

    sir i wanted to ask why homogeneity of variance is so important?please tell me in some points…

    • Charles says:

      Homegeneity of variances is a requirement for many of the most used statistical tests, including ANOVA. Fortunately most such tests are pretty foriving and as long as the variances are not too unequal the tests give pretty accurate results, but when the requirement is sufficiently violated then the results of these tests can be quite unreliable.

  6. Valerian says:

    hi! i want to ask on the interpretation of Bartlet’s test on the Gen stat discover program for ANOVA, i do fail to interpret it

  7. praveen kumar says:

    I want do homogeneity test for two variances please tell me to do the test

    • Charles says:

      Levene’s test can be used for two variance. You can use the LEVENE function as described on the referenced webpage.

  8. umar iqbal says:

    sir i want to know how do i find the relationship between export growth and variation between real and nominal exchange rate based on the measure of 3 months and 6 months i have collected data but now i m confused how do i apply non-parametric test on it and which test…

  9. Lucy says:

    Dear Sir,
    I am analyzing a field experiment on 4 maize varieties. The varieties were replicated three times in one location. Should I examine the homogeneity and normality tests?


    • Charles says:

      It really depends on what you are trying to test. The ANOVA tests will require homogeneity of variance and normality, but they can be quite forgiving even if these assumptions aren’t completely satisfied.

    • Cho says:

      I have conducted a experiment as two randomized complete block design ( 2 RCB) with different fertilization treatments, including control (no fertilizer application), PK, NK, NP and NPK fertilization with four replications at one location for four seasons (2 dry seasons and 2 wet seasons). The short duration rice varieties, hybrid (Pale Thwe-1) and HYV (Yadanar Toe) were tested. Can I do combine analysis for those 4 seasons’ yield data? Need to do homogeneity test? How to approach homogeneity test?

      • Charles says:

        Hello Cho,
        1. You can certainly combine the the yields for the 4 seasons. You can also perform a factorial ANOVA which will let look at the fertilizer treatments, the seasons and the rice variety.
        2. Generally I use Levene’s test to check homogeneity of variances. I also look at the actual values for variances to see whether they are very different or not.

  10. Alfiya says:

    Dear Sir,
    I am doing the One-Way ANOVA analysis. My p-value =0,233 for Levene’s test. Since my data was not normally distributed I transformed it. Do I need to perform Levene’s test again for transformed data? (I have tried and p-value is less then 0.05)
    Thank you

    • Charles says:

      Since you will be testing the transformed data you need to make sure that the assumptions are satisfied on the transformed data. Since homogeneity of variance is an assumption for One-way ANOVA this assumption needs to be verified for the transformed data. Levene’s test is a way of checking this.

  11. Kennedy says:

    I love your website it is so useful in helping me solve statistical problems. I am a little confused about how to perform hypothesis testing when the observations are just given as one total without actually listing them separately – 125 observations (Southern States) and 132 observations (Northern States) with a sample mean of 87 and 88 respectively and a population variance of 7.0 and 6.2 respectively. Level of significance is .01 is there evidence that the workers in southern states are receiving less pay than workers in northern states?

    • Charles says:

      Since you have the population variances you can use a two sample test using the normal distribution, as described in Theorem 1 of Comparing Two Means.

      The null hypothesis is mean1 >= mean2 (these are population means). The test statistic is z = (m1-m2)/stdev, where m1-m2 = 88-87 = 1 (sample means) and stdev = sqrt(var) where var = v1^2/n1 + v2^2/n2 = 6.2^2/132 + 7.0^2/125. If NORMSDIST(z) > .99 then you reject the hypothesis that the the workers receive the same pay. This is a one tailed test. If you want a two-tailed test you need to replace .99 by .995.

      If instead of the population variance you had the sample variances you would use Theorem 1 of Two Sample t Test instead.


  12. Sumit says:

    Can one perform t-test or ANOVA using CV if the variance between group/s is not similar? If yes, how does one do it? I am a statistics illiterate

    • Charles says:

      There is a version of the t-test which you can use when the variances are not similar. See the webpage two sample t-test with unequal variances.

      There are also substitute tests for ANOVA when the variances are not equal. See the Dealing with non-heterogeneity of variances topic on the referenced webpage.

      You use the abbreviation “CV”. What does this stand for?


  13. saif salim says:

    Dear sir please I need your help:
    1- 40 students are divided into two groups of 20 each [the control group (i.e CG) and the experimental group(i.e. EG)] they are considered equal in their level of English language study. The CG read version A of a text with certain rhetorical organization , on the other hand the EG read version B of the same text with different rhetorical organization . They are asked to read these two versions and recall information from them so the amount of information recalled and speed of reading spent are recorded . I want to test the following hypothesis ” To what extent will the change (in the rhetorical pattern) affect the ease of information recalled and speed of reading as well? How can I use ANOVA test please when the CG have two marks of amount of information recalled and marks of speed of reading and the EG have also two marks also

    • Charles says:

      It sounds like you have one independent variable Rhetorical organization (RO) and two dependent variables: Information recalled (IR) and Reading speed (RS). You can use MANOVA, or more simply Hotelling’s T-square as described on the webpage Hotelling’s T-square.

      • saif salim says:

        Dear sir, I consider the students (i.e. CG and EG) as dependent variables since there is no difference between them and considered equal, the RO (i.e.the two versions of texts version A and B with different rhetorical organization) as independent variable since it is varied . However the IR and SR are the responses or marks gathered and not variables .Can I use ANOVA test to examine the responses i.e. the IR (amount of information recalled) then do the same procedures to examine the RS (speed of reading) with my thanks and regards sir.

        • Charles says:

          Sorry, but I still don’t understand the question. E.g. in the statement “I consider the students (i.e. CG and EG) as dependent variables since there is no difference between them and considered equal…”, I don’t understand why this statement would make CG and EG dependent variables. I really don’t understand the other statements either.

  14. niki says:


    i conducted a two way anova and the levene’s test p value is 0.001, my study has a continuous dv (eating style) and two categorical iv’s
    Not sure what to do as ive used this analysis to show the effects of two other eating styles?

    • Charles says:

      Sorry, but I don’t understand your question. Please explain in a little more dtail, what you have done and what you want to accomplish.

      • niki says:

        i have conducted a two way anova to see if there is a sig difference in the means of eating style scores when participants reported their stress levels (high vs low stress) and sleep quality (good vs poor). I have conducted this for both emotional and external eating style but when i conducted the analysis for restrained eating style the homogeneity assumption was violated what can i do?

        • Charles says:

          Look at the Dealing with non-heterogeneity of variances topic towards the end of the referenced webpage.

  15. brain matienga says:

    dear sir
    I am doing research on the effects of wet/dry and wet feeding troughs on feeding grower pigs . so i have 2 treatments wet feeding and dry feeding and 2 blocks that is males and female so i have 24 males and 24 females all pigs are of same age and genetic .Can you help me on the method of data analysis to use. I am measuring the feed in take daily and weighing pigs on entery and then once weekly for 8 weeks

    • Charles says:

      It all depends on what you are trying to test, but based on what you have described it sounds like a three factor Anova. One fixed factor with two levels for feeding type, one fixed factor with two levels for gender and one repeated factor with 9 levels for week. The dependent variable is weight.

  16. Kevin says:


    Great info here…could you possibly do a page on the Fligner Killeen test (nonparametric assessment of homogeneity of variance with >2 groups), using concrete numbers? Every “example” I can find either just gives the chi-square result without the computation details, or just uses the abstract symbols and summation notation to explain it…really confusing and infuriating! Why can’t they just use an example with raw numbers? Thanks again!

  17. ojirobe says:

    Good day sir! Pls. I am currently working on a multiple regression problem, but I don’t know how to test for the homogeneity of variances. I am using the spss package which has a built in levene’s test procedures. Is it expected of me that check for the equality of variances of the DV and all IV? When I tried comparing the means using oneway anova, I kept on getting response like: homogeneity of variances cannot be performed cos only one group has a computed variance and also because the sum of case weights is less than the number of groups. Any form of explanation or references given would be appreciated.
    Thank you in anticipation of a swift response

    • Charles says:

      Levene’s test is a good way to test for homogeneity of variances. If you are performing one-way ANOVA then you need to test the homogeneity of the k groups in the test. I don’t use SPSS and so I am not able to explain the error message you received.

    • Kevin says:

      Hi there,

      I use SPSS, and have received an error that I believe is very similar to yours in wording. In my experience, it usually means that the two variables were entered incorrectly…the IV was labelled as the DV and vice versa. Try switching the group’s around and see if that doesn’t solve the problem.


  18. Naza says:

    Dear Sir,
    I used Levene’s test to check homogeneity of variance for my study. The result shows that p-value of Levene’s test is 0.165, while p-value for F-test is 0.000. Can I conclude that the variances is equal?
    Thank you.

    • Charles says:

      A p-value of .165 indicates that there is no significant difference between the variances.
      I don-t know which F-test you are referring to.

  19. Stan says:

    Good afternoon– I am trying to calculate the sample size required to test whether a sample variance is = 0.63 vs. H1: var < 0.63 with, say, 80% power, alpha=0.05. Could you give me some advice on how to calculate the sample size required for this test?
    Thanks! Stan

    • Charles says:

      This is a one sample variance test (using the chi-square distribution). You can find more information about this test at One Sample Variance Test.

      You can estimate the sample size by pressing Ctrl-m and choosing the Statistical Power and Sample Size data analysis tool. On the resulting dialog box choose the One sample variance and Sample size options.


  20. Duygu says:

    Hi, I am comparing 7 groups defined under one independent variable. My problem is with the homogeneity of variance test as equal variance is one of the assumptions for the one way anova. Although my groups have equal sample size, they show unequal variance in the levene test (p-value:0.001). Can I still use Anova test or should I use Kruskall- Wallis H test. and the sample size is 20 for each group, there are 7 groups, the data is normally distributed but with unequal variance, please kindly guide me regarding choosing the correct anova and post hoc tests for this analysis. Many thanks.

    • Charles says:

      With unequal variances you should probably choose the Walsh test. A post-hoc test for unequal variances is Games-Howell.

  21. Boobala Krishnan says:

    Dear sir,
    I am in a confusion with my data i have. I want to run an one way ANOVA in SPSS for which fails the assumption of homogeneity of variances, where the p<.05(.032). i dont know to proceed with which test to follow and to report the result. please help in this regard.

  22. Michelle says:

    Good afternoon,
    I am very confused; I have been carrying out a Welch test as my data violated the assumption of homogeneity and following with the games-howell post hoc test. However my output showed that I d0 have a sig. difference between groups (0.03) but then failed to identify between which groups the difference occurred on the post hoc test . I don’t know where to go to resolve this. Would a different post hoc test show me where the differences occurred?

    Kind Regards.

  23. Kaiyu says:

    Hi Charles,
    I plan to perform a 3 (A) x 3 (B) ANOVA. However, the Levene’s test results indicate that the error variance is not equal across groups. I like to ask for your advice if I could still conduct ANOVA. If not, what tests should be done? In addition, what are the tests available for comparing group means (B1, B2 and B3) under A1, A2, and A3? Thank you.

    • Charles says:

      For one-way ANOVA, the usual choice is Welch’s Test — see the webpage Dealing with Non-Homogeneous Variances.

      There isn’t a similar test for two-way ANOVA. If you don’t care about the interaction of the two factors (i.e. reduce the problem to one-way ANOVA), you can simply use Welch’s Test. Otherwise you can try to make some transformations that address the problem with the variances (as described on the above webpage). If the variances are not too unequal (e.g. Levene’s test is not so unequal), then you can use two-way ANOVA but report that there is a problem.

      Unfortunately, none of the solutions is that great.


      • Luna Shrestha says:

        Hello Charles,
        I found you very helpful. Interesting to read your reply. I am having problem to do statistics for my data.
        I am doing research on drying of apple in two diffferent temperature (50 C and 70 C) with 9 different pretreatments (Untreated drying at 50 C , Untreated drying at 70 C , Hot water blanching at 50 C drying at 70 C , Hot water blanching at 70 C drying at 70 C and so on…..) drying at different interval of time duration (30 min, 60 min until 240 min)
        I am measuring colour measurement in each experiment : change in colour Delta E , Change in lightness (Delta L), Change in redness (Delta a).
        I want to see effect of temperature, pretreatment, and drying time on colour change.

        I run Anova but I face the problem with leneve’s test. It shows heterogeneity. I am not sure that I am doing right analysis. I do not have much idea about statistics.
        I would be really grateful if you could help me in this matter.

        Thanking you in advance
        Best regards

        • Charles says:

          It is hard for me to answer your question based on the limited amount of data that you have provided. It sounds like you are measuring three types of colour-related change: colour Delta E , lightness (Delta L)and redness (Delta a). If so, then this seems like a MANOVA or multivariate regression problem. What are the scales of the three colour-oriented changes: continuous? discrete? On what groups of data did you perform Levene’s test? What was the actual p-value?

          • Luna Shrestha says:

            Hello Charles,

            Thanks! I have factor A = Pretreatment with 9 sub level; factor B = Drying temperature with 50 C and 70 C; Factor C = Drying Time with sub level of 30 min, 60 min, 90 min, 120 min, 150 min, 240 min.

            Dependent Variable = Colour change (Delta E).
            The scale of Colour change is continuous ( Scale).

            I just checked homogeneity of test (levene’s test ), it shows p<0,05 .

            Even I did Manov , Box's test is not performed and there's no value in some of the table.

            I am very bad in statistics. Please help me


          • Charles says:

            What hypothesis are you trying to test?

  24. Portia says:

    Sir what if in a 1 way independent ANOVA, in your homogeneity of variances the variances are not equal? Should I use Welch? If so, how am i going to report it and tell that I must use the Welch in spss? thankyou

    • Charles says:

      Yes, generally you would use Welch’s test. You report that you are using this test since the homogeneity of variances assumption is not satisfied to use Anova

  25. Tiffany Cleveland says:

    Can you tell me in layman terms, why homogeneity of variance is an important consideration in statistics and what are some methods used to determine if variances are homogenous?

    Any help with simple examples is greatly appreciated.

    • Charles says:


      Homogeneity of variance is an important since some of the statistical tests are not valid (i.e. they don’t give accurate results) if this assumption is not met (more precisely, if the variance are too different). A good example of this is ANOVA. If the homogeneity of variance assumption is not met then you would usually use Welch’s Test instead of one-way ANOVA.

      The typical tests used to determine whether the variances are sufficiently homogeneous are described in the referenced webpage (e.g. Levene’s Test).


  26. Earn says:


    Could you please help me to analyze the data by using One-Way Anova as i have got these result as follows:

    1. Homogeneity of Variance =.011 (It means that I need to take a look on Welch Statistic because the normal assumption has not been met)
    2. Sig. in Welch’ Statistic = .06
    3. Whereas, Sig. in Anova Table = .03
    4. More importantly, there are * asterisk appeared in Post Hoc Tests which can show where the difference exactly is for that variable.

    So, the question is “how can i conclude this result, also how to choose sig. to report when it has not been met assumption of variance”?

    Hopefully, you will reply me soon and I am really looking forward to seeing your response.

    Thanks very much in advance,

    • Charles says:


      1. When you say that “Homogeneity of Variance =.011”, does this mean that p = .011 for Levene’s test? If the homogeneity of variance assumption is not met then usually you would use Welch’s test.

      2. Since p = .06, then using an alpha value of .05, you have a non-significant result and so you wouldn’t usually perform any post-hoc tests

      3. Since the homogeneity of variance assumption is not met, I would tend to use the Welch’s result and not the ANOVA result.

      4. Which post-hoc test did you use? Games-Howell would likely be a good choice. The fact that you see an asterisk, means that you are probably not using Excel, and so I don’t know how to interpret it (although in many software packages this probably indicates a significant result).


  27. fooz says:

    Dear Charles
    please help me how to analyze this hypothesis;
    -There will be no statistically significant difference in the perceptions of EFL teachers and students with respect to teachers ( verbal and nonverbal )’ immediacy behavior in their classrooms.
    since the variances of two populations are not equal. Levene’s Test for Equality of Variances indicate sig=0.000
    thank you

    • Charles says:

      In order for me to answer your question, you need to provide a more detailed explanation of what you are trying to test. If you are trying to perform ANOVA but the homogeneity of variances assumption is not met based on Levene’s test, then a common approach is to use Welch’s ANOVA test instead. See Welch’s Test for more details.

      • fooz says:

        Would you please help me
        I confused between Anova or T.test for independent sample.since I have two population( teachers- students) and I have two factors ( verbal – nonverbal).
        and if I choose Anova which kind should I use.
        thank you

        • fooz says:

          dear Charles,
          I would like to add more details to clarify my question.
          -I have 50 teachers and 277 students. I used two versions of questionnaire as an instrument. one for students and the other is for teachers. each questionnaire includes two parts:
          1- verbal
          2- nonverbal
          The purposes of the study are to explore the perceptions of EFL teachers and students concerning teachers’ verbal and nonverbal immediacy behaviors and to see if there is any difference between their perceptions.
          thank you in advance

          • Charles says:

            As usual, the devil is in the details. It really depends on what sort of hypotheses you want to test. E.g. if you want to see whether there is a significant difference between the teachers’ and students’ verbal scores you can conduct a t test. This has the advantage that the variances can be unequal and the sample sizes can be unequal. You can also perform a similar test comparing nonverbal scores. If you need to consider interactions, then you can use ANOVA. The problem with this is that you have said that the homogeneity of variances assumption fails.

  28. Alison says:

    Dear sir,

    How can I tell if my Homogeneity of variance assumption is met? I ran my data and calculations through SPSS and my output lists my Levene’s Test Sig = .620. My Independent Sample T-Test output reveals my Sig (two-tailed) is .050. Is there a rule to remember when the assumption is met and when it is not? Any help is appreciated.
    Thank you!

    • Charles says:

      Generally a significance level of .05 is used for most statistical tests, including Levene’s Test. Since p-value = .620 is much higher than .05, you can’t reject the null hypothesis that the variances are equal.

      Your t test is right on the boundary with a p-value of .05. Since I typically use p-value < .05 as the measure of significance, then I would say that this test shows a non-significant result, i.e. you don't have enough evidence to reject the null hypothesis. Now although a significance level of .05 is commonly used, it is quite arbitrary. Regarding your question "Is there a rule to remember when the assumption is met and when it is not?", I would say that Levene's test shows that the homogeneity assumption is met. I don't know how you are using the t test and generally it is not used to test the homogeneity of variances assumption, and so I can't comment on this. Charles

  29. magnus says:

    Given the mean and standard deviation of a variable in panel dataset of countries, how do I determine homogeneity or heterogeneity?

    • Charles says:


      You don’t have enough data to determine whether the variances are equal across the various countries. Generally, you need to know the raw data by country. The problem of determining whether the variances are all equal in this case is equivalent to determining whether any list of positive numbers are all equal. I don’t know of any test that does this.

      Whether the variances are equal enough probably relates to what you will use this information for. For some tests as long as the largest variance is not more than 3 or 4 times the smallest variance you should be ok.

      Please provide the context for which the question of homogeneity of variances is relevant.


  30. Selena says:


    and thank you for such a helpful website!

    I’m trying to compare the means of 4 different age groups, which are not equal (e.g. group 1=195, group 2= 499, group 3=688 and group 4=693). The aim is to compare the mean preference score that these groups assigned to 5 images to determine whether different age groups preferred different images. However the Levene test is significant for mean scores of images 2 and 4, and non significant for images 1, 3 and 5. I am just wondering why this might be? The groups are quite uneven, so I would have expected a significant Levene statistic for all of them. Perhaps I did something wrong…?

    thank you for your advice!

    • Charles says:

      I am not following the scenario completely. When you say “e.g. group 1=195, …”, do you mean that group 1 has 195 elements?
      Generally when you perform Levene’s test, you use all the data as input, and don’t carry out the test separately for different images (unless you are testing these images separately).
      Please clarify.

      • Selena says:

        Hi Charles,

        thank you for your reply.

        You are correct; I have 4 age groups, so age group 1 (10-20 yrs) has 195 respondents, age group 2 (21-30 yrs) has 499 respondents and so on.

        I am indeed doing 5 tests separately to assess any differences in opinion among the younger/older respondents for 5 different images. So for instance, will younger respondents rate image A higher than the older respondents? but I would also like to determine if this will also be the pattern for images B, C, D and E. If I was investigating opinion for just one image, I would not have spotted this issue – but having done the 5 tests (in SPSS), I noticed that Levene’s test was significant for Images B and C, and not for A, D and E. I can’t understand why this should be the case as the groups are the same in all 5 tests. I know you do not use SPSS, but was just wondering if in your experience you might have come across something like this?

        Thank you,

        • Charles says:

          Are you saying that when you use Levene’s test on images B and C together you see that the variances are significantly different? Or are you applying Levene’s test to the data in B alone and then to C alone?

  31. Chokoreeto says:


    I am using SPSS to find out if there are significant difference for usage of system by different age groups.

    I have six different age groups (independent variables) with unequal sample sizes. In this case, I used one-way ANOVA, and intends to look at the statistics for between subjects ANOVA’s columns. However, because the sample sizes for six age groups are different, should I look at the Welch’s test instead?

    My data is:

    Levene’s statistics: 2.906, df1: 5, df2: 145, sig: 0.016

    Between groups- sum of squares: 12.894, df=5, mean square: 2.579, F: 4.637, sig: 0.001
    Within groups- sum of squares: 80.646, df: 145, mean square: 0.556
    total: 93.539 df: 150

    welch- statitic: 5.789, df1:5, df2: 60.418, sig: 0.000

    Please help me.

    • Charles says:

      You can use one/way ANOVA even when the group sizes are unequal, but you need to make sure that the homogeneous variance assumption is met. This is especially important when the sample sizes are unequal. Since Levene’s test shows a significant difference in the group variance, I would use Welch’s ANOVA.

  32. Laura says:


    I was wondering if you could help me. For an assigment I have been asked to test whether my physical activity (number of steps I take) differed from baseline, and after I have used a goal setting intervention. I recorded my number of steps for two weeks at baseline and two weeks during the intervention. During the intervention I set myself the goal of walking 10,000 steps each day, and I conducted an independent-samples t-test. My results came back insignificant however my levene’s test for equality of variances was almost significant (0.052). As levene’s is so close to being violated I was wondering if I have carried out the wrong type of t-test?
    Many thanks,


    • Charles says:


      Since it is always you who is the subject of the experiment baseline and after intervention, it sounds like you have a paired t test scenario instead of an independent samples t test scenario. In either case, Levene’s test is not particularly relevant: in the paired t test, there is only one sample (the differences) and in an independent samples t test, you can always use the unequal variance option, although with a p-value = .052, I wouldn’t expect there to be much of a difference between the equal variance and unequal variance versions of the t test.

      I do have one very serious concern about the test. If you are always the one doing the walking, it may be that you are violating the random samples assumption of these tests (and almost any other test). You haven’t given me enough information about the scenario to determine whether this is a problem, but I wanted to alert you to the possibility.


  33. Susanne says:


    I was wondering about the difference between ‘Levene’s test of Equality of Error Variances’ and ‘Levene’s test for Equality of Variances’ is.
    I am trying to do a three-way anova, but i’m not very good as SPSS or statistics. When i perform a T-test my levene’s test (for Equality of Variances) is not significant. Howerver, once i try to combine the data in a three-way anova it say’s that the Levene’s test of Equality of Error Variances is 0,014 and thus significant. I’m not sure what do do, does this mean that my Anova won’t be reliable?
    Please help!
    I’ve seen that over the years you’ve helped a lot of people and i really have to say that you are a huge help to, i am guessing, a lot of students, who are just as desperate about SPSS / statistics as i am. Thank you for that!

    • Charles says:

      My website is about statistics using Excel. On the site I describe Levene’s test and 3-way ANOVA using Excel. I don’t use SPSS and so don’t know the details of these tests in SPSS.

  34. Nikki says:

    I am conducting a Two- way ANOVA and produced a Levene’s test where the p >0.05, which shows that the assumption of equality of variances was met. But i don’t have a null hypothesis. Would I need one?
    I probably sound stupid right now but I really need help, I am so hopeless at statistics!

    Thank you.

    • Charles says:

      The null hypothesis is that the variances are equal. As you have stated, there is no evidence that this hypothesis has been violated. You can’t say definitively that the variances are equal, but you are at least 95% confident of this.

  35. Beth says:

    I am conducting a statistical test which looks at the differences between anxiety on two independent groups (with one group receiving training) . Two groups will be tested pre and post. I have ran a shapiro Wilk test and have found the data is non-normally distributed. Additionally, I have completed a non-parametric Levene test on pre vs post for both of the groups (would this be correct, as opposed to pre vs pre) and have found both groups to be heterogeneous. From reading this post I believe this would result in me running a Welch’s test to identify significant differences. Would this be correct. Additionally, research has suggested I could use a Mann-U Whitney test to test for significance. Would this also be correct?

    • Charles says:


      I am a bit confused. I understand that you have two groups, one receiving training and a different independent group that did not receive training. So far this sounds like a t test using unequal variances. If each group is far from normally distributed then you should consider the Mann-Whitney test. Since the t test is pretty robust to violations of normality, the t test may still be the best choice.

      The part that confuses me is your reference to pre- and post. Do you also have pre and post data? (Also, pre and post what?)


      • Beth says:

        Thank you for your reply, yes you are correct in reference to the data. I have two independent groups where anxiety levels are assessed pre and post season using the Sport Anxiety Scale. I was under the impression that a T test could not be used if the data was non-normally distributed? Also the two groups are unequal.

        Best wishes,

        • Charles says:

          It really depends on how far from normal the data is. In any case, if you are combining pre and post with training and not training, then you probably want to use a two factor mixed ANOVA model. Training is a fixed factor and pre/post is a repeated measures factor.

          • Beth says:

            The data, is very non-normally distributed (.00). I was under the impression that I would use an ANOVA, however, the data shows that the homogeneity of variance is violated. Nevertheless, I ran a non-parametric levenes test, which I have been told would not be correct seeing as I have un-equal sample sizes. Would any other tests be appropriate for testing for homogeneity of variance for non-normally distributed data and unequal sample sizes?

            Thanks again,

  36. Jade says:


    I have conducted a two way ANOVA and a Levene’s test and homogeneity cannot be assumed. Does this mean that all post hoc tests conducted would be non-parametric, i.e. Mann Whitney U? As we re ran normality tests on some interactions and found them to be paramteric so thought we should run a t-test, however if homogeneity can’t be assumed, would we run a Mann Whitney?

    • Charles says:


      If your post hoc test is to compare two groups which have fairly equal variances, then you could simply use a t test. In fact, even if had non-equal variances, you could use the Welch’s version of the t test. You would use Mann-Whitney if the data was far from normally distributed, but in that case you probably shouldn’t have used ANOVA in the first place.

      In any case, in general in order to reduce expertmentwise error, you would a different collection of post hoc tests (Tukey’s HSD, Games-Howell, etc.). I suggest that you look at the following webpage


  37. maryam says:

    i have conducted two way anova and its assumption homogeneity of variance has been violated so what should i do now?

  38. Manjishtha Bhattacharyya says:


    I log transformed my data (experimental) and on conducting one way ANOVA in SPSS, I found that the homogeneity of variance was not significant (Levene’s Test). Can that happen?

    Also, to correct the ANOVA I have included Welch test. Am I right?

    Please advise.


  39. Nathaniel says:

    Dear Gurus,

    I have some statistical dilemmas with unbalanced sample sizes of 15 groups (treatments). All have a sample size of 10, except two with 9 each (Q1. Can this be resolved with Type III Anova?). Besides, Levene’s test showed a p 0.05). Given the violation of homogeneity of variance, (Q2.) would it be okay to use type III ANOVA without any transformation whatsoever? (Q3). How best should I get around these huddles?

    • Charles says:

      Dear Nathaniel,
      Q1: If this is a one-way ANOVA, I wouldn’t worry too much about the small difference in sample size
      Q2: A value of p = .05 is probably good enough to consider the homogeneity of variance assumption to be met. I would look at the actual values and make sure that the largest variance is not more than 3 or 4 times the smallest.
      Q3: Not sure you have any hurdles to get around.

  40. Sammia says:

    Hello. I have run one way ANOVA and the assumption of homogeneity of variance is violated. Is it okay to use Welch test? Is it necessary to run post hoc test in this case>

  41. Sammia says:

    the value of Levene’s test is 0.026 in ANOVA, then I run Welch test and the value is 0.015. Is it necessary to run Post Hoc test?

    • Charles says:

      Since the result from Welch’s test is significant, if you want to pinpoint the source of the difference in means, then you need to perform a post hoc test.

  42. yassineisa says:

    For the result of a parmatric statistical test to be valid the data should ?

  43. khushi says:

    SIR, before applying ANOVA is it necessary to test for all the three assumptions of ANOVA (independence of observation, homogeneity of variance and normality of residuals)? can we apply ANOVA if any of the assumption is violated?

    • Charles says:

      You don’t really test for independence of observations. You need to make sure that when you create the sample that you do this in way that the assumption holds. ANOVA is pretty robust to violations of normality, especially with balanced models where all the groups are symmetric, but you need to make sure that the data doesn’t depart too much from normality. ANOVA is more sensitive to violations of the homogeneity of variances assumption, but even here usually as long as the ratio of the largest group variance to the smallest group variance is not more than 3 or 4, then you shouldn’t have a problem.
      If any of these assumptions is violated (too much), then yes you shouldn’t apply ANOVA.

      • Ming says:

        Sir, what are the references that i can use to support the argument that as long as the ratio of the largest group variance to the smallest grop variance is not more than 3 or 4, then ANOVA test still works fine.

        Thank you.

        • Charles says:

          This is merely a rule-of-thumb that is generally (but not always true). You probably won’t see it stated in a scientific journal.
          For a specific test you can use Levene’s test (or something similar) to see whether the homogeneity of variances assumption is met.

  44. aarju says:

    sir, before applying ANOVA, is it necessary to test all the three assumptions of ANOVA ( nomality assumption of residuals, homogenity of variance and independence of observations). can we apply ANOVA if any one of the assumption is violated.
    thanku sir.

    • Charles says:

      You don’t really test for independence of observations. You need to make sure that when you create the sample that you do this in way that the assumption holds. ANOVA is pretty robust to violations of normality, especially with balanced models where all the groups are symmetric, but you need to make sure that the data doesn’t depart too much from normality. ANOVA is more sensitive to violations of the homogeneity of variances assumption, but even here usually as long as the ratio of the largest group variance to the smallest group variance is not more than 3 or 4, then you shouldn’t have a problem.
      If any of these assumptions is violated (too much), then yes you shouldn’t apply ANOVA.

  45. Takwa says:

    Dear Charles,
    Can equality of variance using Levene’s test be used as a safety-net for the comparability of the two independent groups. What measurements should I take to ensure that groups don’t differ substantially from one another at the beginning of the study or to increase group comparability?
    I do thank you for your tremendous help.

    • Charles says:

      Yes, Levene’s test can be used to determine whether there is a significant difference between variances.
      Sorry, but I don’t understand your second question.

      • Takwa says:

        Thank you for replying and for easing my mind. I was thinking that I need to see if there is a significant difference between groups at a pretest, and use this pretest as a covariate for the analyses.

  46. Weronika says:


    I am doing the research where I test against one independent variable 24 dependent variables. I have tested the homogeneity of variances with Levene’s test. For one part of the data I have received the values higher than 0,05 what means that there is a homogeneity of variances and I can perform for those variables one-way ANOVA test. However, for the other part of the data I have receive the result lower than 0,05 – variances are not homogeneous and I should use Welche’s ANOVA.

    As this everything is happening in one study, does it mean that now for part of the data I should perform the one-way ANOVA test and for the other Welche’s test? Or should I for both data perform the Welche’s test? How I should further proceed with the Post Hoc tests – should I use for part of the data the post hoc test for equal variances and for the rest the post hoc test when the equal variances are not assumed?

    Thank you in advance for your answer.

    • Charles says:

      Based on the paucity of information that you have provided, I can’t say for sure, but it does sound like you might need to use Welch’s ANOVA for at least a part of the analysis. Games-Howell might be the appropriate follow-up test after Welch’s.
      Are you sure that you have 1 independent variable and 24 dependent variables? In this case, it is hard to see how you would use ANOVA or Welch’s test.

  47. Claire says:

    Hi Charles,
    I’m running a one-way ANOVA to figure out if one value (six groups) predicts another value (on a scale 0-100). Levene’s test was p>0.001, so I thought I’d use Welch’s test instead of the ANOVA.

    However, I read that ANOVA is robust enough to be used anyway if sample sizes are equal. My sample sizes are all identical, but the CI for the samples overlapped quite a lot.
    Should I still use Welch (+a post hoc test afterwards)?

    Thank you so much!

    • Charles says:

      ANOVA is pretty sensitive to violations of homogeneity of variances, although a little less so when the group sizes are equal. If p < .001 for Levene's test, then I would be cautious about using ANOVA. Welch's ANOVA may be a good choice if the groups are normally distributed or at least reasonably symmetric and there aren't outliers. Charles

  48. Amanda says:

    Can you run a parametric t test if normality is met but homogeneity of variance is violated?

  49. David says:


    I’m reporting data from two experiments, with one independent variable, five mediating variables and three dependent variables. Levene’s test provides a p-value > .05 for alle my variables, so I don’t have to worry about that. However, I can’t seem to figure out if I should report the results of Levene’s test in the appendix for all the mediating variables, in addition to for the dependent variables. Is there any reason that Levene’s test doesn’t make sense for mediating variables, or should I just report the findings for all the variables?

    Thanks in advance!

    • Charles says:

      You only need to confirm that the mediating variables meet the homogeneous variance assumption if the test that you are using requires this assumption.

  50. Smruthi says:


    I wanted to conduct a two- way ANOVA for the following variable- Independent factors are Temperature (20, 30 and 40 degrees C) and metal concentrations (100, 150, 200….450 ppm) and the dependent variable (Metal uptake percentage). Unfortunately, both the assumptions of normality and Levene’s Test were found significant (violated- 0.000).

    Thus, I cannot carry out Two- way ANOVA. Any suggestions on what test I could apply, as I am quite interested to understand if there is an interaction between the two independent variables on the dependent variable. I could not exactly figure out as how to transform my data.

    If not, then am I right in thinking that I need to forget the interaction of the two factors (i.e. reduce the problem to one-way ANOVA), and simply use Welch’s Test.

    Thank you.

  51. Nur says:

    Hi sir,
    If the data violates the assumption of homogeneity of variance-covariance from the Box’s M Test, can we still use MANOVA?

  52. Sandra says:

    Hi, Sir.

    I would like to use ANOVA for statistical analysis.
    However, the data is not normal. What should I do?

    • Charles says:

      ANOVA is pretty robust to violations of normality, especially when the group data are symmetric and the size of the groups are the same.
      If normality is violated you could try to use Kruskal-Wallis (provided the homogenity of variances assumption is met).

  53. S says:

    Hi I am doing a split plot anova for 2 groups with 3 satay collection points. Levene’s test is significant at time point 2 (but ok- non significant at others) can I
    Proceed with split plot anova as normal or is there something I need to do to? I know with other anova you use the ‘assumption of equal variance’ line on the spss table but in the split plot spss output the levenes test comes out in a separate box without these 2 lines. Any suggestions about how to interpret this is very welcome.

    • Charles says:

      I can’t comment about SPSS since I don’t use SPSS.
      If Levene’s test fails, then you might need to use one of the nonparametric replacements for ANOVA.

  54. jed says:

    I have 50 items test for grade 1 students for four different school years, each school year has different population. I run OPLM in each of the school year. I have a slope result from OPLM for the different school years and I want to know if their slope is significantly different from one another. I want to use SPSS but I don’t know what test to be used. And what post hoc analysis to be used. thanks.

  55. Dejana says:

    Hi Charles,

    This site has been very helpful so far, I wonder if you can please help me further.

    I have one dependent variable and one independent variable with 14 levels. I am looking at the difference in the concentration of DNA found in 14 different areas on 4 different participants. The group sizes are all equal, but the data fails the homogeneity of variance test. Can you tell me why this might be? The data also failed the normality test.
    I am wondering what test if can use here? I want to know if there is any significant difference between the different areas in terms of the DNA concentration. I want to know which areas are different too. I could transform the data to make it normal if necessary.

    Thank you so much!

  56. Rachel Hasson says:

    Hi Charles,

    I want to conduct a 2 by 3 Mixed ANOVA however I have a significant Levene’s value for one of the levels of my independent variable (all others are okay). My sample size is 60 and all other assumptions have been met. Am I still able to use the ANOVA and if so how do I justify it?

    Thank you,


    • Charles says:

      Technically no, since the assumptions have been violated. Whether the test is valid really depends on several factors, e.g.: (1) by how much is the homogeneity of variances assumption violated and (2) how far away from the alpha value are the p-values that you obtained.

      • Jenna Lee says:

        Hi Charles,

        I have found this really useful, I also have exactly the same thing as Rachel here. I have a 3 x 3 Mixed ANOVA design but one of the IV’s has a significant Levene’s, with the p value being .026 – is that too far away to use the ANOVA ?

        Your guidance on this would be really appreciated. Thank you so much for your time.


        • Charles says:

          It really depends on a number of factors, such as whether the variables with the higher means also have higher or lower variances, but it is likely that you will get reasonable results with a p value this high.

  57. Linda E. says:

    I am fascinated by this website, and very appreciative that it exists. I have a question and hope you can help. My participants are all from one cohort. I am wanting to look at 80 students 40 in each group, Group A will not get the intervention and Group B will receive the intervention. I will be have all participants take the same assessment/survey and than all participants retake the same assessment/survey 3 hours later. During the 3 hours the intervention group will have experienced the intervention. Am I correct that I would do a mixed ANOVA? How would I test for violations/assumptions?
    Thank you for any assistance.

    • Charles says:

      Yes, this sounds like a repeated measures ANOVA with one repeated measures Time factor (before/after) and one Treatment factor.
      You can test for normality using Shapiro-Wilk and for homogeneity of variances using Levene’s test. You can use an epsilon correction instead of testing for sphericity.
      PS. Actually I wrote too quickly. With only two time periods, you don’t need to worry about sphericity.

      • Linda E. says:

        Thank you so much for your response, that does help quite a bit. If my post test scores are somewhat similar is there a test to calculate the strength of association. I am looking at anxiety and how much it decreases and does the intervention group’s anxiety decrease more significantly than the control group. I actually expect both groups to have less anxiety on their post test/survey, but I believe the intervention group will have a more significant drop in anxiety. So how would I calculate the impact of the intervention?
        Thank you again for providing some clarity in a subject that can cause a great deal of stress 🙂

        • Charles says:

          In general, the strength of a linear association can be measured by the correlation coefficient.
          For these types of tests, however, the effect size measurement (Cohen’s d, r = correlation, partial eta, omega, etc.) is what is usually used.

  58. K says:

    I have a question will be greatful for any help at this point

    I have tested four participants. on 4 occasions under 4 different conditions. ie. each condition has 4 sets of data.
    I was told to use the repeated measures ANOVA.
    they all satisfy the assumption of sphericity, but have unequal variance and 2 of the 4 conditions are not normally distributed.
    Is it still possible to do the ANOVA?

    • Charles says:

      Homogeneity of variances is not an assumption in this type of analysis (you are testing the same participants based on Time and Treatment). Repeated Measures ANOVA is pretty robust to violations of normality, and so you should be fine provided you don’t have outliers or very skewed data.

  59. Katie says:

    I want to run anova-its a 2×2 with 1 repeated measures and 1 independent variable. The Levene’s test is significant on one occasion but the is only p= 0.041. My groups are equal with 21 in each. Am I still able to run this test?

    Many thanks,


  60. george says:

    Very nice posts on ANOVA and I benefit a lot from them. Here is a question:
    Some textbook says that one of the assumptions is equal variance for the residual terms. Is it equivalent to equal variance of the populations? Thanks.

  61. Lola says:

    Hi Charles,

    I had a quick question, if that’s alright. I ran a Levene’s test and ANOVA on my data (comparing 11 groups with equal sample sizes) but Levene’s test was violated. Here’s the problem: ANOVA says there is significance between my groups but Welch’s ANOVA says there isn’t.

    Is it true that you can still use ANOVA even if Levene’s is violated as long as your sample sizes are equal?


    • Charles says:

      When you say that Levene’s test is violated, I assume that you mean that there is a significant result.
      ANOVA tends to be fairly robust to violations of the homogeneity of variance assumption when the sample sizes are equal, but this is not absolute. If the variances are really different, then I would use Welsh’s ANOVA.

  62. Valeria says:


    I ran one-way ANOVA in equal size groups and if I use a 95% confidence level, the result is that my variances are significantly different, but if I use a 99% confidence level it results in non significant difference. Can you help me understand why is this happening or a reference to read about it?

    Thanks in advance! Your website is very helpful.

    • Charles says:

      This should not happen. Are you using Levene’s test?
      If you send me an Excel file with your data and test, I will try to figure out what is happening.

  63. Kevin Bluxome says:

    Hi Charles,

    Are you familiar with the Conover test for homogeneity of variance? It might also be called the Conover-Iman test. It’s another nonparametric test for homogeneity of variance based on ranks, similar to Fligner-Killeen but involving the square of the ranks and their sums. I found the formula on a site somewhere and tried to implement it on a test dataset, but got confused on all the multiple summation symbols. I was just wondering if you had access to it, and if so, if you might consider making a page here about it. Seeing a raw example worked out would be very helpful…thanks very much for all you do here!

    • Charles says:

      I thought about adding Conover Squared Ranks test when I added Fligner-Killeen.
      I will add this to the website shortly.

  64. Gagandeep says:


    Just like we have Fisher/Tukey for classification of means, is there a an option to classify Variances?

    • Charles says:

      As shown on the referenced webpage, Levene’s test is ANOVA on the absolute value of the residuals. Thus, if you get a significant result, you can apply Tukey’s HSD test as post hoc test. Here the data won’t be the original data, but the absolute value of the residuals.

  65. Eleanor says:

    Hi Charles,

    I ran a three-way ANOVA. Levene’s found homogeneity was violated and sample sizes were unequal. I’m a little confused about how to resolve this. I read somewhere that you can use a more conservative significance level. Is this true?


    • Charles says:

      If you send me an Excel file with your data and the three-way ANOVA test that you ran, I can try to figure out whether your test is valid.

  66. Pete says:

    Hi Charles,

    Thanks for all the useful information provided on the website. I have a question about my current research. I’m comparing 2 groups in a repeated measures design. My groups are however very unequal in sample size. One group has about 300 participants, and the other has around 70 participants. When running the analyses most assumptions are violated, like levene’s test and box’s test of equality of covariance.

    When I look for what to do when these assumptions are violated, I see that with unequal variance the Welch’s test is recommended and with unequal sample sizes the Games-Howell test seems to be the one for good use.

    I am wondering however, with these big differences in sample size (although the standard deviations in both groups are pretty similar) if it is not better to match the participant’s and make the groups equal (in sample size) instead of going forward with the analyses and using the post-hoc tests to correct for the violated assumptions and unequal sample sizes. I was wondering what you think is the best option in my situation.
    Many thanks in advance!

    • Charles says:


      First, I need to understand whether you are using (1) a repeated measures design or (2) a one-way ANOVA with replications.
      In case (1), you wouldn’t use Welch’s test or Games-Howell. I will assume for now that you have case (2).

      In general, I would avoid removing data to make the group sizes equal. One-way ANOVA is pretty resilient to unequal sample sizes and so I would go with that approach. If the variances are roughly equal, you don’t need to use Welch’s test, but can use ANOVA. Also, if you get a significant result, the real action is in the follow-up tests anyway.

      With unequal group sample sizes but roughly equal variances, you can use Tukey-Kramer instead of Games-Howell as the follow-up test.


  67. Shaun says:

    Hi Charles

    Great website,
    If I sent you a completed Anova test, would you be able to assist me in interpreting it.

  68. Matt Converse says:

    Hi Charles,

    I am loving the RealStats add-in for Excel, and am impressed at how thoughtful you are with responding to comments. Thank you!

    I would like to do a repeated measures ANOVA; however, a Levene’s test has shown that my groups have unequal variances. I have read in one of your responses to a comment that the repeated measures is fairly robust to there being unequal variances, and read (maybe online somewhere), that it is robust up to 1 group having 4x the variance of another; however, my most extreme case shows one group with 11x the variance of another. Do you have any recommendations of what I should do? Further, how to then best do a post-hoc test. I’m familiar with Games-Howell for unequal variances, but not sure if this applies with repeated measures?

    Thanks so much for your help! I’ll be sure to cite the RealStats add-in in my publication.


    • Matt Converse says:

      Actually, I think I figured it out. Once I realized your website had the nice navigation bar along the right side I could make a lot more sense of the content.
      I first did a Friedman test and followed that up with a Wilcoxon Signed-Ranks Test for Paired Samples. Does this sound like the right approach? Is the Friedman test even necessary? Thanks, again! Wonderful website and I love that I could download the Example Workbooks!

    • Charles says:

      It is true that ANOVA is generally pretty robust to violations of up to 3x or 4x (provided the group sizes are equal). For repeated measures ANOVA, however, you are interested in the variances of differences between the groups. I suggest that you look at the following webpage:
      You can use contrasts and Tukey HSD as post-hoc tests. These are described on the following webpage:

  69. Mohadese says:

    Hi charles,

    Thank you for your great website.
    I have questions about my recent project. It is an ERP study and there are 6 participants in each group. My questions are regarding the interpretation of behavioral data. Would you mind helping me with this matter?

  70. Ibrahim Bello says:

    Hello Charles,
    Want to use Levetes test in checking the variability of my data but can’t remember the formula and process manually not with the help of software.
    Thank you in advance

Leave a Reply

Your email address will not be published. Required fields are marked *