Assumptions for ANOVA

To use the ANOVA test we made the following assumptions:

  • Each group sample is drawn from a normally distributed population
  • All populations have a common variance
  • All samples are drawn independently of each other
  • Within each sample, the observations are sampled randomly and independently of each other
  • Factor effects are additive

The presence of outliers can also cause problems. In addition, we need to make sure that the F statistic is well behaved. In particular, the F statistic is relatively robust to violations of normality provided:

  • The populations are symmetrical and uni-modal.
  • The sample sizes for the groups are equal and greater than 10

In general, as long as the sample sizes are equal (called a balanced model) and sufficiently large, the normality assumption can be violated provided the samples are symmetrical or at least similar in shape (e.g. all are negatively skewed).

The F statistic is not so robust to violations of homogeneity of variances. A rule of thumb for balanced models is that if the ratio of the largest variance to smallest variance is less than 3 or 4, the F-test will be valid. If the sample sizes are unequal then smaller differences in variances can invalidate the F-test. Much more attention needs to be paid to unequal variances than to non-normality of data.

We now look at how to test for violations of these assumptions and how to deal with any violations when they occur.

87 Responses to Assumptions for ANOVA

  1. Hin Yan says:

    Sir, I have a question.

    I am running a series of one-way ANOVA in a 3-group comparison.
    All groups are with sample size > 300, but are in unbalanced condition (eg. 300, 500, 600, or let say the ration is always around 1:1.5:2). Levene tests are significant in some comparisons (let’s say p<0.01), but i note that the ratios of largest: smallest variance are very small (eg. largest SD: smallest SD = 1:0.8). Could I say if ANOVA is still robust in this conditions even if it violates the the assumption of equal variance?

    or am i supposed to employ Welch's test?


    • Charles says:

      Hin Yan,
      Anova is not that robust to violations of equal variance, especially with unbalanced models. Welch’s Anova (or resampling) is probably a better choice.

    • Pauline Dugas says:

      I need an answer to this question and how they got the answer.
      Which of the following is NOT an assumption of the one-way randomized ANOVA?
      a. the data are interval or ratio
      b. the underlying distribution is skewed
      c. the variance among the populations being compared are homogeneous
      d. all of the alternatives are correct.
      Thank You, Pauline.

  2. Alessia says:

    Hi Charles,

    I have a question related to repeated-measures ANOVA, basically I had a sample of 8 participants that were tested on three different conditions. I did get a significant result, is this valid?

  3. Li says:

    I have some repeated measurements for the same sample at 4 time points and I want to test if the difference between their means is significant. The assumption of independent measurements is violated. How can I resolve this? What do you suggest?

  4. Luke says:

    Hi Charles,

    Forgive me if I have this confused, but I believe that you have made a mistake in stating that an assumption of ANOVA is that “Each group sample is drawn from a normally distributed population”. The link that you provided for correcting for this assumption encourages identifying this violation by plotting histograms. This link further says that the non-normality may be fixed by transformations of the response variable. I think this is being confused with the fact that transformations of the response variable are an approach correction when there is a violation of the assumption of linearity.

    From what I have learnt, the assumption is that the residuals are normally distributed, which can occur when the response variable is not normally distributed.

    • Charles says:

      For one-way Anova with say 4 groups (aka levels), you need to make sure that each group is normally distributed.
      For two-way Anova with say 2 groups for factor A and 3 groups for factor B, you need to make sure that all 2 x 3 = 6 groups are normally distributed.
      Fortunately, Anova is pretty robust to violations of normality.
      Normality can be corrected (if necessary) by using a transformation (not just to address linearity) or by using a different test.

    • Charles says:

      One more thing. I read the links you provided and they did seem to be confusing. Here is another link which may be helpful:

  5. Matt says:

    Hi Charles,

    I work with bacteria in soil and water. I am running statistics (or trying to) on my data which constantly violates normality and equal variances. I know the Welch Anova is recommended for unequal variance and that the Kruskall-wallace anova is for non-normality. What can be done if both of these ANOVA assumptions are violated at the same time? I understand transformations are useful in combating variance issues but I’d like to keep that as a last resort. Do you have any recommendations?


  6. serena says:

    Hi Charles,

    I have a convenience sample of size 60. My IV is major of study (three levels) and my DV is hours of study a week. I would like to run an ANOVA to determine differences in the means of these three groups. What are the consequences (both theoretical and practical) of the fact that my sample is not random? Will it “just” limit my ability to generalize my results? Or will it prevent me to use the test altogether? What do you suggest in these cases?

    Another, and related question: also other colleagues of mine use both ANOVA and the T-test with non-random samples (which can vary in size from 20 to 100) but, and this what puzzles me, they say that they do so without any inferential goal in mind… Basically they told me that all they want to do by using these tests is checking if the means are different among the groups in their sample. BUT, and this is my question, why running these tests if you do not have any inferential goal in mind? By inferential I mean to say smt about the population form your sample (even if non-random). In my understanding, these tests are made for inferential statistic. What do you think about it? Is there something I am missing here?

    I very much appreciated your website, and will greatly benefit from your advice. Thanks so much in advance for taking the time to answer my questions!

    • Charles says:


      The whole point of using ANOVA is to generalize your results from the random samples to the corresponding populations. If you are only interested in using the results for the given sample, then, as you have said, there is no point in doing any inferential analysis. You can simply compare the sample means and draw no conclusions about the population means.

      You say that the samples are not random, but how specifically were they drawn? Very often samples that are called random are not really random. E.g. in a lot of university research samples are drawn from the student body, based on students volunteering to participate. This is not really a random sample, but lots of research papers are written based on such samples.


      • serena says:

        Thank you very much! The sample I believe is convenient: I am asking people in my class (they come from different majors) how many hours a week they study using an online survey. I agree with you that sometimes we think we are collecting a random sample but we really aren’t. I guess my population can be my class in this case, as it is a very large class and I am only collecting a sample of 60 students in there?

        • Charles says:

          If you are sharing the results with other people, the important thing is to describe accurately the limitations of your sampling technique even if you use the standard analysis tool assuming random samples.

          • serena says:

            I have another quick question for another stats assignment. Thanks in advance for your help!

            I am working on a stats assignment for which I am required to design a little study, collect my data, and run an ANOVA. In my study, my IV would be “social media platform used”, with three levels being: Snapchat, Twitter, Facebook. The DV is the number of posts posted per day.

            The three categories of my IV are not mutually exclusive: should they be in order to run an ANOVA? If this is a potential issue, what is the best way to deal with it? Do I have to ask people to self-classify in one group to begin with? Or could I ask let’s say subject1 to provide an answer for each of the three groups, and then subsequently put subject1 in one of the three groups based on the highest score (for ex if subject 1 says Snapchat 2, Twitter 4, Facebook 6, then I would assign the subject to the Facebook group)? Is this theoretically correct? If I do so, would it be a within group research design (with the same subject measured three time)?

            My DV should measure the hours spent on each of these platforms (by the same subject) or the hours spent in general on social media?
            I wonder if I might violate the assumption of independency of the samples.

            Thanks so much!

          • Charles says:

            I am reluctant to answer someone’s homework assignment, but the approach to use really depends on what the objective of the study is. One of your approaches might work for some situations, but not for most others. Another approach is to view this as a repeated measures ANOVA where you allow multiple types of measurements per subject.

  7. John says:

    I have a general question on your article – which I found very useful.

    A question that constantly comes up is related to the nature of the treatments in an experiment, and whether or not ANOVA and means separation is acceptable, or regression analysis should be performed. Following is the question:

    If treatment means are not independent of each other, is it still acceptable to do ANOVA and means separation, or is regression analysis the proper approach? For example, if treatments represent a continuum of concention, such as 0X, 05X, 1.0 X, 1.5X and 2.0X, to me the treatments are not independent and the samples are therefore not independent of each other. Am I reading your article correctly?

    I am sorry about my terrible grammar in the previous post. I failed to review before submitting. My bad!!

    Thank you so much for your time and trouble.

    • Charles says:

      In your example, you have 5 treatment groups. If you use a sample of 50 then as long as you assign 10 elements from this sample to each group at random, then you have independent group samples.

  8. Piero says:

    Dear Dr. Charles,

    I have to perform a set of unpaired t-test on independent samples on a large number of endpoint variables (that is, I have to compare several male vs. female population characteristics).
    For some variable, normality assumption is violated; for some other, homogeneity of variances is violated; for some other variable, both assumptions are not met.
    The two samples are almost equal size (n about 50).

    What is the best non-parametric test to use for such cases?
    Do you think that, for a better consistency of all results, I should use the same method for testing all endpoint variables, independently from which assumptions are violated (if any) for each single variable?

    Thank you very much for your valuable help.
    Best Regards

    • Charles says:


      You can use the t test even if the variances are unequal. The test is pretty robust even if normality is violated provided that the data is reasonably symmetric.

      If you meet the assumptions, the Mann-Whitney test is usually the best nonparametric test to use.


  9. demonceau says:

    Dear Mister Zaiontz,
    I would like to observe how physical performances outcomes are influenced by 2 categorical variables (1- physical activity level (low vs high) and 2-presence of a disease (0-1)). I would like to use a 2-way ANOVA where y= physical outcome, x1= presence of the disease, x2= sedentary/exercising.
    The problem: The sample size is then not the same in each subgroup. So the model is unbalanced. One of the assumptions for the use of this type of ANOVA is therefore not met.
    I read somewhere that we can get round this assumption by using type I sum of squares (sequential) instead of the usual type VI SS (unique). Is it true? Can we draw the same conclusions about the significance of the effects of the 2 variables and their interaction? I guess that it would be too easy and there must be some tricky considerations?
    What is your opinion? Should I rather reduce/match the sample size in order to get equal groups?
    It is not the first time that I find useful and clear answers to my questions on your website and I’m very grateful. I hope that my junk language was understandable for stat expert. Thx a lot.

  10. A. S. says:

    Hi, you state that one of the assumptions are “Factor effects are additive” – is this an assumption that needs to be tested? How can I do that? Can you explain what this means a little?

    • Charles says:

      Additive just means that you can use the usual ANOVA equations to model what is going on (as described on the website). I don’t test for this assumption explicitly. I probably should drop this assumption from the list since it is confusing.

  11. André says:

    For ANOVA test we can´t assume normalty by central limit theorem if we have an enough sample size?

    • Charles says:

      This is likely to be true, but you should check for normality just in case. As long as the data is not too far from normality you should be ok.

  12. Tofi says:

    I am conducting a one way within subject Anova but my sphericity test is violated. I violated my Mauchler’s test and got a value of .702 so I guess I have to use the Greenhouse geisser. How would I report that in the results. Do I mention that the Mauchler’s test was violated and report the Greenhouse geisser instead?

    • Charles says:

      How you report the results really depends on the requirements for publications in your discipline, but in general I would report that Mauchler’s test was violated and report the Greenhouse-Geisser correction (and even the Huynh and Feldt correcction).

      • Tofi says:

        Thank you so much for your reply.
        I am writing my thesis and I have been reading around it and wasn’t sure on what to do . Will I still be okay to carry on with my ANOVA analysis even if I have this violation?

  13. nyasha says:

    what are the effects of violating the factor effects since they are additive

  14. don says:

    Outline the assumptions which underlie the analysis of variance (ANOVA) and the possible methods for their detection and remedy?

    I understand the first part of the question … the assumptions that underlie ANOVA but what are their possible methods of detection and remedy ?

    • Charles says:


      The brief answer is as follows:

      Normality – ANOVA is quite robust to violations of normality, especially if each group is reasonably symmetric. If this assumption is strongly violated then you can use an alternative test (e.g. Kruskal-Wallis or Brown-Forsythe) or a transformation could be employed

      Outliers – If some data are outliers, then you should check to make sure that there wasn’t some error in measurement or in copying the data. If that is not the case, then you can use a rank-oriented test instead (e.g. Kruskal-Wallis), use a trasformation or go ahead and perform the ANOVA, once with the outlier and another time with the outlier removed.

      Homogeneity of Variances – This is covered on the website. See


  15. tinashe says:

    thank u sir

  16. Xia says:

    Hello Charles,
    May I ask some questions about ANOVA and two sample t-test?
    1) For experiment 1, there are three experimental groups. Two group data sets passed normality test, one failed (P=0.045). I used Kruskal-Wallis One Way Analysis of Variance on Ranks to compare the three groups. Is this the right choice? Or I should use ANOVA, since the p values is close to 0.05? What if the P values for the third group is 0.014?
    2) For experiment 2, I have two sets of data. Group A: 1.12, 1.07, 1.12, normality test P<0.001. Group B: 0.05, 0.12, 0.35, normality test P=0.430. Because group A failed normality test, I used Mann-Whitney Rank sum test to compare the two groups, with P=0.077. However, if you look at the raw data, group A values are much bigger than group B values. It does not make sense that there is no significant difference between these two groups. Just for curiosity, I also run t-test to compare these two groups, with P=0.000542. In this situation (two data sets, only one pass the normality test), is the nonparametric test the correct test I have to use to compare these two groups?
    Thank you very much!

    • Charles says:

      Hello Xia,

      1) For data that is so close to normality (p = .045), generally I would just use ANOVA provided the homogeneity of variances assumption is met. ANOVA is much more sensitive to violations of this assumption and is pretty robust to violations of normality. Even if one group has p = .014 when testing normality probably ANOVA is the right way to go provided the data is relatively symmetrical and there aren’t problems with outliers. You can use a box plot to see whether the data is relatively symmetric.

      2) For two samples, if each group is relatively symmetric, I would use the t test. Without seeing your data I can’t say why the results from the MW test are so different from those of the t test; generally they would be similar if the data is symmetric.


      • Xia says:

        Hello Charles,
        Thank you very much for your reply!
        1) For experiment 1, both data sets that failed the normality test (p=0.045 and p=0.014) are not symmetric, according to the box plot. Therefore, a nonparametric test should be used for the analysis, right?
        2) For experiment 2, there are two experimental groups. I only have three values for each group. The data for group A are: 1.12, 1.07, 1.12 (normality test P<0.001). The data for group B are: 0.05, 0.12, 0.35 (normality test P=0.430). The results from t-test (p=0.000542) and Mann-Whitney Rank sum test (p=0.077) are very different.
        Thank you!

        • Charles says:


          1) Yes you would normally use a nonparametric test.

          2) With only three data points in each group, I would expect too much from either statistical test. Given that the first group is symmetric (at least from what you can see from the box plot) and the second group is normal, I would use the t test result. Also just looking at the data indicates that the population means are likely to be different. Again, with such small samples I would be very cautious about any conclusions.


  17. elsayedamr says:

    Thank you very much Sir for your effort : I have 2 questions:
    1. How can I judge the factor effect and how could I judge is it additive or not …?
    2. if the assumption of homogeneity of variance is not met .. i.e. significant Levene test … what do you recommend to use Welch ANOVA or Brown-Forsythe test..?
    thank you very much again.

    • Charles says:

      1. I don’t understand by judging the factor effect.

      2. Usually Welch ANOVA.


      • elsayedamr says:

        I mean the last assumption “Factor effects are additive” .. I could not understand it .. and how to test for it.

        • Charles says:

          This assumption is based on the fact that ANOVA is essentially a type of linear regression. See Regression Model for ANOVA.

          I wouldn’t explicitly worry about this assumption. The usual ANOVA tests will essentially show whether this assumption has been met.


  18. Rae says:

    Can ANOVA still be used if most of the data sets show normality but not all of them? Out of 21 data sets, 3 don’t show normality according to shapiro-wilk and Kolmogorov-smirnov tests.

    • Charles says:

      ANOVA is quite robust for violations of normality. It should be valid provided the data in these three groups are not too skewed.

  19. Raheem Khan says:

    what is the validity of anova?

    • Charles says:

      A test is valid if it measures what it claims to measure. I don’t think that ANOVA is the type of test this definition is intended to apply to, but if I do apply it to ANOVA, I guess I have to conclude that when the assumptions of ANOVA are met then ANOVA does measure what it is supposed to subject to the type I and type II error rates.

  20. Lola says:

    Do the samples have to be random or can you use this test on data collected from random samples?

    Also same question for chi-square and t-tests?


    • Charles says:

      The data is collected from random samples, but the data is not random for any of these tests.

      • Lola says:

        So it’s not a rule that you can only use these tests if the sample was collected in a random manner?

        • Charles says:

          It depends on what you mean by random. For most tests, samples should indeed be collected in a random manner. That doesn’t make the numbers random. These values must be collected randomly from the population that we are studying.

          • Lola says:

            I mean for example if we use survey data should the sample of respondents be for example stratified or systematic taken from a complete sampling frame whereby all members of the population stand an equal chance of selection. As opposed to say a convenience based or self select survey?

          • Charles says:

            Obviously a random sample is better. Many study are conducted with self-selected participants because it is easier to get a sample in this way. Although statistical analyses can be made, the results may not be reliable since the sample is not random.

  21. Lionel Isong says:

    Before now, my major concern was the ”assumptions of the ANOVA”, but this your analysis has been of great help. I visited and was able to surprise my lecturer during ST 525 lecture and i did well even in exams. I’ve also developed interest for design of experiments in advancement. I REMAIN GRATEFUL.

  22. ammara ilyas says:

    These assumption are used after fitting the model???

    • Charles says:

      You should make sure the assumptions hold before you spend a lot of energy building and analyzing the model. ANOVA is pretty robust to violations of normality, but not so robust to violations of homogeneity of variances. Thus if the variances are very different, the results of ANOVA can be completely inaccurate.

  23. Sunny Toka says:

    Am trying to using statistical analysis of anova in flood hazards . how do i use rainfall data and flood events to do my analysis , since there is no flood data in Africa.
    urgent attention sir

  24. Chris Tudeka says:

    what are the necessary conditions for orthogonal contrast?

  25. Steve says:

    Could you please enlighten me on this; anxiety is assessed with the use of three groups of participants utilising differing amounts of resources, no resources, two resources and five resources. The groups are also assessed in terms of gender differences between male and female. Is this a one way anova, a factorial anova or something else? I something else what would it be?

    • Charles says:

      In your description you have defined two factors. Factor A = resources (3 levels) and B = gender (2 levels). This fits the description of factorial anova. Of course, if you don’t care about the influence of gender, then you won’t include factor B in the model, which results in a one-way anova.

  26. joe says:


    You say
    “All populations have a common variance”

    but you are calculating the variance SSB.
    Maybe it’s because the assumption is about the pupulation variance and you calculate the sample variance?

    • Charles says:

      As usual, inferential statistics makes inferences about populations based on the observed sample. Since there is some chance that what we conclude based on the sample is not indeed true of the population, the results are probabilistic in nature.

  27. Abiodun Oluwasegun says:

    There is this question I was ask to solve. It goes thus: “what are the assumptions required in the use of ANOVA for a regression analysis”. Does this mean Anova assumption or regression…I need an urgent answer

    • Charles says:

      It sounds like they want the assumptions for regression, but I don’t know the intention of someone else when they ask such a question.

  28. mike says:

    If my whole population is being used (so I don’t have sample). Can I use ANOVA for that? Could someone please let me know his/her idea, and also if knows any reference regarding that.

    • Charles says:

      If you have the entire populations, then you don’t need to inferential statistics (since you have 100% of the information); you only need to use descriptive statistics. E.g. suppose you have three groups with means 23.4, 26.1 and 26.2, then you know that the three groups have different means since you can see that the values are different. You could use the effect sizes to get an idea of how big the difference between the means are.

  29. Jerome says:

    help me and provide answers to the following questions: (1) assumptions of analysis of variance? (2) implications of non parametrics?

  30. Josue says:

    Do you have any formal reference regarding the F statistic is relatively robust to violations of normality provided the two listed conditions? This reference would be valuable for a text I need to write for school.
    Thank you.

    • Charles says:

      There are a number of references regarding robustness to violations to normality, with slight differences from one to the other. Here is one such reference:

      Zar. J. H. (2010) Biostatistical analysis 5th Ed. Pearson


  31. muhammad saleem says:

    is there any data u provide to check the ANOVA assumptions , their violations and effect on results. kindly show these things on real data

  32. moses says:

    nice summary

  33. Michael Marks says:

    Does a one-way ANOVA require that the responses be linear with group?
    Also is a one-way ANOVA applicable for analysis of data such as response to increasing doses of a drug?

    • Charles says:

      1. There is no linearity assumption for ANOVA
      2. Yes, ANOVA can be used to compare responses to difference drug doses.

  34. Colin says:


    You worte:” the F statistic is relatively robust to violations of normality provided: The cell sizes are equal and greater than 10″ Does the cell size mean sample size?


    • Charles says:

      It means the group sample sizes. I have just revised the webpage to try to make this clearer.

  35. Noel Chimwanda says:


Leave a Reply

Your email address will not be published. Required fields are marked *