Assumptions for ANOVA

To use the ANOVA test we made the following assumptions:

  • Each group sample is drawn from a normally distributed population
  • All populations have a common variance
  • All samples are drawn independently of each other
  • Within each sample, the observations are sampled randomly and independently of each other
  • Factor effects are additive

The presence of outliers can also cause problems. In addition, we need to make sure that the F statistic is well behaved. In particular, the F statistic is relatively robust to violations of normality provided:

  • The populations are symmetrical and uni-modal.
  • The sample sizes for the groups are equal and greater than 10

In general, as long as the sample sizes are equal (called a balanced model) and sufficiently large, the normality assumption can be violated provided the samples are symmetrical or at least similar in shape (e.g. all are negatively skewed).

The F statistic is not so robust to violations of homogeneity of variances. A rule of thumb for balanced models is that if the ratio of the largest variance to smallest variance is less than 3 or 4, the F-test will be valid. If the sample sizes are unequal then smaller differences in variances can invalidate the F-test. Much more attention needs to be paid to unequal variances than to non-normality of data.

We now look at how to test for violations of these assumptions and how to deal with any violations when they occur.

114 Responses to Assumptions for ANOVA

  1. Noel Chimwanda says:


  2. Colin says:


    You worte:” the F statistic is relatively robust to violations of normality provided: The cell sizes are equal and greater than 10″ Does the cell size mean sample size?


    • Charles says:

      It means the group sample sizes. I have just revised the webpage to try to make this clearer.

  3. Michael Marks says:

    Does a one-way ANOVA require that the responses be linear with group?
    Also is a one-way ANOVA applicable for analysis of data such as response to increasing doses of a drug?

    • Charles says:

      1. There is no linearity assumption for ANOVA
      2. Yes, ANOVA can be used to compare responses to difference drug doses.

  4. moses says:

    nice summary

  5. muhammad saleem says:

    is there any data u provide to check the ANOVA assumptions , their violations and effect on results. kindly show these things on real data

  6. Josue says:

    Do you have any formal reference regarding the F statistic is relatively robust to violations of normality provided the two listed conditions? This reference would be valuable for a text I need to write for school.
    Thank you.

    • Charles says:

      There are a number of references regarding robustness to violations to normality, with slight differences from one to the other. Here is one such reference:

      Zar. J. H. (2010) Biostatistical analysis 5th Ed. Pearson


  7. Jerome says:

    help me and provide answers to the following questions: (1) assumptions of analysis of variance? (2) implications of non parametrics?

  8. mike says:

    If my whole population is being used (so I don’t have sample). Can I use ANOVA for that? Could someone please let me know his/her idea, and also if knows any reference regarding that.

    • Charles says:

      If you have the entire populations, then you don’t need to inferential statistics (since you have 100% of the information); you only need to use descriptive statistics. E.g. suppose you have three groups with means 23.4, 26.1 and 26.2, then you know that the three groups have different means since you can see that the values are different. You could use the effect sizes to get an idea of how big the difference between the means are.

  9. Abiodun Oluwasegun says:

    There is this question I was ask to solve. It goes thus: “what are the assumptions required in the use of ANOVA for a regression analysis”. Does this mean Anova assumption or regression…I need an urgent answer

    • Charles says:

      It sounds like they want the assumptions for regression, but I don’t know the intention of someone else when they ask such a question.

  10. joe says:


    You say
    “All populations have a common variance”

    but you are calculating the variance SSB.
    Maybe it’s because the assumption is about the pupulation variance and you calculate the sample variance?

    • Charles says:

      As usual, inferential statistics makes inferences about populations based on the observed sample. Since there is some chance that what we conclude based on the sample is not indeed true of the population, the results are probabilistic in nature.

  11. Steve says:

    Could you please enlighten me on this; anxiety is assessed with the use of three groups of participants utilising differing amounts of resources, no resources, two resources and five resources. The groups are also assessed in terms of gender differences between male and female. Is this a one way anova, a factorial anova or something else? I something else what would it be?

    • Charles says:

      In your description you have defined two factors. Factor A = resources (3 levels) and B = gender (2 levels). This fits the description of factorial anova. Of course, if you don’t care about the influence of gender, then you won’t include factor B in the model, which results in a one-way anova.

  12. Chris Tudeka says:

    what are the necessary conditions for orthogonal contrast?

  13. Sunny Toka says:

    Am trying to using statistical analysis of anova in flood hazards . how do i use rainfall data and flood events to do my analysis , since there is no flood data in Africa.
    urgent attention sir

  14. ammara ilyas says:

    These assumption are used after fitting the model???

    • Charles says:

      You should make sure the assumptions hold before you spend a lot of energy building and analyzing the model. ANOVA is pretty robust to violations of normality, but not so robust to violations of homogeneity of variances. Thus if the variances are very different, the results of ANOVA can be completely inaccurate.

  15. Lionel Isong says:

    Before now, my major concern was the ”assumptions of the ANOVA”, but this your analysis has been of great help. I visited and was able to surprise my lecturer during ST 525 lecture and i did well even in exams. I’ve also developed interest for design of experiments in advancement. I REMAIN GRATEFUL.

  16. Lola says:

    Do the samples have to be random or can you use this test on data collected from random samples?

    Also same question for chi-square and t-tests?


    • Charles says:

      The data is collected from random samples, but the data is not random for any of these tests.

      • Lola says:

        So it’s not a rule that you can only use these tests if the sample was collected in a random manner?

        • Charles says:

          It depends on what you mean by random. For most tests, samples should indeed be collected in a random manner. That doesn’t make the numbers random. These values must be collected randomly from the population that we are studying.

          • Lola says:

            I mean for example if we use survey data should the sample of respondents be for example stratified or systematic taken from a complete sampling frame whereby all members of the population stand an equal chance of selection. As opposed to say a convenience based or self select survey?

          • Charles says:

            Obviously a random sample is better. Many study are conducted with self-selected participants because it is easier to get a sample in this way. Although statistical analyses can be made, the results may not be reliable since the sample is not random.

  17. Raheem Khan says:

    what is the validity of anova?

    • Charles says:

      A test is valid if it measures what it claims to measure. I don’t think that ANOVA is the type of test this definition is intended to apply to, but if I do apply it to ANOVA, I guess I have to conclude that when the assumptions of ANOVA are met then ANOVA does measure what it is supposed to subject to the type I and type II error rates.

  18. Rae says:

    Can ANOVA still be used if most of the data sets show normality but not all of them? Out of 21 data sets, 3 don’t show normality according to shapiro-wilk and Kolmogorov-smirnov tests.

    • Charles says:

      ANOVA is quite robust for violations of normality. It should be valid provided the data in these three groups are not too skewed.

  19. elsayedamr says:

    Thank you very much Sir for your effort : I have 2 questions:
    1. How can I judge the factor effect and how could I judge is it additive or not …?
    2. if the assumption of homogeneity of variance is not met .. i.e. significant Levene test … what do you recommend to use Welch ANOVA or Brown-Forsythe test..?
    thank you very much again.

    • Charles says:

      1. I don’t understand by judging the factor effect.

      2. Usually Welch ANOVA.


      • elsayedamr says:

        I mean the last assumption “Factor effects are additive” .. I could not understand it .. and how to test for it.

        • Charles says:

          This assumption is based on the fact that ANOVA is essentially a type of linear regression. See Regression Model for ANOVA.

          I wouldn’t explicitly worry about this assumption. The usual ANOVA tests will essentially show whether this assumption has been met.


  20. Xia says:

    Hello Charles,
    May I ask some questions about ANOVA and two sample t-test?
    1) For experiment 1, there are three experimental groups. Two group data sets passed normality test, one failed (P=0.045). I used Kruskal-Wallis One Way Analysis of Variance on Ranks to compare the three groups. Is this the right choice? Or I should use ANOVA, since the p values is close to 0.05? What if the P values for the third group is 0.014?
    2) For experiment 2, I have two sets of data. Group A: 1.12, 1.07, 1.12, normality test P<0.001. Group B: 0.05, 0.12, 0.35, normality test P=0.430. Because group A failed normality test, I used Mann-Whitney Rank sum test to compare the two groups, with P=0.077. However, if you look at the raw data, group A values are much bigger than group B values. It does not make sense that there is no significant difference between these two groups. Just for curiosity, I also run t-test to compare these two groups, with P=0.000542. In this situation (two data sets, only one pass the normality test), is the nonparametric test the correct test I have to use to compare these two groups?
    Thank you very much!

    • Charles says:

      Hello Xia,

      1) For data that is so close to normality (p = .045), generally I would just use ANOVA provided the homogeneity of variances assumption is met. ANOVA is much more sensitive to violations of this assumption and is pretty robust to violations of normality. Even if one group has p = .014 when testing normality probably ANOVA is the right way to go provided the data is relatively symmetrical and there aren’t problems with outliers. You can use a box plot to see whether the data is relatively symmetric.

      2) For two samples, if each group is relatively symmetric, I would use the t test. Without seeing your data I can’t say why the results from the MW test are so different from those of the t test; generally they would be similar if the data is symmetric.


      • Xia says:

        Hello Charles,
        Thank you very much for your reply!
        1) For experiment 1, both data sets that failed the normality test (p=0.045 and p=0.014) are not symmetric, according to the box plot. Therefore, a nonparametric test should be used for the analysis, right?
        2) For experiment 2, there are two experimental groups. I only have three values for each group. The data for group A are: 1.12, 1.07, 1.12 (normality test P<0.001). The data for group B are: 0.05, 0.12, 0.35 (normality test P=0.430). The results from t-test (p=0.000542) and Mann-Whitney Rank sum test (p=0.077) are very different.
        Thank you!

        • Charles says:


          1) Yes you would normally use a nonparametric test.

          2) With only three data points in each group, I would expect too much from either statistical test. Given that the first group is symmetric (at least from what you can see from the box plot) and the second group is normal, I would use the t test result. Also just looking at the data indicates that the population means are likely to be different. Again, with such small samples I would be very cautious about any conclusions.


  21. tinashe says:

    thank u sir

  22. don says:

    Outline the assumptions which underlie the analysis of variance (ANOVA) and the possible methods for their detection and remedy?

    I understand the first part of the question … the assumptions that underlie ANOVA but what are their possible methods of detection and remedy ?

    • Charles says:


      The brief answer is as follows:

      Normality – ANOVA is quite robust to violations of normality, especially if each group is reasonably symmetric. If this assumption is strongly violated then you can use an alternative test (e.g. Kruskal-Wallis or Brown-Forsythe) or a transformation could be employed

      Outliers – If some data are outliers, then you should check to make sure that there wasn’t some error in measurement or in copying the data. If that is not the case, then you can use a rank-oriented test instead (e.g. Kruskal-Wallis), use a trasformation or go ahead and perform the ANOVA, once with the outlier and another time with the outlier removed.

      Homogeneity of Variances – This is covered on the website. See


  23. nyasha says:

    what are the effects of violating the factor effects since they are additive

  24. Tofi says:

    I am conducting a one way within subject Anova but my sphericity test is violated. I violated my Mauchler’s test and got a value of .702 so I guess I have to use the Greenhouse geisser. How would I report that in the results. Do I mention that the Mauchler’s test was violated and report the Greenhouse geisser instead?

    • Charles says:

      How you report the results really depends on the requirements for publications in your discipline, but in general I would report that Mauchler’s test was violated and report the Greenhouse-Geisser correction (and even the Huynh and Feldt correcction).

      • Tofi says:

        Thank you so much for your reply.
        I am writing my thesis and I have been reading around it and wasn’t sure on what to do . Will I still be okay to carry on with my ANOVA analysis even if I have this violation?

  25. André says:

    For ANOVA test we can´t assume normalty by central limit theorem if we have an enough sample size?

    • Charles says:

      This is likely to be true, but you should check for normality just in case. As long as the data is not too far from normality you should be ok.

  26. A. S. says:

    Hi, you state that one of the assumptions are “Factor effects are additive” – is this an assumption that needs to be tested? How can I do that? Can you explain what this means a little?

    • Charles says:

      Additive just means that you can use the usual ANOVA equations to model what is going on (as described on the website). I don’t test for this assumption explicitly. I probably should drop this assumption from the list since it is confusing.

  27. demonceau says:

    Dear Mister Zaiontz,
    I would like to observe how physical performances outcomes are influenced by 2 categorical variables (1- physical activity level (low vs high) and 2-presence of a disease (0-1)). I would like to use a 2-way ANOVA where y= physical outcome, x1= presence of the disease, x2= sedentary/exercising.
    The problem: The sample size is then not the same in each subgroup. So the model is unbalanced. One of the assumptions for the use of this type of ANOVA is therefore not met.
    I read somewhere that we can get round this assumption by using type I sum of squares (sequential) instead of the usual type VI SS (unique). Is it true? Can we draw the same conclusions about the significance of the effects of the 2 variables and their interaction? I guess that it would be too easy and there must be some tricky considerations?
    What is your opinion? Should I rather reduce/match the sample size in order to get equal groups?
    It is not the first time that I find useful and clear answers to my questions on your website and I’m very grateful. I hope that my junk language was understandable for stat expert. Thx a lot.

  28. Piero says:

    Dear Dr. Charles,

    I have to perform a set of unpaired t-test on independent samples on a large number of endpoint variables (that is, I have to compare several male vs. female population characteristics).
    For some variable, normality assumption is violated; for some other, homogeneity of variances is violated; for some other variable, both assumptions are not met.
    The two samples are almost equal size (n about 50).

    What is the best non-parametric test to use for such cases?
    Do you think that, for a better consistency of all results, I should use the same method for testing all endpoint variables, independently from which assumptions are violated (if any) for each single variable?

    Thank you very much for your valuable help.
    Best Regards

    • Charles says:


      You can use the t test even if the variances are unequal. The test is pretty robust even if normality is violated provided that the data is reasonably symmetric.

      If you meet the assumptions, the Mann-Whitney test is usually the best nonparametric test to use.


  29. John says:

    I have a general question on your article – which I found very useful.

    A question that constantly comes up is related to the nature of the treatments in an experiment, and whether or not ANOVA and means separation is acceptable, or regression analysis should be performed. Following is the question:

    If treatment means are not independent of each other, is it still acceptable to do ANOVA and means separation, or is regression analysis the proper approach? For example, if treatments represent a continuum of concention, such as 0X, 05X, 1.0 X, 1.5X and 2.0X, to me the treatments are not independent and the samples are therefore not independent of each other. Am I reading your article correctly?

    I am sorry about my terrible grammar in the previous post. I failed to review before submitting. My bad!!

    Thank you so much for your time and trouble.

    • Charles says:

      In your example, you have 5 treatment groups. If you use a sample of 50 then as long as you assign 10 elements from this sample to each group at random, then you have independent group samples.

  30. serena says:

    Hi Charles,

    I have a convenience sample of size 60. My IV is major of study (three levels) and my DV is hours of study a week. I would like to run an ANOVA to determine differences in the means of these three groups. What are the consequences (both theoretical and practical) of the fact that my sample is not random? Will it “just” limit my ability to generalize my results? Or will it prevent me to use the test altogether? What do you suggest in these cases?

    Another, and related question: also other colleagues of mine use both ANOVA and the T-test with non-random samples (which can vary in size from 20 to 100) but, and this what puzzles me, they say that they do so without any inferential goal in mind… Basically they told me that all they want to do by using these tests is checking if the means are different among the groups in their sample. BUT, and this is my question, why running these tests if you do not have any inferential goal in mind? By inferential I mean to say smt about the population form your sample (even if non-random). In my understanding, these tests are made for inferential statistic. What do you think about it? Is there something I am missing here?

    I very much appreciated your website, and will greatly benefit from your advice. Thanks so much in advance for taking the time to answer my questions!

    • Charles says:


      The whole point of using ANOVA is to generalize your results from the random samples to the corresponding populations. If you are only interested in using the results for the given sample, then, as you have said, there is no point in doing any inferential analysis. You can simply compare the sample means and draw no conclusions about the population means.

      You say that the samples are not random, but how specifically were they drawn? Very often samples that are called random are not really random. E.g. in a lot of university research samples are drawn from the student body, based on students volunteering to participate. This is not really a random sample, but lots of research papers are written based on such samples.


      • serena says:

        Thank you very much! The sample I believe is convenient: I am asking people in my class (they come from different majors) how many hours a week they study using an online survey. I agree with you that sometimes we think we are collecting a random sample but we really aren’t. I guess my population can be my class in this case, as it is a very large class and I am only collecting a sample of 60 students in there?

        • Charles says:

          If you are sharing the results with other people, the important thing is to describe accurately the limitations of your sampling technique even if you use the standard analysis tool assuming random samples.

          • serena says:

            I have another quick question for another stats assignment. Thanks in advance for your help!

            I am working on a stats assignment for which I am required to design a little study, collect my data, and run an ANOVA. In my study, my IV would be “social media platform used”, with three levels being: Snapchat, Twitter, Facebook. The DV is the number of posts posted per day.

            The three categories of my IV are not mutually exclusive: should they be in order to run an ANOVA? If this is a potential issue, what is the best way to deal with it? Do I have to ask people to self-classify in one group to begin with? Or could I ask let’s say subject1 to provide an answer for each of the three groups, and then subsequently put subject1 in one of the three groups based on the highest score (for ex if subject 1 says Snapchat 2, Twitter 4, Facebook 6, then I would assign the subject to the Facebook group)? Is this theoretically correct? If I do so, would it be a within group research design (with the same subject measured three time)?

            My DV should measure the hours spent on each of these platforms (by the same subject) or the hours spent in general on social media?
            I wonder if I might violate the assumption of independency of the samples.

            Thanks so much!

          • Charles says:

            I am reluctant to answer someone’s homework assignment, but the approach to use really depends on what the objective of the study is. One of your approaches might work for some situations, but not for most others. Another approach is to view this as a repeated measures ANOVA where you allow multiple types of measurements per subject.

  31. Matt says:

    Hi Charles,

    I work with bacteria in soil and water. I am running statistics (or trying to) on my data which constantly violates normality and equal variances. I know the Welch Anova is recommended for unequal variance and that the Kruskall-wallace anova is for non-normality. What can be done if both of these ANOVA assumptions are violated at the same time? I understand transformations are useful in combating variance issues but I’d like to keep that as a last resort. Do you have any recommendations?


  32. Luke says:

    Hi Charles,

    Forgive me if I have this confused, but I believe that you have made a mistake in stating that an assumption of ANOVA is that “Each group sample is drawn from a normally distributed population”. The link that you provided for correcting for this assumption encourages identifying this violation by plotting histograms. This link further says that the non-normality may be fixed by transformations of the response variable. I think this is being confused with the fact that transformations of the response variable are an approach correction when there is a violation of the assumption of linearity.

    From what I have learnt, the assumption is that the residuals are normally distributed, which can occur when the response variable is not normally distributed.

    • Charles says:

      For one-way Anova with say 4 groups (aka levels), you need to make sure that each group is normally distributed.
      For two-way Anova with say 2 groups for factor A and 3 groups for factor B, you need to make sure that all 2 x 3 = 6 groups are normally distributed.
      Fortunately, Anova is pretty robust to violations of normality.
      Normality can be corrected (if necessary) by using a transformation (not just to address linearity) or by using a different test.

    • Charles says:

      One more thing. I read the links you provided and they did seem to be confusing. Here is another link which may be helpful:

  33. Li says:

    I have some repeated measurements for the same sample at 4 time points and I want to test if the difference between their means is significant. The assumption of independent measurements is violated. How can I resolve this? What do you suggest?

  34. Alessia says:

    Hi Charles,

    I have a question related to repeated-measures ANOVA, basically I had a sample of 8 participants that were tested on three different conditions. I did get a significant result, is this valid?

  35. Hin Yan says:

    Sir, I have a question.

    I am running a series of one-way ANOVA in a 3-group comparison.
    All groups are with sample size > 300, but are in unbalanced condition (eg. 300, 500, 600, or let say the ration is always around 1:1.5:2). Levene tests are significant in some comparisons (let’s say p<0.01), but i note that the ratios of largest: smallest variance are very small (eg. largest SD: smallest SD = 1:0.8). Could I say if ANOVA is still robust in this conditions even if it violates the the assumption of equal variance?

    or am i supposed to employ Welch's test?


    • Charles says:

      Hin Yan,
      Anova is not that robust to violations of equal variance, especially with unbalanced models. Welch’s Anova (or resampling) is probably a better choice.

    • Pauline Dugas says:

      I need an answer to this question and how they got the answer.
      Which of the following is NOT an assumption of the one-way randomized ANOVA?
      a. the data are interval or ratio
      b. the underlying distribution is skewed
      c. the variance among the populations being compared are homogeneous
      d. all of the alternatives are correct.
      Thank You, Pauline.

  36. a'yun says:

    hi charlie,
    I read that anova is rhobust with violations of homogenity if the number of subject in grups are aproximately equal…what do u think?

  37. Tareq says:

    I Have Scatter Data between (Y,X). I need to fit (Nonlinear Regression) between them. how I can read the interpretation of regression results. In other words, if the t-test of the Coefficient is significant, is it enough for my null hypothesis. And what is about the F-test in the Nonlinear Regression??!!! is it necessary or not.!??.

  38. Cheng says:

    Hi sir,

    my survey questions are based on five-likert scale and the distribution are never normal. but based on central limit theorem i have sample size more than 30, which means my sample mean is normally distributed.
    i have used t-test and ANOVA to perform the analysis. am i correct?

    • Charles says:

      Without getting into all the details of the central limit theorem (esp. the fact that you need a continuous set of values), it is best to check that your data is really normally distributed by using a test like Shapiro-Wilk. Since your data will have a lot of tied values, in your case, it is better to use the d’Agostino-Pearson test for normality.
      If the data is at least symmetric the t test and ANOVA will perform pretty well even if the data are not normally distributed. In general, a 9-likert scale will perform better than a 7-likert scale (more like a continuous function) and a 7-likert scale will perform better than a 5-likert scale.

  39. ABYGAIL says:

    good day
    please explain whey we assume normality of distribution and homogeneity of variance when doing analysis of variance

    • Charles says:

      Otherwise the test doesn’t give valid results. If the assumptions don’t hold, perhaps when we conclude that there is no significant difference, there is in fact a significant difference or vice versa with probability greater than the assumed 5%.

  40. Manuel says:

    Can you please explain why the standard deviation of a measurement is not used in a Design of Experiments and thus only the averages are used in the ANOVA of the DOE?

    Very nice post!

    • Charles says:

      ANOVA and DOE focus on detecting differences in the means (averages) between various groups. This is because the mean tends to be the more relevant statistics in many experiments. There are tools that do compare the standard deviation, such as Levene’s test. Note too that the standard deviation is used in DOE and ANOVA, but the objective of these tools is to compare means.

  41. Hi Charles,

    Great post!

    I was just a bit confused when you say one of the assumptions of ANOVA is that the factors need to be additive. What about interactions?


    • Charles says:

      Yes, the interactions need to be additive as well. This just means that the error terms need to have zero mean. I wouldn’t worry about this assumption and wouldn’t explicitly test for it.

  42. Victor Job says:

    Great post!

    • Helen Bower says:

      Does anyone know of an academic reference or text for the above comment, “normality assumption can be violated provided the samples are symmetrical or at least similar in shape”. I’ve hunted for hours.
      All my 4 groups are negatively skewed so I wanted to justify doing a 2 x 2 ANOVA for a dissertation (even though several assumptions are violated – using Levine’s test, skewness z scores & Shapiro-Wilks).

  43. Ibrahim says:

    hi..pls can you give a situation each when some of the above assumption may not be valid?

    • Charles says:

      It is pretty easy to create data which violates one of the assumption. Try it yourself. The usual problems are homogeneity of variances and normality.

  44. Ibrahim says:

    pls add me to WhatsApp statistical group if there’s any available from your end sir..+2348083449340

  45. Ibrahim says:

    can you pls state other differences BTW CH.function and moment generating function sir

Leave a Reply

Your email address will not be published. Required fields are marked *