One-way Analysis of Variance (ANOVA)

Essentially Analysis of Variance (ANOVA) is an extension of the two sample hypothesis testing for comparing means (when variances are unknown) to more than two samples. In this part of the website we deal with the simple case, namely One-way ANOVA.

Topics:

16 Responses to One-way Analysis of Variance (ANOVA)

  1. Luiz Fabrizio Stoppiglia says:

    You have simply the best material for teaching statistics. Thanks a lot for producing this!

  2. Subhu says:

    Hi Charles:

    Thank you very much for this website. I have been benefited from you website in a number of occasions.

    I have a question about the ANOVA test:
    1. Does it necessary to have the whole population to do Anova or we can as well use the average values of the population to do Anova.
    2. I have several p-values from a number of Anova tests. What is the possibility of combining all these p-values to come up with one p-value. Is there any way of averaging the p-value for one system.

    Thanks very much.

    • Charles says:

      Subhu,
      1. There is no point in running an ANOVA if you have access to the whole population’s data. You can just look at descriptive statistics on the population. If by population, you mean sample, then I am not sure what average values you are referring to. Perhaps a more concrete example would be helpful in understanding what you are trying to accomplish.
      2. I can’t see any benefit in averaging p-values. What is it that you are trying to accomplish?
      Charles

      • Subhu says:

        Thank you for your reply Charles.
        I will explain in a bit detail of what I am trying to do:
        1. I have a reactor and I am recording velocity data at different cross-sections.
        2. I have three heights where I am taking these datas.

        3. I am measuring the velocity using different instruments. I want to compare the difference between these instruments based on the time-averaged velocity values.

        So, now here are my questions:
        1. Should I take the instantaneous values or time-averaged velocity values for doing ANOVA.

        2. By doing Anova at three different heights, I will have 3 p- values. Is a way to combine these 3 p-values to get just one, which will represent the whole system.

        I hope that I conveyed the problem to you. Thanks very much in advance.
        Subhu

        • Charles says:

          Subhu,

          As always the answer to your questions depend on what you are trying to prove/test. It sounds like you have three factors (i.e. independent variables): cross-section, height, instrument. Velocity is the dependent variable. If you want to understand the interactions between these factors then you probably should use a 3 factor ANOVA (instead of a one-factor ANOVA). You may also have a fourth factor, namely time, although this may be equivalent to the cross-section factor.

          Question 1: Assuming time and cross-section are equivalent, and you don’t care about differences at the cross-section level, then you could use time-averaged velocity; otherwise you would need to use velocity at each cross-section. It is really up to you and to what you are trying to study. Generally it is best to keep all the detail, but at some point (certainly at the fourth factor level) too much data makes any analysis too complicated.

          Question 2: If you make height a factor, then only one p-value is created for the height factor instead of 3 p-values.

          Charles

  3. I appreciate, cause I discovered exactly what I was looking for.
    You’ve ended my 4 day lengthy hunt! God Bless you man. Have a nice day.
    Bye

  4. Subhu says:

    Thank you for your reply Charles.
    I will explain in a bit detail of what I am trying to do:
    1. I have a reactor and I am recording velocity data at different cross-sections.
    2. I have three heights where I am taking these datas.

    3. I am measuring the velocity using different instruments. I want to compare the difference between these instruments based on the time-averaged velocity values.

    So, now here are my questions:
    1. Should I take the instantaneous values or time-averaged velocity values for doing ANOVA.

    2. By doing Anova at three different heights, I will have 3 p- values. Is a way to combine these 3 p-values to get just one, which will represent the whole system.

    I hope that I conveyed the problem to you. Thanks very much in advance.
    Subhu

  5. Meera says:

    Dear Sir,

    Is it mandatory to do a post hoc test after ANOVA? I need to prove that the variable ‘annual income’ influences error count. Can I simply do an ANOVA and leave it at that?

    • Charles says:

      Meera,
      If ANOVA gives you sufficient information for the test you are trying to make then you can leave it at that. Particularly if you get a non-significant result then you will typically want to leave it at that. If, however, you get a significant result, then usually you will want to better pinpoint what is causing the non-significant result, which is where the post-hoc tests come in to play.
      Charles

  6. Meera says:

    Dear Sir,

    I didn’t quite get this part “If, however, you get a significant result, then usually you will want to better pinpoint what is causing the non-significant result, which is where the post-hoc tests come in to play.”
    I got significant results at p<0.01.

    Meera.

    • Charles says:

      Meera,
      If for example you had four groups, the significant result from the ANOVA test tells you that there is a significant differences among the means of the four groups, but it doesn’t tell you which groups(s) have different means. If you want to better understand this then you need to conduct some follow up test. See the website for more details and examples about this.
      Charles

  7. Meera says:

    I will Sir. Thank you for helping me out.

    Meera

  8. Meera says:

    Dear Sir,

    I tried doing the post-hoc test but got no significant result. I’m confused. My ANOVA result was significant at p <0.01
    What to do?

    Meera

    • Charles says:

      Meera,
      Which post-hoc did you perform? Most likely the problem is that you need to fill in the highlighted range with contrast values as described on the webpages
      Planned Comparisons and Unplanned Comparisons.
      Charles

      • Meera says:

        Dear Sir,

        I went for Scheffe’s. Not using the Resource Pack though. When I tried it in the Resource Pack it said “Compile error in hidden module: frmAnova1″

        Meera

        • Charles says:

          Meera,
          That is not good. What version of Windows and Excel are you using? Are you able to use other Real Statistics Resource Pack capabilities?
          Depending on what you are trying to demonstrate, Scheffe’s is usually not the best post-hoc test to use. Usually Tukey HSD gives better results.
          Charles

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>