To help introduce the basic concepts we start with the following example.

**Example 1**: A new fertilizer has been developed to increase the yield on crops, and the makers of the fertilizer want to better understand which of the three formulations (blends) of this fertilizer are most effective for wheat, corn, soy beans and rice (crops). They test each of the three blends on one sample of each of the four types of crops. The crop yields for the 12 combinations are as shown in Figure 1.

**Figure 1 – Data for Example 1**

We interrupt the analysis of this example to give some background, after which we will resume the analysis.

**Definition 1**: We define the **structural model** as follows.

A **factor** is an independent variable. A *k* factor ANOVA addresses *k* factors.

A** level** is some aspect of a factor; these are what we called groups or treatments in the one factor analysis discussed in Basic Concepts for ANOVA.

In Example 1 there are two factors: blends and crops. The blend factor has 3 levels and the crop factor has 4 levels.

In general, suppose we have two factors A and B. Factor A has *r* levels and factor B has *c* levels. We organize the levels for factor A as rows and the levels for factor B as columns. We use the index *i* for the rows (i.e. factor A) and the index *j* for the columns (i.e. factor B). Thus we use an *r* × *c* table where the entries in the table are

We use terms such as *x̄ _{i}* (or

*x̄*) as an abbreviation for the mean of {

_{i.}*x*}. Similarly, we use terms such as

_{ij}: 1 ≤ j ≤ c*x̄*(or

_{j}*x̄*) as an abbreviation for the mean of {

_{.j}*x*}.

_{ij}: 1 ≤ i ≤ rWe estimate the level means from the total mean for factor A by *μ _{i} = μ + α_{i}* where

*α*denotes the effect of the

_{i}*i*th level for factor A (i.e. the departure of the

*i*th level mean

*μ*for factor A from the total mean

_{i}*μ*). We have a similar estimate for the sample of

*x̄*=

_{i}*x̄*+

*a*.

_{i}Note that

Similarly we estimate the level means from the total mean for factor B by *μ _{j} = μ + β_{j} *where

*β*denotes the effect of the

_{j}*j*th level for factor B (i.e. the departure of the

*j*th level mean μ

_{j}for factor B from the total mean

*μ*). We have a similar estimate for the sample of

*x̄*=

_{j}*x̄*+

*b*.

_{j}As for factor A,

The two-way ANOVA will either test for the main effects of factor A or factor B, namely

H_{0}: *μ _{1.} = μ_{2.} =⋯= μ_{r.} *(Factor A)

or

H_{0}: *μ. _{1} = μ._{2} =⋯= μ._{c} *(Factor B)

If testing for factor A, the null hypothesis is equivalent to

H_{0}: *α _{i} *= 0 for all

*i*

If testing for factor B, the null hypothesis is equivalent to

H_{0}: *β _{j}* = 0 for all

*j*

Finally, we can represent each element in the sample as *x _{ij} = μ + α_{i} + β_{j} + ε_{ij} *where

*ε*denotes the error (or unexplained amount). As before we have the sample version

_{ij}*x*=

_{ij}*x̄*+

*a*where

_{i}+ b_{j}+ e_{ij}*e*is the counterpart to

_{ij}*ε*in the sample.

_{ij}**Observation**: Since

**Definition 2**: Using the terminology of Definition 1, define

**Correction**: The term *x _{ij}* in the formula for

*SS*in the above table should not have a bar over it.

_{E}**Property 1**:

Proof: Clearly

If we square both sides of the equation, sum over *i, j* and then simplify (with various terms equal to zero as in the proof of Property 2 of Basic Concepts for ANOVA), we get the first result. For the second,

**Property 2**: If a sample is made as described in Definition 1, with the *x _{ij} *independently and normally distributed and with all (or ) equal, then

Proof: The proof is similar to that of Property 1 of Basic Concepts for ANOVA.

**Theorem 1**: Suppose a sample is made as described in Definitions 1 and 2, with the* x _{ij} *independently and normally distributed.

If all *μ _{i} *are equal and all are equal then

If all *μ _{j}* are equal and all are equal then

Proof: The result follows from Property 2 and Theorem 1 of F Distribution.

**Property 3**:

**Observation**: We use the following tests:

Recall that the assumptions for using these tests are:

- All samples are drawn from normally distributed populations
- All populations have a common variance
- All samples were drawn independently from each other
- Within each sample, the observations were sampled randomly and independently of each other

We now return to Example 1 and show how to conduct the required analysis using Excel’s **Anova: Two-factor Without Replication** data analysis tool.

**Example 1** (continued): The output from the data analysis tool is shown in Figure 2.

**Figure 2 – Two factor ANOVA without replication data analysis tool**

There are two null hypotheses: one for the rows and the other for the columns. Let’s look first at the rows:

H_{0}: there is no significant difference in yield between the (population) means of the blends

Since the p-value for the rows = .0068 < .05 = *α* (or *F* = 12.83 > 5.14 = *F-crit*) we reject the null hypothesis, and so at the 95% level of confidence we conclude there is significant difference in the yields produced by the three blends.

The null hypothesis for the columns is

H_{0}: there is no significant difference in yield between the (population) means for the crop types

Since the p-value for the columns = .1446 > .05 = *α* (or *F* = 2.63 < 4.76 = *F-crit*) we can’t reject the null hypothesis, and so at 95% level of confidence we conclude there is no significant difference in the yields for the four crops studied.

**Observation**: Although the analysis in Figure 2 was produced automatically by Excel’s data analysis tool, the same result can be produced using Excel formulas, just as we were able to do in Basic Concepts of ANOVA for one-way ANOVA. The most interesting cells are the ones corresponding to the four sum squares. We show how to calculate the values for each of those cells in Figure 3.

**Figure 3 – Key formulas for analysis from Figure 2**

The formulas for calculating *SS _{Row} *and

*SS*in Definition 2 involve taking squared deviations of the group means. E.g.

_{Col}*SS*can be calculated via the formula =DEVSQ(I6:I8)/H6. Alternatively we can take squared deviations from the sums of each group, as is done in Figure 3.

_{Row}**Real Statistics Excel Capabilities**: The Real Statistics Resource Pack contains a number of supplemental functions and the Two Factor ANOVA data analysis tool which support Two Factor ANOVA without Replication. You can get more information about these in Two Factor ANOVA with Replication.

Hi Charles, I want to know if which two ANOVA is appropriate to calculate the significant difference in the means of some data with two treatments across three different age groups

Ajibola,

You have 2 levels for the Treatment factor and 3 levels for the Age factor. Now the question for you is how many subjects do you have for each of the 6 combinations of Treatment x Age? If one then you need Two Factor ANOVA without Replication. If more than one then you need Two Factor ANOVA with Replication.

Charles

Hello,

Thank you for your great addin. I have been working with your Gage R&R feature. The way that I understand the number of categories is that it is the (stddev(part)/stddev(gage))*sqrt(2). Your Gage R&R report uses (stddev(part)/stddev(total))*sqrt(2). Am I misunderstanding a variable, or should the top formula be used?

Thanks

Neil,

I believe that I am using the formula with gage and not total variation.

Charles

Hi Charles

I have done carried out a biofilm assay with two different nanoparticles with 6 different concentrations (0, 100, 200,300,400,500) without replication. Please let me know whether I can use two way ANOVA without replication for this dataset. I find significant difference betwee the two types of nanoparticles when I do the test. However, there is no significant differences between different concentrations. What I am confused is that the difference between columns (diff concentrations) is for both types of nanoparticles. However, when I replicate and do t-test assuming equal variance between control and different concentrations of nanoparticles, I find there is a significant difference. Please let me know which method and analysis is appropriate .

Niluka,

If I understand your scenario correctly, it seems that you can use Two Factor ANOVA without replication. The differences between the columns is for both types of nanoparticles (combined).

I don’t understand what you mean by “when I replicate and do t-test”. If you like, you can send me an Excel file with this data and analysis so that I can better understand what you are trying to do.

Charles

I’m using the Two Factor Anova data analysis tool and just noticed that the means for each of the categories in the two factors (the categorical “total” means as opposed to the means at each intersection of variables, if you will) are calculated by averaging the corresponding cells in the matrix of means, as opposed to either calculating them directly from the data or using a weighted average. This seems to be producing inaccurate results.

Madison,

The Two Factor Anova supports two input data formats: Excel format and standard (i.e. stacked) format. When Excel format is used, I believe that marginal means are based on the original data, but when standard format is used then the marginal means are the average of the group means, as you have observed.

For balanced models (when all interactions have the same number of elements), both approaches yield the same result. This is not the case for unbalanced models. For unbalanced models, you should choose the Regression option on the Two Factor Anova dialog box. This will use the standard format approach to calculating the marginal means. This is the preferred approach as described on the following webpage:

Unbalance Approach to Two Factor Anova

Charles

The distinction between balanced and unbalanced models was what I was missing, thank you! Your site is an excellent resource, and it is very much appreciated!

Charles,

Which test can I use when the assumptions for using a Two Factor Anova, especially the one of a common variance, are not met? I understand that the Kruskal Willis test can only be used in place of a One Way Anova.

Thank you again,

Erik

Erik,

You can use Scheirer-Ray-Hare as a substitute for Two Factor ANOVA with Replication. This test has limited power, but it is a possible approach.

Charles

Hello Mr Charles.

Can you please tell me why I get a p value 2.76E-08 while performing Two way ANOVA without replication between Season and Species Density?

Thank you

Have A Nice Day

Anila Ajayan

Anila,

A p value of 2.76E-08 is a small value which is equal to .0000000276. This value is written in what is called scientific notation.

Charles

can u do anova for me….if yes , then reply to me at santoshpandey511@gmail.com

No. You will need to do this yourself. I am providing you the tools and explanations, but you need to do the work.

Charles

I am attempting to perform this ANOVA, but my variance results column has #DIV/0! for all values, leading to #NUM in my P-value and F-crit value boxes. What is the problem with my data?

Hilary,

If you send me an Excel file with your data and calculations and I will try to figure out what is going on. You can get my email address on the webpage

Contact Us

Charles

I’m experiencing a similar problem. This time, I can’t do the Tukey HSD test follow-up for two factor anova because the variances has #DIV/0! for all values. Hope you can assist as well.

You need to fill in the contrast column (labeled c) in the output with 1 for one group and -1 for another group. In this way you compare two groups.

Charles

Hi Charles,

I have the same problem. Since my data doesn’t have replication, the variance of response in every interaction of the factors cannot be calculated.. so it resulted #DIV/0!. Could you please give me any suggestion?

Thank you

Regards,

Zahra

Hi Zahra,

Sorry, but when you say that you “have the same problem”, whom are you referring to?

In the case where there is no replication, there is no interaction factor, and so you cannot analyze it. You can, however, analyze the two main factors, as described on the referenced webpage.

Charles

Which post hoc test should I use in the excel toolpack when I find significance in the results of the two factor anova without replication?

Thanks

Kelly,

You can use the usual post hoc tests for ANOVA with replication, except those that test the interaction (since you are assuming there is no interaction). Probably most useful are contrasts and Tukey’s HSD. An example of the Tukey’s HSD is given on the following webpage:

Question (3) of the Spring 2005 exam from http://www.math.utah.edu/~treiberg/M3081Finalsamplesoln.pdf

Charles

hi I m doing two way anova with replication but results for P and F value are not coming normal. My data is having 6 columns and 24 rows for each column. I am confuse what is happening.

Please explain better what you mean by “P and F value are not coming normal”.

Charles

HI Charles

Many thanks for your excellent website.

I am having trouble finding the ‘ANOVA: two factor without replication’ tool in the data analysis toolkit. I go to the ‘analysis of variance section’ and check the the ‘Anova: two factors’ box but there is then no option for ‘without replication’. Hence I cannot understand how to arrive at the output for this example. (similarly with the example for Anova: 2 factor with replication).

I hope you can help.

Regards

Hi Sean,

For ANOVA without replication choose the

Anova: two factorsoption. Then in the dialog box that appears insert 1 in theNumber of Rows per Samplefield.Charles

How can I do multiple comparison like LSD, DMRT between treatment by this two way annova without replecation.

I don’t support these follow up tests at present. I don’t support LSD because I find other tests are better.

Charles

Can you tell me, name of some others test? Which one is better to draw conclusions.Thank You.

Which follow up test is best depends on a number of things (equal sampler size or not, homogeneity of variances or not, etc.). Generally I use Tukey’s HSD post-hoc test for ANOVA with replication. See the following webpage:

Unplanned Comparisons

I have not thought about what sort of post-hoc tests are appropriate for ANOVA without replication.

Charles

Because the degree freedom of error is equal to zero when trying to calculate the interaction effect,so we can conclude that their are no interaction effects in this case? Is that right?

Thank you

There are no interaction effects in two factor ANOVA without replication.

Charles

what does it mean by error values that come out in ANOVA table (2 way without replication)…how to interpret it?

Like any error value, the lower the value the better the fit of the model.

Charles

Nice work and thanks.

You have written:

rows = .0068 < 05 = α

It ought to be:

rows = .0068 05 = α

should be:

columns = .1446 > 0.05 = α

(Or you could write 0.05 as .05, same thing)

Thanks again.

Posting the above comment also dropped the decimal on my first example for the correction. Must be something in the way the HTML is conveyed.

Test:

05 needs to be written either 0.05, or .05.

Strange!

Dave,

Thanks for catching this typo and for helping improve the accuracy of the website. I have now revised the webpage to include the decimal point.

Charles

I use Analysis of variance-two factors Appear number of rows per sample must be a positive integer. what is this please teach me thanks.

In a Two factor ANOVA there are two factors, which I will call Row and Column. Suppose the Row factor has 3 levels and the Column factor has 4 levels. If say there are 240 elements in the sample, with 20 elements in each combination of Row and Column levels (3 x 4 x 20 = 240). The value for number of rows per sample = 20.

Charles

The math here is not correct, possible typo on the greater than sign>

(or F = 2.63 > 4.76 = F-crit)

Chris,

Thanks for catching this typo. It should indeed state (or F = 2.63 < 4.76 = F-crit). I have now corrected this mistake on the referenced webpage. Thanks again for bringing this to my attention. Charles

Should the Figure 2 labels read without replication?

Jeff,

Yes, you are correct. The caption for Fiure 2 should read “without replication”. Thanks for catching this typing mistake. I have now corrected the caption on the webpage.

Charles

very nice website! It is good to learn Stats with easy-to-use samples.

Sir

I think there is a mistake about SSE in the table of definition 2. It may be a typo.

Sir

Sorry, you are right. The SSE formula is different with the textbook I read, but it is correct.

What happens to SSAB in the two factor without replication? Why is it not shown.

Ed,

Because it is Anova w/o replication SSAB becomes the error term SSW (or SSE).

Charles

Either figure 1 is incorrect or the opening paragraph, from figure 1 there are 4 crops and 3 blends, where as your opening paragraph states “four blends on one sample of each of the three types of crops”.

Thanks J for finding the typo. The figure is correct but the opening paragraph is not. It should state “three blends on one sample of each of the four types of crops”. The website has now been corrected. Thanks again for catching the error. Charles.

Could you please define two factor anova without replication.

This defined on the referenced webpage.

Charles