To help introduce the basic concepts we start with the following example.

**Example 1**: A new fertilizer has been developed to increase the yield on crops, and the makers of the fertilizer want to better understand which of the three formulations (blends) of this fertilizer are most effective for wheat, corn, soy beans and rice (crops). They test each of the three blends on one sample of each of the four types of crops. The crop yields for the 12 combinations are as shown in Figure 1.

**Figure 1 – Data for Example 1**

We interrupt the analysis of this example to give some background, after which we will resume the analysis.

**Definition 1**: We define the **structural model** as follows.

A **factor** is an independent variable. A *k* factor ANOVA addresses *k* factors.

A** level** is some aspect of a factor; these are what we called groups or treatments in the one factor analysis discussed in Basic Concepts for ANOVA.

In Example 1 there are two factors: blends and crops. The blend factor has 3 levels and the crop factor has 4 levels.

In general, suppose we have two factors A and B. Factor A has *r* levels and factor B has *c* levels. We organize the levels for factor A as rows and the levels for factor B as columns. We use the index *i* for the rows (i.e. factor A) and the index *j* for the columns (i.e. factor B). Thus we use an *r* × *c* table where the entries in the table are

We use terms such as *x̄ _{i}* (or

*x̄*) as an abbreviation for the mean of {

_{i.}*x*}. Similarly, we use terms such as

_{ij}: 1 ≤ j ≤ c*x̄*(or

_{j}*x̄*) as an abbreviation for the mean of {

_{.j}*x*}.

_{ij}: 1 ≤ i ≤ rWe estimate the level means from the total mean for factor A by *μ _{i} = μ + α_{i}* where

*α*denotes the effect of the

_{i}*i*th level for factor A (i.e. the departure of the

*i*th level mean

*μ*for factor A from the total mean

_{i}*μ*). We have a similar estimate for the sample of

*x̄*=

_{i}*x̄*+

*a*.

_{i}Note that

Similarly we estimate the level means from the total mean for factor B by *μ _{j} = μ + β_{j} *where

*β*denotes the effect of the

_{j}*j*th level for factor B (i.e. the departure of the

*j*th level mean μ

_{j}for factor B from the total mean

*μ*). We have a similar estimate for the sample of

*x̄*=

_{j}*x̄*+

*b*.

_{j}As for factor A,

The two-way ANOVA will either test for the main effects of factor A or factor B, namely

H_{0}: *μ _{1.} = μ_{2.} =⋯= μ_{r.} *(Factor A)

or

H_{0}: *μ. _{1} = μ._{2} =⋯= μ._{c} *(Factor B)

If testing for factor A, the null hypothesis is equivalent to

H_{0}: *α _{i} *= 0 for all

*i*

If testing for factor B, the null hypothesis is equivalent to

H_{0}: *β _{j}* = 0 for all

*j*

Finally, we can represent each element in the sample as *x _{ij} = μ + α_{i} + β_{j} + ε_{ij} *where

*ε*denotes the error (or unexplained amount). As before we have the sample version

_{ij}*x*=

_{ij}*x̄*+

*a*where

_{i}+ b_{j}+ e_{ij}*e*is the counterpart to

_{ij}*ε*in the sample.

_{ij}**Observation**: Since

**Definition 2**: Using the terminology of Definition 1, define

**Correction**: The term *x _{ij}* in the formula for

*SS*in the above table should not have a bar over it.

_{E}**Property 1**:

Proof: Clearly

If we square both sides of the equation, sum over *i, j* and then simplify (with various terms equal to zero as in the proof of Property 2 of Basic Concepts for ANOVA), we get the first result. For the second,

**Property 2**: If a sample is made as described in Definition 1, with the *x _{ij} *independently and normally distributed and with all (or ) equal, then

Proof: The proof is similar to that of Property 1 of Basic Concepts for ANOVA.

**Theorem 1**: Suppose a sample is made as described in Definitions 1 and 2, with the* x _{ij} *independently and normally distributed.

If all *μ _{i} *are equal and all are equal then

If all *μ _{j}* are equal and all are equal then

Proof: The result follows from Property 2 and Theorem 1 of F Distribution.

**Property 3**:

**Observation**: We use the following tests:

Recall that the assumptions for using these tests are:

- All samples are drawn from normally distributed populations
- All populations have a common variance
- All samples were drawn independently from each other
- Within each sample, the observations were sampled randomly and independently of each other

We now return to Example 1 and show how to conduct the required analysis using Excel’s **Anova: Two-factor Without Replication** data analysis tool.

**Example 1** (continued): The output from the data analysis tool is shown in Figure 2.

**Figure 2 – Two factor ANOVA without replication data analysis tool**

There are two null hypotheses: one for the rows and the other for the columns. Let’s look first at the rows:

H_{0}: there is no significant difference in yield between the (population) means of the blends

Since the p-value for the rows = .0068 < 05 = *α* (or *F* = 12.83 > 5.14 = *F-crit*) we reject the null hypothesis, and so at the 95% level of confidence we conclude there is significant difference in the yields produced by the three blends.

The null hypothesis for the columns is

H_{0}: there is no significant difference in yield between the (population) means for the crop types

Since the p-value for the columns = .1446 > 05 = *α* (or *F* = 2.63 < 4.76 = *F-crit*) we can’t reject the null hypothesis, and so at 95% level of confidence we conclude there is no significant difference in the yields for the four crops studied.

**Observation**: Although the analysis in Figure 2 was produced automatically by Excel’s data analysis tool, the same result can be produced using Excel formulas, just as we were able to do in Basic Concepts of ANOVA for one-way ANOVA. The most interesting cells are the ones corresponding to the four sum squares. We show how to calculate the values for each of those cells in Figure 3.

**Figure 3 – Key formulas for analysis from Figure 2**

The formulas for calculating *SS _{Row} *and

*SS*in Definition 2 involve taking squared deviations of the group means. E.g.

_{Col}*SS*can be calculated via the formula =DEVSQ(I6:I8)/H6. Alternatively we can take squared deviations from the sums of each group, as is done in Figure 3.

_{Row}**Real Statistics Excel Capabilities**: The Real Statistics Resource Pack contains a number of supplemental functions and the Two Factor ANOVA data analysis tool which support Two Factor ANOVA without Replication. You can get more information about these in Two Factor ANOVA with Replication.

Either figure 1 is incorrect or the opening paragraph, from figure 1 there are 4 crops and 3 blends, where as your opening paragraph states “four blends on one sample of each of the three types of crops”.

Thanks J for finding the typo. The figure is correct but the opening paragraph is not. It should state “three blends on one sample of each of the four types of crops”. The website has now been corrected. Thanks again for catching the error. Charles.

Could you please define two factor anova without replication.

This defined on the referenced webpage.

Charles

What happens to SSAB in the two factor without replication? Why is it not shown.

Ed,

Because it is Anova w/o replication SSAB becomes the error term SSW (or SSE).

Charles

Sir

I think there is a mistake about SSE in the table of definition 2. It may be a typo.

Sir

Sorry, you are right. The SSE formula is different with the textbook I read, but it is correct.

very nice website! It is good to learn Stats with easy-to-use samples.

Should the Figure 2 labels read without replication?

Jeff,

Yes, you are correct. The caption for Fiure 2 should read “without replication”. Thanks for catching this typing mistake. I have now corrected the caption on the webpage.

Charles

The math here is not correct, possible typo on the greater than sign>

(or F = 2.63 > 4.76 = F-crit)

Chris,

Thanks for catching this typo. It should indeed state (or F = 2.63 < 4.76 = F-crit). I have now corrected this mistake on the referenced webpage. Thanks again for bringing this to my attention.

Charles