A probably very basic question about multi-factorial ANOVA. Assume a two-way design where we test both main effects A, B, and the interaction A:B. When testing the main effect for A with type I SS, the effect SS is calculated as the difference $RSS(1) – RSS(A)$, where $RSS(1)$ is the residual error sum of squares for the model with just the intercept, and $RSS(A)$ the RSS for the model with factor A added. My question concerns the choice for the error term:

**How do you justify that the error term for this test is typically calculated from the RSS of the full model A + B + A:B that includes both main effects and the interaction?**

$$

F_{A} = frac{(RSS_{1} – RSS_{A}) / (df_{RSS 1} – df_{RSS A})}{RSS_{A+B+A:B} / df_{RSS A+B+A:B}}

$$

… as opposed to taking the error term from the unrestricted model from the actual comparison (RSS from just the main effect A in the above case):

$$

F_{A} = frac{(RSS_{1} – RSS_{A}) / (df_{RSS 1} – df_{RSS A})}{RSS_{A} / df_{RSS A}}

$$

This makes a difference, as the error term from the full model is probably often (not always) smaller than the error term from the unrestricted model in the comparison. It seems that the choice for the error term is somewhat arbitrary, creating room for desired p-value changes just by adding/removing factors that aren't really of interest, but change the error term anyway.

In the following example, the F-value for A changes considerably depending on the choice for the full model, even though the actual comparison for the effect SS stays the same.

`> DV <- c(41,43,50, 51,43,53,54,46, 45,55,56,60,58,62,62, + 56,47,45,46,49, 58,54,49,61,52,62, 59,55,68,63, + 43,56,48,46,47, 59,46,58,54, 55,69,63,56,62,67) > IV1 <- factor(rep(1:3, c(3+5+7, 5+6+4, 5+4+6))) > IV2 <- factor(rep(rep(1:3, 3), c(3,5,7, 5,6,4, 5,4,6))) > anova(lm(DV ~ IV1)) # full model = unrestricted model (just A) Df Sum Sq Mean Sq F value Pr(>F) IV1 2 101.11 50.556 0.9342 0.4009 Residuals 42 2272.80 54.114 > anova(lm(DV ~ IV1 + IV2)) # full model = A+B Df Sum Sq Mean Sq F value Pr(>F) IV1 2 101.11 50.56 1.9833 0.1509 IV2 2 1253.19 626.59 24.5817 1.09e-07 *** Residuals 40 1019.61 25.49 > anova(lm(DV ~ IV1 + IV2 + IV1:IV2)) # full model = A+B+A:B Df Sum Sq Mean Sq F value Pr(>F) IV1 2 101.11 50.56 1.8102 0.1782 IV2 2 1253.19 626.59 22.4357 4.711e-07 *** IV1:IV2 4 14.19 3.55 0.1270 0.9717 Residuals 36 1005.42 27.93 `

The same question applies to type II SS, and in general to a general linear hypothesis, i.e., to a model comparison between a restricted and an unrestricted model within a full model. (For type III SS, the unrestricted model is always the full model, so the question doesn't arise there)

**Contents**hide

#### Best Answer

*This is a very old question, and I believe that @gung's answer is very good (+1). But as it was not entirely convincing for @caracal, and as I don't fully follow all its intricacies either, I would like to provide a simple figure illustrating how I understand the issue.*

Consider a two-way ANOVA (factor A has three levels, factor B has two levels) with both factors being obviously very significant:

SS for factor A is huge. SS for factor B is much smaller, but from the top figure it is clear that factor B is nevertheless very significant as well.

Error SS for the model containing both factors is represented by one of six Gaussians, and when comparing SS for factor B with this error SS, the test will conclude that factor B is significant.

Error SS for the model containing only factor B, however, is massive! Comparing SS for factor B with this massive error SS will definitely result in B appearing not significant. Which is clearly not the case.

That is why it makes sense to use error SS from the full model.

### Similar Posts:

- Solved – How is an ANOVA calculated for a repeated measures design: aov() vs lm() in R
- Solved – How is an ANOVA calculated for a repeated measures design: aov() vs lm() in R
- Solved – Interpretation of n-way ANOVA results using different models in MATLAB
- Solved – Interpretation of n-way ANOVA results using different models in MATLAB
- Solved – Why are results different when using aov_ez{afex} and Anova{car}, Type III SS in R