Solved – How to justify the error term in factorial ANOVA

A probably very basic question about multi-factorial ANOVA. Assume a two-way design where we test both main effects A, B, and the interaction A:B. When testing the main effect for A with type I SS, the effect SS is calculated as the difference $RSS(1) – RSS(A)$, where $RSS(1)$ is the residual error sum of squares for the model with just the intercept, and $RSS(A)$ the RSS for the model with factor A added. My question concerns the choice for the error term:

How do you justify that the error term for this test is typically calculated from the RSS of the full model A + B + A:B that includes both main effects and the interaction?
$$
F_{A} = frac{(RSS_{1} – RSS_{A}) / (df_{RSS 1} – df_{RSS A})}{RSS_{A+B+A:B} / df_{RSS A+B+A:B}}
$$

… as opposed to taking the error term from the unrestricted model from the actual comparison (RSS from just the main effect A in the above case):
$$
F_{A} = frac{(RSS_{1} – RSS_{A}) / (df_{RSS 1} – df_{RSS A})}{RSS_{A} / df_{RSS A}}
$$

This makes a difference, as the error term from the full model is probably often (not always) smaller than the error term from the unrestricted model in the comparison. It seems that the choice for the error term is somewhat arbitrary, creating room for desired p-value changes just by adding/removing factors that aren't really of interest, but change the error term anyway.

In the following example, the F-value for A changes considerably depending on the choice for the full model, even though the actual comparison for the effect SS stays the same.

> DV  <- c(41,43,50, 51,43,53,54,46, 45,55,56,60,58,62,62, +          56,47,45,46,49, 58,54,49,61,52,62, 59,55,68,63, +          43,56,48,46,47, 59,46,58,54, 55,69,63,56,62,67)  > IV1 <- factor(rep(1:3, c(3+5+7, 5+6+4, 5+4+6))) > IV2 <- factor(rep(rep(1:3, 3), c(3,5,7, 5,6,4, 5,4,6))) > anova(lm(DV ~ IV1))                           # full model = unrestricted model (just A)           Df  Sum Sq Mean Sq F value Pr(>F) IV1        2  101.11  50.556  0.9342 0.4009 Residuals 42 2272.80  54.114  > anova(lm(DV ~ IV1 + IV2))                     # full model = A+B           Df  Sum Sq Mean Sq F value   Pr(>F)     IV1        2  101.11   50.56  1.9833   0.1509     IV2        2 1253.19  626.59 24.5817 1.09e-07 *** Residuals 40 1019.61   25.49                       > anova(lm(DV ~ IV1 + IV2 + IV1:IV2))           # full model = A+B+A:B           Df  Sum Sq Mean Sq F value    Pr(>F)     IV1        2  101.11   50.56  1.8102    0.1782     IV2        2 1253.19  626.59 22.4357 4.711e-07 *** IV1:IV2    4   14.19    3.55  0.1270    0.9717     Residuals 36 1005.42   27.93 

The same question applies to type II SS, and in general to a general linear hypothesis, i.e., to a model comparison between a restricted and an unrestricted model within a full model. (For type III SS, the unrestricted model is always the full model, so the question doesn't arise there)

This is a very old question, and I believe that @gung's answer is very good (+1). But as it was not entirely convincing for @caracal, and as I don't fully follow all its intricacies either, I would like to provide a simple figure illustrating how I understand the issue.


Consider a two-way ANOVA (factor A has three levels, factor B has two levels) with both factors being obviously very significant:

Factorial ANOVA sums of squares

SS for factor A is huge. SS for factor B is much smaller, but from the top figure it is clear that factor B is nevertheless very significant as well.

Error SS for the model containing both factors is represented by one of six Gaussians, and when comparing SS for factor B with this error SS, the test will conclude that factor B is significant.

Error SS for the model containing only factor B, however, is massive! Comparing SS for factor B with this massive error SS will definitely result in B appearing not significant. Which is clearly not the case.

That is why it makes sense to use error SS from the full model.

Similar Posts:

Rate this post

Leave a Comment