I have been trying to understand the function anovan
in MATLAB to perform n-way ANOVA to test the effects of multiple factors on my data. What caught my eyes when I read the help page for this function is that, in their example, the p-value for factor X1
changes from being insignificant (p>0.05) to being significant (p<0.05) when the model is changed from default('linear')
to 'interaction'
.
How should I interpret this result? Also, what if I don't know whether the two factors I'm testing have any interaction or not, which model should I use?
Best Answer
I'll use the example in the linked help page with dependent variable y
and factors g1
, g2
, g3
.
When the model
option is set to linear
, no interaction terms are considered in the full model. For type III sum of squares (default in anovan
), the full model is always y ~ g1+g2+g3
. The restricted models for the model comparisons for the 3 tests of main effects are as follows:
g1
: restricted modely ~ g2+g3
g2
: restricted modely ~ g1+g3
g3
: restricted modely ~ g1+g2
When the model
option is set to interaction
, pairwise interactions are considered for the full model. For type III sum of squares, the full model is now always y ~ g1+g2+g3 + g1:g2 + g1:g3 + g2:g3
. The restricted models for the model comparisons for the 3 tests of main effects now are as follows:
g1
: restricted modely ~ g2 + g3 + g1:g2 + g1:g3 + g2:g3
g2
: restricted modely ~ g1 + g3 + g1:g2 + g1:g3 + g2:g3
g3
: restricted modely ~ g1 + g2 + g1:g2 + g1:g3 + g2:g3
The ANOVA F-test is based on the difference in error sum of squares of the full and restricted models, relative to the error sum of squares of the full model.
Since you have different model comparisons for the main effects in the two cases, their p-values differ. This should not be the case when you use type II or type I sum of squares, since the inclusion of interactions does not change their choices for full and restricted models for the tests of main effects. So which set of comparisons should you choose? I'm afraid this has to follow from theoretical considerations, i.e., which comparisons actually test the hypotheses you have?
The help page example in R:
y <- c(52.7, 57.5, 45.9, 44.5, 53.0, 57.0, 45.9, 44.0) g1 <- factor(c(1, 2, 1, 2, 1, 2, 1, 2)) g2 <- factor(c('hi', 'hi', 'lo', 'lo', 'hi', 'hi', 'lo', 'lo')) g3 <- factor(c('may', 'may', 'may', 'may', 'june', 'june', 'june', 'june')) # don't consider interactions to replicate MATLAB's model='linear' option # model comparisons for tests of main effects with SS type III anova(lm(y ~ g2 + g3), lm(y ~ g1 + g2 + g3)) anova(lm(y ~ g1 + g3), lm(y ~ g1 + g2 + g3)) anova(lm(y ~ g1 + g2), lm(y ~ g1 + g2 + g3)) # consider pairwise interactions, test main effects and interactions with SS type III # replicate MATLAB's model='interaction' option, switch to effect-coding first options(contrasts=c(unordered="contr.sum", ordered="contr.poly")) drop1(lm(y ~ g1 + g2 + g3 + g1:g2 + g1:g3 + g2:g3), ~ ., test="F")
Similar Posts:
- Solved – How is an ANOVA calculated for a repeated measures design: aov() vs lm() in R
- Solved – How is an ANOVA calculated for a repeated measures design: aov() vs lm() in R
- Solved – Output of aov() in R varies with changing order of independent variables
- Solved – How to justify the error term in factorial ANOVA
- Solved – Analyse unbalanced repeated measures 2x2x2x2 type II anova interactions