Hopefully, the answer to this question is simple.
Why do I get different results when I am using Anova from the car package and the aov_ez from the afex package?
library(car) library(afex) set.seed(100) dv <- runif(30,90,110) factor_a <- factor(rep(c("a","b"),15)) factor_b <- factor(rep(c("f","m"),each=15)) id <- factor(1:30) data <- data.frame(dv,factor_a,factor_b,id) data$dv[data$factor_b=="f" & data$factor_a=="a"] <- data$dv[data$factor_b=="f" & data$factor_a=="a"] *1.05 aov_ez(dv="dv",between = c("factor_a","factor_b"), id="id",data=data ) Anova(lm(dv~factor_a*factor_b, data=data),type = "III")
Results are different in case of the main effect factor B:
Using aov_ez, the main effect is significant, using Anova, the main effect is not significant. How is that possible?
In both cases, there is a significant interaction and the F-values are very similar. But the F-values of the main effects differ.
> aov_ez(dv="dv",between = c("factor_a","factor_b"), id="id",data=data ) Contrasts set to contr.sum for the following variables: factor_a, factor_b Anova Table (Type 3 tests) Response: dv Effect df MSE F ges p.value 1 factor_a 1, 26 20.96 4.16 + .14 .05 2 factor_b 1, 26 20.96 0.55 .02 .46 3 factor_a:factor_b 1, 26 20.96 9.03 ** .26 .006 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘+’ 0.1 ‘ ’ 1 > Anova(lm(dv~factor_a*factor_b, data=data),type = "III") Anova Table (Type III tests) Response: dv Sum Sq Df F value Pr(>F) (Intercept) 89775 1 4282.2927 < 2.2e-16 *** factor_a 267 1 12.7159 0.001434 ** factor_b 147 1 7.0288 0.013478 * factor_a:factor_b 189 1 9.0269 0.005822 ** Residuals 545 26 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Because I am quite new using R, which of these two packages I should use. Does anyone know why the results differ?
Best Answer
First note that you have a balanced between-subjects design with only dichotomous IVs, so the sum-of-squares type doesn't matter and Anova(lm(dv~factor_a*factor_b, data=data))
produces the same $p$-values as $t$-tests of regression coefficients:
> print(summary(lm(dv~factor_a*factor_b, data=data))) […] Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 105.933 1.619 65.439 < 2e-16 *** factor_ab -8.450 2.370 -3.566 0.00143 ** factor_bm -6.282 2.370 -2.651 0.01348 * factor_ab:factor_bm 10.069 3.351 3.004 0.00582 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 […]
So why does aov_ez
give different $p$s? It seems that aov_ez
codes the IVs with effects coding by default, rather than dummy coding as in base R's lm
; the message Contrasts set to contr.sum for the following variables: factor_a, factor_b
is a warning of this. When we set check.contrasts
to FALSE
, aov_ez
uses dummy coding and we get concordant $F$s and $p$s:
> print(aov_ez(dv="dv",between = c("factor_a","factor_b"), id="id",data=data, check_contrasts = F )) Anova Table (Type 3 tests) Response: dv Effect df MSE F ges p.value 1 factor_a 1, 26 20.96 12.72 ** .33 .001 2 factor_b 1, 26 20.96 7.03 * .21 .01 3 factor_a:factor_b 1, 26 20.96 9.03 ** .26 .006 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘+’ 0.1 ‘ ’ 1 Warning message: Calculating Type 3 sums with contrasts != 'contr.sum' for: factor_a, factor_b, factor_a, factor_b Results likely bogus or not interpretable! You probably want check_contrasts = TRUE or options(contrasts=c('contr.sum','contr.poly'))
Why exactly the package author thinks that dummy coding in this situation makes the results "likely bogus or not interpretable" goes further into the weeds of ANOVA than I'm familiar with. But at least now we know what aov_ez
is doing.
Similar Posts:
- Solved – How to deal with unbalanced group sizes in mixed design analysis
- Solved – Within-subjects contrasts in repeated measures anova with unbalanced design
- Solved – Main effects of 2-way ANOVA – same as 1-way ANOVA on each main effect
- Solved – Big difference between a t-test and a F-test in a mixed model (anova vs summary in lmerTest)
- Solved – Big difference between a t-test and a F-test in a mixed model (anova vs summary in lmerTest)