Solved – ANOVA using count data

I'd like to perform a two-way ANOVA on count data. I was told to first fit a GLM and then do the ANOVA. My first problem is that

fit1 <- aov(glm(Branches~Accession*Location, data=branches, family=quasipoisson)) summary(fit1) 

and

fit2 <- glm(Branches~Accession*Location, data=branches, family=quasipoisson) Anova(fit2, test="F") 

don't result in the same p-values. What is the mistake here? Which is the right way of doing this, or is it wrong to do an ANOVA following a GLM anyway?

My second problem is that I don't know how I can do a post hoc test. For example, when I do a Tukey's test, should I use the ANOVA model of the GLM or the GLM itself?

I don't know how aov handles glm objects, but the documentation for aov mentions only lm objects.

My advice, then, is to not use aov, but just use car::Anova directly to produce an analysis of deviance table. Another option is emmeans::joint_tests.

For post-hoc testing, I recommend the emmeans package, since it explicitly lists supported model objects.

if(!require(car)){install.packages("car")} if(!require(emmeans)){install.packages("emmeans")}  set.seed(1234)  Accession = factor(rep(c("1", "2", "3"), 1, each=6)) Location  = factor(rep(c("A", "B"), 9))  Branches  = as.numeric(Accession) * 2 +            as.numeric(Location) + rnorm(length(Accession), 0, 1)  Branches = round(Branches)  branches = data.frame(Accession, Location, Branches)   str(branches)  # # #  fit1 <- glm(Branches~Accession*Location, data=branches, family=quasipoisson)  summary(fit1)  library(car)  Anova(fit2)     ### Analysis of Deviance Table (Type II tests)    ###     ### Response: Branches    ###                   LR Chisq Df Pr(>Chisq)        ### Accession            36.153  2  1.411e-08 ***    ### Location              5.912  1    0.01504 *      ### Accession:Location    1.841  2    0.39837        ### ---    ### Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1   library(emmeans)   marginal = emmeans(fit1, ~ Accession)  pairs(marginal)     ### NOTE: Results may be misleading due to involvement in interactions    ###    ###  contrast   estimate        SE  df z.ratio p.value    ###  1 - 2    -0.3389399 0.1363010 Inf  -2.487  0.0345    ###  1 - 3    -0.7680533 0.1260328 Inf  -6.094  <.0001    ###  2 - 3    -0.4291135 0.1128866 Inf  -3.801  0.0004    ###     ### Results are averaged over the levels of: Location     ### Results are given on the log (not the response) scale.     ### P value adjustment: tukey method for comparing a family of 3 estimates  

Similar Posts:

Rate this post

Leave a Comment