So I have read many textbooks and so many R tutorials that I am going crazy here. How do you decide on which model to use? I really hope this comes with experience but with the amount of modern techniques coming out and evidence for and against transformations, etc., how is anyone supposed to actually create a model that produces the correct result?
All I want to know is if there is a significant difference between the number of points in a plot covered with wood between two treatments (Low and High elephant impact). I would also like to know if any of the effects are significant. Each site has 5 plots (1,2,3,4,5). The number of points covered with wood were counted in each plot in 2013 and then again in 2014 and 2015. Therefore I have repeated measures.
My response variable is Number
= number of points covered with wood
My fixed effects or predictor variable are Year
(2013,2014,2015) and Site
(High and Low)
To account for the repeated measure, Year
and Site
are also my random effects. Or should this actually be Plot
(1,2,3,4,5)?
The first option is to use a GLMM, as I have both random and fixed effects; because I have count data, I selected the Poisson family:
model<-glmer(Number~Year*Treatment+(1|Year:Treatment),data=data,family=poisson)
Firstly, can Year
and Treatment
act as both fixed and random effects in the same model? I haven't included plot as I'm assuming the repeated measure is actually YearL is that correct?
Secondly, if my data is not normally distributed, should I log-transform it and then run the GLMM?
Or should I rather leave it untransformed and use a linear mixed effects model (LME) instead?
model1<-lmer(Number~Year*Treatment+(1+Year|Treatment),data=data,REML=FALSE)
For the LME, should I stipulate a distribution? Or does it automatically use the Gaussian distribution (Normal distribution)?
Again, can Year
and Treatment
be both fixed and random effects?
Could this actually be non-linear?
Best Answer
If you have count data as the response variable then you should be using a glmm. A poisson model is appropriate so long as it is not over-dispersed or zero-inflated, in which case you will need to consider other glmms.
If I understood the description correctly then have 3 repeated measures in 2 sites where each site has 5 plots. So plots are nested within sites, but you don't have enough sites, or plots, to treat them as nested with the usual syntax (1|site/plot)
, so instead you could use the combination of site and plot as the grouping factor (1|site:plot)
. Treatment is clearly a fixed effect and there is no justification for treating it as random. There are only 3 years, so this can be treated as fixed too.
So I would suggest a model such as:
glmer(Number~Year*Treatment+(1|site:plot),data=data,family=poisson)
Similar Posts:
- Solved – How does one decide on whether to use a GLMM versus an LME? And how do you select the random/fixed effects
- Solved – How does one decide on whether to use a GLMM versus an LME? And how do you select the random/fixed effects
- Solved – Mixed effects model or mixed design ANOVA in R
- Solved – Including time as a continuous predictor and as a categorical random effect in mixed model
- Solved – Does the blocking factor confound the fixed effect in the model