In our study we are looking at the change in the numbers of acoustically tagged fish detections with respect to tidal state (ebbing, flowing), light period (dawn, day, dusk, night) and month (February, March, April, May, June). Detection data is in the form of counts per light stage and corresponding tidal stage and month. Detections are transformed through log transformation and tide, light and month are all set as factors.
I've followed the protocol by Zuur et al (2009) and am therefore using mixed effects models to account for the repeated measures by each unique fish (fish.id is the random effect).
My issue arises at the model selection stage where anova of the models with sequential fixed effects left out:
lmer.1 <- lmer(logdetections ~ tide + light + month + tide:light + (1|fish.id), data=Sea_Bass_Data_2log, REML=FALSE) lmer.2 <- lmer(logdetections ~ light + month + tide:light + (1|fish.id), data=Sea_Bass_Data_2log, REML=FALSE) lmer.3 <- lmer(logdetections ~ tide + month + tide:light + (1|fish.id), data=Sea_Bass_Data_2log, REML=FALSE) lmer.4 <- lmer(logdetections ~ tide + light + tide:light + (1|fish.id), data=Sea_Bass_Data_2log, REML=FALSE) lmer.5 <- lmer(logdetections ~ tide + light + month + (1|fish.id), data=Sea_Bass_Data_2log, REML=FALSE) lmer.6 <- lmer(logdetections ~ 1 + (1|fish.id), data=Sea_Bass_Data_2log, REML=FALSE)
An anova of the above models finds that 3 models have the same AIC value and weights.
> anova(lmer.1, lmer.2, lmer.3, lmer.4, lmer.5, lmer.6) Data: Sea_Bass_Data_2log Models: lmer.6: logdetections ~ 1 + (1 | fish.id) lmer.4: logdetections ~ tide + light + tide:light + (1 | fish.id) lmer.5: logdetections ~ tide + light + month + (1 | fish.id) lmer.1: logdetections ~ tide + light + month + tide:light + (1 | fish.id) lmer.2: logdetections ~ light + month + tide:light + (1 | fish.id) lmer.3: logdetections ~ tide + month + tide:light + (1 | fish.id) Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq) lmer.6 3 450.23 461.86 -222.12 444.23 lmer.4 10 390.65 429.43 -185.32 370.65 73.583 7 2.781e-13 *** lmer.5 11 382.41 425.06 -180.20 360.41 10.242 1 0.001373 ** lmer.1 14 378.02 432.31 -175.01 350.02 10.383 3 0.015579 * lmer.2 14 378.02 432.31 -175.01 350.02 0.000 0 < 2.2e-16 *** lmer.3 14 378.02 432.31 -175.01 350.02 0.000 0 1.000000 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
I've used the "bbmle" package to look at AICc and weights and find the same result:
> bbmle::AICctab(lmer.1, lmer.2, lmer.3, lmer.4, lmer.5, lmer.6, base = T, weights = T) AICc dAICc df weight lmer.2 379.3 0.0 14 0.318 lmer.3 379.3 0.0 14 0.318 lmer.1 379.3 0.0 14 0.318 lmer.5 383.2 3.9 11 0.045 lmer.4 391.3 12.0 10 <0.001 lmer.6 450.3 71.0 3 <0.001
I then used the "MuMIn" package function model.avg but am no closer to understanding its output for model selection.
> model.avg(lmer.1, lmer.2, lmer.3) Call: model.avg(object = lmer.1, lmer.2, lmer.3) Component models: ‘124’ ‘234’ ‘1234’ Coefficients: (Intercept) lightday lightdusk lightnight month3 month4 month5 full 0.2603615 0.4141133 0.09135152 0.2023542 0.1938293 0.5564969 0.4062781 subset 0.2603615 0.6211699 0.13702728 0.3035313 0.1938293 0.5564969 0.4062781 month6 lightdawn:tideFlooding lightday:tideFlooding lightdusk:tideFlooding full 0.3912216 0.002832931 -0.1251204 -0.09993087 subset 0.3912216 0.008498793 -0.1251204 -0.09993087 lightnight:tideFlooding tideFlooding lightday:tideEbbing lightday:tideFlooding full 0.03964991 0.005665860 0.2070566 -0.1251204 subset 0.03964991 0.008498789 0.6211699 -0.1251204 lightdusk:tideEbbing lightdusk:tideFlooding lightnight:tideEbbing full 0.04567576 -0.09993087 0.1011771 subset 0.13702728 -0.09993087 0.3035313 lightnight:tideFlooding full 0.03964991 subset 0.03964991
Can anyone advise on how I should select the correct model?
Best Answer
The first three models you've constructed differ in the ways the parameters are defined, but they have the same number of the parameters and the fits are equivalent in every way except for the numerical values of the parameters.
We can illustrate this with a plain linear model – mixed models just complicate the issue.
set.seed(101) dd <- expand.grid(light=c("day","dusk","night"), tide=c("base","Flooding","Ebbing")) dd$y <- rnorm(nrow(dd)) ## add one more row so fit isn't perfect dd <- rbind(dd,dd[1,]) dd$y[nrow(dd)] <- rnorm(1)
Use model.matrix
to see what parameters R will construct when fitting the model (you could also use names(coef(...))
on the output of lm()
, or names(fixef(...))
on the output of (g)lmer
).
tmpf <- function(f) { model.matrix(f,data=dd) } colnames(m1 <- tmpf(~light+tide+light:tide)) ## [1] "(Intercept)" "lightdusk" ## [3] "lightnight" "tideFlooding" ## [5] "tideEbbing" "lightdusk:tideFlooding" ## [7] "lightnight:tideFlooding" "lightdusk:tideEbbing" ## [9] "lightnight:tideEbbing"
If we use the *
operator, we get the interaction plus the main effects; if we redundantly specify the main effects, R silently drops them.
all.equal(m1,tmpf(~light*tide)) ## TRUE all.equal(m1,tmpf(~light+light*tide)) ## TRUE all.equal(m1,tmpf(~light+tide+light*tide)) ## TRUE
If we use :
but leave out one of the main effects we get the same number of parameters (9), but they are organized differently:
colnames(m2 <- tmpf(~light+light:tide)) ## [1] "(Intercept)" "lightdusk" ## [3] "lightnight" "lightday:tideFlooding" ## [5] "lightdusk:tideFlooding" "lightnight:tideFlooding" ## [7] "lightday:tideEbbing" "lightdusk:tideEbbing" ## [9] "lightnight:tideEbbing"
As I explain elsewhere, it rarely makes sense to test the model with interactions present but main effects missing; the only ways that I know of to do this are to construct the dummy variables yourself (either by hand, or by constructing the model matrix, dropping the terms you don't want, and using the remaining model matrix columns as (numeric) predictor variables.
The MuMIn
package tries to do the right thing: from ?dredge
,
By default, marginality constraints are respected, so “all possible combinations” include only those containing interactions with their respective main effects and all lower order terms.
library(MuMIn) full_model <- lm(y~light*tide,data=dd,na.action="na.fail") (dmods <- dredge(full_model)) ## Model selection table ## (Int) lgh tid lgh:tid df logLik AICc delta weight ## 8 -0.27460 + + + 10 23.541 -247.1 0.00 1 ## 1 0.24500 2 -8.291 22.3 269.38 0 ## 3 -0.16790 + 4 -5.948 27.9 274.98 0 ## 2 0.07096 + 4 -7.821 31.6 278.72 0 ## 4 -0.25820 + + 6 -5.543 51.1 298.17 0
As you can see dredge
has not tried to fit any models with the interaction but missing some main effects.
Similar Posts:
- Solved – Understanding what a factor is in a model
- Solved – REML or ML to compare two mixed effects models with differing fixed effects, but with the same random effect
- Solved – How to build a linear mixed-effects model in R
- Solved – How to build a linear mixed-effects model in R
- Solved – Random effects for within subjects study in R