Having troubles to perform a model selection for glmer in R. I'm using the package lme4 with the following structure:
glo_mo <- glmer(aban ~ year + hab + wlv + gra + cov + (1|lodge), data = aban, family='binomial', na.action = na.omit) ``` str(aban) Classes ‘spec_tbl_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 67 obs. of 9 variables: $ lodge : chr "2" "52" "34" "39" ... $ year : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ... $ hab : chr "for" "for" "for" "for" ... $ wlv : num 7 1 NA NA 4 NA NA -4 44 NA ... $ dlv : num 5 NA NA NA 7 NA NA 2 4 NA ... $ gra : num 3 0 0 0 3 NA 0 8 5 4 ... $ cov : num 3.92 16.46 1.78 1.25 2.48 ... $ for_str: num 4.4 4.06 3.65 5.54 4.14 5.69 8.61 5.84 6.23 4.36 ... $ aban : Factor w/ 2 levels "0","1": 1 2 1 2 1 2 2 2 1 2 ...
When I run the model:
summary(glo_mo) Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [glmerMod] Family: binomial ( logit ) Formula: aban ~ year + hab + wlv + gra + cov + (1 | lodge) Data: aban AIC BIC logLik deviance df.resid 76.4 89.7 -31.2 62.4 42 Scaled residuals: Min 1Q Median 3Q Max -1.7283 -1.1100 0.5375 0.7449 1.4179 Random effects: Groups Name Variance Std.Dev. lodge (Intercept) 0.09585 0.3096 Number of obs: 49, groups: lodge, 32 Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.360995 0.824027 -0.438 0.661 year2 0.605911 0.650404 0.932 0.352 habstep -0.340842 0.926110 -0.368 0.713 wlv 0.005414 0.009677 0.559 0.576 gra 0.032089 0.086737 0.370 0.711 cov 0.023428 0.022942 1.021 0.307 Correlation of Fixed Effects: (Intr) year2 habstp wlv gra year2 -0.239 habstep -0.470 0.033 wlv -0.127 -0.051 -0.155 gra -0.666 -0.130 0.411 0.313 cov -0.130 -0.074 -0.647 0.185 -0.170
Then, I tried to standarize and use the function dredge to automatically select best models, but this last one did not work. The following error mistake
stad <- standardize(glo_mo, standardize.y=F) options(na.action = "na.fail") mset <- dredge(stad) Error in dredge(glo_mo) : 'global.model' uses 'na.action' = "na.omit"
So that blocks me to continue to the selection model. Based on my previous steps and with the aim to select best models,
1. What is wrong in my script?
Also, Is AIC the only parameter to select the best models? Do I have to run each of the model combinations to select the best one, or can I apply function dredge or steps to do that?
What are the other options to select best models in glmer with lme4(or other recommend it packages)?
Best Answer
You have too few observations to include in your initial model that many predictors. Also, note that for binary data the effective sample size is determined by the minimum of the frequencies of the zeros and the ones. Hence, you have very little information in your data to obtain any meaningfully stable results.
Finally, as noted in the comments by EdM, model selection, especially with that small sample size, can be very dangerous. It would be best to just report the results from your full model.