Solved – Measuring goodness of fit for mixed logistic regression model – inconsistent results from R squared and AUC

I am trying to assess the goodness-of-fit or accuracy of 6 generalised linear models. I first assessed this using AUC (calculated from function auc1 described here), and got results ranging from 0.65 to 0.82, and therefore concluded that my models explain a reasonable amount of the variation in my datasets.

However, I then tested them again using Nakagawa and Schielzeth's (2012) method for obtaining R^2 from GLMM, and Nagelkerke's modified R^2 based on likelihood ratio (both in R's MuMIn package). I would expect slight differences in the results, but the results I have are totally inconsistent. Anyone have a suggestion for why this might be? Nakagawa and Schieleth's R^2 function in MuMIn comes with a warning that it is in the experimental stage and should be used with caution. Should I therefore just rely on the results from the AUC calculations?

Details about the models:
I am using my models to try to identify predictive factors in the presence or absence of particular viruses in several wildlife species. The fixed effects are various individual traits (eg. sex, age, month captured, presence of parasites, scaled mass index (SMI), and random variables are site (nested in region), year, and observer(because of potential bias in measurements). Models differ by virus or host species, and explanatory variables not relevant to certain species are not included in those models, but the models are on the whole very similar.

Variation in goodness-of-fit measures:

Model        AUC        Nakagawa & S's           Nagelkerke's                          r.squaredGLMM            r.squaredLR                          (R^2m/R^2c)*             (adj.R^2)  Model 1      0.78       0.96/0.96                0.32 Model 2      0.72       0.22/0.22                0.09 Model 3      0.81       0.96/0.96                0.18 Model 4      0.78       0.09/0.37                0.07 Model 5      0.84       0.28/0.54                0.16 Model 6      0.65       0.08/0.08                0.04 

note: * R^2m refers to variation from fixed effects only, R^2c includes random effects (so the full model)

The concordance probability ($c$-index; ROC area) are not sensitive enough for comparing models. They should be used as a pure discrimination measure. The gold standard is the likelihood and its derivatives (generalized $R^2$, likelihood ratio $chi^2$, etc.).

Similar Posts:

Rate this post

Leave a Comment