I am currently completing quasi-binomial regression and I am using this line of R code to plot the residuals.
plot(residuals(mylogit) ~ predict(mylogit,type="link"), xlab=expression(hat(eta)), ylab="Deviance residuals")
I was wondering what do I expect in this plot to see if it is a good model or not.
Best Answer
In principle, you would like to check if your residuals are "quasi-binomial distributed".
The issue with the quasi-families is that there is no clear generating model, so there is no quasi-binomial distribution that we could test against.
@havefun's advice of "You are plotting the residuals vs the fitted value, so you expect the points to scatter around zero without particular patterns" will often, but not always be correct. Simulations for the binomial show that in particular situations (low data, low counts), (deviance) residuals of a correctly specified model are indeed not homogeneous (see vignette below). I guess we can assume that the same applies for the quasi-binomial.
For a binomial, you could use the DHARMa R package, which uses simulations from the fitted model to transform the residuals of any GL(M)M into a standardized space. Once this is done, you can visually assess / test residual problems such as deviations from the distribution, residual dependency on a predictor, heteroskedasticity or autocorrelation in the normal way. See the package vignette for worked-through examples.
If your reason for using the quasi-binomial is to account for overdispersion, you could instead use a binomial with an observation-level random effect (see, e.g. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4517959/), which would have the advantage that the generating model is known, and we can use binomial residual tests.
Similar Posts:
- Solved – Logistic regression, goodness of fit interpretation
- Solved – How to account for overdispersion in a glm with negative binomial distribution
- Solved – Interpreting dumthe variables in glm
- Solved – Why the residual-fitted plot looks like this?
- Solved – Binomial GLM in R: the same data, but two different models