I want to know what is the best way to analyze a data set where my response variable is count data and my explanatory variables are continuous variables. All my variables are not normally distributed. Are GLMs a good option?
Best Answer
They are. You may want to look at Poisson regression (in R: glm(..., family=poisson, ...)
) or, if you have overdispersion, Negbin regression or, if you have "too many" zeros, ZIP regression (Zero-Inflated Poisson).
Whether the predictors are normally distributed does not matter. (Except for analyses of influential data points.) What you probably have in mind is whether residuals are normally distributed. This is an important assumption in Ordinary Least Squares – more specifically: for inference in OLS. However, your data are counts, so residuals will not be normal and you are not thinking about OLS, anyway.
Similar Posts:
- Solved – Why the residual-fitted plot looks like this?
- Solved – Poisson regression with both response and explanatory variables as counting
- Solved – Regression with zero-inflated continuous variable, zero-inflated binomial variable and integer response variable
- Solved – Explanatory variables with many zeros
- Solved – Explanatory variables with many zeros