There seem to be too many points clustered around negative values for all the plots
And while 3 & 4 seem to have random enough patterns, 1 & 2 seems to have negatively sloped trend.
If these were to violate linearity and homogeneity assumption, I should stop using the regression model, correct?
Best Answer
Yes, the residual plots for variables 1 & 2 are problematic. I don't necessarily see any heterogeneity of variance (heteroscedasticity), or even non-linearity, but they certainly show non-independence. You can very clearly guess if a residual will be above or below 0 based on whether its neighbors are.
I do want to clear up a small misunderstanding. You state that you think there may be too many residuals below 0. It isn't that 50% of the residuals must be <0, and 50% above, rather the assumption is that the mean of the residuals is 0. If you have some skew in the distribution of the residuals, the mean won't equal the median, and you can validly have different numbers greater or less than 0.
I am perplexed, though. The OLS algorithm should ensure that what you see in your top two plots does not happen in regression. What code / program did you use to fit the data and generate these residuals? Did you force the intercept to be 0? That is the only thing I can think of that would produce the plots you show.
Similar Posts:
- Solved – Standardized residuals in R’s lm output
- Solved – Why are residual plots constructed using the residuals vs the predicted values
- Solved – Analyzing bad lm plots in R (Two parallel lines in Residuals and Normal QQ)
- Solved – Interpreting linearity in residual vs. fitted plot
- Solved – What type of post-fit analysis of residuals do you use