# Solved – Interpreting a “wavy” QQ plot

I am observing the following QQ plot produced from an OLS linear regression fit of my data:

Many other SE questions discussion QQ plot interpretation, but this is an extremely regular (but non-linear) patttern that I'm not sure how to interpret. To me this suggests that the linear mean function poorly estimates the response, but what can I learn from this QQ plot? (Perhaps it suggests the data were generated from a beta distribution?)

The residuals seem to follow a Gaussian distribution, and the fitted plot seems pretty okay (although I don't know how to check for equal variance).

Any help with interpretation of these results would be greatly appreciated. If it helps, the outcome is a text sentiment score in the range (-2, 2).

Edit: A histogram of the residuals. A one-sample Kolmogorov-Smirnov test (`ks.test(resid(md), y=pnorm)`) leads me to reject the null hypothesis that the residuals are normally distributed.

Contents

The "flatter" part of a QQ plot suggests that from corresponding normal scores on the X-axis where it is flat, you have more data than would be expected according to a normal probability model. These Z-scores are (low) to -2, -1 to 1, and 2 to (high). For instance, on a normal curve, you'd expect 66% of data to lie within 1 SD of the mean. However, in your residual distribution, you have far more than 66% in that interval. Projecting the curves value at X=-1 and X=1 seems to give a Y of about -.33 to 0.33. That means that the central $$pm$$ 0.33 SD of the residual distribution holds 66% of the data, a much higher concentration than in a normal distribution.