I want to run an ANOVA test. I am therefore testing for normality. I have tested each group and the residuals (group together)for normality. My data sample does not look approximately normal. However I have an outlier (5,95 SD from the mean). This is still a true value, not due to wrong data entering. When I am deleting this number, the data sample looks close to a normal distribution. How should I deal with this value? Is it best to use a non-parametric test? A transformation? Can I just remove the value?
You could consider Cook's distance as an aid for your decision. It is a measure for the effect that removing this observation would have on your analysis. Values with a large Cook's distance merit further attention, those that have a small Cook's distance despite being far outside the range of your other observations shouldn't do much harm. As you do not say which statistical software you use, I cannot tell you exactly how to do that. I use
R myself, I would look at the fourth graph of the
plot of an
plot(lm.fit, which=4, cook.levels=cutoff)
lm.fit is an
My apologies, this suggestion would probably have been better suited for a comment, but I lack the reputation to comment.