Solved – Normality assumption in linear regression

As an assumption of linear regression, the normality of the distribution of the error is sometimes wrongly "extended" or interpreted as the need for normality of the y or x.

Is it possible to construct a scenario/dataset that where the X and Y are non-normal but the error term is and therefore the obtained linear regression estimates are valid?

Expanding on Hong Oois comment with an image. Here is an image of a dataset where none of the marginals are normally distributed but the residuals still are, thus the assumptions of linear regression are still valid:

enter image description here

The image was generated by the following R code:

library(psych) x <- rbinom(100, 1, 0.3) y <- rnorm(length(x), 5 + x * 5, 1)  scatter.hist(x, y, correl=F, density=F, ellipse=F, xlab="x", ylab="y") 

Similar Posts:

Rate this post

Leave a Comment