As an assumption of linear regression, the normality of the distribution of the error is sometimes wrongly "extended" or interpreted as the need for normality of the y or x.

Is it possible to construct a scenario/dataset that where the X and Y are non-normal but the error term is and therefore the obtained linear regression estimates are valid?

**Contents**hide

#### Best Answer

Expanding on Hong Oois comment with an image. Here is an image of a dataset where none of the marginals are normally distributed but the residuals still are, thus the assumptions of linear regression are still valid:

The image was generated by the following R code:

`library(psych) x <- rbinom(100, 1, 0.3) y <- rnorm(length(x), 5 + x * 5, 1) scatter.hist(x, y, correl=F, density=F, ellipse=F, xlab="x", ylab="y") `

### Similar Posts:

- Solved – Calculating the confidence interval for simple linear regression coefficient estimates
- Solved – Calculating the confidence interval for simple linear regression coefficient estimates
- Solved – Calculating the confidence interval for simple linear regression coefficient estimates
- Solved – Calculating the confidence interval for simple linear regression coefficient estimates
- Solved – What to do first when there are violations of assumption in Simple Regression?