Solved – Is sum of squared residuals smaller than sum of squared errors in linear regression

In the linear regression model, the true error vector $U=Y-Xbeta$ is based upon the true value of the unknown coefficient vector $beta$. Meanwhile, the OLS residual vector $U^* =Y-Xbeta^*$ uses the OLS estimator $beta'$ of $beta$, where $U^*$ and $beta^*$ are the estimators of $U$ and $beta$.

Then is it true that $U'Ult {U^*}'U^*$?

No.

Suppose you have a true relationship described by whatever slope $betaneqbeta^*$, but just two data points. You fit a simple linear regression. Then, the straight line will connect the two points and thus the residuals will be zero, and hence so will be ${U^*}'U^*$.

In general, recall that OLS is, by definition, the procedure that minimizes the sum of squared differences from $Y$, so that it cannot be beaten by other linear functions of the same set of regressors $X$.

In relation to this, the many posts on overfitting on this site and elsewhere provide further counterexamples: by making your model increasingly complex through, e.g., including powers of regressors, interactions etc., you can reduce the "training error" (that is, obtain smaller sum of squared residuals) of your model.

Similar Posts:

Rate this post

Leave a Comment