Question about the Total, Explained, and Residual Sum of Squares. I am in the simple linear regression model.
Could you help me clarify why the residual sum of squares (SSE where E stands for errors)
$$SSE = sum_{i=1}^n (hat{Y}_{i}-Y_{i})^{2} $$
is the amount in variance of Y which is not explained by the model and why the explained sum of squares (SSR where R stands for regression)
$$SSR = sum_{i=1}^n (hat{Y}_{i}-bar{Y})^{2} $$
is the amount in variance of Y which is explained by the model, where $Y_{i}$ are the observations, $bar{y}$ is the esperance and $hat{Y}_{i}$ are the predicted values in the model.
Best Answer
If you take a normal regression model, $$Y_i|X_isimmathcal{N}(X_i^text{T}beta,sigma^2),$$ the density of the data $(Y_1,ldots,Y_n)$ writes as follows: begin{align*}&expleft{ -frac{1}{2sigma^2}sum_{i=1}^n (Y_i-X_i^text{T}beta)^2 right}\ &qquad=expleft{-frac{1}{2sigma^2}sum_{i=1}^n [(Y_i-X_i^text{T}hat{beta})^2 +(X_i^text{T}hat{beta}-X_i^text{T}beta)^2]right}\ &qquad=expleft{-frac{1}{2sigma^2}[text{SSR}+text{SSE}]right}\ &qquad=expleft{-frac{1}{2sigma^2}text{SSE}right}timesexpleft{-frac{1}{2sigma^2}text{SSR}right}end{align*} and only the first term depends on the parameter $beta$ and hence characterises the model fit, while the second term is about the residual variability of the $Y_i$'s around their best prediction or projection, $X_i^text{T}hat{beta}$.
Similar Posts:
- Solved – Simple Linear Regression: how does $Sigmahat{u_i}^2/sigma^2$ follow chi squared distribution with df (n-2)
- Solved – Question about deriving posterior distribution from normal prior and likelihood
- Solved – Deriving the Ridge Regression $boldsymbol{beta}mid mathbf{y}$ distribution
- Solved – How to derive the conjugate prior of an exponential family distribution
- Solved – Posterior distribution of Normal Normal-inverse-Gamma Conjugacy