It is said that $S_{xx} = sum_{i=1}^n(x_i−overline x)^2 = sum_{i=1}^n x_i^2 −noverline x^2$.

I suspect this is simple algebra but I am missing something still. How does this work?

Further, Wikipedia mentions that the MSE is $sum_i frac{(X_i – overline X)^2}{n-2}$. However, my text notes that SSE is $sum_i (Y_i-hat{Y} )^2$.

However it should be the case that $MSE = frac{SSE}{n-2}$.

Can Xs and Ys be used interchangeably like this? It seems wrong to me.

**Contents**hide

#### Best Answer

$begin{align} S_{xx} &= sum_i (x_i – overline x)^2 \ &= sum_i (x_i^2 – 2overline x x_i + overline x^2) \ &= sum_i x_i ^2 – 2overline x sum_i x_i + sum_i overline x^2 \ &= sum_i x_i ^2 – 2overline x sum_i x_i + n overline x^2 end{align} $

since $overline x$ is a constant wrt $i$. Now we note that $overline x = frac{sum_i x_i}{n} Rightarrow sum_i x_i = noverline x$. So

$begin{align} S_{xx} &= sum_i x_i ^2 – 2noverline x^2 + n x^2 \ &= sum_i x_i^2 – noverline x^2 end{align} $

Thus endeth the required algebra.

As for the next part of the question, MSEs can be calculated for any estimator. An estimator is a special kind of random variable.

This is difficult to explain in words, but basically: In your regression problem, you have the random variables ${Y_i}_{i=0}^n$, which you're trying to estimate by the **estimators** ${hat Y_i}_i$. The observed values of ${Y_i}_i$ are ${y_i}_i$, and the observed values of ${hat Y_i}_i$ are ${hat y_i}_i$. The observed values of an estimator are also called **estimates**.

Now, since the ${hat Y_i}_i$ are estimators, you can calculate their MSEs. This is what your text does. A section of the Wikipedia article does the same.

Now, forget about regression. Suppose you just have a vector of observed values ${x_i}_i$. If you get these values by sampling from an infinite population, your ${X_i}_i$ are *also* random variables. But we're generally not interested in the behaviour of these variables on their own. We're more concerned with things like $overline X$ (the sample mean) or $S^2_X$ (the sample variance) and so on.

Now, $overline X$ is also an estimator: It estimates the population mean $mu$. So you can also calculate an MSE for $overline X$. This is done in a different section of the same Wikipedia article, and I'm guessing this is what you found odd.

(If sampling from an infinite population sounds weird, consider it as sampling from a normal distribution or some other distribution. The "population" is basically all the points under the curve, and thus is infinitely large.)

### Similar Posts:

- Solved – Proof of simple linear regression
- Solved – Posterior distribution of Normal Normal-inverse-Gamma Conjugacy
- Solved – Understanding the proof of sample mean being unbiased estimator of population mean in Simple Random Sampling Without Replacement (SRSWOR)
- Solved – Find the UMVUE of $frac{mu^2}{sigma}$ where $X_isimmathsf N(mu,sigma^2)$
- Solved – Joint posterior distribution of $(mu,sigma^2)$ in the Normal model