# Solved – Questions on Definitions and Notation (MSE, SSE, Sxx)

It is said that \$S_{xx} = sum_{i=1}^n(x_i−overline x)^2 = sum_{i=1}^n x_i^2 −noverline x^2\$.

I suspect this is simple algebra but I am missing something still. How does this work?

Further, Wikipedia mentions that the MSE is \$sum_i frac{(X_i – overline X)^2}{n-2}\$. However, my text notes that SSE is \$sum_i (Y_i-hat{Y} )^2\$.

However it should be the case that \$MSE = frac{SSE}{n-2}\$.

Can Xs and Ys be used interchangeably like this? It seems wrong to me.

Contents

\$begin{align} S_{xx} &= sum_i (x_i – overline x)^2 \ &= sum_i (x_i^2 – 2overline x x_i + overline x^2) \ &= sum_i x_i ^2 – 2overline x sum_i x_i + sum_i overline x^2 \ &= sum_i x_i ^2 – 2overline x sum_i x_i + n overline x^2 end{align} \$

since \$overline x\$ is a constant wrt \$i\$. Now we note that \$overline x = frac{sum_i x_i}{n} Rightarrow sum_i x_i = noverline x\$. So

\$begin{align} S_{xx} &= sum_i x_i ^2 – 2noverline x^2 + n x^2 \ &= sum_i x_i^2 – noverline x^2 end{align} \$

Thus endeth the required algebra.

As for the next part of the question, MSEs can be calculated for any estimator. An estimator is a special kind of random variable.

This is difficult to explain in words, but basically: In your regression problem, you have the random variables \${Y_i}_{i=0}^n\$, which you're trying to estimate by the estimators \${hat Y_i}_i\$. The observed values of \${Y_i}_i\$ are \${y_i}_i\$, and the observed values of \${hat Y_i}_i\$ are \${hat y_i}_i\$. The observed values of an estimator are also called estimates.

Now, since the \${hat Y_i}_i\$ are estimators, you can calculate their MSEs. This is what your text does. A section of the Wikipedia article does the same.

Now, forget about regression. Suppose you just have a vector of observed values \${x_i}_i\$. If you get these values by sampling from an infinite population, your \${X_i}_i\$ are also random variables. But we're generally not interested in the behaviour of these variables on their own. We're more concerned with things like \$overline X\$ (the sample mean) or \$S^2_X\$ (the sample variance) and so on.

Now, \$overline X\$ is also an estimator: It estimates the population mean \$mu\$. So you can also calculate an MSE for \$overline X\$. This is done in a different section of the same Wikipedia article, and I'm guessing this is what you found odd.

(If sampling from an infinite population sounds weird, consider it as sampling from a normal distribution or some other distribution. The "population" is basically all the points under the curve, and thus is infinitely large.)

Rate this post