On page 72 of *Introductory Statistics, A Conceptual Approach Using R* (Routledge, 2012), the authors first compute the variance of a sample of size $n$ using:

$$sigma^2=dfrac{sum_{i=1}^n(Y_i-mu)^2}{n}$$

Then, because they do not know the mean $mu$ of the population, they replace it with the sample mean $overline{Y}$:

$$hat{sigma}^2=dfrac{sum_{i=1}^n(Y_i-overline{Y})^2}{n}$$

Next they say they use "expectation algebra" to show that:

$$E(hat{sigma}^2)=sigma^2-frac{sigma^2}{n}$$

I've tried a number of things. For example, I tried:

$$begin{align*}

E(hat{sigma}^2)

&=Eleft[frac{sum(Y-overline{Y})^2}{n}right]\

&=frac1n Eleft[sum Y^2-2overline{Y}sum Y+sumoverline{Y}^2right]\

&=frac1n Eleft[sum Y^2-noverline{Y}^2right]\

&=frac1nEleft[sum Y^2right]-overline{Y}^2

end{align*}$$

But I have been unable to make this equal to $sigma^2-sigma^2/n$. Any suggestions would be helpful, allowing me to continue my reading.

**Contents**hide

#### Best Answer

I didn't check that reference, but I guess they are assuming that $Y_i$'s are independent with $E(Y_i)=mu$ and $Var(Y_i)=sigma^2$ for $i=1,2,…,n$ i.e. all the observation has the same (finite) mean $mu$ and (finite) variance $sigma^2$. So first note that $E(Y_i^2)=Var(Y_i)+E^2(Y_i)=sigma^2+mu^2$. Also for $bar{Y}=dfrac{sum_{i=1}^n Y_i}{n}$ we have: $E(bar{Y})=dfrac{sum_{i=1}^n E(Y_i)}{n}=dfrac{nmu}{n}=mu$. In addition, by using independency among $Y_i$'s, we have: $Var(bar{Y})=dfrac{sum_{i=1}^n Var(Y_i)}{n^2}=dfrac{nsigma^2}{n^2}=dfrac{sigma^2}{n}$. Now it is easy to find $E(bar{Y}^2)=Var(bar{Y})+E^2(bar{Y})=sigma^2/n+mu^2$. You should take an expectation from $bar{Y}^2$ in the last line you wrote as well, i.e. $E(hat{sigma}^2)=dfrac{1}{n}E(sum_{i=1}^n Y_i^2)-E(bar{Y}^2)=dfrac{1}{n}.n.E(Y_i^2)-sigma^2/n-mu^2$. Now replace $E(Y_i^2)=sigma^2+mu^2$ to get $E(hat{sigma}^2)=sigma^2-sigma^2/n$.