Solved – Expectation of the variance of the sampling set without replacement

Select $n$ numbers without replacement from the set ${1,2,…,m}$, and generate the set $S={a_1,a_2,…,a_n}$. I want to calculate the expectation of the variance for the sampling set $mathbb{E}[Var(S)]$ and the maximum variance among all samples : $max{Var(S)}$.

Besides, what's the distribution of the sample variance?

We know that

$$widehat{Var}(mathbf{a}) = frac{1}{n-1}left(sum_{i=1}^n a_i^2 – frac{1}{n}left(sum_{i=1}^n a_i right)^2 right)$$

is an unbiased estimator of the population variance, which is easily computed as $(m+1)m/12$. This, therefore, answers the first question concerning the expected variance.

I will only sketch how to maximize the variance. I claim it is maximized when the $a_i$ are in two contiguous blocks: that is, $mathbf{a}$ is in the form

$$mathbf{a} = (1, 2, ldots, k, m-l+1, m-l+2, ldots, m).$$

(Evidently $k+l = n$.) To prove this claim, suppose $mathbf{a}$ is not in this form: then you can find a gap in one of the end sequences and increase the variance by changing one of the components of $mathbf{a}$ to that gap. It remains only to maximize the variance among these special forms of $mathbf{a}$; this is done by making the end sequence lengths as balanced as possible; that is, by setting $k=l$ when $n$ is even and otherwise by setting either $k=l+1$ or $l=k+1$. When $n=2k$ is even, the maximum variance equals

$$ n frac{left(3 m^2-3 m n+n^2 -1right)}{12 (n-1)}.$$

When $n=2k+1$ is odd, the maximum variance is

$$(n+1) frac{left(3 m^2-3 m n+n^2right)}{12 n}.$$

Similar Posts:

Rate this post

Leave a Comment