Select $n$ numbers without replacement from the set ${1,2,…,m}$, and generate the set $S={a_1,a_2,…,a_n}$. I want to calculate the expectation of the variance for the sampling set $mathbb{E}[Var(S)]$ and the maximum variance among all samples : $max{Var(S)}$.
Besides, what's the distribution of the sample variance?
Best Answer
We know that
$$widehat{Var}(mathbf{a}) = frac{1}{n-1}left(sum_{i=1}^n a_i^2 – frac{1}{n}left(sum_{i=1}^n a_i right)^2 right)$$
is an unbiased estimator of the population variance, which is easily computed as $(m+1)m/12$. This, therefore, answers the first question concerning the expected variance.
I will only sketch how to maximize the variance. I claim it is maximized when the $a_i$ are in two contiguous blocks: that is, $mathbf{a}$ is in the form
$$mathbf{a} = (1, 2, ldots, k, m-l+1, m-l+2, ldots, m).$$
(Evidently $k+l = n$.) To prove this claim, suppose $mathbf{a}$ is not in this form: then you can find a gap in one of the end sequences and increase the variance by changing one of the components of $mathbf{a}$ to that gap. It remains only to maximize the variance among these special forms of $mathbf{a}$; this is done by making the end sequence lengths as balanced as possible; that is, by setting $k=l$ when $n$ is even and otherwise by setting either $k=l+1$ or $l=k+1$. When $n=2k$ is even, the maximum variance equals
$$ n frac{left(3 m^2-3 m n+n^2 -1right)}{12 (n-1)}.$$
When $n=2k+1$ is odd, the maximum variance is
$$(n+1) frac{left(3 m^2-3 m n+n^2right)}{12 n}.$$
Similar Posts:
- Solved – Maximum value of coefficient of variation for bounded data set
- Solved – When is the maximum value of chi square achieved for a non-symmetric table
- Solved – Derivation of Group Lasso
- Solved – Sufficient statistic for bivariate or multivariate normal
- Solved – Variance of $hat{mathbf{beta}}_j$ in multiple linear regression models