Is is possible to calculate or estimate the range of a population if I know the mean, population size and standard deviation?
I am doing a research and the mean age of the population is 29.9 +/- 5.3 SD, the population size is 21. What will be range of the population age?
Best Answer
When writing the lower part, I missed that the population size is 21. I somehow thought the question was for general population size. For known population size 21 there should be a mathematical maximum range, not only a minimum one. First now considerations regarding the maximum possible range:
Note that it is possible to have a very small observation if all observations larger than the mean 29.9 are not much larger. Here is how to find the smallest possible observation. Let's say 20 observations have size $29.9+epsilon$ with $epsilon>0$ and one observation has size $delta<29.9$. Then for the mean: $$ 29.9=frac{20*(29.9+epsilon)+delta}{21}Rightarrow delta=29.9-20epsilon.$$ The variance is $5.3^2=28.9$, so $$ frac{20epsilon^2+(delta-29.9)^2}{21}=frac{20epsilon^2+(20epsilon)^2}{21}=28.9, $$ thus $$frac{420epsilon^2}{21}=20epsilon^2=28.9Rightarrow epsilon=sqrt{28.9/20}=1.202$$ and $delta=29.9-20epsilon=5.86$, so that's the smallest observation you can have, but only if it is the only observation smaller than the mean and all other observations are larger.
On the other hand, the same argument but assuming $epsilon<0$ shows what can happen if 20 observations are smaller than the mean and only one is larger. Then we get $epsilon=-1.202$, and $delta=29.9-20epsilon=53.94$, which is the biggest observation you can have, again only if all other observations are smaller than the mean.
Now for unknown population size, this was the original answer:
No, it's not. Technically any mean and SD are compatible with any range larger than a minimum possible range, which I haven't checked or computed, but for a given population size there is a minimum possible range.
But the range can be arbitrarily bigger than that. Note that a Gaussian distribution, which is normally taken as the basis for using mean and sd for estimation, is theoretically unlimited, i.e., it ranges (if infinitely many observations were available) from minus to plus infinity, which already shows that you can have an arbitrarily large range with any given mean and sd.
Obviously there are physical bounds in your real example, however in a general situation nothing can be said apart from that the range is larger or equal to the minimum possible one.