I have the following problem.
Consider that there is an election where n persons are voting. The goal of the election is to estimate the weight of a bag. Each person can vote for an integer value : 1 kg , 2 kg, 3 kg….
For example, consider that 10 persons have voted and that the result is:
- value 1 kg : 1 vote
- value 2 kg : 2 votes
- value 3 kg : 3 votes
- value 4 kg : 0 vote
- value 5 kg : 2 votes
- value 6 kg : 2 votes
I can calculate the weighted average of this vote:
((1 x 1) + (2 x 2) + (3 x 3) + (5 x 2) + (6 x 2)) / 10 = 3.6 kg
So this means that on average people think that the bag has a weight of 3.6 kg.
However, I would like to know how good this estimation is. What is the probability that this is accurate? Is there some statistical methods or something from statistics that can be used to solve this kind of problem such as to calculate a confidence interval or something like that?
Thanks for your help.
Best Answer
The standard error (not the standard deviation) can be used to estimate how well you know the mean. This is not necessarily how accurate the mean is, but more how much you must expect it to change if you get more data.
But to reliably use this you probably need some strong assumptions (e.g. i.i.d. and normal distributed) that won't hold for your data (nobody guesses negative weight) Without such assumptions, your distribution could always yield extreme values with low probability that make the mean meaningless.
It won't give you a probability. In particular if you cannot define "right". In your case, the weight probably won't be 2.5 but 2.5001342424… Nor does this temove systematic error. To pick up the famous dress-gate example. On average, people think the dress is some ugly gray. While according to those who have seen it in reality (and on better pictures) it clearly is black and blue. The average doesn't correct every kind of error. It only helps if you have reason to beleive there is some symmetric (e.g. Gaussian) error distribution added to your observations. This can hold for physical processes. But if you are measuring on the wrong scale (say, time instead of frequency, linear instead of logspace, etc. – there is even no guarantee that your measurement device is using the correct data representation for this!) then symmetry is easily lost. Say we have a transmitter sending with strength $5+N(0,1)$. The signal strength drops quadratically with distance (a reasonable model for many physical signals). The error in the observed volume no longer will be symmetrical: $frac{5+N(0,1)}{d^2}$. But the observation will have a measurement error, too!
Similar Posts:
- Solved – How to determine if a value is significantly larger than other values
- Solved – n equivalent to Lower bound of Wilson score confidence interval for variables with more outcome
- Solved – Probability of candidate winning majority vote
- Solved – Estimate of Uncertainty (95th Percentile – 5th Percentile)
- Solved – Probabilities of classes using h2o.predict