I am a complete stranger to statistics (apart from mandatory courses in college), but lately I ran into an interesting real-world scenario.
Recently I started jogging. I take my GPS phone with me to measure time and distance. I picked a route (about 4km, but I don't know exactly) and started running. I have observed that the distance measured is not constant, although the route is the same. Sometimes, the distance measured is a little more than 4km and sometimes a little less. I am interested in the precision of my phone GPS.
But there is one small problem: when I arrive, I usually (but not always) check the phone, and if the distance is less than 4Km, I continue running to reach 4Km. So even though I have all the measurements saved in the history, I cannot use them to calculate the mean value, because sometimes I run a little bit more to reach 4km. What I do know, however, is that during my 20 runs, the minimum distance was about 3.9km and maximum 4.05km.
So what I am interested in is how precise my GPS is and how long the path is.
I expect the answer to be something along the lines "with confidence X%, the path is Y km long and the GPS has Z% distance variation". Or would I have to know the exact mean to calculate that?
Best Answer
As I mentioned even though you cannot compute a confidence interval for the mean of the distribution you can determine a nonparametric tolerance interval. Tolerance interval tell you with a given level of confidence what percentage of the distribution of your run lengths fall within the interval. They are based on using upper and lower order statistics for the end point. Since you can say that the minimum distance out of 20 was 3.9 km and the maximum was 4.05 that range can be used as a tolerance interval for a certain confidence and coverage. Table A.16 in Hahn and Meeker's book "Statistical Intervals" provides the answer. From this we see that you can be 99% confident that the interval generated by taking the range will include at least 75% of the distribution. For it to be able to claim higher coverage you would need a larger sample size. For a sample size of 50 it will give you 99% confidence in a coverage of 90% of the distribution and for 95% you need a sample size of 100. For 99% confidence in 99% coverage you need a sample size of 500. The required sample size can go down a little for a given % coverage if you lower the ocnfidence level from 99% to say 95% or 90%.
Similar Posts:
- Solved – Interpretation of a 95% confidence interval calculated via bootstrapping
- Solved – Can one-sided confidence intervals have 95% coverage
- Solved – One Sigma error and 68% tolerance interval
- Solved – One Sigma error and 68% tolerance interval
- Solved – the equivalent of a standard deviation when considering a least squares fit line