My x and y data (n) have a non normal distribution. I'm measuring the mean of the distribution (I'm using Python):

`x_m=np.mean(x) y_m=np.mean(y) `

I would like to measure error that represent the standard deviation of the mean (from the 16th and 84th percentile). Do you think this is the right way to do it?

`y1=(numpy.percentile(y,16))/n y2=(numpy.percentile(y,84))/n `

Are there any other possibility to measure such an error?

Thank you for your help.

**Contents**hide

#### Best Answer

I'm afraid you've gotten confused. The standard deviation of a random variable is defined according to a simple formula, namely: $$ sigma_y = sqrt{E[(Y – mu_y)^2]}$$ (This can be re-arranged in various ways, so you may see somewhat different formulae too–see the wikipedia page linked above).

Now, for a normal distribution, it turns out that about 68% of the data (i.e., the 16th – 84th percentiles), lie within one standard deviation of the mean, but this in **no way** implies that the standard deviation is, in general, connected to those percentiles.

If you want to report standard deviations, calculate them with the formula above and go from there.

Edit: As @User603 points out, one could use percentiles to derive a (weak) *lower bound* on a standard deviation using Chebyshev's Inequality. Chebyshev's inequality is usually paraphrased as saying that "most" values are near the mean. In particular, at least $100(1 – 1/(k^2))$ percent of the values are within $k$ standard deviations of the population mean, *regardless of the distribution*. Note that these bounds are pretty loose–and they get even looser when you have sample means and standard deviations instead of population values–so this is of great theoretical interest, but rarely used to analyze data.

Therefore, I'd reiterate my suggestion that you just calculate the standard deviation directly. In numpy, it's just:

`numpy.std(y, ddof=1) `

(The ddof keyword argument gets you a less-biased estimate of the sample standard deviation; see the numpy std docs for more details).

### Similar Posts:

- Solved – How to calculate the mean and standard deviation given three percentiles using Matlab
- Solved – Relationship Between Percentile and Confidence Interval (On a Mean)
- Solved – Determining the percentile rank of a score w.r.t. a known distribution
- Solved – Standard deviation of revenue where numbers are only positive
- Solved – Standard deviation of revenue where numbers are only positive