Just learning some stats, so please forgive if this is simple but I couldn't find a good explanation.

Let $X sim mathcal{N}(mu,sigma^2)$ and $Y = e^X$. To find an approximately 95% confidence interval, note

begin{align*}

P(a leq Y leq b) & = P(a leq e^X leq b) \

& = P(log a leq X leq log b) \

& = Pleft(frac{log a – mu}{sigma} leq Z leq frac{log b – mu}{sigma}right) \

& = frac{1}{sqrt{2pi}} int_{frac{log a – mu}{sigma}}^frac{log b – mu}{sigma} e^{-z^2/2} dz \

& triangleq 0.95,

end{align*}

for which we know

begin{align*}

frac{log b – mu}{sigma} & approx 2 iff b = e^{mu + 2sigma}, \

frac{log a – mu}{sigma} & approx 2 iff a = e^{mu – 2sigma}.

end{align*}

Then, my understanding of a confidence interval (CI) would lead me to believe 95% of the values of $Y$ should lie within the interval

$$

[e^{mu + sigma^2/2} – e^{mu – 2sigma},e^{mu + sigma^2/2} + e^{mu + 2sigma}],

$$

where $e^{mu + sigma^2/2}$ is the mean of $Y$. Is this correct? Specifically, when we speak of a " 95% confidence interval," do we mean that 95% of the values lie within the *mean* of the random variable, or another average like median or mode?

Finally, to clear up a source of confusion on notation. For a normally-distributed random variable $X sim mathcal{N}(mu,sigma^2)$, the variance $sigma^2$ is also the square of the *standard deviation* (SD) $sigma$, for which an approximate 95% confidence interval is $[mu – 2sigma, mu + 2sigma]$. Similarly for a lognormally-distributed random variable $Y = e^X$, its variance is given by $(e^{sigma^2} – 1) e^{2mu + sigma^2}$, and I believe its standard deviation would again just be the square root of this (by definition), namely $left(sqrt{e^{sigma^2} – 1}right) e^{mu + sigma^2/2}$. But now we don't have that an approximate 95% confidence interval is $[mean – 2*SD, mean + 2*SD]$ since the pdf of $Y$ is not symmetric.

So, is the $mean pm SD$ property for a confidence interval only valid for normal random variables?

**Contents**hide

#### Best Answer

Is this correct?

No.

i) This isn't a *confidence* interval you're calculating (since those are for parameters or functions of them), nor is it really a prediction interval, a tolerance interval, or any of the more common statistical intervals … since for starters it's based on known population values, not on a sample.

ii) You already calculated the limits of an interval that includes 95% of the probability; it's $(a,b)$, *not* $(mu-a,mu+b)$.

do we mean that 95% of the values lie within the mean of the random variable

No. The mean is a single value. How can 95% of a continuous distribution lie "within" a single value?

But now we don't have that an approximate 95% confidence interval is [mean−2∗SD,mean+2∗SD] since the pdf of Y is not symmetric.

Just because the density isn't symmetric doesn't of itself mean that a symmetric interval can't include 95% of the probability.

It *doesn't* include 95%, as it happens, though it's often fairly close to 95% for unimodal distributions. However, while it works pretty well for $pm 2sigma$, that doesn't always carry over nearly as well to other numbers of sds not close to 2.

So, is the mean±SD property for a confidence interval only valid for normal random variables?

(Again, keeping in mind that it's not a confidence interval)

Well, actually, for normal random variables, 95% of the distribution is within 1.96 sd's of the mean and 95.4% is within 2 sd's of the mean.

Those numbers are calculated from the normal distribution function; $Phi(1.96)-Phi(-1.96)=0.9500$ and $Phi(2)-Phi(-2)=0.9545$.

### Similar Posts:

- Solved – Confidence Interval of a Lognormal Random Variable
- Solved – Confidence Interval of a Lognormal Random Variable
- Solved – Error Bars for Monte Carlo Experiment
- Solved – Estimating Uniform distribution endpoints using data with errors
- Solved – confusion regarding confidence interval of normal distribution