This is a problem from the "7th Kolmogorov Student Olympiad in Probability Theory":
Given one observation $X$ from a $operatorname{Normal}(mu,sigma^2)$ distribution with both parameters unknown, give a confidence interval for $sigma^2$ with a confidence level of at least 99%.
It seems to me that this should be impossible. I have the solution, but haven't read it yet. Any thoughts?
I'll post the solution in a couple days.
[Follow-up edit: Official solution posted below. Cardinal's solution is longer, but gives a better confidence interval. Thanks also to Max and Glen_b for their input.]
Best Answer
Viewed through the lens of probability inequalities and connections to the multiple-observation case, this result might not seem so impossible, or, at least, it might seem more plausible.
Let $renewcommand{Pr}{mathbb P}newcommand{Ind}[1]{mathbf 1_{(#1)}}X sim mathcal N(mu,sigma^2)$ with $mu$ and $sigma^2$ unknown. We can write $X = sigma Z + mu$ for $Z sim mathcal N(0,1)$.
Main Claim: $[0,X^2/q_alpha)$ is a $(1-alpha)$ confidence interval for $sigma^2$ where $q_alpha$ is the $alpha$-level quantile of a chi-squared distribution with one degree of freedom. Furthermore, since this interval has exactly $(1-alpha)$ coverage when $mu = 0$, it is the narrowest possible interval of the form $[0,b X^2)$ for some $b in mathbb R$.
A reason for optimism
Recall that in the $n geq 2$ case, with $T = sum_{i=1}^n (X_i – bar X)^2$, the typical $(1-alpha)$ confidence interval for $sigma^2$ is $$ Big(frac{T}{q_{n-1,(1-alpha)/2}}, frac{T}{q_{n-1,alpha/2}} Big) >, $$ where $q_{k,a}$ is the $a$-level quantile of a chi-squared with $k$ degrees of freedom. This, of course, holds for any $mu$. While this is the most popular interval (called the equal-tailed interval for obvious reasons), it is neither the only one nor even the one of smallest width! As should be apparent, another valid selection is $$ Big(0,frac{T}{q_{n-1,alpha}}Big) >. $$
Since, $T leq sum_{i=1}^n X_i^2$, then $$ Big(0,frac{sum_{i=1}^n X_i^2}{q_{n-1,alpha}}Big) >, $$ also has coverage of at least $(1-alpha)$.
Viewed in this light, we might then be optimistic that the interval in the main claim is true for $n = 1$. The main difference is that there is no zero-degree-of-freedom chi-squared distribution for the case of a single observation, so we must hope that using a one-degree-of-freedom quantile will work.
A half step toward our destination (Exploiting the right tail)
Before diving into a proof of the main claim, let's first look at a preliminary claim that is not nearly as strong or satisfying statistically, but perhaps gives some additional insight into what is going on. You can skip down to the proof of the main claim below, without much (if any) loss. In this section and the next, the proofs—while slightly subtle—are based on only elementary facts: monotonicity of probabilities, and symmetry and unimodality of the normal distribution.
Auxiliary claim: $[0,X^2/z^2_alpha)$ is a $(1-alpha)$ confidence interval for $sigma^2$ as long as $alpha > 1/2$. Here $z_alpha$ is the $alpha$-level quantile of a standard normal.
Proof. $|X| = |-X|$ and $|sigma Z + mu| stackrel{d}{=} |-sigma Z+mu|$ by symmetry, so in what follows we can take $mu geq 0$ without loss of generality. Now, for $theta geq 0$ and $mu geq 0$, $$ Pr(|X| > theta) geq Pr( X > theta) = Pr( sigma Z + mu > theta) geq Pr( Z > theta/sigma) >, $$ and so with $theta = z_{alpha} sigma$, we see that $$ Pr(0 leq sigma^2 < X^2 / z^2_alpha) geq 1 – alpha >. $$ This works only for $alpha > 1/2$, since that is what is needed for $z_alpha > 0$.
This proves the auxiliary claim. While illustrative, it is unsatifying from a statistical perspective since it requires an absurdly large $alpha$ to work.
Proving the main claim
A refinement of the above argument leads to a result that will work for an arbitrary confidence level. First, note that $$ Pr(|X| > theta) = Pr(|Z + mu/sigma| > theta / sigma ) >. $$ Set $a = mu/sigma geq 0$ and $b = theta / sigma geq 0$. Then, $$ Pr(|Z + a| > b) = Phi(a-b) + Phi(-a-b) >. $$ If we can show that the right-hand side increases in $a$ for every fixed $b$, then we can employ a similar argument as in the previous argument. This is at least plausible, since we'd like to believe that if the mean increases, then it becomes more probable that we see a value with a modulus that exceeds $b$. (However, we have to watch out for how quickly the mass is decreasing in the left tail!)
Set $f_b(a) = Phi(a-b) + Phi(-a-b)$. Then $$ f'_b(a) = varphi(a-b) – varphi(-a-b) = varphi(a-b) – varphi(a+b) >. $$ Note that $f'_b(0) = 0$ and for positive $u$, $varphi(u)$ is decreasing in $u$. Now, for $a in (0,2b)$, it is easy to see that $varphi(a-b) geq varphi(-b) = varphi(b)$. These facts taken together easily imply that $$ f'_b(a) geq 0 $$ for all $a geq 0$ and any fixed $b geq 0$.
Hence, we have shown that for $a geq 0$ and $b geq 0$, $$ Pr(|Z + a| > b) geq Pr(|Z| > b) = 2Phi(-b) >. $$
Unraveling all of this, if we take $theta = sqrt{q_alpha} sigma$, we get $$ Pr(X^2 > q_alpha sigma^2) geq Pr(Z^2 > q_alpha) = 1 – alpha >, $$ which establishes the main claim.
Closing remark: A careful reading of the above argument shows that it uses only the symmetric and unimodal properties of the normal distribution. Hence, the approach works analogously for obtaining confidence intervals from a single observation from any symmetric unimodal location-scale family, e.g., Cauchy or Laplace distributions.
Similar Posts:
- Solved – What does a confidence interval with a negative endpoint mean
- Solved – Multistep prediction interval for ARMA(p,q) process
- Solved – Multistep prediction interval for ARMA(p,q) process
- Solved – Multistep prediction interval for ARMA(p,q) process
- Solved – relation between confidence interval and likelihood function