Solved – Bootstrapped p-value

I have a p-value that I generate via resampling.

Resamples = 5000

Positive findings = 1000 positive findings

P-value = 1000/5000 = 0.2

How can I compute the 95% confidence interval for this p-value?

I would assume it's a function of the number of positive findings and resamples 1000 and 5000 in this case, respectively.

Is the answer p-value +- 1.96*sqrt(1000/5000 * (1 – 1000/5000) / 1000)?

Why this matters: Resampling is very expensive in my code and I'd like to stop resampling as soon as the 95% confidence interval for the boostrapped p-value doesn't include 0.05. Right now I'm doing a million resamplings to estimate every p-value and it is very slow.

If you want to minimize the number of samples, you are probably better off by estimating the $p$-value using (# positives + 1) / (# resamples + 1), see: (Davison and Hinkley 1997, chapter 4). In that case you can get a fine estimate of the Monte Carlo confidence interval using the 2.5th and 97.5th percentiles from the beta distribution with parameters # positives + 1, and # resamples + 1 – # positives. I discussed the logic behind using the beta distribution on pages 9 and 10 of this presentation.

Davison, A.C. and D.V. Hinkley (1997). Bootstrap methods and their application. Cambridge university press.

Similar Posts:

Rate this post

Leave a Comment