Solved – How is approximately unbiased bootstrap better than a regular bootstrap with regards to hierarchical clustering

I asked this question at BioStar but did not get a reply, so Im posting the question here.

What is a simple explanation of what an approximately unbiased bootstrap is with regards to hierarchical clustering?

From what I read, it alters the sample size during randomization to calculate p-values.

How is this approach better than the regular bootstrap which keeps the sample size intact while randomizing and also is it randomization with replacement?

Edit: There is an R package pvclust that calculates p-value and approximately unbiased p-value. My apologies for being unclear as I thought this was due to a difference in the bootstrap method. Thanks for all the answers and comments!

Helen: I am the author of books on the bootstrap but I don't know what you mean by the term approximately unbiased bootstrap. As a guess based on your description that you may be talking about the $m$ out of $n$ bootstrap. The $m$ out of $n$ bootstrap takes an original sample of size $n$ and samples $m$ times with replacement from that sample where $m < n$. Each sample of size $m$ is an $m$ out of $n$ bootstrap sample. Most of the time the ordinary bootstrap provides consistent estimates for the parameter but there are situations where it fails to be consistent. In those cases the $m$ out of $n$ bootstrap is often consistent as long as $m$ approaches infinity at a slower rate than $n$. One example is the estimate of a population mean when the distribution of the samples does not have a finite variance. Such results have been proven in papers by Peter Bickel and his coauthors.

Bill Huber has shown that my guess was wrong. It appears that in the paper they are referring to a p-value estimate that is determined by bootstrap and they happen to choose the modifier "approximately unbiased". But it is not a variant of the bootstrap.

Similar Posts:

Rate this post

Leave a Comment