Solved – Why do these statements not follow logically from a 95% CI for the mean

I've been reading Hoekstra et al's 2014 paper on "Robust misinterpretation of confidence intervals", which I downloaded from Wagenmakers's website.

On the penultimate page the following image appears.


According to the authors, False is the correct answer to all these statements. I am not very sure why the statements are false, and as far as I can tell the rest of the paper does not attempt to explain this.

I believe that 1-2 and 4 aren't correct because they assert something about the probable value of the true mean, when the true mean has a definite value that is unknown. Is this a convincing distinction?

Regarding 3, I understand that one is not meant to make assertions about the likelihood the null hypothesis is incorrect, though I'm not so sure of the reason why.

Similarly 6 can't be true because it implies that the true mean is changing from experiment to experiment.

The one I really don't understand at all is 5. Why is that one wrong? If I have a process that 95% of the time produces CIs that contain the true mean, why shouldn't I say I have 95% confidence the population value is between 0.1 and 0.4? Is it because we might have some special information about the sample we just took that would make us think it's likely to be one of the 5% that does not contain the true mean? For example, 0.13 is included in the confidence interval and for some reason 0.13 is not considered a plausible value within some specific research context, e.g. because that value would conflict with previous theory.

What does confidence mean in this context, anyway?

The very meaning of question (5) depends on some undisclosed interpretation of "confidence." I searched the paper carefully and found no attempt to define "confidence" or what it might mean in this context. The paper's explanation of its answer to question (5) is

"… [it] mentions the boundaries of the CI whereas … a CI can be used to evaluate only the procedure and not a specific interval."

This is both specious and misleading. First, if you cannot evaluate the result of the procedure, then what good is the procedure in the first place? Second, the statement in the question is not about the procedure, but about the reader's "confidence" in its results.

The authors defend themselves:

"Before proceeding, it is important to recall the correct definition of a CI. A CI is a numerical interval constructed around the estimate of a parameter. Such an interval does not, however, directly indicate a property of the parameter; instead, it indicates a property of the procedure, as is typical for a frequentist technique."

Their bias emerges in the last phrase: "frequentist technique" (written, perhaps, with an implicit sneer). Although this characterization is correct, it is critically incomplete. It fails to notice that a confidence interval is also a property of the experimental methods (how samples were obtained and measured) and, more importantly, of nature herself. That is the only reason why anyone would be interested in its value.

I recently had the pleasure of reading Edward Batschelet's Circular Statistics in Biology (Academic Press, 1981). Batschelet writes clearly and to the point, in a style directed at the working scientist. Here is what he says about confidence intervals:

"An estimate of a parameter without indications of deviations caused by chance fluctuations has little scientific value.

"Whereas the parameter to be estimated is a fixed number, the confidence limits are determined by the sample. They are statistics and, therefore, dependent on chance fluctuations. Different samples drawn from the same population lead to different confidence intervals."

[The emphasis is in the original, at pp 84-85.]

Notice the difference in emphasis: whereas the paper in question focuses on the procedure, Batschelet focuses on the sample and specifically on what it can reveal about the parameter and how much that information can be affected by "chance fluctuations." I find this unabashedly practical, scientific approach far more constructive, illuminating, and–ultimately–useful.

A fuller characterization of confidence intervals than offered by the paper therefore would have to proceed something like this:

A CI is a numerical interval constructed around the estimate of a parameter. Anyone agreeing with the assumptions underlying the CI construction is justified in saying they are confident that the parameter lies within the interval: this is the meaning of "confident." This meaning is broadly in accord with conventional non-technical meanings of confidence because under many replications of the experiment (whether or not they actually take place) the CI, although it will vary, is expected to contain the parameter most of the time.

In this fuller, more conventional, and more constructive sense of "confidence," the answer to question (5) is true.

Similar Posts:

Rate this post

Leave a Comment