A woman I was working for asked me to do a one-way ANOVA on some data. I replied that the data were repeated measures (time series) data, and that I thought the assumption of independence was violated. She replied that I should not worry about the assumptions, just do the test and she would take into account that the assumptions might not have been met.
That did not seem right to me. I did some research, and found this wonderful blog post by David Robinson, K-means clustering is not a free lunch, which exposed me to the No Free Lunch Theorem. I have looked at the original paper and some follow on stuff, and frankly the math is a bit over my head.
The gist of it — according to David Robinson — seems to be that the power of a statistical test comes from its assumptions. And he gives two great examples. As I wade through the other articles and blog posts about it, it seems to always be referenced in terms of either supervised learning or search.
So my question is, does this theorem apply to statistical tests in general? In other words, can one say that the power of a t-test or ANOVA comes from its adherence to the assumptions, and cite the No Free Lunch Theorem?
I owe my former boss a final document regarding the work I did, and I would like to know if I can reference the No Free Lunch Theorem in stating that you cannot just ignore the assumptions of a statistical test and say you'll take that into account when evaluating the results.
Best Answer
I don't know of a proof but I'll bet this applies quite generally. An example is an experiment with 2 subjects in each of 2 treatment groups. The Wilcoxon test cannot possibly be significant at the 0.05 level, but the t-test can. You could say that its power comes more than half from its assumptions and not just from the data. To your original problem, it is not appropriate to proceed as if the observations per subject are independent. To take things into account after the fact is certainly not good statistical practice except in very special circumstances (e.g., cluster sandwich estimators).
Similar Posts:
- Solved – Assumptions behind cross-validation
- Solved – Using Kruskal Wallis vs One Way Anova with small sample size
- Solved – trust the result of a one-way ANOVA with many treatments each with very few replicates
- Solved – trust the result of a one-way ANOVA with many treatments each with very few replicates
- Solved – Zero inflated negative binomial in Stata