Solved – Student’s t vs Mann-Whitney U for small equal samples

I have 2 groups of data to compare using a t-test, both of sample size n=5. Checking to see if the normality assumption of the test holds is difficult since the sample sizes are too small.

I have read that a t-test is robust to non-normality in this case since the sample sizes are equal. However if the data were highly non-normal it might no longer be robust. An alternative would be to perform an Mann-Whitney U test but this test has less power than the t-test (though I'm not sure how less). I actually encounter situations similar to these all of the time and realise there may not be a definitive answer but I am just curious how others in the community would approach this sort of problem. I typically just go with the t-test and tell others to do the same since they are more familiar with the test than the MW test.

How I would approach this: I would not go for a t-test if the requirements for applying it are not (or not known to be) satisfied. This seems obvious to me and good practice for any hypothesis testing. Mere familiarity with one or other test above other one(s) is not a justification. Can you give details where you found that the difference in centrality two small samples from unknown and unverifiable population distributions would be robust against non-normality? I find this hard to accept. Nonparametric tests like M-W compare the difference of population medians (as opposed to mean values), because the median is more robust. This is especially true in your case, where you cannot test for the population distribution. Do you have any prior information or evidence that the two samples were drawn from the same population? Perhaps a description of the experiment might help to judge. You may be pushing the boundaries about what hypothesis testing can do for you here. With small samples (and no repeat experiments available, I presume?) and without any information on the population, I would not want to involve & quantify statistical concepts like power, significance, CI, etc. All you can do really is list some descriptives for the two samples (quote the two medians or their difference, relative to their sum; the ranges & maximum range of their difference, etc.). However, if you could repeat by drawing many samples of size n=5, that would change the picture at lot. Is that an option for you?

Similar Posts:

Rate this post

Leave a Comment