Solved – How to a t-test be used to compare the distributions between groups of data

I understand that the t-test is used to test the difference in means for two populations when the populations have relatively similar variances, the units are independent, and they are normal (especially with smaller sample sizes).

However, I was wondering how t-tests are used to look at the difference in the distributions of data between two groups? I am asking this because it's basically the phrasing used by the question I'm trying to answer. It asks to compare whether the distributions of the item of interest are different using a t-test.

The reason that I am confused is, even though I understand the mean is a product of the distribution and t-tests may be strongly affected by outliers so the t-test might give some information about two distributions, there could be a case where the two distributions were very similar but the effect size was large simply because they were centered at different means, and there could be a case where the two distributions looked funky with difference variances and whatnot, and that could lead to the same t-stat. So how would to be able to tell anything from a t-test?

The typical setup for a two-sample t-test is:

$$X_1,dots,X_n overset{iid}sim N(mu_x,sigma^2)$$

$$Y_1,dots,Y_m overset{iid}sim N(mu_x + delta,sigma^2)$$

$$H_0: delta = 0$$

$$H_a: delta ne0 $$

$$text{(Or do it one-sided.)}$$

By this setup, if you find that there are two different distributions, the only way for that to happen is if they differ in mean.

Then you might want to say that the variances are unequal, or at least allow for that possibility, and then test for mean differences anyway. That get's to Welch's test…which still only tests for differences in mean. There might be a difference in variance, and that might be more interesting than a difference in means, but Welch's test should not be catching differences in variance.

A simulation in R confirms this.

set.seed(2019) times <- 10000 N <- 1000 Ps <- rep(NA,times) for (i in 1:times){      #the default t-test in R is the Welch test     Ps[i] <- t.test(rnorm(N,0,1),rnorm(N,0,5))$p.value } length(Ps[Ps<0.1])/times length(Ps[Ps<0.05])/times 

At the $0.1$-level, we reject about 10% of the time, and at the $0.05$-level, we reject about 5% of the time. This is with a fairly large sample size of 1000, so even subtle differences should be discovered, yet they are not. So you're right that the t-test doesn't do much for you if you want to examine differences that aren't just the mean.

However, others have noticed this, too, and there are tests for distribution differences in general. The classic full distribution test is the Kolmogorov-Smirnnov (KS) test. It examines the largest (technically supremum) vertical distance between two (empirical) CDFs. The KS test is known to have a lack of power to reject differences that are found way out in the tails, but it's still a popular test. Some others include Anderson-Darling and Kuiper. Some playing around with simulations indicates to me that Kuiper is the best at the three at detecting tail differences, though I have not been especially thorough in my investigation of this.

What you elect to explore will depend on what you want to know. Perhaps it's good enough for you to know that the means are different, in which case, t-testing or Welch-testing might be totally fine!

Similar Posts:

Rate this post

Leave a Comment