Solved – Problem on Kolmogorov -Smirnov test

I compared two sets of data using KS-test. First set is empirical data X1 and the second is expected data X2 which is randomly sampled, normally distributed with mean $mu$ and std dev $sigma$.

The length of X2 is $10^6$. In the plot of their CDF, it looks like that both of them have similar distribution. When I did KS-test with X1 length is $10^3$, the result is H=0, which was correct.

However, when I tried KS-test with X1 length > $10^3$, I got wrong results (H=1), even though the plot showed that they belong to the similar distribution. I attached the plot here, when I got the wrong result. For the plot, I used X1 size=$10^6$ and X2 size=$10^6$. Empirical CDF in red, Expected CDF in blue. I used standard Matlab command (kstest2)

Empirical CDF in red, Expected CDF in blue

Is there any opinion related to this issue?

When I did KS-test with X1 length is 103103, the result is H=0, which was correct.

This is incorrect. You have failed to reject your null hypothesis, but that doesn't mean the correct result is H=0. All you can say, at that particular sample size the test is not powerful enough to reject the null hypothesis and conclude your empirical CDFs are statistically different.

However, when I tried KS-test with X1 length > 103103, I got wrong results (H=1)

The results is correct. With more samples, the KS test correctly reject your null hypothesis. It should be rejected because they are similar but not identical.

Practically, I wouldn't even both to run KS test here. It's clear the two distributions are very close, why bother? Statistics is not magic, it can't tell you anything not in your data.

Similar Posts:

Rate this post

Leave a Comment