I have a sample of data, and I want to know weather it is Gaussian-distributed or not. The mean of my data is not zero.
To check weather I'm using the K-S test correct, I generated some Gaussian-distributed data and added some bias:
data = stats.norm.rvs(size=10000) + 1 print stats.kstest(data, 'norm')
This gives a p-value of 0.0. If I subtract the bias, I get something like 0.7-0.8, depending on the seed, of course. Does the data need to have $mu = 0$? If so, does $sigma^2 = 1$? What if my distribution has a different, unknown $sigma^2$?
Best Answer
According to this SO question and the docs, it seems that the Python KS test default reference distribution is a normal distribution with $mu = 0$ and $sigma = 1$ ($N(0,1)$). See the SO question for more instructions on how to change the reference distribution.
In answer to your more specific question, you were using it correctly. In your first analysis, you had $p approx 0.7$ when comparing 2 distributions which were $N(0,1)$. You then added 1 to all the terms in one distribution, so that the means were different, and $p = 0$.
Similar Posts:
- Solved – Conditional distribution of a normal distribution given it is smaller/bigger than another normal distribution
- Solved – Distribution of the $L^{2}$ norm of a vector of components drawn from Gaussian distributions
- Solved – Trying to implement the Jensen-Shannon Divergence for Multivariate Gaussians
- Solved – Trying to implement the Jensen-Shannon Divergence for Multivariate Gaussians
- Solved – How to get any quantiles given median value and margin of error