# Solved – Calculate p-value of multivariate normal distribution

I want to calculate $$p$$-values by using the statistics program R.

I want to test multiple groups to a placebo group and thus I want to test the nullhypothesis $$H^{11}_0 : mu_1 = mu_2, ldots H^{1m}_0 : mu_1 = mu_m$$. My test statistics $$T^1,ldots,T^m$$, where $$T^1$$ is the test statistic to test $$H^{11}$$ and so on, are asymptotically normal distributed and my vector of test statistics $$(T^1,ldots,T^m)$$ has multivariate normal distribution with mean $$mu=(0,…,0)$$ and a Covariance-matrix Cov under the nullhypothesis.
Let's say $$m=3$$ and $$mu=(0,0,0)$$ and
begin{align} Cov = begin{pmatrix} 1 & 0.5 & 0.1 \ 0.5 & 1 & 0.1 \ 0.1 & 0.1 & 1 end{pmatrix}. end{align}

Now I've for example observed $$T_1 = 0.2, T_2=1.3, T_4=-0.4$$.

$$p_1 = 1-pmvnorm(lower=(-T_1,-Inf,-Inf), upper=(T_1,Inf,Inf),mean=mu,sigma=Cov).$$

But I'd get the same $$p_1$$ for any Covariance-Matrix with diagonal elements equal to 1 and this obviously seems kinda false since the correlation between the test statistics aren't considered. But I actually don't know how else to calculate the $$p$$-value. So any advice would be helpful.

Thank you!

Contents

I'm not very good at drawing in 3 dimensions, so here is a 2D view of what you're calculating with that definition of $$p_1$$: This is a rectangular region going to infinity and just touching the observed point $$(T_1, T_2, T_3)$$ at the corner. While that is certainly a value that can be calculated, it is not particularly meaningful.

It is much more common to construct a hypothesis test by calculating something like this: Where the measure of the orange region is now the probability that a random point would have been "less likely" (e.g. have a lower probability density) than $$(T_1, T_2, T_3)$$. This can also be interpreted as the probability that a random point would be further from the origin in the the Mahalanobis metric.

The formal test statistic is then:

$$d = sqrt{({mathbf x}-{boldsymbolmu})^mathrm{T}{boldsymbolSigma}^{-1}({mathbf x}-{boldsymbolmu})}$$

Where $$Sigma$$ is the covariance matrix, $$Sigma^{-1}$$ is the precision matrix, and $$mathbf{x}$$ is the vector $$(T_1, T_2, T_3)$$ in your notation. If true population parameters $$mu$$ and $$Sigma$$ are known then $$d^2 sim chi^2_1$$ (read as $$d^2$$ has the Chi-square distribution with one degree of freedom). If, on the other hand, $$mu$$ and $$Sigma$$ are empirical estimates from the same sample, then $$d^2$$ has the Hotelling's T-squared distribution.

Of course, only you can know exactly what hypothesis you want to test. I'm just showing you one common way that other people have approached this problem, but I can't know your specific situation in detail; I'm just guessing. Think carefully about which definition is most useful for what you are trying to accomplish!

Rate this post