Solved – If a multivariate Gaussian distribution is truncated what will be the new distribution

I have a covariance matrix and mean values(OMEGA) for a multidimensional Gaussian distribution (3-dim) as follows (respectively),

COVAR = 1.0e-12 *                  0.2498   -0.4832    0.2140                 -0.4832    0.9543   -0.4456                  0.2140   -0.4456    0.3245 

and

OMEGA = 1.0e-06 *                 -0.0334    0.1460   -0.1079 

The problem I am involved here is that of parameter estimation one. I am able to make the 2-dim contour plots (1,2-sigma) and marginalize them to get a 1-dim distribution for each parameters.
The contour plot goes into the negative region too, which I want to get rid of as these parameters cannot take negative values. In a way, I am looking for a truncated Gaussian distribution (correct me if I am wrong!).

This is how, I remove the negative estimates from my data, and I replace them with zeros.

samples = mvnrnd(OMEGA,COVAR,100000); data = samples; data(data<0)=0; 

Having done that, I can easily find the new mean and new covariance matrix pertaining to the modified data set.

OMEGA = mean(data); COVAR = cov(data); 

I do not quite understand how to interpret these data. Is the new distribution still a multivariate Gaussian? It is definitely truncated.

I want my data or the contour plot to restrict to the positive quadrant only. The idea is to have a comparison with the Bayesian analysis where the prior is set up in such a way that the parameters do not take negative values (eg. we can think of a flat prior in the range [0 1], not containing negative values).

I should also mention how I make the contour plots. Since the initial data was Gaussian I have the privilege to compute the covariance matrix of (say) the first two parameters by removing the 3rd row and 3rd column as follows:

COV12(1,1) = COVAR(1,1);   COV12(1,2) = COVAR(1,2);   COV12(2,1) = COVAR(2,1);   COV12(2,2) = COVAR(2,2); OM12(1) = OMEGA(1);        OM12(2) = OMEGA(2); 

Similar can be done for COV13 and COV23 too.

Contour/ error-ellipse:

ellipsedata(COV12,OM12,1000,[2.0 1.0]);%% 2-sigma and 1-sigma contour 

I am not sure whether this is correct method to do for the new data, because of its validity only for a gaussian distribution.

It you truncate a distribution, you end up with a truncated version of the distribution. So if you truncate multivariate normal distribution, you will end up with truncated multivariate normal distribution. Truncated multivariate normal distribution is parametrized by vector of means $boldsymbol{mu}$ and covariance matrix $boldsymbol{Sigma}$ and truncation points $boldsymbol{a},boldsymbol{b}$. Notice that mean and covariance of the resulting distribution $boldsymbol{mu^*}$ and $boldsymbol{Sigma^*}$ do not have to match $boldsymbol{mu}$ and $boldsymbol{Sigma}$.

However, from your description I can't see why you should use truncated multivariate normal distribution for your data. If you have data with non-negative support, then why don't you simply use a distribution that has build-in non-negative support rather then truncating something that hasn't? Actually, if the truncation is not a part of your model, then I imagine that using truncated multivariate normal distribution would cause more problems then help since it's parameters $boldsymbol{mu}$ and $boldsymbol{Sigma}$ are not the parameters of resulting distribution, so you would to translate $boldsymbol{mu}$ and $boldsymbol{Sigma}$ back and forth to $boldsymbol{mu^*}$ and $boldsymbol{Sigma^*}$ during your analysis and $boldsymbol{mu}$ and $boldsymbol{Sigma}$ won't have any intuitive meaning.

If you want to read more about multivariate truncated normal distribution, you can check the following paper describing R package for it (the first link Google returned for me) and the references it provides:

Wilhelm, S., and Manjunath, B. G. (2010). tmvtnorm: A package for the truncated multivariate normal distribution and student t distribution. The R Journal, 2/1, 25-29.

or search for other papers. The take-away message is that truncated multivariate normal distribution is not an easy creature to work with, so you should consider if this is really the best choice for your model.

Similar Posts:

Rate this post

Leave a Comment