Solved – Ftting a mixture of two Gaussians

I want to fit a mixture of two gaussian densities to my financial data. The data can be found here: the variable is called dat.

The probability density of a mixture is given by:
f(l)=pi phi(l;mu_1,sigma^2_1)+(1-pi)phi(l;mu_2,sigma^2_2)

The quantile can be computed by using a numerical algorithm to solve the following:
alpha=P(L leq VaR_alpha) = pi F_1(Quantile_alpha;mu_1,sigma^2_1)+(1-pi) F_2(Quantile_alpha;mu_2,sigma^2_2)
I use mixtools in R:

install.packages("mixtools") library(mixtools) mixture<-normalmixEM(dat,k=2,fast=TRUE) 

This uses the EM algorithm.

I now want to calculate the 0,95 quantile of the mixture distribution. I do a loop, a kind of a grid search, I assume, that the quantile (due to the characteristics of my data) will be below 0.3. So the loop ends at 0.3

pi<-mixture$lambda[1] mu1<-mixture$mu[1] mu2<-mixture$mu[2] sigma1<-mixture$sigma[1] sigma2<-mixture$sigma[2]  quantile<-0 probabilitylevel<-0.95 dummy1<-0  # the loop lasts for about 20-40 seconds for (i in 1:100000){ quantile[i]<-i/(1000000/3) } dummy1<- probabilitylevel - ( pi * pnorm(quantile,mean=mu1,sd=sigma1) + (1-pi) * pnorm(quantile,mean=mu2,sd=sigma2))  min(abs(dummy1)) which.min(abs(dummy1)) quantileresult<-which.min(abs(dummy1))/(1000000/3) 

the result


is 0.025371

which seems to be correct, if control it with:

pi * pnorm(quantileresult,mean=mu1,sd=sigma1) + (1-pi) * pnorm(quantileresult,mean=mu2,sd=sigma2) 

I look at the plot:

plot(density(dat),col="red") curve(expr=pi*dnorm(x,mu1,sigma1)+(1-pi)*dnorm(x,mu2,sigma2),lwd=2,col="black",add=TRUE) curve(dnorm(x,mean(dat),sd(dat)),add=TRUE,lty=3,col="orange",lwd=2) 

which gives


It looks like, that the mixture normal (black) is fitting the data way better. The dashed orange line is the univariate normal distribution fitted to the data set. It is fitting the data not as good as the mixture density, is this correct interpreted?

Finally, we look at the single densities and compare it to the mixture:

plot(density(dat),col="red") curve(dnorm(x,mu1,sigma1),add=TRUE,lty=2,col="darkgreen") curve(dnorm(x,mu2,sigma2),add=TRUE,lty=2,col="blue") curve(expr=pi*dnorm(x,mu1,sigma1)+(1-pi)*dnorm(x,mu2,sigma2),lwd=2,col="black",add=TRUE) 

which gives the following plot:


The first density has a higher peak, the second density is shifted to the left and has a lower peak, higher variance.

Are my calculations and interpretations correct?

Just so this thread gets an answer (since we can't access your data any more, I don't think much more answering will be happening): what you are doing seems to be perfectly fine.

You could change your search for a quantile of a mixture to use KScorrect::qmixnorm(), as per Compute quantile function from a mixture of Normal distribution.

Similar Posts:

Rate this post

Leave a Comment