# Solved – R: random sampling for multivariate normal and log-normal distributions

I want to generate random monthly (m) temperature (T) and Precipitation (P) data considering that both variables are intercorrelated (rTP[m])
The tricky thing is that my random variables that have specific quantitative properties: temperatures are normally distributed, while precipitations follow a log-normal distribution and should be log-transformed

mvrnorm of the package MASS could be used.

``mT=c(1,2,4,7,10,15,17,18,17,10,5,1) mP=c(3.9,3.7,3.9,4.1,4.5,4.7,4.8,4.8,4.4,4.1,4.2,3.9) #log-transformed sdT=c(1,1,1,1,1,1,1,1,1,1,1,1) sdP=c(0.7,0.8,0.7,0.6,0.4,0.4,0.4,0.5,0.6,1,0.8,0.6)  #log-transformed rTP=c(0.4,0.4,0.4,0.4,0.4,0.4,0.4,0.4,0.4,0.4,0.4,0.4) covTP=rTP*(sdT*sdP)  simP=NULL for (m in 1:12) { out=mvrnorm(500, mu = c(mT[m],mP[m]), Sigma = matrix(c(sdT[m]*sdT[m],covTP[m],covTP[m],sdP[m]*sdP[m]), ncol = 2), empirical = TRUE) simP[m]=mean(exp(out[,2])-1) } ``

In this case I generate two random time-series that are inter-correlated which is great.

However, simulated precipitations (simP) are on average higher than the observed one (mP)

``plot(exp(mP)-1, type="l", lwd=2, ylim=c(0,250)); points(simP, type="l", lwd=2, lty=2) ``

I could use rlnorm or rlnorm.rplus to consider than precipitations are log-transformed, but then I have troubles with temperatures that are normally distributed.

My question is: How can I create random sampling for variables that have specific quantitative properties (log-normal and normal distributions)?

Contents

The the problem is in calculating the mean of the original distribution in the plot statement. The correct way is

``plot(exp(mP + sdP^2/2), type="l", lwd=2, ylim=c(0,250)); points(simP, type="l", lwd=2, lty=2) ``

Mean and standard deviation do not commute with logarithm or exponent. In general

``log(mean(mydata)) != mean(log(mydata)) ``

and similarly also for `sd`, `exp` in any combination of the above You should also make sure that `mP` and `sdP` are the mean and sd of the transformed data and not the logarithms of the mean and sd of the raw data. ie you want

``mP <- mean(log(P)) stP <- sd(log(P)) ``

You can check the formula for the mean and other statistics of the log normal on the wikipedia

Rate this post