Solved – Simulation of Monte Carlo test

Using R, I am trying to simulate how the power of a Monte Carlo two-sample test for central tendency changes with sample size. However, my simulation results does not show power increasing with sample size which is clearly wrong.

Could someone please advise where I am going wrong? Below is the basic set-up.

The null hypothesis is:

$H_0=mu_X-mu_Y=0$

The alternative hypothesis is:

$H_1=mu_X>mu_Y=0$

The test statistic is:

$E(x)-E(y)$

For my simulation, I use two samples $Xsim N(mu_X, sigma_y)$, $Ysim N(mu_Y, sigma_Y)$ of size $n_X$ and $n_Y$ respectively. I run 100 simulations for every increase in the sample size of X.

Below is my coded simulation with comments:

 mc.pvalue<-function(nx, ny, mu.x, mu.y, sig.x, sig.y){     x<-rnorm(nx, mu.y, sig.x)            # Generate samples under H1: mu_x > mu_y     y<-rnorm(ny, mu.x, sig.y)                 obs.diff<-mean(x)-mean(y)            # Test-stat is observed diff. in means      se.x<-sd(x)/sqrt((length(x)))        # Calculate standard errors      se.y<-sd(y)/sqrt((length(y)))     count=0                              # Set counter equal to zero     for(i in 1:1000){                    # Generate 1000 pairs of samples and calculate       x1<-rnorm(nx, mu.y, se.x)          # the difference in means between each pair.       y1<-rnorm(ny, mu.y, se.y)          # The samples have been generated under       sim.diff<-mean(x1)-mean(y1)        # H0: mu_x = mu_y       if(sim.diff<= obs.diff){count=count+1}      }     count/1000                           # Estimated p-value for test statistic }   # Calculate 100 estimated p-values for the test statistic. Then find the proportion of times # reject hypothesis at level of significance 0.05  calc.pvalues<-function(nx, ny=10, mu.x=21, mu.y=20, sig.x=0.5, sig.y=0.25,                      n.sim=100){    pvalues<-replicate(n.sim, mc.pvalue(nx, ny, sig.x, sig.y, mu.x, mu.y))   sum(pvalues<0.05)/n.sim   }  chge.sample.size<-seq(2, 100, by=0.05)          # Sample size x increasing from 1 to 100 result<-sapply(chge.sample.size, calc.pvalues)  # Apply the calc.pvalues function to                                                 # estimate p-value for each sample size                                                 # of x     plot(chge.sample.size, result)                  # Plot increasing sample size of X                                                  # against test power 

When I an edit the code to calculate the power of a t-test, it works perfectly:

calc.pvalue.t<-function(nx, ny, sig.x, sig.y, mu.x, mu.y){       x<-rnorm(nx, mu.x, sig.x)   y<-rnorm(ny, mu.y, sig.y)   t.test(x, y, alternative="greater",           paired=FALSE, conf.level=0.95)$p.value }  calc.t<-function(nx, ny=10, mu.x=21, mu.y=20, sig.x=0.5, sig.y=0.25,                      n.sim=100, alpha=0.05){   p.values<-replicate(n.sim, calc.pvalue.t(nx, ny, sig.x,                                              sig.y, mu.x, mu.y))   sum(p.values<alpha)/n.sim   }  chge.sample.size<-seq(2, 10, by=1) tpower1<-sapply(chge.sample.size, calc.t)  plot(chge.sample.size, tpower1) 

Something seems to be going wrong with the first part of my code when I try to do the Monte Carlo test

I think maybe the reason is that the empirical distribution is changed with every observation in the function "mc.pvalue". So the 100 p-values are calculated with different empirical distribution.

Similar Posts:

Rate this post

Leave a Comment