Solved – Calculating probability for bivariate normal distributions based on bootstrapped regression coefficients

Dear all,
I was encouraged to ask this question here as well as on stackoverflow and would be very appreciative of any answers…

Due to hetereoscedasticity I'm doing bootstrapped linear regression (appeals more to me than robust regression). I'd like to create a plot along the lines of what I've done in the script here. However the fill=int is not right since int should (I believe) be calculated using a bivariate normal distribution.

  • Any idea how I could do that in this setting?
  • Also is there a way for bootcov to return bias-corrected percentiles?

sample script:

library(ggplot2)  library(Hmisc)  library(Design) # for ols()  o<-data.frame(value=rnorm(10,20,5),               bc=rnorm(1000,60,50),               age=rnorm(1000,50,20),               ai=as.factor(round(runif(1000,0,4),0)),               Gs=as.factor(round(runif(1000,0,6),0)))   reg.s<-function(x){           ols(value~as.numeric(bc)+as.numeric(age),data=x,x=T,y=T)->temp      bootcov(temp,B=1000,coef.reps=T)->t2       return(t2)      }   dlply(o,.(ai,Gs),function(x) reg.s(x))->b.list  llply(b.list,function(x) x[["boot.Coef"]])->b2   ks<-llply(names(b2),function(x){      s<-data.frame(b2[[x]])      s$ai<-x      return(s)      })    ks3<-do.call(rbind,ks)  ks3$ai2<-with(ks3,substring(ai,1,1))   ks3$gc2<-sapply(strsplit(as.character(ks3$ai), "\."), "[[", 2)    k<-ks3  j<-dlply(k,.(ai2,gc2),function(x){      i1<-quantile(x$Intercept,probs=c(0.025,0.975))[1]      i2<-quantile(x$Intercept,probs=c(0.025,0.975))[2]       j1<-quantile(x$bc,probs=c(0.025,0.975))[1]      j2<-quantile(x$bc,probs=c(0.025,0.975))[2]       o<-x$Intercept>i1 & x$Intercept<i2       p<-x$bc>j1 & x$bc<j2       h<-o & p      return(h)      })   m<-melt(j)  ks3$int<-m[,1]     ggplot(ks3,aes(x=bc,y=Intercept,fill=int)) +   geom_point(,alpha=0.3,size=1,shape=21) +   facet_grid(gc2~ai2,scales = "free_y")+theme_bw()->plott  plott<-plott+opts(panel.grid.minor=theme_blank(),panel.grid.major=theme_blank())  plott<-plott+geom_vline(x=0,color="red")  plott+xlab("BC coefficient")+ylab("Intercept")  

First I would like to point out that to get the object ks3 you do not need to do that much of the book-keeping code. Use the features of ddply:

ks3 <- ddply(o,.(ai,Gs),function(d){      temp <- ols(value~as.numeric(bc)+as.numeric(age),data=d,x=T,y=T)      t2 <- bootcov(temp,B=1000,coef.reps=T)       data.frame(t2$boot.Coef,ai=d$ai[1],gc2=d$Gs[1]) }) 

Also scales=free works much better if you want to see what kind of graph you intend to provide.

Concerning your first question, if you estimate the standard errors of coefficient using bootstrap, you are absolutely correct in using estimated percentiles. There is no need to use the percentiles of bivariate normal, since then you assume normality and bootstrap is used exactly for the purpose of not attaching oneself to some theoretical distribution.

For your second question, first you need to define what do you mean by bias-corrected percentiles. Furthermore your question implies that bootcov returns non bias-corrected percentiles, but it does not return any percentiles at all.

Update

If we know that coefficients are normal we can use confidence ellipses. The mean and covariance matrix are estimated using bootstrap. The following code checks which of the observations of the bivariate sample falls into 0.95 confidence ellipse region:

 confeps <- function(x, level=0.95, t=qchisq(level,2)) {        cv <- cov(x)       ch <- t(chol(x))       x <- sweep(x, 2, apply(x, 2, mean), "-")       y <- solve(ch,t(x))       tt <- y[1, ]^2 + y[2, ]^2       tt < t     } 

Here is the example how this function works:

      bs <- cbind(a <- rnorm(1000), 2-a/3+rnorm(1000)/5)       mn <- apply(bs, 2, mean)       require(ellipse)       eps <- ellipse(cov(bs), centre=mn, npoints=1000)       plot(eps, type="l", xlim=range(bs[,1]), ylim=range(bs[, 2]))       col <- rep(1, nrow(x))       col[!confeps(bs)] <- 2       points(bs[, 1], bs[, 2], col=col) 

Similar Posts:

Rate this post

Leave a Comment