# Solved – Calculating probability for bivariate normal distributions based on bootstrapped regression coefficients

Dear all,
I was encouraged to ask this question here as well as on stackoverflow and would be very appreciative of any answers…

Due to hetereoscedasticity I'm doing bootstrapped linear regression (appeals more to me than robust regression). I'd like to create a plot along the lines of what I've done in the script here. However the `fill=int` is not right since `int` should (I believe) be calculated using a bivariate normal distribution.

• Any idea how I could do that in this setting?
• Also is there a way for `bootcov` to return bias-corrected percentiles?

sample script:

``library(ggplot2)  library(Hmisc)  library(Design) # for ols()  o<-data.frame(value=rnorm(10,20,5),               bc=rnorm(1000,60,50),               age=rnorm(1000,50,20),               ai=as.factor(round(runif(1000,0,4),0)),               Gs=as.factor(round(runif(1000,0,6),0)))   reg.s<-function(x){           ols(value~as.numeric(bc)+as.numeric(age),data=x,x=T,y=T)->temp      bootcov(temp,B=1000,coef.reps=T)->t2       return(t2)      }   dlply(o,.(ai,Gs),function(x) reg.s(x))->b.list  llply(b.list,function(x) x[["boot.Coef"]])->b2   ks<-llply(names(b2),function(x){      s<-data.frame(b2[[x]])      s\$ai<-x      return(s)      })    ks3<-do.call(rbind,ks)  ks3\$ai2<-with(ks3,substring(ai,1,1))   ks3\$gc2<-sapply(strsplit(as.character(ks3\$ai), "\."), "[[", 2)    k<-ks3  j<-dlply(k,.(ai2,gc2),function(x){      i1<-quantile(x\$Intercept,probs=c(0.025,0.975))[1]      i2<-quantile(x\$Intercept,probs=c(0.025,0.975))[2]       j1<-quantile(x\$bc,probs=c(0.025,0.975))[1]      j2<-quantile(x\$bc,probs=c(0.025,0.975))[2]       o<-x\$Intercept>i1 & x\$Intercept<i2       p<-x\$bc>j1 & x\$bc<j2       h<-o & p      return(h)      })   m<-melt(j)  ks3\$int<-m[,1]     ggplot(ks3,aes(x=bc,y=Intercept,fill=int)) +   geom_point(,alpha=0.3,size=1,shape=21) +   facet_grid(gc2~ai2,scales = "free_y")+theme_bw()->plott  plott<-plott+opts(panel.grid.minor=theme_blank(),panel.grid.major=theme_blank())  plott<-plott+geom_vline(x=0,color="red")  plott+xlab("BC coefficient")+ylab("Intercept")  ``
Contents

First I would like to point out that to get the object `ks3` you do not need to do that much of the book-keeping code. Use the features of `ddply`:

``ks3 <- ddply(o,.(ai,Gs),function(d){      temp <- ols(value~as.numeric(bc)+as.numeric(age),data=d,x=T,y=T)      t2 <- bootcov(temp,B=1000,coef.reps=T)       data.frame(t2\$boot.Coef,ai=d\$ai[1],gc2=d\$Gs[1]) }) ``

Also `scales=free` works much better if you want to see what kind of graph you intend to provide.

Concerning your first question, if you estimate the standard errors of coefficient using bootstrap, you are absolutely correct in using estimated percentiles. There is no need to use the percentiles of bivariate normal, since then you assume normality and bootstrap is used exactly for the purpose of not attaching oneself to some theoretical distribution.

For your second question, first you need to define what do you mean by bias-corrected percentiles. Furthermore your question implies that `bootcov` returns non bias-corrected percentiles, but it does not return any percentiles at all.

Update

If we know that coefficients are normal we can use confidence ellipses. The mean and covariance matrix are estimated using bootstrap. The following code checks which of the observations of the bivariate sample falls into 0.95 confidence ellipse region:

`` confeps <- function(x, level=0.95, t=qchisq(level,2)) {        cv <- cov(x)       ch <- t(chol(x))       x <- sweep(x, 2, apply(x, 2, mean), "-")       y <- solve(ch,t(x))       tt <- y[1, ]^2 + y[2, ]^2       tt < t     } ``

Here is the example how this function works:

``      bs <- cbind(a <- rnorm(1000), 2-a/3+rnorm(1000)/5)       mn <- apply(bs, 2, mean)       require(ellipse)       eps <- ellipse(cov(bs), centre=mn, npoints=1000)       plot(eps, type="l", xlim=range(bs[,1]), ylim=range(bs[, 2]))       col <- rep(1, nrow(x))       col[!confeps(bs)] <- 2       points(bs[, 1], bs[, 2], col=col) ``

Rate this post