# Solved – Calculating prediction variance from JAGS model of a bernoulli outcome in R

I have a model of a bernoulli random process I fit using JAGS via the `rjags` package in R. Here are some example data, as well as code to fit the given models in JAGS via `rjags`:

``#generate some binary 0/1 data set.seed(666) x1 = rnorm(1000)           # some continuous variables  x2 = rnorm(1000) z = 1 + 2*x1 + 3*x2        # linear combination with a bias pr = 1/(1+exp(-z))         # pass through an inv-logit function y = rbinom(1000,1,pr)      # bernoulli response variable  #fit a JAGS model.  require(rjags)  jags.model = " model { for (i in 1:N){ y[i] ~ dbern(p[i]) p[i] <- 1 / (1 + exp(-z[i])) z[i] <- a + b*x1[i] + c*x2[i] } a ~ dnorm(0, .0001) b ~ dnorm(0, .0001) c ~ dnorm(0, .0001) } " #setup data as list data = list(y=y, x1=x1, x2=x2, N = length(y))  #run JAGS model j.model <- jags.model(file = textConnection(jags.model),                       data=data,                       n.chains=3)  #sample from the posterior jags.out   <- coda.samples (model = j.model,                             variable.names = c('a','b','c'),                             n.iter = 1000) ``

To prove the model fits, here is some code to generate predicted vs. observed of the z variable (probability of being 0 or 1).

``#grab coefficients, compare fitted vs observed of z to prove this fits.  cf <- summary(jags.out)\$statistics[,1] fitted <- cf + cf*x1 + cf*x2 plot(z ~ fitted, pch=16, cex = 0.2, ylab='observed',xlab='fitted') ``

Now that the model fits, What I would like to do is plot the relationship between one of the predictor variables, say `x1`, and the probabilty of the 0/1 outcome (the prediction for `z`), as well as +/- the uncertainty on this prediction. Getting the mean prediction across the range of `x1` values is fairly straightforward using the model coefficients:

``#okay, not plot the relationship between x1 and z, holding x2 constant x1.range <- seq(min(x1),max(x1), by = (max(x1)-min(x1))/100) z.fit <- cf + x1.range*cf + mean(x2) * cf  #plot relationship plot(z.fit~ x1.range, cex=0) lines(smooth.spline(z.fit~x1.range), lwd=2, col='purple') `` My question: How do I estimate the uncertainty associated with this relationship? An estimate of the standard deviation or the variance would be fine. Anything that would allow me to shade a confidence interval on this figure, given the model I have fit.

Contents

I think the best way to generate uncertainty values around each prediction is to send the desired predictions through the original JAGS model. This way, the model will sample from the distributions of all parameters, regardless of whether they are positive or negative, and generate a distribution around each prediction. This distribution will reflect the uncertainty associated with all parameters, at the value of each predictor of interest. Aspects of this distribution can then be used for plotting. To implement this in R, first generate a mean of `x2`, and a range of `x1` values for which you want to make predictions for. Here I am going to predict over the entire range of `x1` values in the data, divided into 101 intervals across the range of `x1` values.

``x1.range <- seq(min(x1),max(x1), by = (max(x1)-min(x1))/100) x2.m     <- mean(x2) ``

I then modify the JAGS model to make predictions.

``jags.model = " model { #model code for (i in 1:N){ y[i] ~ dbern(p[i]) p[i] <- 1 / (1 + exp(-z[i])) z[i] <- a + b*x1[i] + c*x2[i] }  #predictions over x1, holding x2 constant at its mean value for (i in 1:N.sim){   z.x1[i] <- a + b*x1.range[i] + c*x2.m }  a ~ dnorm(0, .0001) b ~ dnorm(0, .0001) c ~ dnorm(0, .0001) } " ``

``#setup data as list data = list(y=y, x1=x1, x2=x2,             x1.range = x1.range, x2.m = x2.m,             N = length(y), N.sim = length(x1.range)) ``

Compile the model, and the use the `coda` function to sample from the prediction posterior:

``#compile the model require(rjags) j.model <- jags.model(file = textConnection(jags.model),                       data=data,                       n.chains=3)  #sample from the prediction posterior x1.jags.out <- coda.samples(model=j.model, variable.names = c('z.x1'),n.iter=1000) ``

This will generate a mean and standard deviation for each predicted value of `y`, at each condition of `x1` and `x2` simulated, taking the full distribution and uncertainty of all parameter values into account. Finally, for plotting, grab the estimated mean and standard deviation (or whatever your favorite metric of variance is), and plot:

``#generate plot, plotting mean and +/- 1 standard deviation.  plot(x1.pred ~ x1.range, cex=0) lines(smooth.spline(x1.pred~ x1.range), lwd=2, col='purple') lines(smooth.spline((x1.pred+x1.pred.sd) ~ x1.range), lty=2, col='purple') lines(smooth.spline((x1.pred-x1.pred.sd) ~ x1.range), lty=2, col='purple')  #shade error region polygon(c(x1.range,rev(x1.range))         ,c(x1.pred+x1.pred.sd,rev(x1.pred - x1.pred.sd))         ,col=adjustcolor('purple',0.3)         ,lty=0 ) `` If specific probabilities are desired, then these `z` values need to be inverse logit transformed.

Rate this post