Solved – Predicted probabilities for probit model in R – categorical variable

I am running a probit regression with a random effect:


where Explan is a three-level categorical variable.

I want to calculate the mean predicted probabilities for each level of Explan. I tried doing so using this code:

newdata=data.frame(Explan="First") predict(m1,newdata,type="response") 

where First is a level of the categorical Explan variable.

However I get the following error message:

Error: (p <- ncol(X)) == ncol(Y) is not TRUE 

Were this a logit model, I would simply strip the model of the intercept and then back-transform the model summary coefficients to get the predicted values that I'm after, but I am unsure of how I would go about this with a mixed-effects probit model.

Any help in extracting the predicted probabilities would be greatly appreciated.

This question and excellent exchange was the impetus for creating the predictInterval function in the merTools package. predictInterval is designed to use the arm::sim functions to generate distributions of parameters in the model and then to use those distributions to generate simulated values of the response given the newdata provided by the user. It's simple to use — all you would need to do is:

library(merTools) preds <- predictInterval(m1, newdata = newdata, n.sims = 999) 

You can specify a whole host of other values to predictInterval including setting the interval for the prediction intervals, choosing whether to report the mean or median of the distribution, and choosing whether or not to include the residual variance from the model.

It's not a full prediction interval because the variability of the theta parameters in the lmer object are not included, but all of the other variation is captured through this method, giving a pretty decent approximation.

merTools also allows you to easily construct simulated scenarios for analysis. For example, if you wanted to compute the effect of changing a level of a factor variable for the average observation you could do this:

subExample <- list(Explan= "First") example5 <- draw(m1, type = 'average', varList = subExample)  predictInterval(m1, newdata = example5, n.sims = 1000) 

Check out the package vignettes here and here to learn more.

Similar Posts:

Rate this post

Leave a Comment