I'm using R and the lme4
package to compute mixed effects models with binary outcome (glmer
). I have included continuous coefficients (e.g. how many hours per week does a person care for an elderly relative) and – as it is a comparison of 6 countries – country as random intercept (see my question here for details on the model).
Now I'd like to interpret the fixed effects with consideration of the random effects (country intercept).
I do this already for fixed effects without considering random effects, by multiplying each "x" value from the coefficients with their related estimates (xbeta
, first line in sample code below) and then have a formula to convert intercept of fixed effects + each xbeta
from odds ratio to probabilities:
mydf.vals$xbeta <- mydf.vals$value * (fixef(fit)[coef.pos]) mydf.vals$prob <- (1/(1+exp(-(fixef(fit)[1] + mydf.vals$xbeta))))
(this is roughly the same approach as described like in this question)
Now my question is: If I'd like to see how a coefficients / an odds ratio varies between countries (random intercept), would it be correct to retrieve random effects (with ranef
) to get the estimates (intercepts) for each country level and then repeat the above formula for each country level?
for example:
ranef(fit) $g2ctry (Intercept) 30 -0.05605686 39 0.44139287 44 -0.23863412 46 -0.29867999 48 0.41890025 49 -0.27687811 rand.ef <- ranef(fit)[[1]] for (i in 1 : nrow(rand.ef)) { mydf.vals$prob <- (1/(1+exp(-(rand.ef[i, ] + mydf.vals$xbeta)))) # plot probability curve here... }
Or more general: If I have random effects from a random intercept model, why and how should I link the random effects (intercepts of group levels) to the fixed effects coefficients? (so, what are the benefits of knowing the random effects of the grouping variable (random intercept))
EDIT: some kind of reproducible example
Since my data set is quite large and I don't know if I can upload it, this example uses the data set sleepstudy
from the lme4
package. The function for creating the plots (sjp.glmer
) is taken from the current development snapshot of my sjPlot package.
library(lme4) # create binary response sleepstudy$Reaction.dicho <- ifelse(sleepstudy$Reaction <= median( sleepstudy$Reaction.dicho, na.rm=T),0,1) # fit model fit <- glmer(Reaction.dicho ~ Days + (1 | Subject), sleepstudy, family = binomial("logit")) # plot random effects and probability curve of fixed effects sjp.glmer(fit, showContPredPlots = T)
The above code produces following two plots:
The first plot shows the random effects as retrieved by ranef
, plus conf.int, which are retrieved by arm::se.ranef
* 1.96.
The "probability curve" in the 2nd plot is calculated by multiplying fixed-effects-intercept (-3.4601, see summary(fit)
) with each value from Days
(range from 0 to 9) multiplied with Days
's estimate.
Example:
x1 = (1 / (1 + exp(-(-3.4601 + 0 * 0.7426)))) x2 = (1 / (1 + exp(-(-3.4601 + 1 * 0.7426)))) # and so on, from 0 to 9 ^ here
Now, this "probability curve" is only based on fixed effects. My question is, if it would make sense to plot this curve for each "intercept estimate" from the random effects, i.e. to use the above formula and replace -3.4601
with each random effects value (which are, in this case, random intercepts, i.e. the intercepts of each group level).
Would this be an appropriate way to interpret the "differences" or variance of ´Days` in each group?
EDIT 2: Another example
Let me give a comprehensive example of what I want to do:
library(lme4) library(reshape2) library(ggplot2) # create binary response sleepstudy$Reaction.dicho <- ifelse(sleepstudy$Reaction <= median(sleepstudy$Reaction, na.rm = T), 0, 1) # fit model fit <- glmer(Reaction.dicho ~ Days + (1 | Subject), sleepstudy, family = binomial("logit")) # get random effects (random intercept estimates) rand.ef <- ranef(fit)[[1]] # find unique values of continuous coefficient, # for x axis vals.unique <- sort(unique(sleepstudy$Days)) # melt variable mydf.vals <- data.frame(melt(vals.unique)) # add "counter" from 1 to length of unique vals # in this particular case, "Days" is also a "normal" sequence, # so there's not much benefit here - however, if you have e.g. # "workhours per week", you may have values from 20 to 80, with certain # values not in the data (if no one works 25 hours). In that case, # the following "counter"-sequence makes sense mydf.vals <- cbind(seq(from = 1, to = nrow(mydf.vals), by = 1), mydf.vals) # set colnames. x = x-axis value, "value" is the "real" data value, # which was observed colnames(mydf.vals) <- c("x", "value") # calculate x-beta by multiplying original values of "Days" # with estimate of "Days" mydf.vals$xbeta <- mydf.vals$value * fixef(fit)[2] # the data frame for plotting final.df <- data.frame() final.grp <- c() # example only for the first 6 grouping levels, # plot would else be too overloaded... for (i in 1 : 6) { # y-value (probability of odds ratio), by adding x-betas from # "Days" to each random intercept estimate (estimate for each # group level) mydf.vals$prob <- (1/(1+exp(-(rand.ef[i, ] + mydf.vals$xbeta)))) final.df <- rbind(final.df, cbind(days = mydf.vals$x, prob = mydf.vals$prob)) # need to add grp vector later to data frame, # else "x" and "prob" would be coerced to factors final.grp <- c(final.grp, rep(row.names(rand.ef)[i], times = length(mydf.vals$x))) } # add grp vector final.df$subject <- final.grp # plot probability curve here... ggplot(final.df, aes(x = days, y = prob, colour = subject)) + geom_point() + geom_line()
Above we see the prob. curves for each Subject
along each Days
value, calculated by "summed" odds ratio of each Subject`s intercept + each Days' estimates. So, I "linked" random effects and fixed effects.
Is this ok, or is it nonsense to do that? My aim is to say: Taking Subject
variance into account, we see a delay in reaction time for Subject 310, while for subject 331 it is much more likely to have high reactions.
Best Answer
Finally… @BenBoker was right with predict
and plogis
. What I am exactly looking for is the predicted values for model terms (i.e. plogis(predict(fit, type = "terms"))
, however, I'm not sure how to get predicted values for model terms from merMod
objects. predict.merMod
has no type = "terms"
option.
Similar Posts:
- Solved – R lme4 interpret MLM confidence intervals
- Solved – Are aov() with Error() same as lmer() of lme4 package in R
- Solved – Interpreting random effects in mixed effects models
- Solved – Approximate F-test of the intercept with pbkrtest
- Solved – Estimating the break point in a broken stick / piecewise linear model with random effects in R [code and output included]