I have a mixed design data set where participants respond to each of three interventions and also report various demographics. The intervention is thus repeated-measures and each demographic measure is between-subjects.

I want to model this as a linear mixed effects model with random slopes and intercepts, but I can't figure out what the correct way to express that using the lme function is.

If we call the participant variable "n", the intervention "int", the dependent variable "effect", and use "sex" and "age" (categorical) as two of the demographic variables, then using R's lme function, I'm thinking one of the following should be correct:

model <- lme(effect ~ int* sex* age, random=~int|n, data=data, method="ML")

model <- lme(effect ~ int* sex* age, random=~int|n/int, data=data, method="ML")

I haven't been able to work out when to structure the random effects as nested and whether it makes any sense to have intervention on both sides of the bar as both a term in the random model and a grouping variable.

**Contents**hide

#### Best Answer

@Roland already answered your question in his comment, so my answer is likely redundant.

From what you describe about your study design, you have a single grouping variable: *subject* (or `n`

per your notation). For each subject, you have multiple measurements of the response variable `effect`

. If this variable can be assumed to be continuous, then you can indeed model it using a linear mixed effects model. Otherwise, you may need to use a generalized linear mixed effects model (e.g., Poisson mixed effects model for a count response variable).

Your *intervention* variable (`int`

in your notation) is a categorical variable with 3 levels – presumably, these are the only levels you are interested in for your study, which justifies including this variable as a predictor variable in the fixed effects portion of your model, which is allowed to interact with the other two predictors (namely, `sex`

and `age`

): `int * sex * age`

.

The only situation in which you would have treated `int`

as a nested grouping factor – as in `~1|n/int`

– would have been if:

- The three interventions in your study were a representative subset of a larger set of interventions which could not all be included in your study, and
- The interventions assigned to any given subject were specific to that subject and were not used for any other subject. (If the same subset of interventions was used for all subjects, the grouping factors
*subject*and*intervention*would be fully crossed.)

Even if conditions (1) and (2) listed above were satisfied for your study, you couldn't possibly have a syntax such as `~int|n/int`

in the random effects portion of your model. The correct syntax to have would be as `~variable|n/int`

where `variable`

is such that its values change from one intervention to another within each subject.

One way to think of a grouping variable is as a 'container' for repeated observations of a response variable. In your case, the 'container' is the *subject*. Those repeated observations are 'grouped together' in the same container.

In addition to the response variable, you'll also have various predictor variables which you wish to relate to the response variable. The values of these predictor variables can be (i) the same for all grouped observations in the 'container' (e.g., `sex`

) or (ii) different for different grouped observations in the 'container' (e.g., `int`

).

If the values of a predictor variable are *the same* within a container, that predictor variable cannot appear in the random effects portion of your model; it can only appear in the fixed effects portion. For example, `sex`

can only appear in the fixed effects portion of your model:

`lme(effect ~ int*sex*age, random=~int|n, data=data, method="ML") `

If the values of a predictor variable are *different* within a container, then the predictor variable can appear *both* in the fixed effects portion of the model *as well as* in the random effects portion. For example, `int`

can appear only in the fixed effects portion of your model:

` lme(effect ~ int*sex*age, random=~1|n, data=data, method="ML") `

or in both the fixed effects and the random effects portions of your model:

` lme(effect ~ int*sex*age, random=~ 1 + int|n, data=data, method="ML"). `

A variable like `sex`

is a between-container predictor variable. Since "container" is the same as *subject*, the proper terminology for `sex`

in your study is a *between-subject predictor variable*. This terminology indicates that the values of `sex`

change across subjects, but not within subjects as the intervention changes.

A variable like `int`

is a *within-container predictor variable* (where "container" is *subject*). In other words, `int`

is a within-subject variable, whose values change within a subject.

I concur with Roland that the first model you listed seems reasonable, though you might want to test if you can simplify the fixed effects portion of the model.