I have some questions about how to translate my data structure and research Qs to syntax for the **lmer** function (lme4 package).

I am looking to predict teenagers **scores** on a mental health questionnaire (0-10) from 3 variables: **age** (continuous), **sex** (M/F) and my variable of interest, let's call it **X** for simplicity (continuous). I demeaned both my continuous predicted variables. Both the teenager and their parent filled out the questionnaire about the teenager, adding a third (within-subject) variable **rater**. I restructured the dataset into long format such that each subject has 2 rows for each outcome value: parent-report & self-report.

Number of subjects ~85

Number of outcome observations ~170

I am interested in the following effects:

- The fixed effect of X on outcome score (main interest)
- The interaction between X and sex on the outcome ("does X affect the outcome in one sex but not the other?")
- The fixed effect of sex on scores
- The fixed effect of age on scores

But I would also like to know whether the effects above are dependant on who is the rater? In this sense, rater is not a nuisance grouping variable whose effect on the outcome I want to account for. I would like to perform a test similar to a MANOVA but given that some subjects are missing some observations, I would prefer to use mixed models. As I understand it, linear mixed models can be used for multiple outcome data but I do not know how to phrase the syntax such that:

- I declare non-independant observations within subjects (rater falls within subject)
- I do not have a random slope for every subject (I have a relatively small sample size)

Using some specific examples, I'd like to know which (if any!) of the following capture my needs…

`m1 <- lmer(score ~ X*sex+ age + (1+rater), data = mydata ) m2 <- lmer(score ~ X*sex + age + (1|rater), data = mydata ) # same as m1? m3 <- lmer(score ~ X*sex + age + (1|ID/rater), data = mydata ) # Error: number of levels of each grouping factor must be < number of observations # An issue related to missing data?? `

Any help (for any part of the above) is appreciated!

**Contents**hide

#### Best Answer

Using some specific examples, I'd like to know which (if any!) of the following capture my needs…

`m1 <- lmer(score ~ X*sex+ age + (1+rater), data = mydata ) m2 <- lmer(score ~ X*sex + age + (1|rater), data = mydata ) # same as m1? m3 <- lmer(score ~ X*sex + age + (1|ID/rater), data = mydata ) # Error: number of levels of each grouping factor must be < number of observations # An issue related to missing data??`

Model `m1`

does not contain random effects. It should produce an error.

Model `m2`

fits fixed effects for `X`

, `sex`

,`age`

and the `X:sex`

interaction. This will not provide an answer the question of *"whether the effects above are dependant on who is the rater"*. This also fits random intercepts for `rater`

, but, since `rater`

appears to be a binary indicator, this does not make much sense, as the software would be estimating a variance for a normally distributed variable from only 2 observations.

Model `m3`

fits the same fixed effects as `m2`

so also does not provide answer the question of *"whether the effects above are dependant on who is the rater"*. The random effects term `(1|ID/rater)`

is equivalent to `(1|ID) + (1|ID:rater)`

. For the first part there are 85 levels, and for the 2nd part there are 170 levels, the sum of which is 255 and since there are only 170 total observations, this produces the error above.

The model :

`m4 <- lmer(score ~ X*sex*rater + age*rater + (1 | ID)) `

given in the @ErikRuzek solves most of the problems. However, this will provide only fixed effects for `age`

, `sex`

and `rater`

(ie for when it is 0 or 1) along with fixed effects for the interactions `age:rater`

, `sex:rater`

`X:sex`

and `sex:rater`

and `X: sex:rater`

, but will not allow any of these to vary by `ID`

. Regarding the comment to Erik's answer about whether it *"capture the fact that rater is a within-subject variable that is nested within subject"*, yes it does, by fitting fixed effects for `rater`

. If you wanted to allow all of these to vary by `ID`

then you would want to fit random slopes. The maximal model would be

`m5 <- lmer(score ~ X*sex*rater + age*rater + (1 + X*sex*rater + age*rater| ID)) `

However, it is very unlikely that the data would support such a complex random structure. You might instead try something like:

`m6 <- lmer(score ~ X*sex*rater + age*rater + (1 + rater + sex:rater + X:sex:rater + age:rater| ID)) `

although again, I would expect this to be overfitted, so simplifying again to something like:

`m6 <- lmer(score ~ X*sex*rater + age*rater + (1 + rater | ID)) `

If this, too, will not converge, or results in a singular fit, then you would need to stick with model `m4`

. If it does converge and is not singular, then you could try adding the 2-way interactions as random slopes. The choice of random slopes should also be informed by the underlying theory of the data generation process.

Regarding missing values, I would recommend multiple imputation. In addition to Erik's suggestion, I would also suggest looking at the mice package in R.

### Similar Posts:

- Solved – Intraclass correlation coefficient (ICC) for two raters using a mixed effects model for a design with repeated measurements
- Solved – Multilevel model where an interaction is a varying slope
- Solved – Computing inter-rater reliability in R with variable number of ratings
- Solved – Computing inter-rater reliability in R with variable number of ratings
- Solved – Repeatability estimation using a mixed model with repeated measures by four raters