Let's say I have outcome data at four time-points (baseline, 3 months, 6 months, 12 months) which I want to regress on an explicit time variable ($t_1 = 0$, $t_2 = 1$, $t_3 = 2$, $t_4 = 3$) to understand linear change.
I typically adjust for baseline differences in the outcome using a random intercept, e.g.:
$$Y_{it} = beta_0 + beta_1Time_{it} + U_i + e_{it} $$
Where $i$ = subject, $t$ = time, $B_0$ is a fixed intercept, $B_1$ is the slope of the explicit time variable, $U_i$ is the random intercept, and $e$ is subject- and time-varying error.
However, my supervisor adjusts for baseline differences by including the baseline measurement as a covariate and a random intercept, e.g.,:
$$Y_{it} = beta_0 + beta_1Time_{it} + beta_2Baseline_i + U_i + e_{it} $$
I know that other people adjust for baseline variation in the outcome by just including baseline measurement as a covariate and no random intercept.
My questions are:
- Which of the above approaches is valid for adjusting for baseline differences (if any) and why?
- In particular, is it appropriate to adjust for baseline variation with a random intercept and no baseline covariate, and why?
- Do you have any references on the topic?
Best Answer
The random effects (here, intercepts) do not adjust for baseline; they just allow each subject to be vertically shifted by their own customized amount. You'll get more outcome variation explained (and much lower residual variance) when adjusting for baseline in addition. Random effects handle intra-subject correlation. This can also be handled by modeling the correlation structure explicitly using generalized least squares (which used to be called growth curve models). I like structures such as AR1 for this purpose.
Note that the baseline distribution often has a different shape than the distribution of the measurements at follow-up, making it hard to model baseline as an outcome but easy to condition on it as a covariate.
Similar Posts:
- Solved – How to handle age and timepoint variables in a mixed-model for longitudinal data
- Solved – Longitudinal data: baseline effect versus random intercept
- Solved – How to correct for correlation at baseline between predictor and “DV”
- Solved – What’s the difference between including a variable as a mediator in a mediation analysis, and including a variable as a covariate in ANCOVA
- Solved – Baseline differences in RCT: Which variables (if any) should be included as covariates