# Solved – A mixed-effects model for repeated measurements vs multiple time point-wise comparisons with a simpler test

I have a pretty standard situation of a study in which repeated measurements are taken from the same individuals. There are two factors: "Group" (with 25 individuals in each of two groups) and "Day" (time is treated here as a categorical variable). To keep things simple, let's consider only two time points, Day 1 and Day 2. When working in R, the data would look as follows (ID – subjects' IDs; Group – labels for the groups; Day – factor indicating the day of sampling, with 2 levels; BW – body weight, kg):

`` dat      ID Group   Day       BW  1   ID1     A Day 1 2333.231  2   ID2     A Day 1 2615.744  3   ID3     A Day 1 2282.484  4   ID4     A Day 1 2796.806  5   ID5     A Day 1 2262.759  6   ID6     A Day 1 2520.216  7   ID7     A Day 1 2606.598  8   ID8     A Day 1 2617.347  9   ID9     A Day 1 2439.651  10 ID10     A Day 1 2515.900  11 ID11     B Day 1 2692.253  12 ID12     B Day 1 2208.707  13 ID13     B Day 1 2343.652  14 ID14     B Day 1 2564.080  15 ID15     B Day 1 2411.044  16 ID16     B Day 1 2774.001  17 ID17     B Day 1 2634.651  18 ID18     B Day 1 2514.433  19 ID19     B Day 1 2198.449  20 ID20     B Day 1 2505.220  21  ID1     A Day 2 2314.214  22  ID2     A Day 2 2302.396  23  ID3     A Day 2 2319.029  24  ID4     A Day 2 2533.612  25  ID5     A Day 2 2290.300  26  ID6     A Day 2 2168.727  27  ID7     A Day 2 2466.597  28  ID8     A Day 2 2223.379  29  ID9     A Day 2 2441.762  30 ID10     A Day 2 2288.917  31 ID11     B Day 2 1984.846  32 ID12     B Day 2 2702.819  33 ID13     B Day 2 2793.834  34 ID14     B Day 2 2563.337  35 ID15     B Day 2 2666.664  36 ID16     B Day 2 2399.159  37 ID17     B Day 2 2586.255  38 ID18     B Day 2 2193.912  39 ID19     B Day 2 2797.592  40 ID20     B Day 2 3043.074 ``

Here is a graphical representation of these data (data points coming from the same subject are connected with dashed lines to make it easier to understand the structure of this dataset):

In order to test the effects of Group and Day, I could fit a mixed-effects model using e.g. the nlme package for R:

``# Fit the model: M <- lme(BW ~ Day * Group, random = ~ 1 | ID, data = dat)  # check the significance of effects: anova(M)             numDF denDF  F-value p-value (Intercept)     1    18 5564.085  <.0001 Day             1    18    0.326  0.5753 Group           1    18    2.849  0.1087 Day:Group       1    18    3.631  0.0728 ``

Thus, according to the fitted mixed-effects model (which was adequate for these data – diagnostics were run but are not presented here), neither of the examined factors (Day and Group) are affecting the response variable; also, there is no interaction between the two factors.

This is the type of analysis that I would do for such a dataset if I were asked to. However, in my organisation many people have no idea about the mixed-effects models. What they would typically do is applying a bunch of t-tests (or similar tests) to detect the effect of the "Group" on each of the sampling dates. For example, for the data shown above one would conduct a t-test for Day 1 and another t-test for Day 2, getting the following results:

Day 1: P = 0.271

Day 2: P < 0.001

Thus, they would claim that there was a significant Group effect on Day 2. I tried to explain that this result would not be correct because of the presence of correlation in data, which originates from the repeated measurements made on the same subjects. However, a colleague of mine asked a question that I could not answer easily. He said:

"Ok, the observations are correlated, I get that. But for now, forget about the fact that we have data from Day 1 and suppose that there are data only from Day 2. Observations in Group A and Group B are independent from each other, and so we are allowed to apply to a t-test or something similar. When we do apply a t-test [as shown above], we get a significant Group effect. How should we then treat this result?"

And this is exactly the point were I got stuck. Indeed, if one has only the information from Day 2 and does a simple t-test, one gets a very different (and, in principle, justified) conclusion than the one obtained with the mixed effects model. Which method of analysis is to trust then? Is the Group effect real?

I feel like I am missing some important piece for justification of the use of mixed model. Any hint would be highly appreciated.

Contents