I have a dataset which has repeated measures for each subject. And the linear-mixed effect model doesn't work well for the dataset. I want to know if there is any smoothing methods (e.g. splines) that works for longitudinal data such that I can find the function to fit the data?
Best Answer
A fundamental method that can be used with any regression model without extra code is that of B-splines.
In these methods, a single covariate is expanded into higher dimensions such that a piece wise polynomial with max degree – 1 continuous derivatives at the knots. Typical usage is a cubic spline, meaning the first and second derivative at the knots are continuous, but the 3rd derivative is allowed to change.
To help illustrate the parameter expansion, I will use an overly simplified example. Suppose we had a covariate in which we wished to fit a piece-wise linear fit with a single knot at 1 for a covariate $x$. Then we could expand $x$ into two new variables; $x^-$ and $x^+$. If $x_i < 1$, then we can set $x_i^- = x_i$ and $x_i^+ = 0$. If $x_i ge 1$, we set $x_i^- = 1$ and $x_i^+ = x_i – 1$. (Note: this is actually a simplification of what's really done; transformations are performed on the expanded variables to lead to more numeric stability. See De Boor's algorithm if you're really interested). Note that the new fitted model will be continuous in $x$ but allows for different slopes above and below 1.
And most relevantly to this question, note that we have done nothing to the actual regression model itself (i.e. no adding penalties to the likelihood, etc.). So you can expand your parameter space and then plug directly into your regression model.
In R
, it's really easy to use B-splines; you just call bs(x,...)
in the formula for any regression model. But I suggest reading up on the help file to understand all the options.
Here's a simple example using linear regression. We could easily switch out lm
for any other regression model.
library(splines) x <- sort( rnorm(1000) ) # sorting just for plotting purposes y <- sin(x) + rnorm(1000, sd = 0.1) bspl_fit <- lm(y ~ bs(x, df = 10)) plot(x, y) lines(x, sin(x), col = 'blue', lwd = 2) lines(x, predict(bspl_fit), col = 'red', lwd = 2) legend('topleft', c('True', 'Spline Fit'), lwd = 2, col = c('blue', 'red'))