I am trying to explain a time series with the help of other related series. I really get nice fits using a standard LM approach with NeweyWest VC matrix. The fit even increases drastically when I replace some of the explanatories by higher lags (lag 10) of these explanatories (quarterly series) to the mix.
Is it ok, to use such high lags? The length of the series is about 120. What's the risk of using high lags? At least it seems uncommon to me to use it.
Best Answer
It's not "wrong," but it's probably a bad idea. One problem is that you don't have lags for your first ten observations, so you can't use those in your analysis, effectively making your data set smaller.
There are certain lags that we think make sense intuitively: Last period probably effects this period, this time this year is probably related to this time next year due to some seasonal variation patterns. One lag and four lags for you would make sense. Having two years out influence what happens today would be surprising and 2.5 years (or 10 quarters) seems stranger still.
I would chalk up a significant lag at quarter 10 to chance, rather than a good model. Including this lag can lead to overfitting. If you overfit, you will have trouble with out-of-sample forecasting. As a test on this, you might run the model with the 10 quarter lag on the first and second halves of your data to see if you still get a significant/similar result.
Lastly, except in the cases of intuition, I don't like to include one lag, then skip a bunch, then include another. For example, including lags 1 and 4 makes sense intuitively, so that's fine, but adding lags 1, 4, and 10 just seems strange.
Time series is as much art as science, so it does take some playing around.
Similar Posts:
- Solved – ADF test – interpreting the results
- Solved – Neural network for multivariate time series
- Solved – How many lags for ADF test based on ACF, PACF
- Solved – VAR lag selection heavily depends on maximum lag investigated
- Solved – what does it mean if the partial auto correlation function of a time series have a value >+1 or <- 1