Solved – Is it necessary to detrend and decycle time-series data when using machine learning methods

For example:

I want to forecast future values of a time-series based on previous values of multiple time-series' using a ANN and/or SVM. Inputs will be lagged values from each time series, and the outputs will be one-step-ahead forecasts (forecasts with further horizons will be done by "rolling" the predictions forward using previous predictions).

Now, shouldn't SVMs and ANNs be able to learn trends and cycles? Would they not be able to learn things like, "with all else being equal the output of this series should be 2x the previous output?" Or, if I provide a categorical variable for month, "since it's January, divide the prediction I would've made by 2?"

Would attempting to decycle and detrend the data result in imposing more bias than necessary?

With machine learning algorithms it is often beneficial to use feature scaling or normalisation to help the algorithm converge quickly during the training and to avoid one set of features dominating another. Take, for example, the problem of predicting stock prices. If you include high priced stocks such as Apple or Microsoft along with some penny stocks, the high valued features you will necessarily extract from Apple and Microsoft prices will overwhelm those that you extract from the penny stocks, and you won't be training on an apple to apple basis ( no pun intended! ), and the resultant trained model might not generalise very well.

However, imho "attempting to decycle and detrend the data" would be a very good thing to do. Extracting the various cyclic and trend components and normalising them by subtracting their respective means and dividing by their standard deviations would place all the data for all time series into the same approximate range, and then you would be training on like to like data which, when rescaled by reversing the normalisation, would likely generalise much better for predictive purposes.

Furthermore, for any time series it might be the case that trend swamps the cyclic component, so you might end up training on trend only data which almost certainly won't perform well on cyclic time series, and vice versa. By separating out the two components and training on each with separate SVMs or NNs and then recombining the two predictions, you might end up with a more accurate and more readily generalisable algorithm.

Similar Posts:

Rate this post

Leave a Comment