Consider the scenario, where I have many time series data. I have to make predictions for all.I made a ts object out the data. The data may contain outliers. I am not sure of it.
But I always pass the ts object to
tsclean() function of forecast package before building ARIMA model out of it. Is this a right approach?
If not, Please provide any other alternative.
Outliers can occur in patches thus making them level shifts . Outliers can occur systematically …say every june . Outliers an often be Inliers 1,9,1,9,1,9,1,9,5,9 where "5" is an inlier. To detect the 4 kinds of latent deterministic structure in either a univariate or a multivariate setting one has to be concerned with Pulses , Level/Step shifts , Seasonal Pulses ( e.g. a June effect starts at year 7 and local time trends .
The program https://cran.r-project.org/web/packages/tsoutliers/tsoutliers.pdf is quite good but it requires an arima model which of course can't be easily identified if there are outliers present ( chicken and egg comes to mind ! ). This explains why you need a comprehensive/holistic approach to simultaneously identifying outliers ( all 4 kinds ) and the arima component. "Tsoutliers" does not handle the seasonal pulse issue at all and would then inadvertently flag multiple pulses which would not lead to a proper forecast of the seasonal dummy effect.
Given that I wanted to restrict myself to free software I would use "tsoutliers" and manulayy provide some alternative arima models and then compare the multiple sequential results to see which combo generates an error process free of arima structure and free of outliers that also has constant error variance.
- Solved – How to specify pulses/level-shifts in data when creating ARIMA in R
- Solved – Putting less weight on certain data points in a series for forecasting
- Solved – SARIMA model with SARIMA residuals
- Solved – Smoothing constant in single exponential smoothing
- Solved – time series – Poor prediction using ARIMA model