Solved – R : Parallelising auto.arima()

I am parallelizing the execution of auto.arima() to forecast 500 Route time series, initially I did setting up a parallel token in auto.arima to be TRUE and number of cores to be 10. It did not impact the execution time but just increased the processing calculations to 10 times more, hence taking more time. Then I did something like this:

modelsPetrol <- list()  registerDoParallel(cores=28)  start=Sys.time()  foreach(i=1:10, .packages=c("forecast"),.combine=cbind) %dopar% {    filename <- paste("Model_P",temp[[i]],".rda",sep="")      t <- matrix(nrow = 750,ncol=10)     y <- ts(Train_Petrol[,temp[[i]]], frequency = 6)     t <- fourier(ts(Train_Petrol[,temp[[i]]], frequency=311.50), K=5)     modelsPetrol[i] <- auto.arima(y,xreg=cbind(t,holi,wday,schl,weekn),approximation=FALSE,trace=FALSE,stepwise=TRUE,lambda = TRUE)     save(modelsPetrol, file=filename)      print(modelsPetrol)     t<-NULL     y<-NULL } 

The code works fine, with less time, but I am struggling to check models, like summary (modelsPetrol[[1]]) gives an error subscript out of bounds.

Finally cracked it… Sharing with all to have chunk of code that facilitates execution of auto.arima() for 500 Routes using parallel multi core processing:

parallel.arima<-function(data) {  library(forecast)   t <- matrix(nrow = 750,ncol=10)   y <- ts(data, frequency = 6)   t <- fourier(ts(data, frequency=311.50), K=5)   fit <-auto.arima(y,xreg=cbind(t,wday,weekn),approximation=FALSE,trace=FALSE,stepwise=TRUE)   t<-NULL   y<-NULL   return(fit) }  models <-list()  modelsP<-list()  registerDoParallel(cores=20)  start=Sys.time()  models <- foreach(i =1:length(temp)) %dopar% {   modelsP[[i]] <- parallel.arima(Train_Petrol[,temp[[i]]])  }  end=Sys.time() 

Works well..!!

Similar Posts:

Rate this post

Leave a Comment