I have 3 time series and I would like to determine if there are any common patterns in the data. Zuur mentioned in one of his books that a reasonably simple approach to test this is based on repeated loess smoothing. Zuur states that to visualize the general patterns of the time series, we fit a loess smoother through each series and then plot them all on the same graph. This generally captures the long term variations, to capture the short term variations (which I am interested in) we have to apply loess smoothing again, but now on the residuals.
Below, is an example data set where we have hourly values of ozone concentrations for 3 countries, I would like to perform what Zuur has explained, but to this example data set. Could anyone offer some suggestions? Please note that my knowledge of R and statistics in general is low, I apologize if this is a simplistic question.
require(plyr) require(lattice) TopFolder <- list("http://www.nilu.no/projects/ccc/onlinedata/ozone/CZ03_2009.dat" ,"http://www.nilu.no/projects/ccc/onlinedata/ozone/CY02_2009.dat" ,"http://www.nilu.no/projects/ccc/onlinedata/ozone/BE35_2009.dat" ) ## create variable for the data data = ldply(TopFolder, header = TRUE, read.table, sep = "", skip = 3) ## define Ozone levels Ozone <- data$Value Ozone[Ozone==-999] <- NA Ozone <- data.frame(Ozone) ## define Datetime - need to concatenate arrays DateTime <- paste(data$Date,data$Hour, sep = " ") Date <- as.POSIXct(DateTime, format = "%d.%m.%Y %H:%M") ## define countries Countries <- c("Country1","Country2","Country3") Country <- data.frame(Country = rep(Countries, each = 8760)) ## bind together Dat <- cbind(Ozone, Country = Country) Dat <- transform(Dat, Date=Date, Doy = as.numeric(format(Date,format = "%j")), Tod = as.numeric(format(Date,format = "%H")), DecTime = rep(seq(1,365, length = 8760),by = 3)) ## Select data for a month during the summer NewDat <- subset(Dat, as.Date(Dat$Date) >= '2009-07-01 00:00:00' & as.Date(Dat$Date) <= '2009-08-01 00:00:00') ## plot the data xyplot(Ozone~DecTime | Country, data = NewDat, type = "l", col = 1, strip = function(bg = 'white',...)strip.default(bg = 'white',...))
So, here I have imported the data and taken a subset (for one month), and then plotted the values. I would now like to apply loess smoothing to represent the short term variation i.e. hourly.
Best Answer
Try this:
PlotData <- lapply(Countries,function(country) { fit1 <- loess(Ozone~DecTime ,data = NewDat[NewDat$Country==country,],na.action=na.exclude) fit2 <- loess(residuals(fit1)~DecTime ,data = NewDat[NewDat$Country==country,],na.action=na.exclude) data.frame(DecTime=NewDat[NewDat$Country==country,"DecTime"],SmoothResiduals=predict(fit2),Country=country) }) PlotData <- do.call('rbind',PlotData) xyplot(SmoothResiduals~DecTime | Country, data = PlotData, type = "l", col = 1, strip = function(bg = 'white',...)strip.default(bg = 'white',...))
Explanation: lapply
is used to iterate over the countries, applies a function and combines the function values in a list. The anonymous function does a loess
fit (note my use of na.action=na.exclude
to deal with NA
values), does another loess
fit on the residuals and returns a data.frame with the smoothed residuals. Subsequently, I combine all data.frames and create the plot.
Obviously you might want to adjust the parameters of the loess
fits.