I have a large set of time series (100k, each 3 observations), their lengths varies about 10% on average. Each of them cover the time interval of the same lengths but varies due to rate of sampling, this is the cause of differents lengths of time series vectors. I would like to normalize lenght of each pair using linear interpolation, such that first and last observation in each time series remains the same. Is there R function for such manipulation ?
Example R :
input <- 0:4 #should be rescaled/resized into : output <- c(0, .444, .888, 1.333, 1.777, 2.222, 2.666, 3.111, 3.555, 4)
Best Answer
From the question and example I'll make a couple of assumptions– First, that each set of input values are at equi-spaced intervals over a unit time interval. Second, that the output is to be at equi-spaced intervals over a unit time interval. Clearly if the length of time of the input and output is something different from unity, these can be easily scaled to the actual time. So the R code would look like:
# define the standardized x values of the output # output_x_vals <- seq(0,1,length.out=10) # # compute the interpolated values; this would be done for each input time series # interp_output<- approx(x=seq(0,1,length.out=length(input)), y=input, xout=output_x_vals)
For you example, interp_output is
interp_output $x [1] 0.0000000 0.1111111 0.2222222 0.3333333 0.4444444 0.5555556 0.6666667 0.7777778 0.8888889 1.0000000 $y [1] 0.0000000 0.4444444 0.8888889 1.3333333 1.7777778 2.2222222 2.6666667 3.1111111 3.5555556 4.0000000
where $x are the output x interpolation points and $y are the interpolated values.
Similar Posts:
- Solved – Normalize time series with different lengths with linear interpolation in R
- Solved – Normalize time series with different lengths with linear interpolation in R
- Solved – Synthetic time series generation
- Solved – neural networks – Inputting a time series to a classification NN
- Solved – Tools for filling in missing values in a data set