# Solved – Problems with modeling a cumulative dependant variable

I am building a .NET program. One of its functions is to provide a predictive model for a vehicles life-to-date maintenance costs, basically what is the cumulative cost(Y) for a vehicle at specific year(X). I decided to use a 2nd degree polynomial least squares fit and for the most part it does a good job. Sometimes though the curve will peak and start trending downward which doesn't make sense for life-date-cost since its a cumulative value…(X,Y) > (X-1,Y).

This negative trend happens when the difference in cost for say, year 2 to year 3 is less than year 1 to year 2. Some sample data that gives me a negative trend:

(1,328.76)
(2,1133.12)
(3,1366.07)

My solution for now is to check for a negative trend and if its found use a linear best fit instead but I feel like that's a messy fix. I've thought about implementing some sort of minimum value for the change from year to year…essentially turning the curve into a linear line at a certain X value but that seems complicated to implement. Does anyone see a better way of doing this or a better model to use? I'm not very knowledgeable with statistics so go easy on me :-p

Edit

Each vehicle has a varying amount of data depending on how long its been in service, with a soft max at 15 years. So the last data point for each vehicle is for the most recent year(2011 in this case) and we are really only interested in extrapolating 5 years beyond that point. As we use the model year to year, we will get more data for the vehicles which require the model to be altered. Thats why I choose the polynomial least squares fit because its easy to just run the new data back through that function and get a new equation.

Contents