I'm trying to forecast with fbprophet, the input are all positive but the predictions returns negative i'm kind of confused, i read this quick start and if the inputs are all positive then the predictions will be likely all positive and the shape of the prediction is similar like the input e.g if input is 0.86 then the output would be 0.81. Why is it like this? Did i do it wrong? if yes, what should i do?
covid_pr = covid[['date','acc_confirmed']].copy() covid_pr.rename(columns = {'date':'ds', 'acc_confirmed':'y'},inplace= True) prt = Prophet() prt.fit(covid_pr) future_prt = prt.make_future_dataframe(30) forecast = prt.predict(future_prt)
My input data
ds y 0 2020-03-02 2 1 2020-03-03 2 2 2020-03-04 2 3 2020-03-05 2 4 2020-03-06 4 5 2020-03-07 4 6 2020-03-08 6 7 2020-03-09 19 8 2020-03-10 27 9 2020-03-11 34 10 2020-03-12 34 etc until thousands
Prediction of fbprophet
yhat yhat_lower yhat_upper 0 -261.572541 -499.409024 -4.741004 1 -208.490561 -446.503629 41.371788 2 -255.114682 -500.580393 -7.825269 3 -208.963597 -481.238870 33.707433 4 -146.566250 -394.337188 96.726382 5 -92.445918 -354.914790 150.409867 6 -38.341696 -293.639411 204.964963 7 83.483534 -158.412231 332.263619 8 136.565514 -95.174934 370.980615 9 89.941393 -153.219255 349.129866 10 136.097121 -95.508485 404.666524 etc until thousands
My general input plot
Fbprophet plot
Best Answer
Your fit is more or less a linear trend line (the red line). With some additional terms depending on day of the week and time of the day (the last two plots).
So what the fit is doing is trying to draw a (more or less) straight line (the blue line in the plot) as close as possible through the data (the black points).
Since the data doesn't look at all like a straight line, you are not gonna get a very nice fit and in this case it results in unphysical/unrealistic negative values.
Also, the model does not disallow negative values so it is not strange to get them, especially when you are fitting values that are close to zero.
There are several ways to force your model to not fit zero values. First of all you should model the correct model. In this case you might already get a step further to fit the logarithm of the observed values. At least you will not get the negative values anymore, but be aware that for this particular case this is still too simplistic if forecasting is your goal (since, if this is covid then a linear fit to the logarithm of the cases is an incorrect model).