Solved – How to estimate model with both linear and exponential parameters

I have a theoretical growth function that can be perturbed by events, and I'd like to estimate the growth parameters as well as the perturbation, and the rate of falloff after that perturbation.

I'm thinking of using a logistic function to model the effect of the event and the falloff of that effect (if any).

To ground this, $x$ is time, and $t$ is the time the event occurs. Before time $t$, or if the event never occurs, we have a simple linear regression. After the event occurs, I model the contribution of the event with magnitude controlled by $beta_2$ and rate of falloff by $beta_3$.

$y_i=left{x_{i}<t:beta_{0}+beta_1x_i+epsilon_i,x_i>t:beta_0+beta_1x_i+2beta_2frac{1}{left(1+e^{beta_3left(x_i-tright)}right)}+epsilon_iright}$

(edited to add the error term)

Here's a Desmos graph if it helps.

I'm really not sure how to estimate parameters for this model in any of the stats packages I'm familiar with in R. Do I need to turn to Bayesian methods?

To supplement Alexis' fine answer, here's an example to get you started (it assumes fairly 'nice' data).

If your data are less nice you may need to investigate some of the options for controlling the fitting, give better start values (least median of squares fitting for the first two parameters might be better, for example, or estimating a lower quartile line, say), or try one of the other fitting algorithms (the model is partially linear, – if you specify b3, the model is linear in the other parameters, so you can take advantage of that).

# set up some data Desdat=data.frame(   x = c(0.76, 0.98, 1.24, 1.47, 2.14, 2.4, 2.56, 3.52, 3.98, 4.27,      4.48, 4.69, 5.01, 5.32, 5.51, 5.69, 6.69, 6.97, 7.19, 7.52,      7.73, 7.99, 8.17, 8.51, 8.74, 9.08, 9.41, 9.64, 9.83, 10.23),   y = c(5.415, 5.737, 5.31, 6.039, 6.238, 6.05, 8.766, 8.834,         8.731, 9.195, 9.704, 9.024, 9.427, 9.243, 10.173, 9.527,        9.751, 10.767, 10.196, 10.142, 10.351, 10.651, 10.769,        11.344, 11.222, 10.852, 11.484, 10.93, 11.696, 12.031)   )  t = 2.4 

And now to find some values to start the algorithm off. If you know something about the data you may well be able to get much better starting estimates

# Get some very rough starting estimates: # first, a robust linear fit should get reasonably close to b0 and b1 library("MASS") coef = rlm(y~x,Desdat)$coefficients        # "$" <- small hack because of SE display bug  b0s = coef[1] b1s = coef[2] # next a linear fit after t, with most weight near t should get near b2 and -b3 xs = x[x>t] ys = (y[x>t] -b0s-b1s*xs)/2 xs = xs-t xsr = diff(range(xs)) ws = (1-(xs/xsr))^2 coef = rlm(ys~xs,weights=ws)$coefficients b2s = coef[1] b3s = -coef[2] 

(Those starting values for b2 and b3 may not always work well. With a little thought, it is easy to improve all of these, but this isn't intended as a treatise on good starting values.)
Now, fitting the model:

Desfit = nls(y ~ b0+b1*x+(x>t)*2*b2/(1+exp(b3*(x-t))), data=Desdat,              start=list(b0=b0s,b1=b1s,b2=b2s,b3=b3s) ) summary(Desfit) 

which produces:

Formula: y ~ b0 + b1 * x + (x > t) * 2 * b2/(1 + exp(b3 * (x - t)))  Parameters:    Estimate Std. Error t value Pr(>|t|)     b0  4.84735    0.16066  30.171  < 2e-16 *** b1  0.63797    0.05219  12.225 2.77e-12 *** b2  2.18062    0.24045   9.069 1.56e-09 *** b3  0.29291    0.12021   2.437    0.022 *   --- Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1  Residual standard error: 0.3121 on 26 degrees of freedom  Number of iterations to convergence: 7  Achieved convergence tolerance: 2.209e-06 

All the parameter estimates except b1 were within 2 standard errors of the values from which they were generated, and the residual standard error was fairly close to $sigma$.

Here's the fit:
$hspace{1cm}$enter image description here

which was generated with:

 plot(y~x,Desdat)  with(Desdat,lines(fitted(Desfit)[x<=t]~x[x<=t],col=3,lwd=2))  with(Desdat,lines(fitted(Desfit)[x>t]~x[x>t],col=3,lwd=2))  abline(coefficients(Desfit)[1:2],lty=3,col=3) 

Note that in some cases there can be strong dependence between parameters in the model (curved ridges in the likelihood in parameter space), which may cause you difficulty in some situations.

Similar Posts:

Rate this post

Leave a Comment