# Solved – Nonlinear regression: best transformation when getting very different parameter estimates

Disclaimer: Statistics is not my strong side, so if my question is nonsense I apologize. I'm a beginner, but really wanting to understand this.

My question is: why do I get so widely different parameter estimates when using different transformations on my data in a non-linear regression ?

I'm trying to do a nonlinear regression and to estimate the uncertainty of the fit (confidence interval) using linear approximation. From my understanding the more linear-like the shape of the nonlinear function, the more accurate will the confidence interval calculation by linear approximation be. I therefore want to transform the data to make it as linear as possible. The errors in \$y\$ can be assumed to be log-normal. My data is monotonic and assumed to follow a power function in most cases.

\$\$ y = a*(x-x_0)^b \$\$

where \$y\$ is river discharge, \$x\$ is an arbitrary water level in the river and \$x_0\$ is the water level where where discharge \$y\$ is 0. This can be rewritten as log transformed, and nice and linear
\$\$ log(y) = a + b times log(x-x_0) \$\$.

I need to estimate the parameters \$a\$, \$b\$ and \$x_0\$, so to do so simultaneously I use nonlinear regression. I also have some data that follows quadratic functions, so I would like to set up (and understand) a non-linear method.

I use r and `nlsLM()` from `minpack.lm` to carry out the non-linear regression.
Here is some example code:

``library(minpack.lm)  xdata <- c(19,  21,  24,    25, 29, 34, 35, 40, 40, 46, 48, 48, 52, 56, 57, 65, 65, 68) ydata <- c(10,  11, 14, 20, 24, 50, 42, 96, 89, 134,    135,    161,    171,    218,    261,    371,    347,    393) df<-data.frame(x=xdata, y=ydata)  #weights applied in the case of no transformation (relative error assumed to be the same for all y data) W<-1/ydata  # NLS regression with weights, no transformation nlsmodel1<-nlsLM(y ~ a*(x-x0)^b,data=df,start=list(a=0.1, b=2.5,x0=0))  # log transformed nlsmodel2<-nlsLM(log(y) ~ a+b*(log(x-x0)),data=df,start=list(a=0.1, b=2.5,x0=0)) > coef(nlsmodel1)           a           b          x0  0.005158377 2.719693093 4.896772931  > coef(nlsmodel2)         a         b        x0  -8.683758  3.445699 -4.139127   > exp(-8.683758) [1] 0.0001693136 ``

I understand that the weights are very important, and can have a say in the differences here, but not by this much? My judgement of the two parameter sets is that `nlsmodel1` performs "better", and that the `b` coefficient is too high in the fit from `nlsmodel2`. `nlsmodel2` does a poor job in the upper end of the data, with large residuals there. But why are they so different? I feel like I'm doing something very silly here, and is unable to see the error. I have tried some other transformations, for example only transforming LHS as `log(y)`, but the problem remains.

I appreciate any tips that can help me improve, and not the least understand, the transformed fit.

Cheers

Related post #1 and post #2

Contents