I did the code for my $R^{2}$ (R square) test in MATLAB but it is not working accordingly.

I want to test the Weibull distribution against my raw data, hence I want to do an $R^{2}$ (R square) test. Below is my code, but the results I obtain are negative. I tried with Gamma and Rayleigh to test, but the code was not even working with other distributions.

Could you please verify this code:

`sample= N; h1=histfit(sample,30,'weibull'); xdata1 = get(h1(2), 'XData'); ydata1 = get(h1(2), 'YData'); f = fittype('weibull'); [c2, gof] = fit(xdata1',ydata1',f) `

Output

c2 =

`General model Weibull: c2(x) = a*b*x^(b-1)*exp(-a*x^b) Coefficients (with 95% confidence bounds): a = 6.787e-05 (-0.3653, 0.3654) b = 9.961 (-5429, 5449)`

gof =

`sse: 4.9879e+07 rsquare: -1.6634 dfe: 98 adjrsquare: -1.6906 rmse: 713.4243`

**Contents**hide

#### Best Answer

It is pretty clear than an estimation procedure fails here. No half-reasonable fit results to negative $R^2$.

Based on your last comment and doing *quite a bit* of guess work in the current case you have `ydata1`

to roughly correspond to points from the line one would get from binning 5000 points in 30 bins. That means that you are pretty surely looking to values that go 200+ in terms of magnitude; a standard *fit-a-distro* algorithm will not work like that because the scale is quite unlikely for a distributional data. Scale your data (by 1000 or something) and try again. Alternatively look into providing *reasonable* starting values to your algorithm. See the following example where I use as data the absolute values from a simple $N(0,1)$:

`rng(1234); %Fix your random seed sample= abs(randn(5000,1)); h1=histfit(sample,30,'weibull'); xdata1 = get(h1(2), 'XData'); ydata1 = get(h1(2), 'YData'); f = fittype('weibull'); [c, gof] = fit(xdata1',ydata1',f); [c_scaled, gof_scaled] = fit(xdata1',ydata1'/1000,f); >> gof.rsquare ans = -0.7902 >> gof_scaled.rsquare ans = 0.8287 `

For the unscaled data the estimation is horrible; I get a negative $R^2$ similar to yours. For the scaled data though the estimation is quite reasonable (well as reasonable as fitting a Weinbull distribution to an $|N(0,1)|$ can be). Try to appreciate what you are trying to do not only conceptually but also computationally. Estimation algorithms aren't black-boxes and a trust-region algorithm can only take you so far. Always (really *ALWAYS*) plot your data so you have a sense to what you are trying to do.