I'm working with an organization that is using JMP in their analysis, and I can't tell from the description in JMP's help files if the test for goodness of fit in their logistic regression is the Hosmer-Lemeshow test. If it makes a difference, my data set has only one predictor variable, so we aren't considering complex models.

The next questions that JMP addresses are whether there is enough

information using the variables in the current model or whether more

complex terms need to be added. The Lack of Fit test, sometimes called

a Goodness of Fit test, provides this information. It calculates a

pure-error negative log-likelihood by constructing categories for

every combination of the regressor values in the data (Saturated line

in the Lack Of Fit table), and it tests whether this log-likelihood is

significantly better than the Fitted model.The Saturated degrees of freedom is m–1, where m is the number of

unique populations. The Fitted degrees of freedom is the number of

parameters not including the intercept. For the Ingots example, these

are 18 and 2 DF, respectively. The Lack of Fit DF is the difference

between the Saturated and Fitted models, in this case 18–2=16.The Lack of Fit table lists the negative log-likelihood for error due

to Lack of Fit, error in a Saturated model (pure error), and the total

error in the Fitted model. Chi-square statistics test for lack of fit.

(I'm an engineer, not a statistician, the saturated line parts are confusing me).

**Contents**hide

#### Best Answer

That's the likelihood ratio goodness-of-fit test for contingency tables. The saturated model has a parameter for every cell ("combination of regressor values") so it fits the data as well as possible, & you're testing to see if that's significantly better than your model. But you need a few counts in each cell for the test statistic (the deviance) to have roughly a chi-square distribution, so that ought to be mentioned somewhere—the approximation's not going to be much use when you have a continuous regressor.

If the point is to assess "whether there is enough information using the variables in the current model or whether more complex terms need to be added", a likelihood-ratio test between the current model & one with those more complex terms included will be more relevant anyway.