I am try to evaluate a logistic regression with new data using R. I have created the logistic regression using the old data and evaluated that using chi^2 etc. i.e.
model <- glm(outcome ~ input + 0, data = training.data, family = binomial())
(note I do need the intercept = 0). And then the predicted probabilities:
predicted.probabilities <- predict(model, test.data, type = "response")
How can I now perform the same sort of tests with these new probabilities? and what are the best tests to do here?
IMO, the following two tests are a good start:
1) A regression of the test outcomes on the predicted probabilities. If your predictions are correct on average ("calibrated"), the fit of this regression should be the 45° line.
2) Two separate histograms of the predicted probabilities: One for the test cases with y = 1, and one for the test cases with y = 0. If your model's predictions are sharp, the two histograms should differ a lot, with the former histogram focussing on high predicted probabilities and the latter focussing on low probabilities.
For more stuff along these lines, see Chapter 3 of
- Solved – Changing threshold in logistic regression
- Solved – Measuring the performance of Logistic Regression
- Solved – How to interpret the output of a multinomial classification model in R package gbm
- Solved – How to determine if the predicted probabilities from sklearn logistic regresssion are accurate
- Solved – Quantile regression forecast accuracy