The $R^2$ of a model measures how well a model fits the data and is a measure of the shared variation between two (or more) variables. Its equivalent measure for logistic regression is the pseudo-$R^2$. A pseudo-$R^2$ is sometimes presented alongside the area under the receiver operator characteristic (ROC) as a measure of a model's predictive accuracy.
I'm curious as to whether there is any straightforward relationship between these two metrics. Does a model with a higher pseudo-$R^2$ necessarily have a larger AUC ROC? Are there any situations where a model can have a low pseudo-$R^2$ but a high AUC ROC? It seems intuitive that the two measures are necessarily correlated, but I've been wrong many times in the past.
Best Answer
The AUC is scale independant. It is solely based on ranks. If you multiply all the probabilities outputed by your logistic regression by the same factor $lambdain(0,1]$, the AUC will remain the same. Note that as $lambdarightarrow0$ the pseudo $R^2$ will decrease (possibly becoming negative).
So you can have a low pseudo $R^2$ but a large AUC.
Similar Posts:
- Solved – Relationship between pseudo-$R^2$ and area under the ROC curve
- Solved – Relationship between pseudo-$R^2$ and area under the ROC curve
- Solved – How to set a maximum and minimum level for a dependent variable
- Solved – Pseudo-$R^2$: what are the null models for linear and non-linear regressions
- Solved – Find out pseudo R square value for a Logistic Regression analysis