Solved – this measure called: MSE divided by variance of dependent variable

When evaluating a regression model with cross-validation I thought that the meaningful measure would be MSE divided by the MSE of the null model which consists of always predicting the mean,
$frac{hat E[(y-hat{y})^2]}{hat E[(y-bar{y})^2]}$. This is 1 if the model does not add anything, and 0 if the prediction is perfect (and can even be greater than 1 if the model is actively harmful), To make it more interpretable, I can flip it around:

$1 – frac{hat E[(y-hat{y})^2]}{hat E[(y-bar{y})^2]}$.

This can even be negative if the prediction is worse than the null model.

I have seen people use

$1 – frac{rm{var}(y-hat{y})}{rm{var}(y)}$

and calling it explained variance, but this seems too generous for the model as it does not penalize it for additive or multiplicative biases.

What is the measure I have above called? Is there a reason why it is not used or have I just missed the relevant examples?

Thanks to Mario Figueiredo who commented on my blog, I found out the answer: This measure is called $R^2_{CV}$ or $Q^2$.

Reference: The Prediction Sum of Squares as a General Measure for Regression Diagnostics Nguyen T. Quan Journal of Business & Economic Statistics Vol. 6, No. 4 (Oct., 1988), pp. 501-504 JSTOR

Similar Posts:

Rate this post

Leave a Comment