# Solved – How to use press statistic for model selection

I am confused about how to use the PRESS statistic to compare models.I understand that the PRESS statistic is calculated by summing the square of the residuals as :

$$text{PRESS} = sum_{i=1}^n (y_i – hat{y}_{i, -i})^2$$

where the residual is the difference between the observed and predicted value for the $$i$$-th data point, with the prediction coming from a model trained on data with the $$i$$-th data point removed. My confusion lies in the fact that a new regression equation (hence a new model) is estimated each time a data point is dropped (so $$n$$ different models are trained in the process of calculating PRESS) – so the final PRESS statistic is not tied to a single model. In that case, how can you use the PRESS statistics to compare two different models? How do you calculate a PRESS statistic for a given regression model? I think I am making a basic mistake somewhere here but not sure where my reasoning is off. Thanks for any help.

Contents

You calculate PRESS on a model trained on $$n$$ values to get an idea of its out-of-sample performance, by leaving out one sample at a time. So while you indeed end up with $$n$$ models to determine the statistic, you eventually use the original model trained on all $$n$$ values.