I'm trying to use k-fold cross validation for model selection for a mixed-effect model (fitted with the lme
function).
But, what exactly do I use as the score for each fold? Presumably I don't just fit each candidate model to the validation subset, calculating new coefficients based on the new data. If I understand correctly, I'm supposed to score the models according to how well a model with coefficients calculated using the training data fits the validation data.
But how does one calculate AIC, BIC, logLik, adjR^2, etc on an artificial model that gets its coefficients from one source and its data from another? With so many people advocating cross-validation, I thought there would be more information and code available for calculating the scores by which models will be compared. I can't be the first one trying to cross-validate lme
fits in R, yet I see absolutely nothing about what to use as the score… how does everyone else do this? What am I overlooking?
Best Answer
I've mostly seen cross-validation used in a machine-learning context where one thinks in terms of a loss function that one is trying to minimize. The natural loss function associated with linear models is mean squared error (which is basically the same as $R^2$). Calculating this for test data is very simple.
You could also use other loss functions (mean absolute error, rank correlation, etc.). However, since the linear model learns by minimizing $R^2$, it might be advisable to try a different model in this case that maximizes whatever loss function you chose (e.g. quantile regression for the mean absolute error).
Similar Posts:
- Solved – K-Fold Cross Validation for mixed-effect models: how to score them
- Solved – K-Fold Cross Validation for mixed-effect models: how to score them
- Solved – how to choose model when training accuracy is lower than validation accuracy while training neural network
- Solved – Why Cross-Validation score is less than the Test Score
- Solved – What does it mean when during neural network training validation loss AND validation accuracy drop after an epoch