Could somebody provide a nice example code how to best implement an outer crossvalidation cycle using the caret package in R? The package provides a convenient trainControl() argument to ajust the inner crossvalidation. However I would like to embed this into multiple outer crossvalidation cycles to get a more stable estimate of the prediction performance of the estimated models!
Best Answer
Inner and outer CV are used to perform classifier selection not to get a better prediction on the estimate. To get a better estimate, do a repeated cv. So to perform a 10-repeates 5-fold CV use
trainControl(method = "repeatedcv",number = 5, ## repeated ten times repeats = 10)
But if what you really want is a nested CV, for example to select between a random forest or a svm) then as far as know you have to do the outer CV explicitly. What I did for an outer 5-fold, inner 10-fold was:
ntrain=length(ytrain) train.ext=createFolds(ytrain,k=5,returnTrain=TRUE) test.ext=lapply(train.ext,function(x) (1:ntrain)[-x]) for (i in 1:5){ model<-train(Class ~ ., data = training[train.ext[[i]]], trControl=trainControl(method = "cv",number = 10), ... ... }