# Solved – the difference between different kind of cross validation methods

Now I am using MATLAB, and found one of its function named `crossvalind`, which provides lots of methods for cross validation, including some that I wasn't able to find on this site.

``[TRAIN,TEST] = crossvalind('LeaveMOut',N,M), where M is an integer, returns logical index vectors for cross-validation of N observations by randomly selecting M of the observations to hold out for the evaluation set. M defaults to 1 when omitted. Using LeaveMOut cross-validation within a loop does not guarantee disjointed evaluation sets. Use K-fold instead. ``

It seems that this kind of cross-validation can be used in a loop, but it cannot guarantee disjointed evaluation. I think that this kind of cross-validation has the advantage that its partition could be updated in every loop, rather than being fixed before the loop as in k-fold cross-validation. This means that more 'dynamics' could be tested by this kind of cross validation. But I am not sure about this leave-m-out cross-validation: Could anyone states the difference between leave-m-out cross validation and k-fold validation, what are their advantages or disadvantages, and when should we choose one rather than the other?

To @a.desantos

In k-fold cross-validation, from the set of loops one sample can be in the test set for only one time, but in leave-M-out cross-validation one sample could appear in more than one test sets.

Edit 1

Some new issues inspired by @a.desantos. What m and n should I consider? Considering (by convention) that 10-fold cross-validation is nice, I should choose m as 4 when n would be 40; the combination of C(40,4) is 91,390. Should I generate all 91,390 different combinations, then run 91,390 tests, and average the performance of the specific learning algorithm?

To @Frank Harrell

Edit 2

you mean the train-test routine will be runned for 10 times since each time the data is split into ten disjoint parts in a 10 fold cross validation, this will generate one average test error E1, and this error is not reliable, then i need to run this 10-loop-test-routine for about 5-10 times and average all E1s to get another test error named E2, this E2 stands for the ultimate average error, and is more reliable than E1, right?

Contents