Doing "leave one out cross validation" with a regression task is easy. You can calculate the MSE (mean squared error) even on one single sample and average them. But what about a classification task?
Calculating F1 Score, AUC, etc. on a single sample is not possible. So is "leave one out cross validation" possible on classification tasks?
You could just remember the decision of each single "leave one out cross validation" step. Build one confusion matrix after all cross validation steps are done and calculate the score from that. Is this how it is (can be) done?
2nd Question is: When I do "leave one out cross validation" I think doing early stopping is not possible (I cant stop on a single sample). Is there a solution to this dilemma?
Supplement: Early stopping is a method to avoid overfitting when training neural networks and gradient boosted trees (for example). You stop training when you see overfitting. Overfitting is measured on a validation set.
Your second approach is the right one. In LOOCV (just as in k-fold CV) you predict the class / probability for each observation and you save it. Then, at the end, once you have all the predictions for all the observations from all the folds, you build your confusion matrix on the whole data.
As for your second question, I do not understand. Why would you want to stop early?