I found somewhere that test set must not be used as a validation set. Why?
Validation set is acted upon when the model parameters are fixed, and learning happens only through backprop on the training batch.
So, why can't I use test data as validation data?
I presume you're already understand why performance on the training set isn't representative of the actual performance of the trained model: overfitting. The parameters you learn during training are optimized to the training set. If you're not careful, you can over-optimize the parameters, leading to a model that's really, really good on the training set, but doesn't generalize to completely unseen real-world data.
The thing is, in practice the "parameters" of the training method aren't the only thing you need to specify for a learning example. You also have hyperparameters. Now, those hyperparameters might be an explicit part of the model fitting (like learning rate), but you can also view other choices as "hyperparameters": do you choose an SVM or a neural network? If you implement early stopping, at what point do you stop?
Just like overfitting of the parameters on the training set, you can overfit the hyperparameters to the validation set. As soon as you use the results of the method on the validation set to inform how you do modeling, you now have the chance of overfitting to the training+validation set combo. Perhaps this particular validation set does better with an SVM than the general case.
That's the main reason people separate out the validation and test sets. If you use a set during your model fitting – even at the "hmm, that method doesn't do so well, maybe I should try …" level – the results you get on that set will not be fully indicative of the general results you'll obtain on completely new data. That's why you hold out a fraction of the data till the very end, past the point where you're making any decisions on what to do.
- Solved – Do I stick with the tuned model parameters even if they produce worse test scores
- Solved – About cross-validation for machine learning
- Solved – Cross-validation for (hyper)parameter tuning to be performed in validation set or training set
- Solved – Do I need to refit the whole training set after cross-validation?
- Solved – Why will the validation set error underestimate the generalisation error