Solved – Test accuracy is lower than the validate accuracy in classification

I separated my data set into three parts includes training, validate, and testing. I performed k-fold validation with using the validate set, then test the true performance of the predictive model using the test set.

However, I could see that the classification accuracy using the test(unseen data) is pretty low. (~15% lower than the validation accuracy.) Is this normal? Thanks.

It is natural. Cross-validation almost always lead to lower estimated errors – it uses some data that are different from test set so it will cause overfitting for sure. But the percentage of decrease is quite big, and if you have big sample size and it can not be explained with stochastic effects, I would suggest that your classification method is overcomplicated.

Similar Posts:

Rate this post

Leave a Comment