It may be a weird question because I don't fully understand hyperparameter-tuning yet.
Currently I'm using gridSearchCV
of sklearn
to tune the parameters of a randomForestClassifier
like this:
gs = GridSearchCV(RandomForestClassifier(n_estimators=100, random_state=42), param_grid={'max_depth': range(5, 25, 4), 'min_samples_leaf': range(5, 40, 5),'criterion': ['entropy', 'gini']}, scoring=scoring, cv=3, refit='Accuracy', n_jobs=-1) gs.fit(X_Distances, Y) results = gs.cv_results_
After that I check the gs
object for the best_params
and best_score
. Now I'm using best_params
to instantiate a RandomForestClassifier
and use stratified validation again to record metrics and print a confusion matrix:
rf = RandomForestClassifier(n_estimators=1000, min_samples_leaf=7, max_depth=18, criterion='entropy', random_state=42) accuracy = [] metrics = {'accuracy':[], 'precision':[], 'recall':[], 'fscore':[], 'support':[]} counter = 0 print('################################################### RandomForest ###################################################') for train_index, test_index in skf.split(X_Distances,Y): X_train, X_test = X_Distances[train_index], X_Distances[test_index] y_train, y_test = Y[train_index], Y[test_index] rf.fit(X_train, y_train) y_pred = rf.predict(X_test) precision, recall, fscore, support = np.round(score(y_test, y_pred), 2) metrics['accuracy'].append(round(accuracy_score(y_test, y_pred), 2)) metrics['precision'].append(precision) metrics['recall'].append(recall) metrics['fscore'].append(fscore) metrics['support'].append(support) print(classification_report(y_test, y_pred)) matrix = confusion_matrix(y_test, y_pred) methods.saveConfusionMatrix(matrix, ('confusion_matrix_randomforest_distances_' + str(counter) +'.png')) counter = counter+1 meanAcc= round(np.mean(np.asarray(metrics['accuracy'])),2)*100 print('meanAcc: ', meanAcc)
Is this a reasonable approach or do I have something completely wrong?
EDIT:
I just tested the following:
gs = GridSearchCV(RandomForestClassifier(n_estimators=100, random_state=42), param_grid={'max_depth': range(5, 25, 4), 'min_samples_leaf': range(5, 40, 5),'criterion': ['entropy', 'gini']}, scoring=scoring, cv=3, refit='Accuracy', n_jobs=-1) gs.fit(X_Distances, Y)
This yields best_score = 0.5362903225806451
at best_index = 28
. When I check the accuracies in the 3 folds at index 28 I get:
- split0: 0.5185929648241207
- split1: 0.526686807653575
- split2: 0.5637651821862348
Which leads to the mean test accuracy: 0.5362903225806451. best_params: {'criterion': 'entropy', 'max_depth': 21, 'min_samples_leaf': 5}
Now I run this code which is using the mentioned best_params with a stratified 3 fold split (like GridSearchCV):
rf = RandomForestClassifier(n_estimators=100, min_samples_leaf=5, max_depth=21, criterion='entropy', random_state=42) metrics = {'accuracy':[], 'precision':[], 'recall':[], 'fscore':[], 'support':[]} counter = 0 print('################################################### RandomForest_Gini ###################################################') for train_index, test_index in skf.split(X_Distances,Y): X_train, X_test = X_Distances[train_index], X_Distances[test_index] y_train, y_test = Y[train_index], Y[test_index] rf.fit(X_train, y_train) y_pred = rf.predict(X_test) precision, recall, fscore, support = np.round(score(y_test, y_pred)) metrics['accuracy'].append(accuracy_score(y_test, y_pred)) metrics['precision'].append(precision) metrics['recall'].append(recall) metrics['fscore'].append(fscore) metrics['support'].append(support) print(classification_report(y_test, y_pred)) matrix = confusion_matrix(y_test, y_pred) methods.saveConfusionMatrix(matrix, ('confusion_matrix_randomforest_distances_' + str(counter) +'.png')) counter = counter+1 meanAcc= np.mean(np.asarray(metrics['accuracy'])) print('meanAcc: ', meanAcc)
The metrics dictionairy yields the exact same accuracies (split0: 0.5185929648241207, split1: 0.526686807653575, split2: 0.5637651821862348)
However the mean calculation is a bit off: 0.5363483182213101. With this approach I get the actual predictions of the best_estimator found by gridSearchCV. Now I can plot a confusion matrix for each fold to analyse. The productive model would be trained with my whole data set.
Best Answer
Gridsearch uses crossvalidation, if you take the best parameters you should be able to reproduce the best result, just be carefull to leave aside your test data and use it only at the end.
20-30 % test data is the usual.
Similar Posts:
- Solved – Ensemble models perform worse than single one
- Solved – Ensemble models perform worse than single one
- Solved – How to find the optimal threshold for the Weighted f1 score in a binary classification problem
- Solved – How to evaluate the predicted values using Scikit-Learn
- Solved – Area under Precision-Recall Curve (AUC of PR-curve) and Average Precision (AP)