Im facing a problem with the results of a multi-class random forest model.

I want to use a) the predictions of the model and b) the class probabilities of these predictions for further work.

I did a cross-validation, grouped by a variable I dismissed right after, and trained a multiclass model, using the following code:

` folds5 <- groupKFold(feature_data$hh_id, k = 5) #remove group variable feature_data <- feature_data[, ! names(feature_data) == "hh_id"] fitControl <- trainControl(method = "cv", number = 5, index = folds5, sampling = "down", savePred=T) set.seed(1) rf_mod <- train(class~.,feature_data, method = "rf", norm.votes=T, #predict.all=FALSE, type = "Classification", metric= "Accuracy", ntree = 500, trControl = fitControl) `

my results is an accuracy of approx 40%, which is reasonable for that case. this is the confusion matrix:

`Confusion Matrix and Statistics Reference Prediction 1 2 3 4 5 1 245 399 61 57 37 2 171 962 162 206 91 3 50 456 131 130 51 4 36 352 95 395 167 5 67 182 42 263 152 Overall Statistics Accuracy : 0.38 `

My first thoughts to continue was to use the function `predict(..., type = "prob")`

to get the probabilities.

This leads to accuracy going up to 80%. I suppose that these results are wrong, because the data was also used for learning.

`predict_rf_model <- predict(rf_mod) caret::confusionMatrix(predict_rf_model , feature_data$class) Reference Prediction 1 2 3 4 5 1 558 190 0 13 0 2 8 1658 0 45 0 3 1 221 491 54 2 4 1 185 0 886 1 5 1 97 0 53 495 Overall Statistics Accuracy : 0.8242 95% CI : (0.8133, 0.8347) `

This means I cannot use predict() to get the class probabilites

I was trying to find fields inside my model `rf_mod`

. And I found some promising fields:

`rf_mod$pred`

saves the predictions of all test samples, if you set safePred in TrainControl. By that I get all predicted classes, which is nicethere is a field

`rf_mod$finalModel$votes`

which saves the class probabilities( 5 Classes) :

`> rf_mod$finalModel$votes 1 2 3 4 5 1 0.521505376 0.021505376 0.010752688 0.064516129 0.381720430 2 0.865979381 0.072164948 0.020618557 0.005154639 0.036082474 3 0.873626374 0.054945055 0.038461538 0.016483516 0.016483516 ... `

- I first thought this is what I need, but finalModel has the same or a similar confusion matrix as the predict function() with falsified(?) results.

Where can I get the classifier probability like in `rf_mod$finalModel$votes`

?

There might be another parameter to get the probabilites that I am too dumb to figure out.

Any other solution to get class probabilities with grouped cross validation is also appreciated.

For your interest, I want to combine the classifier results in the next step, by hh_id. An information about the probability could improve the results.

Thank you in advance!

**Contents**hide

#### Best Answer

In addition to `savePredictions`

, you should set `classProbs=TRUE`

.

https://rdrr.io/cran/caret/man/trainControl.html

https://stackoverflow.com/q/36750272/10495893

### Similar Posts:

- Solved – Multi- Class probabilities of Random Forest inside caret Model
- Solved – Multi- Class probabilities of Random Forest inside caret Model
- Solved – Probabilities of classes using h2o.predict
- Solved – Why does randomForest confusion matrix not match the one I calculate using predictions from the model object
- Solved – Why does randomForest confusion matrix not match the one I calculate using predictions from the model object