Solved – Combining randomForests in R, why are the err.rate, mse and rsq components NULL

From ?combine in randomForest:

The confusion, err.rate, mse and rsq components (as well as the
corresponding components in the test compnent, if exist) of the
combined object will be NULL.

Why are the err.rate, mse and rsq components for the combined NULL? Is there an efficient way to re-calculate these metrics?

I ask because I would like to figure out a way to use the oob re-sampling method with the parRF model in caret.

Because, at least in default setting, the split of objects into bags is lost.

To replicate, you have to train each partial model with keep.inbag=TRUE, then predict the training set with predict(model,trainSet,predict.all=TRUE) and clip it to out-of-bags using bag splits from model$inbag.
Now, you do this for all forests you want to merge, bind and use to calculate the metrics.

(I'll try to extend this to actual code later)

Similar Posts:

Rate this post

Leave a Comment

Solved – Combining randomForests in R, why are the err.rate, mse and rsq components NULL

From ?combine in randomForest:

The confusion, err.rate, mse and rsq components (as well as the
corresponding components in the test compnent, if exist) of the
combined object will be NULL.

Why are the err.rate, mse and rsq components for the combined NULL? Is there an efficient way to re-calculate these metrics?

I ask because I would like to figure out a way to use the oob re-sampling method with the parRF model in caret.

Best Answer

Because, at least in default setting, the split of objects into bags is lost.

To replicate, you have to train each partial model with keep.inbag=TRUE, then predict the training set with predict(model,trainSet,predict.all=TRUE) and clip it to out-of-bags using bag splits from model$inbag.
Now, you do this for all forests you want to merge, bind and use to calculate the metrics.

(I'll try to extend this to actual code later)

Similar Posts:

Rate this post

Leave a Comment