as I've been learning about tree-based classification algorithms, I've learned the single tree algorithms are 'greedy', meaning that they'll minimize entropy at every stage. This means that sometimes these algorithms will fail to minimize total entropy because they are only looking one step ahead. Here's an example:

In this case, the picture above represents the 'ideal split'. That is, every square is homogeneous. However, a decision tree algorithm might make the following split:

This is because the algorithm will split the data in half along the x axis to minimize initial information gain, without considering future information gain.

Now, my question is as follows:

I'm trying to understand the random forest algorithm, and I was wondering if someone could explain how the algorithm would function in this situation. How would a random forest be able to find this 'optimal split'.

**Contents**hide

#### Best Answer

In the fitting process of each tree inside the forest greedy decisions are still made. However, in the bagging-also called voting-i.e. when the predictions are combined from all the trees into a single prediction, each tree gets an equal vote. So in short, the RandomForest algorithm is also greedy in the same sense as the CART algorithm. The RandomForest algorithm has a sample *with replacement* of the observations in the data so each tree will be slightly different. Given that CART is a *non-stable* algorithm (small changes of input data may lead to drastically different trees), this random sampling may alleviate the drawbacks of the greedy fitting of the CART fitting process.

### Similar Posts:

- Solved – CART vs. Random Forests: How do random forests prevent greediness
- Solved – Types of Trees in Random Forest
- Solved – Complexity parameter zero in decision trees
- Solved – What are advantages of random forests vs using bagging with other classifiers
- Solved – What are advantages of random forests vs using bagging with other classifiers