Solved – Meaning of `max_depth` in GradientBoostingClassifier in scikit-learn

when I use the GradientBoostingClassifier from scikit-learn, I find that there is a parameter max_depth to set, which controls the maximum depth of the regression tree. May I know what exactly does that parameter do? If max_depth=3, does it mean that the construction of a regression tree will stop growing once the tree exceeds a depth of 3? In this case, this parameter is basically used to control the complexity of the regression tree?

You are right. max_depth bounds the maximum depth of regression tree for Random Forest constructed using Gradient Boosting. However, default value for this option is rather good.

To see how decision trees constructed using gradient boosting looks like you can use something like this

from sklearn import tree from sklearn.externals.six import StringIO import pydot import numpy as np  # generate training sample training_points = np.random.rand(20, 3) training_values = np.sum(training_points, axis=1) > 0.8 * np.random.rand(20,)  # get decision tree decision_tree = tree.DecisionTreeClassifier(max_depth=3) model =, training_values)  # save tree as pdf dot_data = StringIO()  tree.export_graphviz(model, out_file=dot_data)  graph = pydot.graph_from_dot_data(dot_data.getvalue())  graph.write_pdf("decision_tree.pdf") ` 

Similar Posts:

Rate this post

Leave a Comment