Using the following R code I obtain a decision tree using the agaricus dataset:
data(agaricus.train, package='xgboost') bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_depth = 3, eta = 1, nthread = 2, nrounds = 2,objective = "binary:logistic") # plot all the trees xgb.plot.tree(model = bst) # plot only the first tree and display the node ID: xgb.plot.tree(model = bst, trees = 0, show_node_id = TRUE)
I want to understand more clearly the "value" output of the tree (the 3rd line in the oval shaped object). Here we can see that tree 0
leaf 7
gives a value 1.90174532
. (That is the first terminal node in the image). I want to know if this value
is the same as the log-odds
score. So, all observations which follow the upper path of the decision tree will obtain a log-odds score of 1.90174532
. Then in a new decision tree the observations will fall into a different split depending on each observations characteristics and will obtain a "new" value
Then we sum up all these values
across all trees to obtain a final log-odds
score which can then be converted to a predicted probability using the logistic function.
Is my intuition correct? Does value
= log-odds
.
( https://rdrr.io/cran/xgboost/man/xgb.plot.tree.html )
Best Answer
The "value" is the contribution of a leaf to the logit. The logit for a sample is the sum of the "value" of all of a sample's leafs. Because XGBoost is an ensemble, a sample will terminate in one leaf for each tree; gradient boosted ensembles sum over the predictions of all trees.
More information about gradient boosted trees generally and XGBoost specifically can be found in https://arxiv.org/abs/1603.02754 or xgboost.
Similar Posts:
- Solved – XGBoost – Weights in classifcation tree 0 or 1
- Solved – Are XGBoost probability outputs based on the number of examples in a terminal leaf
- Solved – What are the final predictions in tree based models
- Solved – Do classification trees need to consider the correlation between attributes
- Solved – Do classification trees need to consider the correlation between attributes