I am studying the CatBoost paper https://arxiv.org/pdf/1706.09516.pdf (particularly **Function** *BuildTree* in page 16), and noticed that it did not mention regularization.

In particular, split selection is based on minimizing the loss of a new candidate tree, measured by cosine distance between previous iteration gradients and tree outputs. I don't see a "lambda" parameter goes in to penalize new splits.

However, in the CatBoost package there is the parameter of `l2_leaf_reg`

, which is for "Coefficient at the L2 regularization term of the cost function". How does that parameter work?

**Contents**hide

#### Best Answer

The value of the parameter is added to `Leaf denominator`

for each leaf in all steps. Since it is added to denominator part, the higher `l2_leaf_reg`

is the lower value the leaf will obtain.

It is quite intuitive though, when you think how L2 Regularization is used in typical linear regression setting.