A colleague in my office said to me today "Tree models aren't good because they get caught by extreme observations".

A search here resulted in this thread that basically supports the claim.

Which leads me to the question – under what situation can a CART model be robust, and how is that shown?

**Contents**hide

#### Best Answer

No, not in their present forms. The problem is that convex loss functions cannot be made to be robust to contamination by outliers (this is a well known fact since the 70's but keeps being rediscovered periodically, see for instance this paper for one recent such re-discovery):

http://www.cs.columbia.edu/~rocco/Public/mlj9.pdf

Now, in the case of regression trees, the fact that CART uses marginals (or alternatively univariate projections) can be used: one can think of a version of CART where the s.d. criterion is replaced by a more robust counterpart (MAD or better yet, Qn estimator).

# Edit:

I recently came across an older paper implementing the approach suggested above (using robust M estimator of scale instead of the MAD). This will impart robustness to "y" outliers to CART/RF's (but *not* to outliers located on the design space, which will affect the estimates of the model's hyper-parameters) See:

Galimberti, G., Pillati, M., & Soffritti, G. (2007). Robust regression trees based on M-estimators. Statistica, LXVII, 173–190.