one approach to deal with the unbalanced dataset is to choose the models that can hand this type of dataset well such as decision tree, but why decision tree can handle the unbalanced dataset well?
Decision trees do not always handle unbalanced data well.
If there is relatively obvious particular partition of our sample space that contains a high-proportion of minority class instances, decision trees can probably find it but that is far from a certainty. For example, if the minority class is strongly associated with multiple features interacting with each other, it is rather demanding for a tree to recognise the pattern; even if it does, it will probably be a rather deep and unstable tree that will be prone to over-fitting; pruning the tree will not immediately solve the problem because it will directly affect the ability of the tree to utilise those interactions.
Generalisations of decision tree algorithms as random forests and gradient boosting machines offer a much better alternative in terms of stability without sacrificing any performance. Similarly using a GAM with an interacting spline can also provide another potentially viable alternative.