# Solved – Why is the penalty term added instead of subtracting it from loss term in regularization

Why is the penalty term \$R(f)\$ added to a general loss function in regularization instead of subtracting?

For example,
\$\$
mathrm{argmin} sum L(theta,hattheta)+ lambda R(f) ?
\$\$

Contents

Let me start with the concept of Regularization. Regularization is means to avoid high variance in model (also known as overfitting). High variance means that your model is actually following all noise and errors in the data. The model is not at all flexible. Since the idea is to control complexity, we want to penalize the model for overfitting.

The parameters of a model are decided based on the cost function of the model. The best model will have minimum cost. Let me take the example of linear regularization.

Cost function and parameters(theta) of Linear Model without Regularization:

Cost function and parameters(theta) of Linear Model with Regularization:

So, by using regularization, the parameters are penalized for over fitting. (regularized term is subtracted from the parameter to minimize the cost function)

Rate this post