When introducing L2 regularization on my neural network, there is a point during training where the error starts to increase after having reached a value very close to 0. This is due to the fact that when $Delta_{w}$ gets closer to 0, the most influenced term in weight update become $lambda w$, that makes the weight go closer to 0, increasing the error. No one seems to point this out when talking about regularization, so I'm a bit confused. What am I missing?

PS: I think that early stopping could be a solution, but is it the right one? And what would you do when there is no validation set to detect when the error stops decreasing and starts increasing?

**Contents**hide

#### Best Answer

Adding any regularization (including L2) will **increase the error** on **training** set. This is exactly the point of the regularization, where we increase bias and reduce the variance of the model. Hopefully, if we regularized well, as a result, the **testing** error will be reduced with the regularization.

Here are some related topics.

What problem do shrinkage methods solve?

How to know if a learning curve from SVM model suffers from bias or variance?

### Similar Posts:

- Solved – Error increase on L2 regularization in an NN
- Solved – Error increase on L2 regularization in an NN
- Solved – Mathematical/Algorithmic definition for overfitting
- Solved – L1 and L2 regularization showing increased MSE with added vars (that eventually decreases)
- Solved – L1 and L2 regularization showing increased MSE with added vars (that eventually decreases)