Is it true that:

Suppose you perform linear regression with $L_2$ regularization and use cross-validation to select

the value of the regularization parameter $λ$ on two datasets drawn from the same distribution :

$D_1$ of 500 examples and $D_2$ of 50,000 examples. The value of lambda found by cross-validation

will likely be higher on $D_2$ than on $D_1$.

**Contents**hide

#### Best Answer

Regularization constrains the parameter space of a model that would otherwise overfit the sample. The optimal amount of regularization depends on the model complexity relative to the sample size ($n$). Namely, the smaller the sample and/or the complexer the model, the more prone the model is to overfitting. Cross-validation should result in a model that is regularized to the extent that it no longer overfits. Hence, the optimal value $lambda_{text{CV}}$ grows roughly with the ratio $frac{p}{n}$, where $p$ is the number of parameters.

Because of this, if the same model is fit on on $D_1$ ($n=500$) and $D_2$ ($n=50,000$), then the optimal value of $lambda$ will almost certainly be lower for $D_2$ than for $D_1$, because $p$ is constant, and $frac{p}{500} > frac{p}{50,000}$.

Put differently, the relative model complexity is lower when you have more observations, so $D_2$ requires less regularization (if any) to combat overfitting.