When running ridge regression, how do you interpret coefficients that end up larger than their corresponding coefficients under least squares (for certain values of $lambda$)? Isn't ridge regression supposed to monotonically shrink coefficients?
On a related note, how does one interpret a coefficient whose sign changes during ridge regression (i.e., the ridge trace crosses from negative to positive on a ridge trace plot)?
Contents
hide
Best Answer
As $lambda$ increases from zero the contribution of various coefficients changes to suit the optimization, allowing both value increases and sign changes. Have a look at Ryan Tibshirani's ridge regression charts (PDF) illustrating both of your questions (charts 17, 19).
Similar Posts:
- Solved – mathematical expression that shows how LASSO shrinks coefficients (including some to zero)
- Solved – Bias and variance properties of $L^1$ vs $L^2$ normalization
- Solved – Can the alpha, lambda values of a glmnet object output determine whether ridge or Lasso
- Solved – Predict Coefficients with glmnet()
- Solved – Maximum penalty for ridge regression