I'm working my way through the ISLR book and I just finished exercise 9c/d in Chapter 6. For some reason the output I end up getting is showing the intercept twice, with no coefficient for one of the intercepts. Is there anything in my code below I've done wrong to have that happen?
library(ISLR) index <- 1:nrow(College) index.train <- sample(index, length(index)/2) College.train <- College[index.train,] College.test <- College[-index.train,] X.train=model.matrix(Apps~.,College.train) Y.train=College.train$Apps X.test=model.matrix(Apps~.,College.test) library(glmnet) cv.ridge=cv.glmnet(X.train,Y.train,alpha=0) bestlam.ridge=cv.ridge$lambda.min reg.ridge=glmnet(X.train,Y.train,alpha=0,lambda=bestlam.ridge) pred.ridge=predict(reg.ridge,s=bestlam.ridge,newx=X.test) ridge.err=mean((College.test$Apps-pred.ridge)^2) coef.ridge=predict(reg.ridge,type="coefficients",s=bestlam.ridge) coef.ridge 19 x 1 sparse Matrix of class "dgCMatrix" 1 (Intercept) -1.442501e+03 (Intercept) . PrivateYes -6.945229e+02 Accept 7.520311e-01 Enroll 7.103728e-01 Top10perc 2.584124e+01 Top25perc 3.603232e-01 F.Undergrad 1.191518e-01 P.Undergrad -6.457211e-03 Outstate 5.294306e-03 Room.Board 2.198127e-01 Books -1.002810e-01 Personal -5.784296e-02 PhD -2.515247e+00 Terminal -4.409074e+00 S.F.Ratio -1.364356e-01 perc.alumni -1.685828e+01 Expend 8.291585e-02 Grad.Rate 1.412741e+01
Best Answer
Yes; in R's formula notation, the intercept is implied, unless you explicitly suppress it, which you did not. This coupled with the fact that often one does not include the intercept in the regularisation process, means you actually fitted a constant term twice – the software automatically includes the intercept via argument intercept
which defaults to TRUE
.
The code below compares how you computed the model matrix:
> m1 <- model.matrix( ~ disp + hp + drat, data = mtcars) > head(m1) (Intercept) disp hp drat Mazda RX4 1 160 110 3.90 Mazda RX4 Wag 1 160 110 3.90 Datsun 710 1 108 93 3.85 Hornet 4 Drive 1 258 110 3.08 Hornet Sportabout 1 360 175 3.15 Valiant 1 225 105 2.76
and how one explicitly excludes the intercept, through the addition of - 1
(or you can use + 0
if you wish):
> m2 <- model.matrix( ~ disp + hp + drat - 1, data = mtcars) > head(m2) disp hp drat Mazda RX4 160 110 3.90 Mazda RX4 Wag 160 110 3.90 Datsun 710 108 93 3.85 Hornet 4 Drive 258 110 3.08 Hornet Sportabout 360 175 3.15 Valiant 225 105 2.76
The other default of note is the standardize
argument, which defaults to TRUE
. You normally want the design matrix, minus the constant term (intercept), to be standardised by columns to make the coefficients for the variables comparable in value. You normally don't want to include the intercept in that process, but you generally do want to include an intercept in the model. Also, you normally don't want to include the intercept in the terms that are subject to shrinkage. All this means that the natural way of handling these varied requirements is to not supply the intercept in variable X
.
Similar Posts:
- Solved – Running Ridge/Lasso regression with glmnet – Intercept shows up twice in the output. Have I done anything wrong
- Solved – Running Ridge/Lasso regression with glmnet – Intercept shows up twice in the output. Have I done anything wrong
- Solved – Since a projection matrix is idempotent, symmetric and square, why isn’t it just the identity matrix
- Solved – R glmnet and elasticnet gives different results, why
- Solved – Selection of correlated variables for ridge regression