# Solved – R: What does train() do when it calculates ridge regression?

I am running ridge regression on the Boston dataset. There are many write-ups online for how to do ridge regression.

I will write up the two methods and then pose my question

Initialize with the dataset

``library('mlbench') data(BostonHousing) ``

First method: According to the Stanford Open course on statistics

``library('glmnet') library('dplyr')  #initialize data matrix for glmnnet z <- colnames(BostonHousing) z <- z[-10] x <- BostonHousing %>% select(one_of(z)) %>% data.matrix()  #create ridge regression with every possible lambda fit <- glmnet(x,BostonHousing\$medv, alpha=0)  #use 10-fold cross validation to choose the lambda with the lowest MSE cv_fit <- cv.glmnet(x,BostonHousing\$medv, alpha = 0)  #create a ridge regression model with the best lambda fit <- cv_fit\$glmnet.fit  #calculate training MSE for best ridge regression model min(cv_fit\$cvm) ``

Second method: According to this tutorial

``library('caret')  #take a random sample of half of the data split <- createDataPartition(y=BostonHousing\$medv, p = 0.5, list = FALSE)  #create training and test sets train <- BostonHousing[split,] test <- BostonHousing[-split,]   #calculate ridge regression on every lambda with the training set ridge <- train(medv ~., data = train, method='ridge',                lambda = 4,preProcess=c('scale', 'center'))  #use the model to predict values of the test set ridge.pred <- predict(ridge, test)  #mse for the test error mean(ridge.pred - test\$medv)^2  #select lambda fitControl <- trainControl(method = "cv", number = 10) lambdaGrid <- expand.grid(lambda = 10^seq(10, -2, length=100))  #do ridge regression with the best lambda ridge <- train(medv~., data = train, method='ridge',                trControl = fitControl,                #                tuneGrid = lambdaGrid                preProcess=c('center', 'scale') )  #predict the test set using the model from the training set ridge.pred <- predict(ridge, test)  #calculate test mse sqrt(mean(ridge.pred - test\$medv)^2) ``

I have a few questions, I hope that's alright.

1- Assuming I use the first method, can I estimate the test error of the ridge model with k-fold cross validation?

It only gives me the training error and I'd like to approximate test error.

2- The second approach uses a validation set. Is that desirable in situations with small sample sizes?

The BostonHousing data is 506 rows by 14 variables.

3- Here is the output in the second method

``ridge  Ridge Regression   254 samples  10 predictor  Pre-processing: centered (10), scaled (10)  Resampling: Cross-Validated (10 fold)  Summary of sample sizes: 230, 229, 228, 229, 229, 229, ...  Resampling results across tuning parameters:    lambda  RMSE       Rsquared   MAE         0e+00   0.5963179  0.6835195  0.4131819   1e-04   0.5963073  0.6835296  0.4131761   1e-01   0.5920124  0.6891727  0.4120725  RMSE was used to select the optimal model using  the smallest value. The final value used for the model was lambda = 0.1. ``

Why is ridge regression using resampling? How did they get a lambda of `0.1` when the first method got a lambda of `0.0501`?

Contents