I have a matrix (x) containing 55 samples (rows) and 10000 independent variables (columns). The observations are binary, healthy or ill {0,1} (y). I want to perform leave one out cross-validation and determine the Area Under Curve (AUC) for each of the variables. To do so I need the nfold
parameter to be equal to the number of observations (i.e..55). Am I right?
result=cv.glmnet(x,y,nfolds=55,type.measure="auc",family="binomial")
And I'm getting these warnings:
"Warning messages: 1: Too few (< 10) observations per fold for type.measure='auc' in cv.lognet; changed to type.measure='deviance'. Alternatively, use smaller value for nfolds 2: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold"
What I'm doing wrong?
I want to get LOO-AUCs for each variable.
I'll really appreciate any help. Thank you
Best Answer
number of folds - default is 10. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds=3
From the package documenation it appears that you indeed can set nfolds equal to the sample size to perform leave-one-out CV.
However, the problem you are facing – as the error message indicates, is that, in order to calculate the AUC ( which really needs a way to rank your test cases) glmnet needs at least 10 obs.
Think about – if no. of test cases is only 1 how are you supposed to rank just one case?
This is only an issue because of the performance measure (auc) you have chosen. Other measures which do not require ranking i.e., those that can be sufficiently calculated using just on one test case ex: Mean squared error will not give you such an error you see.
Similar Posts:
- Solved – Why does lambda.min value in glmnet tuning cross-validation change, when repeating test
- Solved – Why does lambda.min value in glmnet tuning cross-validation change, when repeating test
- Solved – Why does lambda.min value in glmnet tuning cross-validation change, when repeating test
- Solved – Why does lambda.min value in glmnet tuning cross-validation change, when repeating test
- Solved – How to encode an n-level categorical variable as dummies, for glmnet