Solved – Calculate LOO-AUC values using glmnet

I have a matrix (x) containing 55 samples (rows) and 10000 independent variables (columns). The observations are binary, healthy or ill {0,1} (y). I want to perform leave one out cross-validation and determine the Area Under Curve (AUC) for each of the variables. To do so I need the nfold parameter to be equal to the number of observations (i.e..55). Am I right?

result=cv.glmnet(x,y,nfolds=55,type.measure="auc",family="binomial") 

And I'm getting these warnings:

"Warning messages: 1: Too few (< 10) observations per fold for type.measure='auc' in    cv.lognet; changed to type.measure='deviance'. Alternatively, use smaller   value for nfolds  2: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per   fold" 

What I'm doing wrong?

I want to get LOO-AUCs for each variable.

I'll really appreciate any help. Thank you

number of folds - default is 10. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds=3 

From the package documenation it appears that you indeed can set nfolds equal to the sample size to perform leave-one-out CV.

However, the problem you are facing – as the error message indicates, is that, in order to calculate the AUC ( which really needs a way to rank your test cases) glmnet needs at least 10 obs.

Think about – if no. of test cases is only 1 how are you supposed to rank just one case?

This is only an issue because of the performance measure (auc) you have chosen. Other measures which do not require ranking i.e., those that can be sufficiently calculated using just on one test case ex: Mean squared error will not give you such an error you see.

Similar Posts:

Rate this post

Leave a Comment