I'd like to generate ROC curves in R, but I'm confused about what input to give either ROCR or pROC.
I have 5000 data-points for which I know the true classification (1 or 0), and a continuous prediction score for each.
Should I use the 5000 classifications and prediction scores as input? Or am I supposed to manually apply a cutoff to the predictions to turn them into binary predictors? Do I have to do that for multiple cutoff thresholds and supply them all to the
roc function(s), or is it all done for me?
Thanks for you help!
The ROC curve relates the false positive rate — #FP/(#FP + #TN) to the true positive rate — #TP/(#TP + FN). For the ROC curve, the false positive rate is on the x-axis, and the true positive rate is on the y-axis.
As you said, one way to do this would be to calculate the above quantities for many values of your threshold, and plot these values against each other.