Solved – Time complexity of leave-one-out cross validation for nonparametric regression

From Artificial Intelligence: A modern approach: Most nonparametric models have the advantage that it is easy to do leave-one-out crossvalidation without having to recompute everything. With a k-nearest-neighbors model, for instance, when given a test example (x, y) we retrieve the k nearest neighbors once, compute the per-example loss L(y, h(x)) from them, and record … Read more

Solved – Multi- Class probabilities of Random Forest inside caret Model

Im facing a problem with the results of a multi-class random forest model. I want to use a) the predictions of the model and b) the class probabilities of these predictions for further work. I did a cross-validation, grouped by a variable I dismissed right after, and trained a multiclass model, using the following code: … Read more

Solved – Multi- Class probabilities of Random Forest inside caret Model

Im facing a problem with the results of a multi-class random forest model. I want to use a) the predictions of the model and b) the class probabilities of these predictions for further work. I did a cross-validation, grouped by a variable I dismissed right after, and trained a multiclass model, using the following code: … Read more

Solved – Cross validation for uneven groups using cv.glmnet

I am new in bioinformatics and machine learning. I am trying to predict a disease outcome using cv.glmnet to choose the best lambda for the prediction. The problem I have is that outcome groups are uneven (30 samples for outcome 0 and 14 samples for outcome 1). Therefore, in a 10-fold CV (even in a … Read more

Solved – Cross validation for uneven groups using cv.glmnet

I am new in bioinformatics and machine learning. I am trying to predict a disease outcome using cv.glmnet to choose the best lambda for the prediction. The problem I have is that outcome groups are uneven (30 samples for outcome 0 and 14 samples for outcome 1). Therefore, in a 10-fold CV (even in a … Read more

Solved – Question about performing k-fold CV with caret

I have read the help manual of caret carefully: see A Short Introduction to the caret Package. In its example, I found it split the data with createDataPartition before a model training. library(caret) library(mlbench) data(Sonar) set.seed(107) inTrain <- creatDataPartition(y = Sonar$Class, p = .75, list = FALSE) str(inTrain) training <- Sonar[inTrain,] testing <- Sonar[-inTrain,] And … Read more

Solved – Question about performing k-fold CV with caret

I have read the help manual of caret carefully: see A Short Introduction to the caret Package. In its example, I found it split the data with createDataPartition before a model training. library(caret) library(mlbench) data(Sonar) set.seed(107) inTrain <- creatDataPartition(y = Sonar$Class, p = .75, list = FALSE) str(inTrain) training <- Sonar[inTrain,] testing <- Sonar[-inTrain,] And … Read more

Solved – KNN classifier + cross validation

how can I find the mean and standard deviation of error rate or accuracy of a k- fold cross validation performing K-nearest-neighbour classification model for each fold? Best Answer The mean and standard deviation of you metrics are calculated across results of all cross validation (CV) partitions. So, if you have 10 CV partitions with … Read more

Solved – Why does the model consistently perform worse in cross-validation

Okay so I run this model manually and get around 80-90% accuracy: mlp = MLPClassifier(hidden_layer_sizes=( 50, 50), activation="logistic", max_iter=500) mlp.out_activation_ = "logistic" mlp.fit(X_train, Y_train) predictions = mlp.predict(X_test) print(confusion_matrix(Y_test, predictions)) print(classification_report(Y_test, predictions)) Then, I do some 10-fold cross validation: print(cross_val_score(mlp, X_test, Y_test, scoring='accuracy', cv=10)) And I get accuracy stats something like the following for each fold: … Read more

Solved – Why does the model consistently perform worse in cross-validation

Okay so I run this model manually and get around 80-90% accuracy: mlp = MLPClassifier(hidden_layer_sizes=( 50, 50), activation="logistic", max_iter=500) mlp.out_activation_ = "logistic" mlp.fit(X_train, Y_train) predictions = mlp.predict(X_test) print(confusion_matrix(Y_test, predictions)) print(classification_report(Y_test, predictions)) Then, I do some 10-fold cross validation: print(cross_val_score(mlp, X_test, Y_test, scoring='accuracy', cv=10)) And I get accuracy stats something like the following for each fold: … Read more