Solved – How to interpret concretely the misclassification error

I'm reading about Cart classification with rpart on R, and after all we should compute the misclassification error,
given that
y is the column that stocks classes,
and x is the variable columns
and fit=rpart(y~.,x)
How Can we interpret this value W=sum(Y==predict(fit,x,type="class"))/length(Y)?

The last formula may not be accurate but it seems to be the proportion of fitted values where it is classified as a certain class.

Below is an example and the response is a binary variable (H or L). What the last formula seems to aim would be length(fit.val[fit.val=="H"])/length(df$y) or length(fit.val[fit.val=="L"])/length(df$y).

Finally it is normally the confusion matrix that classification results are assessed. As shown in cm, the diagonal elements are correct classification while off-diagonal elements are error whether it is false-positive or false-negative. Therefore mean misclassification error can be obtained by (1 – correct classification proportion) – 1 - (sum(diag(cm))/sum(cm))

library(rpart) set.seed(1237) df <- data.frame(y = sample(c("H","L"), 100, replace = T),                  x = rnorm(100)) fit <- rpart(y ~ x, data = df)  # fitted values fit.val <- predict(fit, type = "class")  # proportion that classified as H or L length(fit.val[fit.val=="H"])/length(df$y) # [1] 0.51 length(fit.val[fit.val=="L"])/length(df$y) # [1] 0.49  # confusion table cm <- table(actual = df$y, fitted = fit.val) cm  #         fitted # actual  H  L #      H 36 11 #      L 15 38  # mean misclassification error mmce <- 1 - (sum(diag(cm))/sum(cm)) mmce # [1] 0.26 

Similar Posts:

Rate this post

Leave a Comment