I'm reading about Cart classification with rpart on R, and after all we should compute the misclassification error,

given that

y is the column that stocks classes,

and x is the variable columns

and `fit=rpart(y~.,x)`

How Can we interpret this value `W=sum(Y==predict(fit,x,type="class"))/length(Y)`

?

**Contents**hide

#### Best Answer

The last formula may not be accurate but it seems to be the proportion of fitted values where it is classified as a certain class.

Below is an example and the response is a binary variable (H or L). What the last formula seems to aim would be `length(fit.val[fit.val=="H"])/length(df$y)`

or `length(fit.val[fit.val=="L"])/length(df$y)`

.

Finally it is normally the confusion matrix that classification results are assessed. As shown in `cm`

, the diagonal elements are correct classification while off-diagonal elements are error whether it is false-positive or false-negative. Therefore mean misclassification error can be obtained by (1 – correct classification proportion) – `1 - (sum(diag(cm))/sum(cm))`

`library(rpart) set.seed(1237) df <- data.frame(y = sample(c("H","L"), 100, replace = T), x = rnorm(100)) fit <- rpart(y ~ x, data = df) # fitted values fit.val <- predict(fit, type = "class") # proportion that classified as H or L length(fit.val[fit.val=="H"])/length(df$y) # [1] 0.51 length(fit.val[fit.val=="L"])/length(df$y) # [1] 0.49 # confusion table cm <- table(actual = df$y, fitted = fit.val) cm # fitted # actual H L # H 36 11 # L 15 38 # mean misclassification error mmce <- 1 - (sum(diag(cm))/sum(cm)) mmce # [1] 0.26 `

### Similar Posts:

- Solved – Use of regression-trees to determine probabilities for a binary variable
- Solved – How to control the cost of misclassification in Random Forests
- Solved – How to find TP,TN, FP and FN values from 8×8 Confusion Matrix
- Solved – How to use estimated probabilities of a class from rpart to identify the top N classes
- Solved – Validating the CART model in R