I have built the CART model, however I want to understand how we predict/validate the results with Validation data.
d=sort(sample(nrow(red_data),nrow(red_data)*.7)) #Selecting training sample train_data=red_data[d,] test_data=red_data[-d,] nrow(train_data) #20133# nrow(test_data)#8629# rpart(formula = red_status ~ VALUE_CLASS + GAP_1 + cs_rr1_2, data = train_data, method = "class", control = rpart.control(minsplit = 6)) print(cfit) summary(cfit) prp(cfit,type=2, extra=106, nn=TRUE,fallen.leaves=TRUE)
This is what I have done.
Could some one help me on this
Since we don't have access to your data, I am guessing that pages 39- 40 of the manual will help you. I'll use the
iris data set since it's readily available (and in the documentation):
data(iris) d <- sort(sample(nrow(iris),nrow(iris)*.7)) (iris.rpart <- rpart(Species ~ . , data = iris, subset = d, method = "class")) # n= 105 # # node), split, n, loss, yval, (yprob) # * denotes terminal node # # 1) root 105 68 setosa (0.3523810 0.3333333 0.3142857) # 2) Petal.Length< 2.45 37 0 setosa (1.0000000 0.0000000 0.0000000) * # 3) Petal.Length>=2.45 68 33 versicolor (0.0000000 0.5147059 0.4852941) # 6) Petal.Width< 1.75 40 5 versicolor (0.0000000 0.8750000 0.1250000) * # 7) Petal.Width>=1.75 28 0 virginica (0.0000000 0.0000000 1.0000000) * # this validation of your test data set (30% of records) shows that the # model predicted 44 of 45 correct. Not bad. table(predict(iris.rpart, iris[-d,], type = "class"), iris[-d, "Species"]) # setosa versicolor virginica # setosa 15 0 0 # versicolor 0 15 1 # virginica 0 0 14 ?predict.rpart
This could be a comment but. . .
In all seriousness, probably the best help anyone can give you is "read the manual." Also I could point you to the
rattle package: this is a pretty simple point and click interface to the
rpart package, along with several other modeling tools.
By the way: as noted on the 2nd page of the manual, CART is a trademarked name. "Recursive Partitioning" is a generic name, so you should probably describe your model accordingly.
I hope this helps.
- Solved – Extract training data predictions from rpart
- Solved – how to perform classification using function train in caret in R
- Solved – R randomForest has classification error of zero for different counts of a given class
- Solved – Number of variables for decision trees
- Solved – How to impute a missing categorical predictor variable for a random forest model