In: Computer Science
How do you make a prediction from a decision tree using confusionMatrix function in rstudio?
Confusion matrix is used in decision tree to predict the result of classification problem. In this the number of correct and incorrect prediction are summarized and are split into classes. It basically shows how classification model gets confused while making prediction.
I am taking here the example of one of the most popular data set “Titanic”. I have created decision tree model using the “rpart” method. The main aim of tree here is to predict whether a person would have survived based on the given variables. We compare the result of prediction with the actual value given in variable “Survived” of “Titanic” data set.
Pls follow the the comment in code below. Confusion matrix is created and prediction is made along with calculating the accuracy.
Code-
# Installing rpart package
install.packages("rpart")
# Loading the package
library(rpart)
# Set random seed
set.seed(1)
# Loading Titanic data
data("Titanic")
# Change data from table to data frame
titanic <- as.data.frame(Titanic)
# Check structure of data
str(titanic)
# Using rpart to make decision tree classification
model
tree <- rpart(Survived ~ ., data = titanic, method =
"class")
# Using the predict() method to make predictions, and
storing result in pred
pred <- predict(tree, titanic, type = "class")
# table() method is used here to make confusion matrix
table(titanic$Survived, pred)
# Accuracy calculation
# pred
# No Yes
# No 13 (TP) 3(FN)
# Yes 5(FP) 11(TN)
# Accuracy = (TP + TN)/(TP+FN+FP+TN)
# Accuracy = (13+11)/13+3+5+11
# = 24/32 = 0.75 = 75%
Code screenshot-
Output-