In: Statistics and Probability
To import the Auto dataset into Rstudio:
library("ISLR")
data(Auto)
view(Auto)
Then, provide necessary codes for the following:
a. Use the vehicle name to name the rows and then remove the
variable name
from the data set since it is not of use for modelling.
b. Split the data into a training set and a test set.
c. Fit a regression tree to the the training set. Report the
training error obtained.
d. Plot the tree.
e. How many terminal nodes are in the tree? Select one of the
terminal nodes
and interpret the information relevant to that node.
f. Report the test error obtained.
g. Use cross-validation in order to determine the optimal level of
tree complexity.
Plot the pruned tree. How many terminal nodes are in the pruned
tree?
h. Compare the training error and test error of the pruned tree and
full tree.