Question

In: Statistics and Probability

To import the Auto dataset into Rstudio: library("ISLR") data(Auto) view(Auto) Then, provide necessary codes for the...

To import the Auto dataset into Rstudio:

library("ISLR")
data(Auto)
view(Auto)

Then, provide necessary codes for the following:

a. Use the vehicle name to name the rows and then remove the variable name
from the data set since it is not of use for modelling.
b. Split the data into a training set and a test set.
c. Fit a regression tree to the the training set. Report the training error obtained.
d. Plot the tree.
e. How many terminal nodes are in the tree? Select one of the terminal nodes
and interpret the information relevant to that node.
f. Report the test error obtained.
g. Use cross-validation in order to determine the optimal level of tree complexity.
Plot the pruned tree. How many terminal nodes are in the pruned tree?
h. Compare the training error and test error of the pruned tree and full tree.

Solutions

Expert Solution


Related Solutions

To import the Carseats dataset into Rstudio: library("ISLR") data(Carseats) view(Carseats) Then, provide necessary codes for the...
To import the Carseats dataset into Rstudio: library("ISLR") data(Carseats) view(Carseats) Then, provide necessary codes for the following: a. Split the data into a training set and a test set. b. Fit a linear model using least squares on the training set to predict Sales using the entire collection of predictors. Report Cp , BIC, R2 , and RSS for this model c. Use the fitted model to predict responses for the test data and report the test error (RSS) obtained....
Task 1 Please import the “admit.csv” into Rstudio. In this dataset, we know the GRE score,...
Task 1 Please import the “admit.csv” into Rstudio. In this dataset, we know the GRE score, the GPA, and the rankof 400 applicants for a graduate program. We also know if each of the candidates is admitted. In the admit column, 1 stands for “admitted”, and 0 stands for “rejected”. Please answer the following questions and include the codes. 1. import the dataset and call it "mydata". Then check the structure of the data 2. convert the data type of...
We will use the dataset **Auto{ISLR}** to develop a binomial classification model to predict the likelihood of automobiles having high gas mileage.
code in R: We will use the dataset **Auto{ISLR}** to develop a binomial classification model to predict the likelihood of automobiles having high gas mileage. So, first load the **{ISLR}** library. Since we don't have a dummy variable to classify high vs. low gas mileage vehicles, let's use the quantitative value of miles per gallon **mpg** to create a binary variable called **mpg.hi** if a vehicle has higher **mpg** than the **median mpg**. Let's first calculate the **median mpg** value...
This question requires using Rstudio. This is following commands to install and import data into R:...
This question requires using Rstudio. This is following commands to install and import data into R: > install.packages("ISLR") > library(ISLR) > data(Wage) The required data installed and imported, now this is description of the data: This dataset contains economic and demographic data for 3000 individuals living in the mid-Atlantic region. For each of the 3000 individuals, the following 11 variables are recorded: year: Year that wage information was recorded age: Age of worker maritl: A factor with levels 1. Never...
This question requires using Rstudio. This is following commands to install and import data into R:...
This question requires using Rstudio. This is following commands to install and import data into R: > install.packages("ISLR") > library(ISLR) > data(Wage) The required data installed and imported, now this is description of the data: This dataset contains economic and demographic data for 3000 individuals living in the mid-Atlantic region. For each of the 3000 individuals, the following 11 variables are recorded: year: Year that wage information was recorded age: Age of worker maritl: A factor with levels 1. Never...
This question requires using Rstudio. This is following commands to install and import data into R:...
This question requires using Rstudio. This is following commands to install and import data into R: > install.packages("ISLR") > library(ISLR) > data(Wage) The required data installed and imported, now this is description of the data: This dataset contains economic and demographic data for 3000 individuals living in the mid-Atlantic region. For each of the 3000 individuals, the following 11 variables are recorded: year: Year that wage information was recorded age: Age of worker maritl: A factor with levels 1. Never...
Use R statictical software. Load the ISLR package to get the Auto data set. Fit below...
Use R statictical software. Load the ISLR package to get the Auto data set. Fit below non-linear models to the Auto data set. We will treat horsepower as the predictor and mpg as the response. • Fit the cubic spline with 3 knots (25th percentile, 50th percentile, and 75th percentile of horsepower) • Fit the natural spline with 3 knots (25th percentile, 50th percentile, and 75th percentile of horsepower) • Fit the smoothing spline by choosing optimal lambda with cross-validation....
In R: Consider dataset “juul” from library “ISwR”. (juul is a built in data set) Are...
In R: Consider dataset “juul” from library “ISwR”. (juul is a built in data set) Are the means of igf1 equal among tanner groups at 5% level? Please use the six step process to test statistical hypotheses for this research problem. Note: You need to convert tanner from numeric to factor type and ignore all the NAs.
Analyze used car inventory dataset using Python's pandas library - using DataFrame data structure¶ Dataset: UsedCarInventory_Assignment1.txt...
Analyze used car inventory dataset using Python's pandas library - using DataFrame data structure¶ Dataset: UsedCarInventory_Assignment1.txt (available on Canvas) This dataset shows used cars available for sale at a dealership. Each row represents a car record and columns tell information about each car. The first row in the dataset contains column headers. You must use Pandas to complete all 10 tasks.
Import the RestaurantRating1 dataset in R and save the resulting data frame. RestaurantRating1 is shown below...
Import the RestaurantRating1 dataset in R and save the resulting data frame. RestaurantRating1 is shown below as a table. Use some of the data wrangling techniques to transform the dataset into a tidy data. Use glimpse() function to show the resulting dataframe. Donalds Fila King Payes Wendi 1 3 1 1 1 2 3 1 1 2 2 3 1 2 2 3 3 1 2 2 3 3 1 3 3 3 3 5 3 3 3 3 5...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT