Question

In: Statistics and Probability

To import the Carseats dataset into Rstudio: library("ISLR") data(Carseats) view(Carseats) Then, provide necessary codes for the...

To import the Carseats dataset into Rstudio:

library("ISLR")

data(Carseats)

view(Carseats)

Then, provide necessary codes for the following:

a. Split the data into a training set and a test set.

b. Fit a linear model using least squares on the training set to predict Sales using the entire collection of predictors.

Report Cp , BIC, R2 , and RSS for this model

c. Use the fitted model to predict responses for the test data and report the test error (RSS) obtained.

d. Compare the performance of (i) best subset selection; (ii) forward subset selection; (iii) backward subset selection.

For each method:

Use the training set to select the best model for each number of predictors.

Plot Cp , BIC, adjusted R2 , and RSS for all of the models at once.

Select one of models as your final model. Report the model, and the training error (RSS) for your chosen model.

Use your chosen model to fit the test data and report the test error (RSS) obtained

Solutions

Expert Solution

Answer:

By using given data,

R-code for the problem is

library(ISLR)
library(randomForest)
library(rpart)
data=Carseats
smp_size <- floor(0.5 * nrow(data))

## set the seed to make your partition reproducible
set.seed(123)
train_ind <- sample(seq_len(nrow(data)), size = smp_size)

train <- data[train_ind, ]
test <- data[-train_ind, ]

part b


fit <- rpart(Sales~., method="class", data=train)
printcp(fit) # display the results
plotcp(fit) # visualize cross-validation results
summary(fit) # detailed summary of splits
plot(fit, uniform=TRUE,
main="Classification Tree for Carseats")
text(fit, use.n=T, all=T, cex=.8)

part c


fit1=randomForest(Sales~.,data=train)
print(fit1)
importance(fit1)

part d


fit2=randomForest(Sales~.,mtry=10,data=train)

importance(fit2)

fit3=randomForest(Sales~.,mtry=2,data=train)

importance(fit3)
#here no difference in changing importance by moving mtry


Related Solutions

To import the Auto dataset into Rstudio: library("ISLR") data(Auto) view(Auto) Then, provide necessary codes for the...
To import the Auto dataset into Rstudio: library("ISLR") data(Auto) view(Auto) Then, provide necessary codes for the following: a. Use the vehicle name to name the rows and then remove the variable name from the data set since it is not of use for modelling. b. Split the data into a training set and a test set. c. Fit a regression tree to the the training set. Report the training error obtained. d. Plot the tree. e. How many terminal nodes...
Task 1 Please import the “admit.csv” into Rstudio. In this dataset, we know the GRE score,...
Task 1 Please import the “admit.csv” into Rstudio. In this dataset, we know the GRE score, the GPA, and the rankof 400 applicants for a graduate program. We also know if each of the candidates is admitted. In the admit column, 1 stands for “admitted”, and 0 stands for “rejected”. Please answer the following questions and include the codes. 1. import the dataset and call it "mydata". Then check the structure of the data 2. convert the data type of...
This question requires using Rstudio. This is following commands to install and import data into R:...
This question requires using Rstudio. This is following commands to install and import data into R: > install.packages("ISLR") > library(ISLR) > data(Wage) The required data installed and imported, now this is description of the data: This dataset contains economic and demographic data for 3000 individuals living in the mid-Atlantic region. For each of the 3000 individuals, the following 11 variables are recorded: year: Year that wage information was recorded age: Age of worker maritl: A factor with levels 1. Never...
This question requires using Rstudio. This is following commands to install and import data into R:...
This question requires using Rstudio. This is following commands to install and import data into R: > install.packages("ISLR") > library(ISLR) > data(Wage) The required data installed and imported, now this is description of the data: This dataset contains economic and demographic data for 3000 individuals living in the mid-Atlantic region. For each of the 3000 individuals, the following 11 variables are recorded: year: Year that wage information was recorded age: Age of worker maritl: A factor with levels 1. Never...
This question requires using Rstudio. This is following commands to install and import data into R:...
This question requires using Rstudio. This is following commands to install and import data into R: > install.packages("ISLR") > library(ISLR) > data(Wage) The required data installed and imported, now this is description of the data: This dataset contains economic and demographic data for 3000 individuals living in the mid-Atlantic region. For each of the 3000 individuals, the following 11 variables are recorded: year: Year that wage information was recorded age: Age of worker maritl: A factor with levels 1. Never...
In R: Consider dataset “juul” from library “ISwR”. (juul is a built in data set) Are...
In R: Consider dataset “juul” from library “ISwR”. (juul is a built in data set) Are the means of igf1 equal among tanner groups at 5% level? Please use the six step process to test statistical hypotheses for this research problem. Note: You need to convert tanner from numeric to factor type and ignore all the NAs.
Analyze used car inventory dataset using Python's pandas library - using DataFrame data structure¶ Dataset: UsedCarInventory_Assignment1.txt...
Analyze used car inventory dataset using Python's pandas library - using DataFrame data structure¶ Dataset: UsedCarInventory_Assignment1.txt (available on Canvas) This dataset shows used cars available for sale at a dealership. Each row represents a car record and columns tell information about each car. The first row in the dataset contains column headers. You must use Pandas to complete all 10 tasks.
Import the RestaurantRating1 dataset in R and save the resulting data frame. RestaurantRating1 is shown below...
Import the RestaurantRating1 dataset in R and save the resulting data frame. RestaurantRating1 is shown below as a table. Use some of the data wrangling techniques to transform the dataset into a tidy data. Use glimpse() function to show the resulting dataframe. Donalds Fila King Payes Wendi 1 3 1 1 1 2 3 1 1 2 2 3 1 2 2 3 3 1 2 2 3 3 1 3 3 3 3 5 3 3 3 3 5...
Why does the analysis of the previous fiscal budget performance provide the necessary data for establishing...
Why does the analysis of the previous fiscal budget performance provide the necessary data for establishing a future budget and what pertinent information could the prior year budget provide?
Chinese furniture industry do the export and import provide the data .write less than 2500 word
Chinese furniture industry do the export and import provide the data .write less than 2500 word
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT