In: Statistics and Probability
What appear to be the 3-4 most important car specifications for predicting the car’s price? using the toyotacorolla file
Use the following code in R to fit the model a <- read.csv("C:\\Users\\Vibhor\\Desktop\\ToyotaCorolla.csv") ## according to your location of file attach(a) corolla <- a[c("Price","Age_08_04","KM","HP","cc","Doors","Gears","Quarterly_Tax","Weight")] View(corolla) summary(corolla) model <- lm(Price~., data = corolla) summary(model)
OUTPUT OF SUMMARY OF THE MODEL ## ## Call: ## lm(formula = Price ~ ., data = corolla) ## ## Residuals: ## Min 1Q Median 3Q Max ## -9366.4 -793.3 -21.3 799.7 6444.0 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -5.573e+03 1.411e+03 -3.949 8.24e-05 *** ## Age_08_04 -1.217e+02 2.616e+00 -46.512 < 2e-16 *** ## KM -2.082e-02 1.252e-03 -16.622 < 2e-16 *** ## HP 3.168e+01 2.818e+00 11.241 < 2e-16 *** ## cc -1.211e-01 9.009e-02 -1.344 0.17909 ## Doors -1.617e+00 4.001e+01 -0.040 0.96777 ## Gears 5.943e+02 1.971e+02 3.016 0.00261 ** ## Quarterly_Tax 3.949e+00 1.310e+00 3.015 0.00262 ** ## Weight 1.696e+01 1.068e+00 15.880 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 1342 on 1427 degrees of freedom ## Multiple R-squared: 0.8638, Adjusted R-squared: 0.863 ## F-statistic: 1131 on 8 and 1427 DF, p-value: < 2.2e-16
We can see that which variables are significant from their p values (have star in them),
the variables are: Age_08_04, KM, HP, Gears, Quarterly_Tax , Weight