Question

In: Statistics and Probability

We will use the dataset **Auto{ISLR}** to develop a binomial classification model to predict the likelihood of automobiles having high gas mileage.

code in R:

We will use the dataset **Auto{ISLR}** to develop a binomial classification model to predict the likelihood of automobiles having high gas mileage. So, first load the **{ISLR}** library. Since we don't have a dummy variable to classify high vs. low gas mileage vehicles, let's use the quantitative value of miles per gallon **mpg** to create a binary variable called **mpg.hi** if a vehicle has higher **mpg** than the **median mpg**. Let's first calculate the **median mpg** value using the `median()` function and store the results in an object named **med.mpg**. Then create a column in the **Auto** dataset named **mpg.hi** with a binary value of 1 if `mpg>med.mpg` and 0 otherwise, using the `ifelse()` function.

For a quick visual inspection, display the **med.mpg** value and then a 2-column data frame using `cbind()` the first 20 values for **mpg** and **mpg.hi**. Please label the columns as shown below. Don't answer this,but quickly verify that your **med.hi** variable was created correctly.

Solutions

Expert Solution

> library(ISLR)
> head(Auto)
mpg cylinders displacement horsepower weight acceleration year origin
1 18 8 307 130 3504 12.0 70 1
2 15 8 350 165 3693 11.5 70 1
3 18 8 318 150 3436 11.0 70 1
4 16 8 304 150 3433 12.0 70 1
5 17 8 302 140 3449 10.5 70 1
6 15 8 429 198 4341 10.0 70 1
name bin_mpg mpg_hi
1 chevrolet chevelle malibu 0 0
2 buick skylark 320 0 0
3 plymouth satellite 0 0
4 amc rebel sst 0 0
5 ford torino 0 0
6 ford galaxie 500 0 0
> med_mpg = median(Auto$mpg)
> med_mpg
[1] 22.75
> mpg_hi= ifelse(Auto$mpg>=med_mpg,"mpg_hi","mpg_low")
> Auto$mpg_hi = ifelse(Auto$mpg>=med_mpg,1,0)
> head(data.frame(Auto$mpg,Auto$mpg_hi))
Auto.mpg Auto.mpg_hi
1 18 0
2 15 0
3 18 0
4 16 0
5 17 0
6 15 0
> summary(glm(mpg_hi ~ weight + year + cylinders+horsepower+ displacement+ acceleration,family = "binomial", data = Auto))

Call:
glm(formula = mpg_hi ~ weight + year + cylinders + horsepower +
displacement + acceleration, family = "binomial", data = Auto)

Deviance Residuals:
Min 1Q Median 3Q Max
-2.1999 -0.1126 0.0115 0.2249 3.3019

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -15.828787 5.653426 -2.800 0.00511 **
weight -0.003986 0.001085 -3.673 0.00024 ***
year 0.414204 0.072700 5.697 1.22e-08 ***
cylinders -0.015009 0.405220 -0.037 0.97045
horsepower -0.035608 0.023543 -1.512 0.13042
displacement -0.006745 0.009961 -0.677 0.49831
acceleration 0.007983 0.141357 0.056 0.95497
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 543.43 on 391 degrees of freedom
Residual deviance: 159.34 on 385 degrees of freedom
AIC: 173.34

Number of Fisher Scoring iterations: 8

> fit1 = glm(as.factor(mpg_hi) ~ weight + year + cylinders+horsepower+ displacement+ acceleration,family = "binomial", data = Auto)
> pred = predict(fit1,type="response")
> pred_mpg_hi = ifelse(pred>=0.5,1,0)
> table(mpg_hi,pred_mpg_hi)
pred_mpg_hi
mpg_hi 0 1
mpg_hi 16 180
mpg_low 173 23


Related Solutions

14.5 A consumer organization wants to develop a regression model to predict gasoline mileage (as measured...
14.5 A consumer organization wants to develop a regression model to predict gasoline mileage (as measured by miles per gallon) based on the horsepower of the car’s engine and the weight of the car, in pounds. A sample of 50 recent car models was selected, with the results recorded in the file auto.xls.                 a. State the multiple regression equation.                 b.   Interpret the meaning of the slopes, b1 and b2, in this problem.                 c.   Explain why the regression coefficient,...
1) A consumer organization wants to develop a regression model to predict gasoline mileage​ (as measured...
1) A consumer organization wants to develop a regression model to predict gasoline mileage​ (as measured by miles per​ gallon) based on the horsepower of the​car's engine and the weight of the car​ (in pounds). A sample of 20 recent car models was​ selected, with the results recorded in the accompanying table. MPG 15.3, 19.2, 20.1, 18.5, 17.5, 27.2, 44.6, 27.2, 28.0, 21.2, 28.0, 36.1, 20.1, 29.9, 36.0, 36.4, 33.7, 32.9, 24.2, 39.3 Horsepower - 190, 102, 142, 171, 166,67,64,82,91,...
Engineer wants to develop a model to describe the gas mileage of a sport utility vehicle....
Engineer wants to develop a model to describe the gas mileage of a sport utility vehicle. He collects the data presented in following table. 2008 Model Engine Size (liters) Cylinders Final Drive Ratio Miles per Gallon Mercedes Benz 5 8 4.38 13 Jeep Wrangler 3.8 6 3.21 16 Mitsubishi Endeavor 3.8 6 4.01 18 Toyota Land Cruiser 5.7 8 3.91 15 Kia Sorento 3.3 6 3.33 18 Jeep Commander Sport 4.7 8 3.73 15 Dodge Durango 4.7 8 3.55 15...
Use the following data to develop a quadratic model to predict y from x. Develop a...
Use the following data to develop a quadratic model to predict y from x. Develop a simple regression model from the data and compare the results of the two models. Does the quadratic model seem to provide any better predictability? Why or why not? x y x y 15 229 15 247 9 74 8 82 6 29 5 21 21 456 10 94 17 320
Use Excel to develop a regression model for the Hospital Database to predict the number of...
Use Excel to develop a regression model for the Hospital Database to predict the number of Personnel by the number of Births. How many residuals are within 1 standard error? Write your answer as a whole number. Personnel Births 792 312 1762 1077 2310 1027 328 355 181 168 1077 3810 742 735 131 1 1594 1733 233 257 241 169 203 430 325 0 676 2049 347 211 79 16 505 2648 1543 2450 755 1465 959 0 325...
Use Excel to develop a regression model for the Hospital Database to predict the number of...
Use Excel to develop a regression model for the Hospital Database to predict the number of Personnel by the number of Births. How many residuals are within 1 standard error? Write your answer as a whole number. Personnel(y) Births(x) 792 312 1762 1077 2310 1027 328 355 181 168 1077 3810 742 735 131 1 1594 1733 233 257 241 169 203 430 325 0 676 2049 347 211 79 16 505 2648 1543 2450 755 1465 959 0 325...
14.1)use the following data to develop a quadratic model to predict y from x. develop a...
14.1)use the following data to develop a quadratic model to predict y from x. develop a simple regression model from the data and compare the results of the two models. Does the quadratic model seem to provide any better predictability? Why or why not ?       x       y         x       y 14 200 15 247 9 74 8 82 6 29 5 21 21 456 10 94 17 320 Answer:simple model: y^=   -14.27+27.128x, F=229.67 with p=.000, se=27.27, R2=.97,...
Use the following data to develop a multiple regression model to predict from and . Discuss...
Use the following data to develop a multiple regression model to predict from and . Discuss the output, including comments about the overall strength of the model, the significance of the regression coefficients, and other indicators of model fit. y x1 x2 198 29 1.64 214 71 2.81 211 54 2.22 219 73 2.70 184 67 1.57 167 32 1.63 201 47 1.99 204 43 2.14 190 60 2.04 222 32 2.93 197 34 2.15 Appendix A Statistical Tables *(Round...
Use Excel to develop a multiple regression model to predict Cost of Materials by Number of...
Use Excel to develop a multiple regression model to predict Cost of Materials by Number of Employees, New Capital Expenditures, Value Added by Manufacture, and End-of-Year Inventories. Locate the observed value that is in Industrial Group 12 and has 7 employees. Based on the model and the multiple regression output, what is the corresponding residual of this observation? Write your answer as a number, round to 2 decimal places. SIC Code No. Emp. No. Prod. Wkrs. Value Added by Mfg....
Upload Cars04-1 Cars04-1 data and use engine size to predict the car’s city gas mileage(City MPG)....
Upload Cars04-1 Cars04-1 data and use engine size to predict the car’s city gas mileage(City MPG). Answer the questions. I) For each additional 2.0 liter in engine size how much the MPG will change? (11.11 points) a. It will decrease by 0.21 mpg. b. It will decrease by 8.21 mpg. c. It will be 25.49 mpg. d. For car with zero horse power we expect 33.7 mpg. e. Not applicable. II) After performing the regression analysis you are asked to...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT