In: Statistics and Probability
The remaining questions in this workbook assignment are based on the following scenario.
Information was collected from a reputable internet site for an
investigation into the fuel economy of various makes of car and
whether these varied over time. Data was collected for fifteen
different models of vehicle from twelve different manufacturers.
After collection, the observed vehicle models were classified as
sedan, sports or 4WD.
The data for sedans is provided in the file sedan_economy.csv, also
available in the same folder as this assignment. The variables in
this file are:
decade: the decade in which the car was produced ((1991-2000,
2001-2010, 2011-present);
economy: fuel economy (L/100km);
cylinders: the number of cylinders in the engine;
capacity: the engine capacity (L);
fuel: the fuel type (petrol or diesel);
transmission: the transmission of the car (manual or
automatic).
decade |
economy |
cylinders |
capacity |
fuel |
transmission |
|
1 |
1991-2000 |
7.2 |
4 |
1.3 |
petrol |
manual |
2 |
2001-2010 |
6.8 |
4 |
1.6 |
petrol |
manual |
3 |
2011+ |
5.7 |
4 |
1.5 |
petrol |
manual |
4 |
2011+ |
4.9 |
3 |
1.0 |
petrol |
manual |
5 |
1991-2000 |
7.2 |
4 |
1.8 |
petrol |
manual |
6 |
2001-2010 |
7.5 |
4 |
2.0 |
petrol |
manual |
7 |
2001-2010 |
8.4 |
4 |
2.0 |
petrol |
automatic |
8 |
2011+ |
6.4 |
4 |
2.0 |
petrol |
manual |
9 |
2011+ |
6.1 |
4 |
2.0 |
petrol |
automatic |
10 |
1991-2000 |
7.1 |
4 |
1.8 |
petrol |
manual |
11 |
1991-2000 |
7.6 |
4 |
1.8 |
petrol |
automatic |
12 |
2001-2010 |
7.1 |
4 |
1.8 |
petrol |
manual |
13 |
2001-2010 |
7.5 |
4 |
1.8 |
petrol |
automatic |
14 |
2011+ |
7.2 |
4 |
1.8 |
petrol |
manual |
15 |
2011+ |
6.7 |
4 |
1.8 |
petrol |
automatic |
16 |
1991-2000 |
7.2 |
6 |
2.8 |
petrol |
manual |
17 |
2001-2010 |
7.7 |
4 |
2.0 |
petrol |
manual |
18 |
2001-2010 |
7.4 |
4 |
2.0 |
petrol |
automatic |
19 |
2011+ |
6.4 |
4 |
2.0 |
petrol |
manual |
20 |
2011+ |
6.8 |
4 |
2.0 |
petrol |
automatic |
21 |
2011+ |
3.6 |
4 |
1.6 |
diesel |
manual |
22 |
1991-2000 |
6.0 |
4 |
1.5 |
petrol |
manual |
23 |
1991-2000 |
6.1 |
4 |
1.5 |
petrol |
automatic |
24 |
2001-2010 |
7.3 |
4 |
1.6 |
petrol |
manual |
25 |
2001-2010 |
7.5 |
4 |
1.6 |
petrol |
automatic |
26 |
2001-2010 |
4.7 |
4 |
2.0 |
diesel |
manual |
27 |
2011+ |
6.7 |
4 |
1.6 |
petrol |
manual |
28 |
2011+ |
7.0 |
4 |
1.6 |
petrol |
automatic |
29 |
2011+ |
4.3 |
4 |
1.8 |
diesel |
manual |
30 |
1991-2000 |
9.2 |
6 |
4.0 |
petrol |
manual |
31 |
2001-2010 |
10.8 |
6 |
4.0 |
petrol |
manual |
32 |
2001-2010 |
10.3 |
6 |
4.0 |
petrol |
automatic |
33 |
2011+ |
11.3 |
6 |
4.0 |
petrol |
manual |
34 |
2011+ |
10.1 |
6 |
4.0 |
petrol |
automatic |
35 |
1991-2000 |
9.2 |
6 |
3.0 |
petrol |
manual |
36 |
1991-2000 |
9.0 |
6 |
3.0 |
petrol |
automatic |
37 |
2001-2010 |
11.7 |
6 |
3.5 |
petrol |
manual |
38 |
2011+ |
10.3 |
6 |
3.7 |
petrol |
manual |
39 |
2011+ |
10.4 |
6 |
3.7 |
petrol |
automatic |
40 |
1991-2000 |
10.5 |
8 |
5.0 |
petrol |
manual |
41 |
1991-2000 |
11.3 |
8 |
5.0 |
petrol |
automatic |
42 |
2001-2010 |
14.3 |
8 |
6.0 |
petrol |
manual |
43 |
2001-2010 |
14.2 |
8 |
6.0 |
petrol |
automatic |
44 |
2011+ |
11.6 |
8 |
6.0 |
petrol |
manual |
45 |
2011+ |
11.7 |
8 |
6.0 |
petrol |
automatic |
46 |
1991-2000 |
9.8 |
8 |
5.5 |
petrol |
automatic |
47 |
2001-2010 |
14.1 |
8 |
6.2 |
petrol |
automatic |
48 |
2011+ |
9.8 |
8 |
5.5 |
petrol |
automatic |
The sample averages of fuel economy for manual and automatic
sedans in this sample are 7.854 L/100km and 9.09 L/100km,
respectively. Based on this, one of the investigators has suggested
that it may be possible to predict the transmission type of a sedan
from its fuel economy, using a formal statistical analysis.
(a) Using R, create a new variable isauto that takes the value 1 if
the transmission type is automatic, or a 0 if the transmission type
is manual. You should show the command you used to do this.
(b) You should then fit an appropriate statistical model that
describes the probability of a randomly chosen car having an
utomatic transmission, where the fuel economy is a predictor. You
should ensure you show the command you used to do this, and provide
the coefficient/summary table.
(c) Is fuel economy a significant predictor of the probability of a
randomly chosen car having an automatic transmission?
(d) Your friend is looking to purchase a sedan with a manual
transmission from an online seller, who has not included the
transmission type of the car in the advertisement description.
However, the seller has stated in the advert that his 60L tank
gives a range of 730km. Whether or not you found fuel economy to be
a significant predictor in (d), calculate the average fuel economy
and hence determine the model estimate of the probability of the
car having a manual transmission.
Input:
dat<-read.csv("sedan_economy.csv",sep=",")
lookup<-c("automatic"=1,"manual"=0)
dat$new_transmission<-lookup[data$transmission]
dat
model<-lm(new_transmission~economy,data = dat)
summary(mod)
econ=data.frame(
economy=c(730/60)
)
predict(model,newdata=econ)
OUTPUT:
a)
We inserted the data into R using the read.csv function.
Next, we created a new variable lookup where "automatic"=1, "manual"=0
We inserted a new column in the data frame using
dat$new_transmission<-lookup[data$transmission]
b)
We fitted a model for new_transmission with the economy as a predictor
model<-lm(new_transmission~economy,data = dat)
c)
summary function gave us a summary of the model.
P-value > 0.05 so the model is significant,but Adjusted R-squared =0.038
i.e. only 3.8% of the variance in the predicted variable is defined by the model.
d)
We created a data frame having the fuel economy of the car we want to predict transmission of.
> econ=data.frame(
+ economy=c(730/60)
+ )
Next, we predict the value of the transmission for a car having fuel economy= 730/60 kmpl
1
0.5964291
> predict(model,newdata=econ,interval = "confidence")
fit lwr upr
1 0.5964291 0.3396301 0.8532281