Question

In: Statistics and Probability

An American consumer organization is currently examining the relationship among several variables including gasoline mileage as...

An American consumer organization is currently examining the relationship among several variables including gasoline mileage as measured by miles per gallon; the horsepower of the car’s engine and the weight of the car (in pounds). A sample of 50 recent car models was selected and the results recorded. (provided in auto.xlxs) (a) Using the sample data attached, calculate the sample mean and standard deviation for the variables: - 1. Miles per gallon (MPG) 2. Horsepower 3. Weight (in pounds) (b) Is there any evidence of skewness in the data sets? Which data set displays greatest skewness? (c) Using the sample data on MPG, calculate the sample proportion of vehicles whose fuel economy exceeds 37 mpg and its corresponding standard deviation. (d) Indicate all possible relationship between the variables and comment on it. (You may use scatter diagram or Karl Pearson’s correlation coefficient) (e) Using the complete data set and using simple ordinary least squares regression formulae develop two models to explain the behavior of gasoline mileage (Miles per gallon) as a function of their- (i) Horsepower (ii) Weight f. Which model best describes the behavior of gasoline mileage? Explain your reasons here.

MPG   Horsepower   Weight
43.1   48   1985
19.9   110   3365
19.2   105   3535
17.7   165   3445
18.1   139   3205
20.3   103   2830
21.5   115   3245
16.9   155   4360
15.5   142   4054
18.5   150   3940
27.2   71   3190
41.5   76   2144
46.6   65   2110
23.7   100   2420
27.2   84   2490
39.1   58   1755
28   88   2605
24   92   2865
20.2   139   3570
20.5   95   3155
28   90   2678
34.7   63   2215
36.1   66   1800
35.7   80   1915
20.2   85   2965
23.9   90   3420
29.9   65   2380
30.4   67   3250
36   74   1980
22.6   110   2800
36.4   67   2950
27.5   95   2560
33.7   75   2210
44.6   67   1850
32.9   100   2615
38   67   1965
24.2   120   2930
38.1   60   1968
39.4   70   2070
25.4   116   2900
31.3   75   2542
34.1   68   1985
34   88   2395
31   82   2720
27.4   80   2670
22.3   88   2890
28   79   2625
17.6   85   3465
34.4   65   3465
20.6   105   3380

Solutions

Expert Solution

I used R software to solve this question.

R codes and output:

d=read.table('mpg.txt',header=TRUE)
> head(d)
MPG Horsepower Weight
1 43.1 48 1985
2 19.9 110 3365
3 19.2 105 3535
4 17.7 165 3445
5 18.1 139 3205
6 20.3 103 2830
> attach(d)

Que.a
> mean(MPG)
[1] 28.542
> sd(MPG)
[1] 8.171431
> mean(Horsepower)
[1] 90.84
> sd(Horsepower)
[1] 27.25867
> mean(Weight)
[1] 2756.52
> sd(Weight)
[1] 635.051

For variable MPG:

Mean = 28.542 and sd = 8.1714

For variable Horsepower :

Mean = 90.84 and sd = 27.2586

For variable weight:

Mean = 2756.52 and sd = 635.051

Que.b

par(mfrow=c(2,2))
> hist(MPG)
> hist(Horsepower)
> hist(Weight)

We check skewness of variable using histogram. Histogram for horsepower is very asymmetric, hence variable horsepower has greatest skewness.

Que.c

mpg=subset(MPG, (MPG>37));mpg
[1] 43.1 41.5 46.6 39.1 44.6 38.0 38.1 39.4
> length(mpg)
[1] 8
> prop=8/50
> prop
[1] 0.16
> sd(mpg)
[1] 3.203569

Sample proportion of vehicle whose fuel economy exceeds 37 mpg is 0.16

And its standard deviation is 3.2036

Que.d

pairs(d)

1. There is negative correlation between the pair MPG and horsepower and also in the pair MPG and weight.

2. There is positive correlation between horsepower and weight.

Que.e

model=lm(MPG~Horsepower)
> summary(model)

Call:
lm(formula = MPG ~ Horsepower)

Residuals:
Min 1Q Median 3Q Max
-12.3218 -3.7569 -0.1532 3.3686 11.9527

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 50.00521 2.52351 19.816 < 2e-16 ***
Horsepower -0.23627 0.02663 -8.873 1.09e-11 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 5.081 on 48 degrees of freedom
Multiple R-squared: 0.6212, Adjusted R-squared: 0.6133
F-statistic: 78.72 on 1 and 48 DF, p-value: 1.093e-11

> model2=lm(MPG~Weight)
> summary(model2)

Call:
lm(formula = MPG ~ Weight)

Residuals:
Min 1Q Median 3Q Max
-8.414 -2.636 -1.202 2.317 13.377

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 57.79725 2.96897 19.47 < 2e-16 ***
Weight -0.01061 0.00105 -10.11 1.79e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.668 on 48 degrees of freedom
Multiple R-squared: 0.6803, Adjusted R-squared: 0.6736
F-statistic: 102.1 on 1 and 48 DF, p-value: 1.787e-13

Since p-value for F statistics corresponding to both models are less than 0.05, hence both the model are useful in predicting MPG.

However adjusted R2 for second model (for weight as predictor) is greater.

Hence second model best describe behavior of gasoline mileage.


Related Solutions

14.5 A consumer organization wants to develop a regression model to predict gasoline mileage (as measured...
14.5 A consumer organization wants to develop a regression model to predict gasoline mileage (as measured by miles per gallon) based on the horsepower of the car’s engine and the weight of the car, in pounds. A sample of 50 recent car models was selected, with the results recorded in the file auto.xls.                 a. State the multiple regression equation.                 b.   Interpret the meaning of the slopes, b1 and b2, in this problem.                 c.   Explain why the regression coefficient,...
1) A consumer organization wants to develop a regression model to predict gasoline mileage​ (as measured...
1) A consumer organization wants to develop a regression model to predict gasoline mileage​ (as measured by miles per​ gallon) based on the horsepower of the​car's engine and the weight of the car​ (in pounds). A sample of 20 recent car models was​ selected, with the results recorded in the accompanying table. MPG 15.3, 19.2, 20.1, 18.5, 17.5, 27.2, 44.6, 27.2, 28.0, 21.2, 28.0, 36.1, 20.1, 29.9, 36.0, 36.4, 33.7, 32.9, 24.2, 39.3 Horsepower - 190, 102, 142, 171, 166,67,64,82,91,...
Gasoline mileage (mpg) was measured on several cars of each of four different makes (coded 1,...
Gasoline mileage (mpg) was measured on several cars of each of four different makes (coded 1, 2, 3 and 4). The make of each car is stored in the first column, and the mileage for each car is stored in the second column, of Table A. You need to conduct an analysis of variance to see if there are differences among the four makes in gasoline mileage. You should also estimate the mileage of each of the four makes of...
Consider the following model to estimate the effects of several variables, including cigarette smoking, on the...
Consider the following model to estimate the effects of several variables, including cigarette smoking, on the weight of newborns: log(bwght) = β0 + β1male + β2parity + β3log(faminc) + β4packs + u where male is a binary indicator equal to one if the child is male; parity is the birth order of this child; faminc is family income; and packs is the average number of packs of cigarettes smoked per day during pregnancy. (i) Why might you expect packs to...
Question 1: Proficient-level: There are several important functions performed in an organization, among which accounting is...
Question 1: Proficient-level: There are several important functions performed in an organization, among which accounting is one of them. Define the accounting function and discuss how it differs from double-entry bookkeeping. What types of information are critical to the performance of the accounting function in an organization? Distinguished-level: What are the three groups of functions performed by accountants? Discuss the activities that are part of each group. Question 2: Proficient-level: There are many opportunities available for careers in the accounting...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT