In: Statistics and Probability
Brand Tar Nicotine CO
American_Filter 16 1.2 15
Benson_&_Hedges 16 1.2
15
Camel 16 1 17
Capri 9 0.8 6
Carlton 1 0.1 1
Cartier_Vendome 8 0.8 8
Chelsea 10 0.8 10
GPC_Approved 16 1 17
Hi-Lite 14 1 13
Kent 13 1 13
Lucky_Strike 13 1.1 13
Malibu 15 1.2 15
Marlboro 16 1.2 15
Merit 9 0.7 11
Newport_Stripe 11 0.9 15
Now 2 0.2 3
Old_Gold 18 1.4 18
Pall_Mall 15 1.2 15
Players 13 1.1 12
Raleigh 15 1 16
Richland 17 1.3 16
Rite 9 0.8 10
Silva_Thins 12 1 10
Tareyton 14 1 17
Triumph 5 0.5 7
True 6 0.6 7
Vantage 8 0.7 11
Viceroy 18 1.4 15
Winston 16 1.1 18
a) Find the regression equation that expresses the response variable (y) of nicotine amount in terms of the predictor variable (x) of the tar amount.
b) Find the regression equation that expresses the response variable (y) of nicotine amount in terms of the predictor variable (x) of the carbon monoxide amount.
c) Find the regression equation that expresses the response variable (y) of nicotine amount in terms of predictor variables (x) of tar amount and carbon monoxide amount.
d) For the regression equations found in parts (a), (b), and (c), which is the best equation for predicting the nicotine amount? Justify your answer.
e) Is the best regression equation identified in part (d) a good equation for predicting the nicotine amount? Why or why not?
a) Here the response variable is nicotine amount (y) and predictor is tar amount (x). The regression equation is given by
y = 0.154030 + 0.065052*x
b) Here the response variable is nicotine amount (y) and predictor is carbon monoxide amount (x). The regression equation is given by
y= 0.191639 + 0.060564*x
c) Here the response variable is nicotine amount (y) and predictors are tar amount (x1) and carbon monoxide amount (x2). The regression equation is given by
y= 0.181645 + 0.081837*x1 - 0.018642*x2
d) The best equation for predicting niocotine amount is the equation in part(c) as the value of multiple correlation coefficient is 0.9333 which is highest among the 3 regressions.
e) The best regression equation identified in part (d) may not always be a good choice for predicting nicotine amount since the predictors ie. tar amount and carbon monoxide amount are highly correlated ie. multicollinearity is present in the data set.
The R code is attached.