Question

In: Statistics and Probability

The cigarette data set (partially given below) presents data on tar, nicotine, weight (in grams) and...

The cigarette data set (partially given below) presents data on tar, nicotine, weight (in grams) and carbon monoxide contents (in milligrams) for a sample of 25 (filter) brands of cigarettes tested in a recent year.

Tar (x1)

Nicotine (x2)

Weight (x3)

Carbon Monoxide (y)

14.1

0.86

0.9853

13.6

.

.

.

.

.

.

.

.

12.0

0.82

1.1184

14.9

Question 1

Answer the following for the variables Carbon Monoxide (response variable) and Tar(predictor variable).

a. Fit the regression line. Report the parameter estimates (the estimates of the intercept and slope).

b. Examine the residual plots and comment on the fit of the model. Are there any fit issues? Are there any outliers (use Cook’s D > 1 as a threshold)? If so, identify the observation numbers and delete the observation and repeat part a.

c. Is Taruseful (use α = 0.05) in predicating Carbon Monoxide? Why?

d. What percentage of the variation in Carbon Monoxide is explained by Tar? Is that high or low?

e. What is the predicted value for Carbon Monoxide when Tar is 10? Give a 95% prediction interval for this estimate.

Solutions

Expert Solution

SOLUTION

QUESTION1

> B12 <- read.csv("C:/Users/pcc/Desktop/B12.csv")

>   View(B12)

> B12

   Carbon.Monoxide     tar

1             13.6     14.1

2             16.6     16.0

3             23.5     29.8

4             10.2    8.0

5              5.4     4.1

6             15.0     15.0

7              9.0    8.8

8             12.3     12.4

9             16.3     16.6

10            15.4     14.9

11            13.0     13.7

12            14.4     15.1

13            10.0     7.8

14            10.2     11.4

15             9.5     9.0

16             1.5     1.0

17            18.5     17.0

18            12.6     12.8

19            17.5     15.8

20             4.9     4.5

21            15.9     14.5

22             8.5     7.3

23            10.6     8.6

24            13.9     15.2

25            14.9     12.0

fit=lm(Carbon.Monoxide~tar,data = B12)
> fit
 
Call:
lm(formula = Carbon.Monoxide ~ tar, data = B12)
 
Coefficients:
(Intercept)          tar  
      2.743        0.801
> summary(fit)
 
Call:
lm(formula = Carbon.Monoxide ~ tar, data = B12)
 
Residuals:
    Min      1Q  Median      3Q     Max 
-3.1124 -0.7167 -0.3754  1.0091  2.5450 
 
Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.74328    0.67521   4.063 0.000481 ***
tar          0.80098    0.05032  15.918 6.55e-14 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
Residual standard error: 1.397 on 23 degrees of freedom
Multiple R-squared:  0.9168,   Adjusted R-squared:  0.9132 
F-statistic: 253.4 on 1 and 23 DF,  p-value: 6.552e-14
 
> ## Estimated intercept is ##
> intercept= 2.74328 ##
> ## Slope = 0.80098 ##
>  
> ## Residual plot ##
> plot(fit)

Here in the above graph we observe that there is non – linear pattern.

From the above graph we observe that normality assumption is followed.

Here we can say that residual point are randomaly spread . Means that the assumption of equal variance is satisfied.

## Here is the p value of tar is 0.00000 is highly significant for predicting the Carbon monoxide
 because the t test use to see weather the independent variables are significantly effect on dependent variable or not ##
##  R squared value is 0.9168 means the percentage of variation explained by the carbon monoxide is 91.68% to the our model. Which is very high. ##
> 
> ## prediction for Carbon monoxide when tar is 10 ##
> n=data.frame(tar=10)
 
> predict(fit,n,interval = "confidence")
 
       fit      lwr      upr
1 10.75304 10.13083 11.37524
> 
> ## So here the value of carbon monoxide for tar is = 10 is 10.75304 ##
>  And the 95 % CI are (10.13083 , 11.37524) ##

Expert Solution

a) Scatter plot:

Tar Carbon T*T T*C c*C
14.1 13.6 198.81 191.76 184.96
16 16.6 256 265.6 275.56
29.8 23.5 888.04 700.3 552.25
8 10.2 64 81.6 104.04
4.1 5.4 16.81 22.14 29.16
15 15 225 225 225
8.8 9 77.44 79.2 81
12.4 12.3 153.76 152.52 151.29
16.6 16.3 275.56 270.58 265.69
14.9 15.4 222.01 229.46 237.16
13.7 13 187.69 178.1 169
15.1 14.4 228.01 217.44 207.36
7.8 10 60.84 78 100
11.4 10.2 129.96 116.28 104.04
9 9.5 81 85.5 90.25
1 1.5 1 1.5 2.25
17 18.5 289 314.5 342.25
12.8 12.6 163.84 161.28 158.76
15.8 17.5 249.64 276.5 306.25
4.5 4.9 20.25 22.05 24.01
14.5 15.9 210.25 230.55 252.81
7.3 8.5 53.29 62.05 72.25
8.6 10.6 73.96 91.16 112.36
15.2 13.9 231.04 211.28 193.21
12 14.9 144 178.8 222.01
Mean 12.216 12.528
Sum 305.4 313.2 4501.2 4443.15 4462.92
n 25

Y=a+bX

c) Correlation coefficient to test prediciton accuracy:

=95.75% is strong for prediction.

d) Percentage of variaiton:

R-squared to be cosidered as percentage of variaiton:

= 91.6778%

b)

Remove outlier with: Tar =29.8

Tar Carbon T*T T*C c*C
14.1 13.6 198.81 191.76 184.96
16 16.6 256 265.6 275.56
8 10.2 64 81.6 104.04
4.1 5.4 16.81 22.14 29.16
15 15 225 225 225
8.8 9 77.44 79.2 81
12.4 12.3 153.76 152.52 151.29
16.6 16.3 275.56 270.58 265.69
14.9 15.4 222.01 229.46 237.16
13.7 13 187.69 178.1 169
15.1 14.4 228.01 217.44 207.36
7.8 10 60.84 78 100
11.4 10.2 129.96 116.28 104.04
9 9.5 81 85.5 90.25
1 1.5 1 1.5 2.25
17 18.5 289 314.5 342.25
12.8 12.6 163.84 161.28 158.76
15.8 17.5 249.64 276.5 306.25
4.5 4.9 20.25 22.05 24.01
14.5 15.9 210.25 230.55 252.81
7.3 8.5 53.29 62.05 72.25
8.6 10.6 73.96 91.16 112.36
15.2 13.9 231.04 211.28 193.21
12 14.9 144 178.8 222.01
Mean 11.4833333 12.07083
Sum 275.6 289.7 3613.16 3742.85 3910.67
n 24

Y=a+bx

Using above formulas:

Regression equation:

Correlation coefficient: r = 0.966158

R-squared value= 0.933462= 93.3462%


Related Solutions

Are cigarettes bad for people? Cigarette smoking involves tar, carbon monoxide, and nicotine (measured in milligrams)....
Are cigarettes bad for people? Cigarette smoking involves tar, carbon monoxide, and nicotine (measured in milligrams). The first two are definitely not good for a person's health, and the last ingredient can cause addiction. Brand Tar Nicotine CO Brand Tar Nicotine CO Alpine Benson & Hedges Bull Durham Camel Lights Carlton Chesterfield Golden Lights Kent Kool L&M Lark Lights Marlboro Merit 14.1 16.0 29.8 8.0 4.1 15.0 8.8 12.4 16.6 14.9 13.7 15.1 7.8 0.86 1.06 2.03 0.67 0.40 1.04...
The accompanying table provides data for​ tar, nicotine, and carbon monoxide​ (CO) contents in a certain...
The accompanying table provides data for​ tar, nicotine, and carbon monoxide​ (CO) contents in a certain brand of cigarette. Find the best regression equation for predicting the amount of nicotine in a cigarette. Why is it​ best? Is the best regression equation a good regression equation for predicting the nicotine​ content? Why or why​ not? TAR NICOTINE CO 6 0.4 5 15 1.0 18 16 1.3 16 13 0.7 18 13 0.8 18 13 0.9 14 16 1.0 17 16...
The accompanying table provides data for​ tar, nicotine, and carbon monoxide​ (CO) contents in a certain...
The accompanying table provides data for​ tar, nicotine, and carbon monoxide​ (CO) contents in a certain brand of cigarette. Find the best regression equation for predicting the amount of nicotine in a cigarette. Why is it​ best? Is the best regression equation a good regression equation for predicting the nicotine​ content? Why or why​ not? Tar   Nicotine   CO 5   0.5   5 17   1.0   19 16   1.1   18 13   0.7   19 13   0.8   18 15   1.1   13 15   1.0   17 16  ...
The accompanying table provides data for​ tar, nicotine, and carbon monoxide​ (CO) contents in a certain...
The accompanying table provides data for​ tar, nicotine, and carbon monoxide​ (CO) contents in a certain brand of cigarette. Find the best regression equation for predicting the amount of nicotine in a cigarette. Why is it​ best? Is the best regression equation a good regression equation for predicting the nicotine​ content? Why or why​ not? Tar   Nicotine   CO 5   0.5   3 15   1.0   19 17   1.1   16 14   0.7   19 14   0.8   19 14   1.0   13 15   1.0   16 14  ...
The table below lists measured amounts (mg) of tar, carbon monoxide (CO), and nicotine in king...
The table below lists measured amounts (mg) of tar, carbon monoxide (CO), and nicotine in king size cigarettes of different brands. Tar 25 27 20 24 20 20 21 24 CO 18 16 16 16 16 16 14 17 Nicotine 1.5 1.7 1.1 1.6 1.1 1.0 1.2 1.4 Use the amounts of nicotine and carbon monoxide (CO). 1.Find the value of the linear correlation coefficient between amounts of nicotine and carbon monoxide. 2.Use the data and determine whether there is...
Refer to the accompanying data​ table, which shows the amounts of nicotine​ (mg per​ cigarette) in​...
Refer to the accompanying data​ table, which shows the amounts of nicotine​ (mg per​ cigarette) in​ king-size cigarettes,​ 100-mm menthol​ cigarettes, and​ 100-mm nonmenthol cigarettes. The​ king-size cigarettes are​ nonfiltered, while the​ 100-mm menthol cigarettes and the​ 100-mm nonmenthol cigarettes are filtered. Use a 0.05 significance level to test the claim that the three categories of cigarettes yield the same mean amount of nicotine. Given that only the​ king-size cigarettes are not​ filtered, do the filters appear to make a​...
The accompanying table, MultiLinear Regression 5, provides data for tar, nicotine, and carbon monoxide (CO) contents...
The accompanying table, MultiLinear Regression 5, provides data for tar, nicotine, and carbon monoxide (CO) contents in a certain brand of cigarette. All measurements are in milligrams (mg). MultiLinear Regression 5 Nicotine (Y) Tar (X1) CO (X2) 0.4 5 3 0.9 9 11 0.7 12 18 0.8 13 18 1 16 18 0.6 6 6 0.9 15 18 1.1 15 15 0.8 13 18 1.2 17 16 Part a) Run the Multilinear Regression Analysis in Excel with both predictor variables....
You are given a data set containing the height, weight, age, and blood pressure of a...
You are given a data set containing the height, weight, age, and blood pressure of a representative sample of people from a major metropolitan area. Comment on the suitability of using a statistically-based versus a cluster-based outlier detection scheme to identify people with anomalous characteristics for this data set.   
Cigarette and Marijuana Use.  Below are data on cigarette and marijuana use among high school students in...
Cigarette and Marijuana Use.  Below are data on cigarette and marijuana use among high school students in Dayton, Ohio. Marijuana Cigarettes Yes No Yes 914 581 No 46 735 Write the null and alternative hypotheses for the association between cigarette use and marijuana use. (4 points) How many people would you expect to use both cigarettes and marijuana if the null hypothesis were true? HINT: take a look at the notes for determining expected cell count (2 points) Calculate the p...
Using data set E, answer the questions given below. DATA SET E Microprocessor Speed (MHz) and...
Using data set E, answer the questions given below. DATA SET E Microprocessor Speed (MHz) and Power Dissipation (watts) (n = 14 chips)   Chip Speed (MHz) Power (watts)   1989 Intel 80486 20       3         1993 Pentium 100       10         1997 Pentium II 233       35         1998 Intel Celeron 300       20         1999 Pentium III 600       42         1999 AMD Athlon 600       50         2000 Pentium 4 1300       51         2004 Celeron D 2100...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT