In: Statistics and Probability
A -year study conducted by the American Heart Association provided data on how age, blood pressure, and smoking relate to the risk of strokes. Data from a portion of this study follow. Risk is interpreted as the probability (times 100) that a person will have a stroke over the next 10-year period. For the smoker variable, 1 indicates a smoker and 0 indicates a nonsmoker.
Risk | Age | Blood Pressure | Smoker | ||
10 | 85 | 150 | 1 | ||
30 | 68 | 207 | 1 | ||
11 | 64 | 104 | 0 | ||
61 | 86 | 127 | 0 | ||
39 | 70 | 139 | 0 | ||
49 | 83 | 155 | 0 | ||
7 | 68 | 179 | 1 | ||
36 | 84 | 176 | 0 | ||
41 | 57 | 169 | 0 | ||
25 | 66 | 161 | 1 | ||
39 | 69 | 122 | 0 | ||
37 | 90 | 101 | 0 | ||
26 | 89 | 124 | 0 | ||
63 | 81 | 118 | 0 | ||
35 | 84 | 181 | 0 | ||
33 | 89 | 176 | 0 | ||
30 | 66 | 156 | 1 | ||
34 | 66 | 164 | 1 | ||
15 | 74 | 210 | 1 | ||
32 | 76 | 160 | 1 |
a. Develop an estimated regression equation that can be used to predict the risk of stroke given the age and blood pressure level. Enter negative value as negative number. Use Table 4 in Appendix B.
The regression equation is (to 4 decimals) | ||||||||||||||||||||||||
Risk=_______+________ age+________ blood pressure | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
Analysis of Variance | ||||||||||||||||||||||||
|
b. Consider adding two independent variables to the model developed in part (a), one for the interaction between age and blood pressure level and the other for whether the person is a smoker. Develop an estimated regression equation using these four independent variables. Enter negative value as negative number. Use Table 4 in Appendix B.
The regression equation is (to 4 decimals) | ||||||||||||||||||||||||
Risk=______+______ age+_______ blood pressure | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
Analysis of Variance | ||||||||||||||||||||||||
|
c. At a level of significance, test to see whether the addition of the interaction term and the smoker variable contribute significantly to the estimated regression equation developed in part (a). Use Table 4 in Appendix B.
What is the value of the F test statistic?
(to 2 decimals)
P-value is - Select your answer -lower than 0.01between 0.01 and 0.025between 0.025 and 0.05between 0.05 and 0.10greater than 0.10Item 36 , so the addition of the two independent variables - Select your answer -is not is 37 statistically significant.
a. Develop an estimated regression equation that can be used to predict the risk of stroke given the age and blood pressure level. Enter negative value as negative number. Use Table 4 in Appendix B.
The regression equation is (to 4 decimals) |
|||||||||
Risk=27.8343+0.3062 age+(-0.1194) blood pressure |
|||||||||
|
|||||||||
|
|||||||||
ANOVA |
||||
df |
SS |
MS |
F |
|
Regression |
2 |
530.8051 |
265.4026 |
1.2136 |
Residual |
17 |
3717.7449 |
218.6909 |
|
Total |
19 |
4248.5500 |
Regression Analysis |
||||||
Regression Statistics |
||||||
Multiple R |
0.3535 |
|||||
R Square |
0.1249 |
|||||
Adjusted R Square |
0.0220 |
|||||
Standard Error |
14.7882 |
|||||
Observations |
20 |
|||||
ANOVA |
||||||
df |
SS |
MS |
F |
Significance F |
||
Regression |
2 |
530.8051 |
265.4026 |
1.2136 |
0.3216 |
|
Residual |
17 |
3717.7449 |
218.6909 |
|||
Total |
19 |
4248.5500 |
||||
Coefficients |
Standard Error |
t Stat |
P-value |
Lower 95% |
Upper 95% |
|
Intercept |
27.8343 |
34.2284 |
0.8132 |
0.4274 |
-44.3814 |
100.0499 |
Age |
0.3062 |
0.3434 |
0.8917 |
0.3850 |
-0.4183 |
1.0307 |
Blood Pressure |
-0.1194 |
0.1121 |
-1.0649 |
0.3018 |
-0.3559 |
0.1171 |
b. Consider adding two independent variables to the model developed in part (a), one for the interaction between age and blood pressure level and the other for whether the person is a smoker. Develop an estimated regression equation using these four independent variables. Enter negative value as negative number. Use Table 4 in Appendix B.
The regression equation is (to 4 decimals) |
|||||||||
Risk=-143.4202+2.3621 age+1.2060blood pressure+(-0.0156) ageBp+(-18.7861)smoker |
|||||||||
|
|||||||||
|
ANOVA |
||||
df |
SS |
MS |
F |
|
Regression |
4 |
1642.1534 |
410.5384 |
2.3627 |
Residual |
15 |
2606.3966 |
173.7598 |
|
Total |
19 |
4248.5500 |
Regression Analysis |
||||||
Regression Statistics |
||||||
Multiple R |
0.6217 |
|||||
R Square |
0.3865 |
|||||
Adjusted R Square |
0.2229 |
|||||
Standard Error |
13.1818 |
|||||
Observations |
20 |
|||||
ANOVA |
||||||
df |
SS |
MS |
F |
Significance F |
||
Regression |
4 |
1642.1534 |
410.5384 |
2.3627 |
0.0999 |
|
Residual |
15 |
2606.3966 |
173.7598 |
|||
Total |
19 |
4248.5500 |
||||
Coefficients |
Standard Error |
t Stat |
P-value |
Lower 95% |
Upper 95% |
|
Intercept |
-143.4202 |
126.7083 |
-1.1319 |
0.2755 |
-413.4925 |
126.6521 |
Age |
2.3621 |
1.6233 |
1.4551 |
0.1662 |
-1.0979 |
5.8221 |
Blood Pressure |
1.2060 |
0.8502 |
1.4185 |
0.1765 |
-0.6062 |
3.0183 |
age*Bp |
-0.0156 |
0.0109 |
-1.4328 |
0.1724 |
-0.0388 |
0.0076 |
Smoker |
-18.7861 |
7.8138 |
-2.4042 |
0.0296 |
-35.4409 |
-2.1313 |
c. At a 0.05 level of significance, test to see whether the addition of the interaction term and the smoker variable contribute significantly to the estimated regression equation developed in part (a). Use Table 4 in Appendix B.
df |
SS |
MS |
F |
|
Regression(part a) |
2 |
530.8051 |
||
Regression |
4 |
1642.153403 |
410.538 |
|
Regression addition 2 variable |
2 |
1111.3483 |
555.674 |
3.1979 |
Residual |
15 |
2606.396597 |
173.76 |
|
Total |
19 |
4248.55 |
What is the value of the F test statistic? 3.20
(to 2 decimals)
P-value is - Select your answer
-lower than 0.01
between 0.01 and 0.025
between 0.025 and 0.05
correct option: between 0.05 and 0.10
greater than 0.10
Item 36 , so the addition of the two independent variables - Select your answer
-is not statistically significant.