In: Accounting
A recent 10-year study conducted by a research team at the Great Falls Medical School was conducted to assess how age, systolic blood pressure, and smoking relate to the risk of strokes. Assume that the following data are from a portion of this study. Risk is interpreted as the probability (times 100) that the patient will have a stroke over the next 10-year period. For the smoking variable, define a dummy variable with 1 indicating a smoker and 0 indicating a nonsmoker.
Risk |
Age |
Systolic Blood Pressure |
Smoker |
12 | 57 | 152 | No |
24 | 67 | 163 | No |
13 | 58 | 155 | No |
56 | 86 | 177 | Yes |
28 | 59 | 196 | No |
51 | 76 | 189 | Yes |
18 | 56 | 155 | Yes |
31 | 78 | 120 | No |
37 | 80 | 135 | Yes |
15 | 78 | 98 | No |
22 | 71 | 152 | No |
36 | 70 | 173 | Yes |
15 | 67 | 135 | Yes |
48 | 77 | 209 | Yes |
15 | 60 | 199 | No |
36 | 82 | 119 | Yes |
8 | 66 | 166 | No |
34 | 80 | 125 | Yes |
3 | 62 | 117 | No |
37 | 59 | 207 | Yes |
(a) | Develop an estimated multiple regression equation that relates risk of a stroke to the person's age, systolic blood pressure, and whether the person is a smoker. |
Let x1 represent the person's age. | |
Let x2 represent the person's systolic blood pressure. | |
Let x3 represent whether the person is a smoker. | |
If required, round your answers to three decimal places. For subtractive or negative numbers use a minus sign even if there is a + sign before the blank. (Example: -300) | |
= + x1 + x2 + x3 | |
(b) | Is smoking a significant factor in the risk of a stroke? Explain. Use a 0.05 level of significance. |
The input in the box below will not be graded, but may be reviewed and considered by your instructor. | |
(c) | What is the probability of a stroke over the next 10 years for Art Speen, a 67-year-old smoker who has a systolic blood pressure of 176? |
If required, round your answer to two decimal places. Do not round intermediate calculations. | |
What action might the physician recommend for this patient? | |
The input in the box below will not be graded, but may be reviewed and considered by your instructor. | |
(d) | An insurance company will only sell its Select policy to people for whom the probability of a stroke in the next ten years is less than .01. If a smoker with a systolic blood pressure of 230 applies for a Select policy, under what condition will the company sell him the policy if it adheres to this standard? |
24 | |
SOLUTION
.a. The following Excel output provides the estimated multiple linear regression equation that relates risk of a stroke to the person’s age (x1), blood pressure (x2), and whether the person is a smoker (x3).
The estimated multiple linear regression equation is .
b. Before testing any hypotheses about this regression model, we again check the conditions necessary for valid inference in regression. Excel plots of the residuals and each of the independent variables follow.
None of these scatter charts provides strong evidence of a violation of the conditions, so we will proceed with our inference.
Next we check for evidence of multicollinearity. First note that by using Excel’s we can determined that the correlation coefficient r for age and blood pressure is -0.3090, which indicates that multicollinearity between the quantitative variables is not a concern.
Now we rerun this regression after removing the smoker dummy variable (x3) from our model and compare the parameter estimates and associated p-values for each of the reamining independent variables to the parameter estimates and associated p-values for the original model.
When making these comparisons we observe that these values do not change substantially when the smoker dummy variable is introduced into or removed from the model and conclude that the smoker dummy variable does not create a problem with multicollinearity.
Our results suggest that multicollinearity is not an issue for this regression model. We will therefore proceed with our inferences.
The p-value for the test of the hypothesis that β3 = 0 is 0.0102. Because this p-value is less than the 0.05 level of significance, we again reject the hypothesis that β3 = 0, and conclude that there is a difference between smokers and nonsmokers in the risk of a stroke. We estimate that holding age and blood pressure constant, smokers have a risk of stroke that is 8.7399 percent higher than nonsmokers.
c. For a patient with the profile of Art Speen (a 68-year-old smoker who has blood pressure of 175), the predicted risk of a stroke is:
or a probability of approximately .34.
d. Other factors that could be included in the model as independent variables include family history of stroke, weight/obesity, and gender.