In: Statistics and Probability
Choose or collect a data file, use regression theory to set up a model, calculate parameters value, get the regression model, analysis the meaning of model, including R square, F-test, t-test, explain the relation between dependent variable and independent variables. During the analysis, you need to represent the chart, correlation, regression output table. You’d better choose the multiple regression model, preferably one that includes dummy variables.
ANSWER::
Collected Data: To compare the smokers and non-smokers
Dependent variable = Risk
Independent variable = Age, Pressure, Smoker
Categorical variable or dummy variable = Smoker, we assign the Yes = 1 and No = 0
=IF(E4="Yes",1,0)
Risk | Age | Pressure | Smoker | Risk(Y) | Age (X1) | Pressure(X2) | Smoker (X3) | |
12 | 57 | 152 | No | 12 | 57 | 152 | 0 | |
24 | 67 | 163 | No | 24 | 67 | 163 | 0 | |
13 | 58 | 155 | No | 13 | 58 | 155 | 0 | |
56 | 86 | 177 | Yes | 56 | 86 | 177 | 1 | |
28 | 59 | 196 | No | 28 | 59 | 196 | 0 | |
51 | 76 | 189 | Yes | 51 | 76 | 189 | 1 | |
18 | 56 | 155 | Yes | 18 | 56 | 155 | 1 | |
31 | 78 | 120 | No | 31 | 78 | 120 | 0 | |
37 | 80 | 135 | Yes | 37 | 80 | 135 | 1 | |
15 | 78 | 98 | No | 15 | 78 | 98 | 0 | |
22 | 71 | 152 | No | 22 | 71 | 152 | 0 | |
36 | 70 | 173 | Yes | 36 | 70 | 173 | 1 | |
15 | 67 | 135 | Yes | 15 | 67 | 135 | 1 | |
48 | 77 | 209 | Yes | 48 | 77 | 209 | 1 | |
15 | 60 | 199 | No | 15 | 60 | 199 | 0 | |
36 | 82 | 119 | Yes | 36 | 82 | 119 | 1 | |
8 | 66 | 166 | No | 8 | 66 | 166 | 0 | |
34 | 80 | 125 | Yes | 34 | 80 | 125 | 1 | |
3 | 62 | 117 | No | 3 | 62 | 117 | 0 | |
37 | 59 | 207 | Yes | 37 | 59 | 207 | 1 |
Regression Output:
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.934605 | |||||
R Square | 0.873487 | |||||
Adjusted R Square | 0.849766 | |||||
Standard Error | 5.756575 | |||||
Observations | 20 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 3 | 3660.74 | 1220.247 | 36.82301 | 2.06E-07 | |
Residual | 16 | 530.2104 | 33.13815 | |||
Total | 19 | 4190.95 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | -91.7595 | 15.22276 | -6.02778 | 1.76E-05 | -124.03 | -59.4887 |
Age (X1) | 1.076741 | 0.165964 | 6.487814 | 7.49E-06 | 0.724914 | 1.428568 |
Pressure(X2) | 0.251813 | 0.045226 | 5.567951 | 4.24E-05 | 0.15594 | 0.347687 |
Smoker (X3) | 8.739871 | 3.000815 | 2.912499 | 0.010174 | 2.378427 | 15.10132 |
RESIDUAL OUTPUT | PROBABILITY OUTPUT | |||||
Observation | Predicted Risk(Y) | Residuals | Standard Residuals | Percentile | Risk(Y) | |
1 | 7.89039 | 4.10961 | 0.777953 | 2.5 | 3 | |
2 | 21.42775 | 2.572252 | 0.48693 | 7.5 | 8 | |
3 | 9.722571 | 3.277429 | 0.62042 | 12.5 | 12 | |
4 | 54.15109 | 1.848912 | 0.350001 | 17.5 | 13 | |
5 | 21.12366 | 6.876335 | 1.301696 | 22.5 | 15 | |
6 | 46.40544 | 4.594561 | 0.869754 | 27.5 | 15 | |
7 | 16.30896 | 1.69104 | 0.320115 | 32.5 | 15 | |
8 | 22.44392 | 8.556079 | 1.619674 | 37.5 | 18 | |
9 | 37.11448 | -0.11448 | -0.02167 | 42.5 | 22 | |
10 | 16.90402 | -1.90402 | -0.36043 | 47.5 | 24 | |
11 | 22.96476 | -0.96476 | -0.18263 | 52.5 | 28 | |
12 | 35.91598 | 0.084023 | 0.015906 | 57.5 | 31 | |
13 | 23.11684 | -8.11684 | -1.53653 | 62.5 | 34 | |
14 | 52.51845 | -4.51845 | -0.85535 | 67.5 | 36 | |
15 | 22.95585 | -7.95585 | -1.50605 | 72.5 | 36 | |
16 | 35.23894 | 0.761058 | 0.144069 | 77.5 | 37 | |
17 | 21.10645 | -13.1064 | -2.48106 | 82.5 | 37 | |
18 | 34.59634 | -0.59634 | -0.11289 | 87.5 | 48 | |
19 | 4.460623 | -1.46062 | -0.2765 | 92.5 | 51 | |
20 | 32.63348 | 4.366516 | 0.826585 | 97.5 | 56 |
Regression Model:
Predicted risk = -91.759 + 1.076 x Age + 0.251 x Pressure + 8.739 x Smoker
Interpretations:
2) All three independent variable age, pressure, and smoker are statistically significant because all three variables have the p-value is less than 0.05.
3) r is always between -1 and 1 inclusive. The R-squared value, denoted by R 2, is the square of the correlation. It measures the proportion of variation in the dependent variable that can be attributed to the independent variable. Correlation r = 0.9; R=squared = 0.87. Small positive linear association
Graphs:
1) Residual Plots:
2) Line Fit Plot:
3) Normal Probability Plot:
NOTE:: I HOPE YOUR HAPPY WITH MY ANSWER....***PLEASE SUPPORT ME WITH YOUR RATING...
***PLEASE GIVE ME "LIKE"...ITS VERY IMPORTANT FOR ME NOW....PLEASE SUPPORT ME ....THANK YOU