In: Statistics and Probability
Suppose that a health researcher is interested in determining whether the distance between the patient’s residence and source of care, and # of disability days significantly predict the number of visits at a local hospital.
Visits |
Distance |
Disability days |
7 |
1 |
8 |
7 |
1 |
9 |
2 |
10 |
1 |
4 |
4 |
3 |
5 |
2 |
5 |
3 |
9 |
2 |
2 |
10 |
1 |
9 |
2 |
10 |
6 |
3 |
7 |
2 |
11 |
2 |
4 |
2 |
3 |
7 |
2 |
8 |
3 |
14 |
2 |
2 |
16 |
1 |
8 |
2 |
9 |
4 |
4 |
2 |
6 |
2 |
5 |
a. Determine the multiple regression equation for the data and interpret the regression coefficients.
b. Provide an interpretation of the coefficient of multiple determination, R2.
c. At the 5% significance level, determine if # of disability days and distance between patient’s residence and source of care significantly predict the number of visits.
a. Determine the multiple regression equation for the data and interpret the regression coefficients.
First arrange data in Excel as,
Visits | Distance | Disability days |
7 | 1 | 8 |
7 | 1 | 9 |
2 | 10 | 1 |
4 | 4 | 3 |
5 | 2 | 5 |
3 | 9 | 2 |
2 | 10 | 1 |
9 | 2 | 10 |
6 | 3 | 7 |
2 | 11 | 2 |
4 | 2 | 3 |
7 | 2 | 8 |
3 | 14 | 2 |
2 | 16 | 1 |
8 | 2 | 9 |
4 | 4 | 2 |
6 | 2 | 5 |
To determine the multiple regression using excel data analysis tool as below,
The Output of Regression is,
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.980 | |||||
R Square | 0.960 | |||||
Adjusted R Square | 0.954 | |||||
Standard Error | 0.496 | |||||
Observations | 17 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 2 | 81.615 | 40.808 | 165.911 | 0.000 | |
Residual | 14 | 3.443 | 0.246 | |||
Total | 16 | 85.059 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 2.469 | 0.469 | 5.262 | 0.000 | 1.463 | 3.475 |
Distance | -0.081 | 0.038 | -2.104 | 0.054 | -0.163 | 0.002 |
Disability days | 0.599 | 0.059 | 10.229 | 0.000 | 0.473 | 0.724 |
b. Provide an interpretation of the coefficient of multiple determination, R2.
As we can see in output,
R2 = 0.960 i.e 96%
The model explains 96% the variability of the response data Visit.
c. At the 5% significance level, determine if # of disability days and distance between patient’s residence and source of care significantly predict the number of visits.
As we can see in output,
distance
P-value of Distance is 0.054, is greater than 0.05 Then, distance variable Insignificantly predict.
Disability days,
P-value of Distance is 0.000, is less than 0.05 Then, Disability Days variable significantly predict.the number of visits.