In: Statistics and Probability
Construct a regression model for predicting total charges from length of stay for DRG 105.
a. State the null and alternative hypotheses and alpha level.
b. Prepare a scatter diagram with the regression line for the two variables.
c. What are the r and r2? What is the importance of the r and r2 results?
d. What is the regression equation?
e. What are your conclusions?
Gender Age LOS Charges Payor 1 Female 47 20 $91,683 Medicaid 2 Female 75 43 $93,708 Medicare 3 Female 84 7 $21,446 Medicare 4 Female 50 13 $37,797 Medicare 5 Male 77 14 $54,364 Medicare 6 Male 57 4 $17,626 Medicare 7 Male 73 4 $12,832 Medicare 8 Female 56 1 $36,153 Medicaid 9 Male 69 1 $14,907 Medicaid 10 Female 81 23 $104,148 Medicare 11 Male 21 5 $21,423 Medicaid 12 Female 37 5 $24,971 Medicaid 13 Female 69 4 $17,022 Medicare 14 Female 89 17 $50,652 Medicare 15 Male 28 35 $186,496 Medicaid 16 Male 47 6 $24,441 Medicaid 17 Male 87 11 $35,349 Medicare 18 Female 85 5 $22,155 Medicare 19 Male 56 5 $24,455 Managed Care 20 Male 45 11 $36,401 Medicaid 21 Male 82 6 $25,783 Medicare 22 Female 65 10 $37,055 Managed Care 23 Male 67 4 $19,236 Medicare 24 Male 59 23 $60,132 Other 25 Female 67 7 $35,777 Medicare 26 Male 53 4 $19,972 Managed Care 27 Male 71 7 $25,409 Medicare 28 Female 79 6 $281,140 Medicare 29 Male 63 1 $41,283 Medicaid 30 Male 53 19 $71,439 Medicaid 31 Female 75 9 $33,735 Medicare 32 Female 68 9 $37,830 Gov Mngd Care 33 Male 37 4 $22,311 Medicaid Total N 33 33 33 33 33
Answer:-
Given That:-
Construct a regression model for predicting total charges from length of stay for DRG 105.
Given,
I used R software to solve this question.
R codes and output:
> d=read.table('data.csv',header=T,sep=',')
> head(d)
LOS Charges
1 20 $91,683
2 43 $93,708
3 7 $21,446
4 13 $37,797
5 14 $54,364
6 4 $17,626
> attach(d)
The following objects are masked from d (pos = 3):
Charges, LOS
> Charge=as.numeric(Charges)
> fit=lm(Charge~LOS)
> summary(fit)
Call:
lm(formula = Charge ~ LOS)
Residuals:
Min 1Q Median 3Q Max
-20.850 -5.925 1.690 7.152 13.614
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 13.0014 2.3517 5.529 4.72e-06 ***
LOS 0.3847 0.1675 2.297 0.0285 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 9.082 on 31 degrees of freedom
Multiple R-squared: 0.1454, Adjusted R-squared: 0.1179
F-statistic: 5.275 on 1 and 31 DF, p-value: 0.02855
> plot(LOS, Charge)
> abline(fit)
> cor(Charge,LOS)
[1] 0.3813442
a. State the null and alternative hypotheses and alpha level.
Hypothesis:
The fitted regression model is not statistically significant.
The fitted regression model is statistically significant.
b. Prepare a scatter diagram with the regression line for
the two variables.
Scatter plot:
c. What are the r and r2? What is the importance of the r and r2 results?
Correlation coefficient = r = 0.3813442
Coefficient of determination = r2 = 0.1454
Correlation coefficient (r) gives the strength of linear relationship between two variables whereas Coefficient of determination tells information about how much variation in one variable is explained by the other variable.
d. What is the regression equation?
Regression equation:
Total charges = 13.0014 + 0.3847 LOS
e. What are your conclusions?
Since p-value for t test ( for testing significance of slope ) is 0.0285, which is less than 0.05, hence we reject null hypothesis and conclude that fitted regression model is statistically significant.
Thank you for your supporting.Please upvote my answer...