Question

In: Statistics and Probability

Construct a regression model for predicting total charges from length of stay for DRG 105. a....

Construct a regression model for predicting total charges from length of stay for DRG 105.

a. State the null and alternative hypotheses and alpha level.

b. Prepare a scatter diagram with the regression line for the two variables.

c. What are the r and r2? What is the importance of the r and r2 results?

d. What is the regression equation?

e. What are your conclusions?

Gender Age LOS Charges Payor 1 Female 47 20 $91,683 Medicaid 2 Female 75 43 $93,708 Medicare 3 Female 84 7 $21,446 Medicare 4 Female 50 13 $37,797 Medicare 5 Male 77 14 $54,364 Medicare 6 Male 57 4 $17,626 Medicare 7 Male 73 4 $12,832 Medicare 8 Female 56 1 $36,153 Medicaid 9 Male 69 1 $14,907 Medicaid 10 Female 81 23 $104,148 Medicare 11 Male 21 5 $21,423 Medicaid 12 Female 37 5 $24,971 Medicaid 13 Female 69 4 $17,022 Medicare 14 Female 89 17 $50,652 Medicare 15 Male 28 35 $186,496 Medicaid 16 Male 47 6 $24,441 Medicaid 17 Male 87 11 $35,349 Medicare 18 Female 85 5 $22,155 Medicare 19 Male 56 5 $24,455 Managed Care 20 Male 45 11 $36,401 Medicaid 21 Male 82 6 $25,783 Medicare 22 Female 65 10 $37,055 Managed Care 23 Male 67 4 $19,236 Medicare 24 Male 59 23 $60,132 Other 25 Female 67 7 $35,777 Medicare 26 Male 53 4 $19,972 Managed Care 27 Male 71 7 $25,409 Medicare 28 Female 79 6 $281,140 Medicare 29 Male 63 1 $41,283 Medicaid 30 Male 53 19 $71,439 Medicaid 31 Female 75 9 $33,735 Medicare 32 Female 68 9 $37,830 Gov Mngd Care 33 Male 37 4 $22,311 Medicaid Total N 33 33 33 33 33

Solutions

Expert Solution

Answer:-

Given That:-

Construct a regression model for predicting total charges from length of stay for DRG 105.

Given,

I used R software to solve this question.

R codes and output:

> d=read.table('data.csv',header=T,sep=',')

> head(d)

LOS Charges

1 20 $91,683

2 43 $93,708

3 7 $21,446

4 13 $37,797

5 14 $54,364

6 4 $17,626

> attach(d)

The following objects are masked from d (pos = 3):

Charges, LOS

> Charge=as.numeric(Charges)

> fit=lm(Charge~LOS)

> summary(fit)

Call:

lm(formula = Charge ~ LOS)

Residuals:

Min 1Q Median 3Q Max

-20.850 -5.925 1.690 7.152 13.614

Coefficients:

Estimate Std. Error t value Pr(>|t|)   

(Intercept) 13.0014 2.3517 5.529 4.72e-06 ***

LOS 0.3847 0.1675 2.297 0.0285 *

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 9.082 on 31 degrees of freedom

Multiple R-squared: 0.1454, Adjusted R-squared: 0.1179

F-statistic: 5.275 on 1 and 31 DF, p-value: 0.02855

> plot(LOS, Charge)

> abline(fit)

> cor(Charge,LOS)

[1] 0.3813442

a. State the null and alternative hypotheses and alpha level.

Hypothesis:

The fitted regression model is not statistically significant.

The fitted regression model is statistically significant.


b. Prepare a scatter diagram with the regression line for the two variables.

Scatter plot:

c. What are the r and r2? What is the importance of the r and r2 results?

Correlation coefficient = r = 0.3813442

Coefficient of determination = r2 = 0.1454

Correlation coefficient (r) gives the strength of linear relationship between two variables whereas Coefficient of determination tells information about how much variation in one variable is explained by the other variable.

d. What is the regression equation?

Regression equation:

Total charges = 13.0014 + 0.3847 LOS

e. What are your conclusions?

Since p-value for t test ( for testing significance of slope ) is 0.0285, which is less than 0.05, hence we reject null hypothesis and conclude that fitted regression model is statistically significant.

Thank you for your supporting.Please upvote my answer...


Related Solutions

Construct a regression model for predicting total charges from length of stay for DRG 105. a....
Construct a regression model for predicting total charges from length of stay for DRG 105. a. State the null and alternative hypotheses and alpha level. b. Prepare a scatter diagram with the regression line for the two variables. c. What are the r and r2? What is the importance of the r and r2 results? d. What is the regression equation? e. What are your conclusions?      Gender     Age     LOS     Charges    Payor 1    Female     47        20 $91,683    Medicaid...
Suppose a statistician built a multiple regression model for predicting the total number of runs scored...
Suppose a statistician built a multiple regression model for predicting the total number of runs scored by a baseball team during a season. Using data for n=200 samples, the results below were obtained. Complete parts a through d. Ind. Var. β estimate Standard Error Ind. Var.. β estimate Standard Error Intercept 3.88 17.03 Doubles (X3) 0.74 0.04 Walks (X1) 0.37 0.05 Triples (X4) 1.17 0.23 Singles (X2) 0.51 0.05 Home Runs (X5) 1.44 0.04 a. Write the least squares prediction...
This question presents regression output from a model predicting life expectancy from gross national product. The...
This question presents regression output from a model predicting life expectancy from gross national product. The model output is also provided below: Variable Estimate Standard Error T Value Pr (> |t| ) Intercept 69.4 0.54 126.7 0.000 Gross National Product 0.000323 0.00004 8.06 0.000 (a) Write out the regression equation (b) Interpret the coefficient and the slope (c) What are the hypotheses for evaluating if gross national product has any impact on life expectancy? (d) State the conclusion of the...
A least-squares simple linear regression model was fit predicting duration (in minutes) of a dive from...
A least-squares simple linear regression model was fit predicting duration (in minutes) of a dive from depth of the dive (in meters) from a sample of 45 penguins' diving depths and times. Calculate the F-statistic for the regression by filling in the ANOVA table. SS df MS F-statistic Regression Residual 1628.4056 Total 367385.9237
In this project, you will collect data from real world to construct a multiple regression model....
In this project, you will collect data from real world to construct a multiple regression model. The resulting model will be used for a prediction purpose. For example, suppose you are interested in “sales price of houses”. In a multiple regression model, this is called a “response variable”. There are many important factors that affect the prices of houses. Those factors include size (square feet), number of bedrooms, number of baths, age of the house, distance to a major grocery...
Stay Length of Stay (LOS) Total Costs 1 3 $2,613.91 2 10 $8,769.03 3 2 $2,448.60...
Stay Length of Stay (LOS) Total Costs 1 3 $2,613.91 2 10 $8,769.03 3 2 $2,448.60 4 3 $2,568.70 5 3 $1,936.19 6 5 $7,230.71 7 5 $5,342.61 8 3 $4,108.13 9 1 $1,596.91 10 2 $4,061.28 11 2 $1,761.53 12 5 $4,779.19 13 1 $2,078.30 14 3 $4,713.61 15 4 $3,946.68 16 2 $2,902.74 17 1 $1,438.85 18 1 $820.21 19 1 $3,309.41 20 6 $5,476.33 INTERCEPT Now that you have determined the answer, it is time to provide...
The estimated regression equation for predicting the number of speeding tickets from a driver’s age is...
The estimated regression equation for predicting the number of speeding tickets from a driver’s age is given as Y = 5 – 0.06X. For every year older an individual gets, the estimated number of speeding tickets Need to know which is the answer: Increases by 5 tickets Decreases by 5 tickets Decreases by 0.06 tickets
The linear regression equation for predicting poverty (%) from the high school graduation rate (%) is...
The linear regression equation for predicting poverty (%) from the high school graduation rate (%) is as follows: ˆ y = 29 -0.2*x High school graduation rate for North Carolina is 22% and the poverty rate is 29.5%. Find the residual for this observation (round your answer to one decimal place)
a. Construct a scatterplot of the data and tell why a linear regression model is appropriate....
a. Construct a scatterplot of the data and tell why a linear regression model is appropriate. (Include this graph in your report.)   b. Run the linear regression procedure on StatCrunch and include the output in your report. c. Give the regression equation using the correct notation. d. Give the Coefficient of Determination AND interpret it.   e. Check the assumptions of the model by constructing each of the following plots and commenting on what they suggest in terms of the assumptions....
RPI would like to develop a multiple regression model for predicting graduate student Grade Point Averages....
RPI would like to develop a multiple regression model for predicting graduate student Grade Point Averages. The initial data from 30 grad students are in the file GPA.sav. The file contains the following variables: GPA (graduate grade point averages), GREQ (score on the quantitative section of the Graduate Record Exam, a commonly used entrance exam for graduate programs), GREV (score on the verbal section of the GRE), MAT (score on the Miller Analogies Test, another graduate entrance exam), and AR,...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT