Question

In: Statistics and Probability

Construct a regression model for predicting total charges from length of stay for DRG 105. a....

Construct a regression model for predicting total charges from length of stay for DRG 105.

a. State the null and alternative hypotheses and alpha level.

b. Prepare a scatter diagram with the regression line for the two variables.

c. What are the r and r2? What is the importance of the r and r2 results?

d. What is the regression equation?

e. What are your conclusions?

     Gender     Age     LOS     Charges    Payor

1    Female     47        20 $91,683    Medicaid

2    Female     75        43       $93,708    Medicare

3    Female 84          7      $21,446     Medicare

4    Female     50         13 $37,797 Medicare

5    Male 77         14      $54,364    Medicare

6    Male    57           4     $17,626    Medicare

7    Male        73           4     $12,832      Medicare

8 Female     56 1     $36,153      Medicaid

9    Male 69           1 $14,907      Medicaid

10    Female   81    23   $104,148      Medicare

11    Male       21            5 $21,423    Medicaid

12    Female   37            5      $24,971     Medicaid

13    Female   69            4 $17,022    Medicare

14 Female    89           17      $50,652 Medicare

15   Male        28 35      $186,496    Medicaid

16    Male       47             6      $24,441      Medicaid

17   Male        87            11     $35,349    Medicare

18   Female     85             5 $22,155      Medicare

19    Male      56               5     $24,455      Managed Care

20    Male    45             11      $36,401 Medicaid

21    Male      82    6    $25,783      Medicare

22    Female 65             10     $37,055       Managed Care

23    Male     67             4      $19,236       Medicare

24    Male       59           23       $60,132       Other

25    Female    67            7       $35,777       Medicare

26    Male        53 4       $19,972       Managed Care

27    Male    71           7 $25,409      Medicare

28    Female   79            6       $281,140     Medicare

29    Male 63    1        $41,283     Medicaid

30    Male       53           19        $71,439     Medicaid

31    Female    75            9        $33,735 Medicare

32    Female    68            9        $37,830     Gov Mngd Care

33    Male        37 4          $22,311     Medicaid

Total N 33        33          33            33              33

Solutions

Expert Solution

I used R software to solve this question.

R codes and output:

> d=read.table('data.csv',header=T,sep=',')

> head(d)

LOS Charges

1 20 $91,683

2 43 $93,708

3   7 $21,446

4 13 $37,797

5 14 $54,364

6   4 $17,626

> attach(d)

The following objects are masked from d (pos = 3):

    Charges, LOS

> Charge=as.numeric(Charges)

> fit=lm(Charge~LOS)

> summary(fit)

Call:

lm(formula = Charge ~ LOS)

Residuals:

    Min      1Q Median      3Q     Max

-20.850 -5.925   1.690   7.152 13.614

Coefficients:

            Estimate Std. Error t value Pr(>|t|)   

(Intercept) 13.0014     2.3517   5.529 4.72e-06 ***

LOS           0.3847     0.1675   2.297   0.0285 *

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 9.082 on 31 degrees of freedom

Multiple R-squared: 0.1454, Adjusted R-squared: 0.1179

F-statistic: 5.275 on 1 and 31 DF, p-value: 0.02855

> plot(LOS, Charge)

> abline(fit)

> cor(Charge,LOS)

[1] 0.3813442

Que.a

Hypothesis:

The fitted regression model is not statistically significant.

The fitted regression model is statistically significant.

Que.b

Scatter plot:

Que.c

Correlation coefficient = r = 0.3813442

Coefficient of determination = r2 = 0.1454

Correlation coefficient (r) gives the strength of linear relationship between two variables whereas Coefficient of determination tells information about how much variation in one variable is explained by the other variable.

Que.d

Regression equation:

Total charges = 13.0014 + 0.3847 LOS

Que.e

Since p-value for t test ( for testing significance of slope ) is 0.0285, which is less than 0.05, hence we reject null hypothesis and conclude that fitted regression model is statistically significant.


Related Solutions

Construct a regression model for predicting total charges from length of stay for DRG 105. a....
Construct a regression model for predicting total charges from length of stay for DRG 105. a. State the null and alternative hypotheses and alpha level. b. Prepare a scatter diagram with the regression line for the two variables. c. What are the r and r2? What is the importance of the r and r2 results? d. What is the regression equation? e. What are your conclusions? Gender Age LOS Charges Payor 1 Female 47 20 $91,683 Medicaid 2 Female 75...
Suppose a statistician built a multiple regression model for predicting the total number of runs scored...
Suppose a statistician built a multiple regression model for predicting the total number of runs scored by a baseball team during a season. Using data for n=200 samples, the results below were obtained. Complete parts a through d. Ind. Var. β estimate Standard Error Ind. Var.. β estimate Standard Error Intercept 3.88 17.03 Doubles (X3) 0.74 0.04 Walks (X1) 0.37 0.05 Triples (X4) 1.17 0.23 Singles (X2) 0.51 0.05 Home Runs (X5) 1.44 0.04 a. Write the least squares prediction...
This question presents regression output from a model predicting life expectancy from gross national product. The...
This question presents regression output from a model predicting life expectancy from gross national product. The model output is also provided below: Variable Estimate Standard Error T Value Pr (> |t| ) Intercept 69.4 0.54 126.7 0.000 Gross National Product 0.000323 0.00004 8.06 0.000 (a) Write out the regression equation (b) Interpret the coefficient and the slope (c) What are the hypotheses for evaluating if gross national product has any impact on life expectancy? (d) State the conclusion of the...
A least-squares simple linear regression model was fit predicting duration (in minutes) of a dive from...
A least-squares simple linear regression model was fit predicting duration (in minutes) of a dive from depth of the dive (in meters) from a sample of 45 penguins' diving depths and times. Calculate the F-statistic for the regression by filling in the ANOVA table. SS df MS F-statistic Regression Residual 1628.4056 Total 367385.9237
In this project, you will collect data from real world to construct a multiple regression model....
In this project, you will collect data from real world to construct a multiple regression model. The resulting model will be used for a prediction purpose. For example, suppose you are interested in “sales price of houses”. In a multiple regression model, this is called a “response variable”. There are many important factors that affect the prices of houses. Those factors include size (square feet), number of bedrooms, number of baths, age of the house, distance to a major grocery...
Stay Length of Stay (LOS) Total Costs 1 3 $2,613.91 2 10 $8,769.03 3 2 $2,448.60...
Stay Length of Stay (LOS) Total Costs 1 3 $2,613.91 2 10 $8,769.03 3 2 $2,448.60 4 3 $2,568.70 5 3 $1,936.19 6 5 $7,230.71 7 5 $5,342.61 8 3 $4,108.13 9 1 $1,596.91 10 2 $4,061.28 11 2 $1,761.53 12 5 $4,779.19 13 1 $2,078.30 14 3 $4,713.61 15 4 $3,946.68 16 2 $2,902.74 17 1 $1,438.85 18 1 $820.21 19 1 $3,309.41 20 6 $5,476.33 INTERCEPT Now that you have determined the answer, it is time to provide...
The estimated regression equation for predicting the number of speeding tickets from a driver’s age is...
The estimated regression equation for predicting the number of speeding tickets from a driver’s age is given as Y = 5 – 0.06X. For every year older an individual gets, the estimated number of speeding tickets Need to know which is the answer: Increases by 5 tickets Decreases by 5 tickets Decreases by 0.06 tickets
The linear regression equation for predicting poverty (%) from the high school graduation rate (%) is...
The linear regression equation for predicting poverty (%) from the high school graduation rate (%) is as follows: ˆ y = 29 -0.2*x High school graduation rate for North Carolina is 22% and the poverty rate is 29.5%. Find the residual for this observation (round your answer to one decimal place)
a. Construct a scatterplot of the data and tell why a linear regression model is appropriate....
a. Construct a scatterplot of the data and tell why a linear regression model is appropriate. (Include this graph in your report.)   b. Run the linear regression procedure on StatCrunch and include the output in your report. c. Give the regression equation using the correct notation. d. Give the Coefficient of Determination AND interpret it.   e. Check the assumptions of the model by constructing each of the following plots and commenting on what they suggest in terms of the assumptions....
RPI would like to develop a multiple regression model for predicting graduate student Grade Point Averages....
RPI would like to develop a multiple regression model for predicting graduate student Grade Point Averages. The initial data from 30 grad students are in the file GPA.sav. The file contains the following variables: GPA (graduate grade point averages), GREQ (score on the quantitative section of the Graduate Record Exam, a commonly used entrance exam for graduate programs), GREV (score on the verbal section of the GRE), MAT (score on the Miller Analogies Test, another graduate entrance exam), and AR,...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT