Question

In: Statistics and Probability

Data from n = 113 hospitals in the United States are used to assess factors related...

Data from n = 113 hospitals in the United States are used to assess factors related to the likelihood that a hospital patients acquires an infection while hospitalized. The variables here are y = infection risk, x1 = average length of patient stay, x2 = average patient age, x3 = measure of how many x-rays are given in the hospital. The Minitab output is as follows:

Regression Analysis: InfctRsk versus Stay, Age, Xray

Analysis of Variance

Source

DF

Adj SS

Adj MS

F-Value

P-Value

Regression

3

73.099

24.366

20.70

0.000

Stay

1

31.684

31.684

26.92

0.000

Age

1

1.126

1.126

0.96

0.330

Xray

1

13.719

13.719

11.66

0.001

Error

109

128.281

1.177

Total

112

201.380

Model Summary

S

R-sq

R-sq(adj)

R-sq(pred)

1.08484

36.30%

34.55%

30.64%

Coefficients

Term

Coef

SE Coef

T-Value

P-Value

VIF

Constant

1.00

1.31

0.76

0.448

Stay

0.3082

0.0594

5.19

0.000

1.23

Age

-0.0230

0.0235

-0.98

0.330

1.05

Xray

0.01966

0.00576

3.41

0.001

1.18

Regression Equation

InfctRsk

=

1.00 + 0.3082 Stay - 0.0230 Age + 0.01966 Xray

  1. Set up the hypothesis test to decide whether there is a connection between the infection risk and the group of predictors.
  2. Then, decide, using tests, which of the predictors and constant should be in the final relationship.
  3. Give the value of the coefficient of determination and tell what it means.

Solutions

Expert Solution

The regression model that is being estimated is

where is the intercept, are the slope coefficients for stay, Age, Xray and is a random error

  1. Set up the hypothesis test to decide whether there is a connection between the infection risk and the group of predictors.

The hypotheses are

The test statistics for this test has F distribution and it is obtained from the ANOVA table

Source

DF

Adj SS

Adj MS

F-Value

P-Value

Regression

3

73.099

24.366

20.70

0.000

Stay

1

31.684

31.684

26.92

0.000

Age

1

1.126

1.126

0.96

0.330

Xray

1

13.719

13.719

11.66

0.001

Error

109

128.281

1.177

Total

112

201.380

The test statistics is

F=20.70

The p-value =0.000

We will reject the null hypothesis if the p-value is less than the significance level.

Here, the p-value is 0.000 and it is less than the significance level 0.05. Hence we reject the null hypothesis.

We conclude that, at 5% level of significance, there is a connection between the infection risk and the group of predictors.

  • Then, decide, using tests, which of the predictors and constant should be in the final relationship.

We need to test for each of the slopes the following hypotheses.

Test if the variable Stay is significant in the relationship

The test statistics for this 2 tailed test is in the coefficient table.

Coefficients

Term

Coef

SE Coef

T-Value

P-Value

VIF

Constant

1.00

1.31

0.76

0.448

Stay

0.3082

0.0594

5.19

0.000

1.23

Age

-0.0230

0.0235

-0.98

0.330

1.05

Xray

0.01966

0.00576

3.41

0.001

1.18

The test statistics is t=5.19 and the p-value is 0.000

We will reject the null hypothesis if the p-value is less than the significance level.

Here, the p-value is 0.000 and it is less than the significance level 0.05. Hence we reject the null hypothesis.

We conclude that, at 5% level of significance, there is a connection between the infection risk and the predictor Stay.

Test if the variable Age is significant in the relationship

The test statistics for this 2 tailed test is in the coefficient table.

The test statistics is t=-0.98 and the p-value is 0.330

We will reject the null hypothesis if the p-value is less than the significance level.

Here, the p-value is 0.330 and it is not less than the significance level 0.05. Hence we do not reject the null hypothesis.

We conclude that, at 5% level of significance, there is no connection between the infection risk and the predictor Age.

Test if the variable Xray is significant in the relationship

The test statistics for this 2 tailed test is in the coefficient table.

The test statistics is t=3.41 and the p-value is 0.001

We will reject the null hypothesis if the p-value is less than the significance level.

Here, the p-value is 0.001 and it is less than the significance level 0.05. Hence we reject the null hypothesis.

We conclude that, at 5% level of significance, there is a connection between the infection risk and the predictor Xray.

We can decide that we need to drop Age from the model and retain Stay and Xray in the final relationship.

The new model that we need to estimate would be

Give the value of the coefficient of determination and tell what it means

The value of the coefficient of determination is

Model Summary

S

R-sq

R-sq(adj)

R-sq(pred)

1.08484

36.30%

34.55%

30.64%

Ans: The value of the coefficient of determination is 0.3630

This indicates that 36.30% of the variation in infection risk is explained by the model (or is explained by the predictor variables).


Related Solutions

Hospitals: Currently, there are 5723 registered hospitals in the United States. Are the numbers of hospitals...
Hospitals: Currently, there are 5723 registered hospitals in the United States. Are the numbers of hospitals in different states discrete or continuous? Answer: ________________ What is the level of measurement for the number of hospitals in different years? Pick one (nominal, ordinal, interval, ratio) Answer: ________________ A survey is conducted by randomly selecting 10 patients in every hospital, what type of sampling is used? Pick one (random, systematic, convenience, stratified, cluster) Answer: ________________ If a survey is conducted by randomly...
Based on Court Case United States v. Bestfoods 113F.3d 572 (1998) United States v. Bestfoods 113...
Based on Court Case United States v. Bestfoods 113F.3d 572 (1998) United States v. Bestfoods 113 F.3d 572 (1998) SOUTER, JUSTICE The United States brought this action under §107(a)(2) of the Comprehensive Environmental Response, Compensation, and Liability Act of 1980 (CERCLA) against, among others, respondent CPC International, Inc., the parent corporation of the defunct Ott Chemical Co. (Ott II), for the costs of cleaning up industrial waste generated by Ott II’s chemical plant. Section 107(a)(2) authorizes suits against, among others,...
The United States’ health-related views and laws are shaped by social, political, and historical factors that...
The United States’ health-related views and laws are shaped by social, political, and historical factors that are often part of the larger debate over individual rights versus the collective good. Based on this idea please discuss your thoughts on the 3 following public health topics. Should childhood immunizations be mandatory or optional? Should higher insurance rates or taxes be used to punish poor health choices (e.g., cigarettes, “junk” food)? Would this be unfair to individuals, or is this fair since...
From the data for 46 states in the United States for 1992, Baltagi obtained the following...
From the data for 46 states in the United States for 1992, Baltagi obtained the following regression results:                                               LogC=  4.3- 1.34 log P +0.17 log Y                                                     Se=(0.91)  (0.32)       (0.20)                                    R2=0.27 Where C= cigarette consumption, Packs per year               P= real price per pack               Y= real disposable income per capita What is the elasticity of demand for cigarettes with respect to price? Is it statistically significant? What is the income elasticity of demand for cigarettes? Is it statistically significant? What is the overall significance of the...
From the data for 46 states in the United States for 1992, Baltagi obtained the following...
From the data for 46 states in the United States for 1992, Baltagi obtained the following regression results:                                               LogC=  4.3- 1.34 log P +0.17 log Y                                                     Se=(0.91)  (0.32)       (0.20)                                    R2=0.27 Where C= cigarette consumption, Packs per year               P= real price per pack               Y= real disposable income per capita What is the elasticity of demand for cigarettes with respect to price? Is it statistically significant? What is the income elasticity of demand for cigarettes? Is it statistically significant? What is the overall significance of the...
In the United​ States, most healthcare services are produced by private doctors and hospitals that receive...
In the United​ States, most healthcare services are produced by private doctors and hospitals that receive their incomes from​ _______. A.private​ health-insurance, governments, and patients B.only private​ health-insurance C.only governments D.50 percent from private insurance companies and 50 percent from patient expenditure
In the United​ States, most healthcare services are produced by private doctors and hospitals that receive...
In the United​ States, most healthcare services are produced by private doctors and hospitals that receive their incomes from​ _______. A. private​ health-insurance, governments, and patients B. 50 percent from private insurance companies and 50 percent from patient expenditure C. only private​ health-insurance D. only governments
What factors are likely to cause a person who comes to the United States from another...
What factors are likely to cause a person who comes to the United States from another culture, with different norms for conversational space, to maintain the norms from his or her culture of origin?
How does assess data stewardship considerations related to data? And how does data related issues are...
How does assess data stewardship considerations related to data? And how does data related issues are identified, managed, and resolved?
In the United States, 7.8% of the population has diabetes. The CDC compiles data from an...
In the United States, 7.8% of the population has diabetes. The CDC compiles data from an SRS of 1200 people. a.What would be the mean number of people who have diabetes from such a sample? What would be the standard deviation? b.Use the Normal approximation to estimate the probability that at least 115 people (out of the 1200) would have diabetes?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT