Question

In: Statistics and Probability

Build a simple linear regression for (1) all 50 states, (2) Eastern Time zone states, (3)...

Build a simple linear regression for (1) all 50 states, (2) Eastern Time zone states, (3) Central Time zone states, (4) Mountain Time zone states, and (5) Pacific, Alaska, and Hawaii Time zone states. Compare your results in all five parts and state your judgements. You may use charts and tables in the comparison. Your answers should have values for the coefficient of determination, AOV table, significance levels, residual plots, and the regression fit with their interpretations.

Data source: Kaiser Family Foundation, 4/20/2020, 5:38PM. (ET = eastern time, CT = central time, MT = mountain time, PT = Pacific time). Some states have a mix of two different time zones which I ignored here).

States

Time zone

X = Number of COVID-19 Cases

Y = Deaths from COVID-19

Alabama

CT

5,041

169

Alaska

PT

321

9

Arizona

MT

5,068

191

Arkansas

CT

1,923

41

California

PT

33,404

1205

Colorado

MT

9,730

420

Connecticut

ET

19,830

1331

Delaware

ET

2,745

72

District of Columbia

ET

2,927

105

Florida

ET

26,660

789

Georgia

ET

18,947

733

Hawaii

PT

580

10

Idaho

MT

1,672

45

Illinois

CT

31,513

1349

Indiana

ET

11,686

569

Iowa

CT

3,159

79

Kansas

CT

2,043

101

Kentucky

ET

2,960

148

Louisiana

CT

24,523

1328

Maine

ET

875

35

Maryland

ET

13,684

465

Massachusetts

ET

38,077

1706

Michigan

ET

32,000

2468

Minnesota

CT

2,470

143

Mississippi

CT

4,512

169

Missouri

CT

5,889

200

Montana

MT

433

10

Nebraska

CT

1,511

28

Nevada

PT

3,830

159

New Hampshire

ET

1,390

41

New Jersey

ET

88,722

4496

New Mexico

MT

1,845

55

New York

ET

252,595

18611

North Carolina

ET

6,842

202

North Dakota

CT

627

9

Ohio

ET

12,919

509

Oklahoma

CT

2,680

143

Oregon

PT

1,957

75

Pennsylvania

ET

33,914

1348

Rhode Island

ET

5,090

155

South Carolina

ET

4,446

123

South Dakota

CT

1,685

7

Tennessee

ET

7,238

152

Texas

CT

19,751

507

Utah

MT

3,213

27

Vermont

ET

816

38

Virginia

ET

8,984

300

Washington

PT

12,111

643

West Virginia

WV

902

24

Wisconsin

CT

4,499

230

Wyoming

MT

313

2

Solutions

Expert Solution

Solution:

Excel Output

1.

In the above table, our regression model is,

Y ( Death from COVID-19) = Intercept ( B0) + (B1) * Number of COVID-19

Y ( Death from COVID-19) = -162.097 + 0.071 * number of COVID-19

2.

In the above result the P-value(0.000) < 0.05, then we reject the null hypothesis that means they stiatisticaly insignificant.

3.

Here the value of the coefficient of determination( R-square) is 0.92.

R-squared is a statistical measure of how close the data are to the fitted regression line. It is also known as the coefficient of determination, or the coefficient of multiple determination for multiple regression.

R-squared is always between 0 and 100%:

  • 0% indicates that the model explains none of the variability of the response data around its mean.
  • 100% indicates that the model explains all the variability of the response data around its mean.

here the value of R-square is near to 100% that means the line fit data well.

4.

Residual plot:

In this plot (on the right) each point is one day, where the prediction made by the model is on the x-axis, and the accuracy of the prediction is on the y-axis. The distance from the line at 0 is how bad the prediction was for that value.

Since

Residual = Observed – Predicted

positive values for the residual (on the y-axis) mean the prediction was too low, and negative values mean the prediction was too high; 0 means the guess was exactly correct.

In a simple model like this, with only two variables, you can get a sense of how accurate the model is just by relating Number of COVID-19 cases to death cases of COVID-19. Here’s the regression run where the model is very accurate.

5.

Line Fit Plot:

The value of R-squared is 0.92 which indicates the line of the data is fit well.

In the line fit plot the maximum number of points are near to line that means our line is fit well for these data.


Let me know in the comment section if anything is not clear. I will reply ASAP!
If you liked the answer, please give an upvote. This will be quite encouraging for me.Thank-you!


Related Solutions

3. Consider the simple linear regression Yi = 2Xi + ui for i = 1, 2,...
3. Consider the simple linear regression Yi = 2Xi + ui for i = 1, 2, . . . ,n. The ui are IID (0; 2 ). a. Derive OLS estimator of 2 and called it b 2 b. Find its variance c. Is b 2 unbiased, show it? d.What is the risk we run when we do not include an intercept in the regression? Do question d.
Complete all of the steps to derive the normal equations for simple linear regression and then...
Complete all of the steps to derive the normal equations for simple linear regression and then solve them.
Consider the simple linear regression mode
Consider the simple linear regression modelYi = β0 + β1xi + εi, where the errors εi are identically and independently distributed as N (0, σ2).(a) If the predictors satisfy x ̄ = 0, show that the least squares estimates βˆ0 and βˆ1 are independently distributed.(b) Let r be the sample correlation coefficient between the predictor and response. Under what conditions will we have βˆ1 = r?(c) Suppose that βˆ1 = r, as in part b), but make no assumptions on...
Discuss the application of simple linear regression
Discuss the application of simple linear regression
Simple Linear Regression: Suppose a simple linear regression analysis provides the following results: b0 = 6.000,    b1...
Simple Linear Regression: Suppose a simple linear regression analysis provides the following results: b0 = 6.000,    b1 = 3.000,    sb0 = 0.750, sb1 = 0.500,  se = 1.364 and n = 24. Use this information to answer the following questions. (a) State the model equation. ŷ = β0 + β1x ŷ = β0 + β1x + β2sb1    ŷ = β0 + β1x1 + β2x2 ŷ = β0 + β1sb1 ŷ = β0 + β1sb1 x̂ = β0 + β1sb1 x̂ = β0 +...
In simple linear regression, r 2 is the _____. a. coefficient of determination b. coefficient of...
In simple linear regression, r 2 is the _____. a. coefficient of determination b. coefficient of correlation c. estimated regression equation d. sum of the squared residuals QUESTION 3 A least squares regression line ______. a. may be used to predict a value of y if the corresponding x value is given b. implies a cause-effect relationship between x and y c. can only be determined if a good linear relationship exists between x and y d. All of the...
What is the difference between simple linear regression and multiple linear regression? What is the difference...
What is the difference between simple linear regression and multiple linear regression? What is the difference between multiple linear regression and logistic regression? Why should you use adjusted R-squared to choose between models instead of R- squared? Use SPSS to: Height (Xi) Diameter (Yi) 70 8.3 72 10.5 75 11.0 76 11.4 85 12.9 78 14.0 77 16.3 80 18.0 Create a scatterplot of the data above. Without conducting a statistical test, does it look like there is a linear...
Question 3 Suppose that the estimated simple linear regression of a response Y on a predictor...
Question 3 Suppose that the estimated simple linear regression of a response Y on a predictor X based on n = 6 observations produces the following residuals: resid <- c(-0.09, 0.18, -0.27, 0.16, -0.06, 0.09) Note: For this question, all of the computations should be performed “by-hand”. (a) (1 point) What is the estimate of σ 2? (b) (2 points) Further, you know that the estimated regression parameters are βˆ 0 = −0.54 and βˆ 1 = 0.08. Additionally, the...
In a simple linear regression analysis, will the estimate of the regression line be the same...
In a simple linear regression analysis, will the estimate of the regression line be the same if you exchange X and Y? Why or why not?
Problem 3 (A Real Data Application). Recall in the simple linear regression model in Module 3,...
Problem 3 (A Real Data Application). Recall in the simple linear regression model in Module 3, I gave a real data example using the Nobel-winning Capital Asset Pricing Model (CAPM). In that example, we obtained R2 = 0.108, or 10.8%, which is a small value way less than 100%. This means that the single independent variable, the market return, RM, does not explain the return of an individual stock or portfolio very well in this simple linear regression model. Researchers...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT