Question

In: Statistics and Probability

Part 1. Consider the dataset below. You will perform a series of regressions and data transformations....

Part 1. Consider the dataset below. You will perform a series of regressions and data transformations. Be sure to keep a record of all your computer results. First, please perform a simple linear regression. Predict Y if X = 40. To avoid rounding errors in ALL your calculations, please perform your calculations on your spreadsheet referencing data from your regression output.

X

Y

54

6

42

16

28

33

38

18

25

41

70

3

48

10

41

14

20

45

52

9

65

5

14.0363

14.1891

17.2164

21.5627

None of the above

Part 2

Please perform a polynomial regression of Y against X and X-squared.  What is the coefficient of the curvature term?

92.8725

-2.7222

0.0208

0.9851

None of the above

Part 3.

With other OLS regression conditions satisfied, can we utilize the estimated equation of this model to predict Y?

Yes. The regression is statistically significant and the coefficient of determination is reasonably high.

No. Although the regression is significant, there is a high likelihood that multicollinearity is a problem due to the inclusion of the X-squared term.

No. The regression model is nonlinear and cannot therefore be utilized to make a forecast.

No. The standard error of the regression is too high, indicating that the unexplained variation (error term) exhibits heteroscedasticity.

None of the above

Part 4.

Based on the parameter estimates of the quadratic model, predict Y if X = 40.

14.0363

14.1891

17.2164

21.5627

19.1523

Part 5.

Perform a logarithmic transformation of only Y on the original dataset. That is, ln(Y) = B0 + B1(X) + e. Then predict Y if X = 40.

21.5627

17.2164

2.7885

Solutions

Expert Solution

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.932145801
R Square 0.868895795
Adjusted R Square 0.854328661
Standard Error 5.64255772
Observations 11
ANOVA
df SS MS F Significance F
Regression 1 1899.090245 1899.090245 59.64768355 2.92936E-05
Residual 9 286.5461186 31.83845762
Total 10 2185.636364
Coefficients Standard Error t Stat P-value Lower 95%
Intercept 56.15733314 5.203079582 10.79309518 1.88916E-06 44.3871494
X -0.8648668 0.111983087 -7.72319128 2.92936E-05 -1.118190142

part 1)

y^ = 56.1573 - 0.8649 *x

= 56.1573 - 0.8649 *40

= 21.5627

part 2)

Y X x^2
6 54 2916
16 42 1764
33 28 784
18 38 1444
41 25 625
3 70 4900
10 48 2304
14 41 1681
45 20 400
9 52 2704
5 65 4225
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.994003876
R Square 0.988043706
Adjusted R Square 0.985054633
Standard Error 1.807349945
Observations 11
ANOVA
df SS MS F Significance F
Regression 2 2159.504253 1079.752 330.5518 2.04E-08
Residual 8 26.1321106 3.266514
Total 10 2185.636364
Coefficients Standard Error t Stat P-value Lower 95%
Intercept 92.87251293 4.436918394 20.93176 2.85E-08 82.64096
X -2.722212931 0.211088773 -12.8961 1.24E-06 -3.20898
x^2 0.020770253 0.002326226 8.928735 1.96E-05 0.015406

coefficient of the curvature term = 0.0208

part 3)

A) Yes. The regression is statistically significant and the coefficient of determination is reasonably high.

part 4)

17.21640086

part 5)

ln (Y) Y X
1.791759 6 54
2.772589 16 42
3.496508 33 28
2.890372 18 38
3.713572 41 25
1.098612 3 70
2.302585 10 48
2.639057 14 41
3.806662 45 20
2.197225 9 52
1.609438 5 65
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.991254363
R Square 0.982585213
Adjusted R Square 0.980650236
Standard Error 0.122438958
Observations 11
ANOVA
df SS MS F Significance F
Regression 1 7.612614045 7.612614 507.8022 3.16E-09
Residual 9 0.134921686 0.014991
Total 10 7.747535731
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 4.978748604 0.112902636 44.09772 7.92E-12 4.723345 5.234152 4.723345 5.234152
X -0.054757465 0.002429943 -22.5345 3.16E-09 -0.06025 -0.04926 -0.06025 -0.04926

ln y^ = 4.9787 -0.0547 * x

y^ =

16.25580413

Related Solutions

Perform an analysis of the data below. A time-series plot. Comment on the underlying pattern in...
Perform an analysis of the data below. A time-series plot. Comment on the underlying pattern in the time series. Using the dummy variable approach, build the forecasting model. Show your final time series table and regression equation. Using your model, forecast sales for January through December of the fourth year. Assume that January sales for the fourth year turn out to be $295,000. What was your forecast error? What can you do to resolve the owner's uncertainty in the forecasting...
Use the annual flood data (annual maximum series) in the table below to perform a flood...
Use the annual flood data (annual maximum series) in the table below to perform a flood frequency analysis using the U.S. Water Resources Council Guidelines. The map skew for this location is - 0.2 Year Discharge (cfs) 1935 1955 1936 4050 1937 3570 1938 2060 1939 1300 1940 1390 1941 1720 1942 6280 1943 1360 1944 7440 1945 5320 1946 1400 1947 3240 1948 2710 1949 4520 1950 4840 1951 8320 1952 13900 1953 71500 1954 6250 1955 2260 1956...
Question 3: Use the annual flood data (annual maximum series) in the table below to perform...
Question 3: Use the annual flood data (annual maximum series) in the table below to perform a flood frequency analysis using the U.S. Water Resources Council Guidelines. The map skew for this location is - 0.2 Year Discharge (cfs) Year Discharge (cfs) 1935 1955 1955 2260 1936 4050 1956 318 1937 3570 1957 1330 1938 2060 1958 970 1939 1300 1959 1920 1940 1390 1960 15100 1941 1720 1961 2870 1942 6280 1962 20600 1943 1360 1963 3810 1944 7440...
Data set 2 (a). For this series of questions you will perform an F ratio variance...
Data set 2 (a). For this series of questions you will perform an F ratio variance test on the following two samples: Set A: 7, 9, 9, 11, 11, 13 Set B: 5, 5, 8, 10 You will be asked to provide: (a, 4 pts) The value of F. (b, 3 pts) The appropriate critical F value corresponding to an overall alpha value of 0.05 for the F test. (c, 3 pts) Indicate whether the test suggests that the population...
Consider the time-series data in columns A and B on the below (picture). Week (A) Value...
Consider the time-series data in columns A and B on the below (picture). Week (A) Value (B) 1 24 2 13 3 20 4 12 5 19 6 23 7 15 a. Using the naive method, develop a forecast for this time series. Compute MSE and MAPE. Show the forecast for week 8. b. Using all previous values, develop a forecast for this time series. Compute MSE and MAPE. Show the forecast for week 8. c. Develop a three-week moving...
Consider the time-series data in columns A and B on the below (picture). Week (A) Value...
Consider the time-series data in columns A and B on the below (picture). Week (A) Value (B) 1 24 2 13 3 20 4 12 5 19 6 23 7 15 a. Using the naive method, develop a forecast for this time series. Compute MSE and MAPE. Show the forecast for week 8. b. Using all previous values, develop a forecast for this time series. Compute MSE and MAPE. Show the forecast for week 8. c. Develop a three-week moving...
1. Consider the builtin dataset iris. a. What is the structure of the iris data frame?...
1. Consider the builtin dataset iris. a. What is the structure of the iris data frame? b. Create a histogram of the Sepal.Width variable. c. Create a histogram of the Petal.Width variable. d. For both histograms, does the data appear normally distributed? Are they skewed? e. For both histograms, does it appear that the data come from more than one populations? f. What is the mean and median of Sepal.Width? What is the variance and standard deviation? g. What is...
Run two regressions using Excel from the data below. Find the following information: 1. estimated regression...
Run two regressions using Excel from the data below. Find the following information: 1. estimated regression equations for both regressions 2. both coefficients of determination. 3. significance of each independent variable 4.Report the significance of both models. 5.Predict y for a fictitious set of x values for both Years Weekend Daily Tour Income Daily Gross Revenue Number of Tourists 1 Friday 3378 4838.95 432 1 Saturday 1198 3487.78 139 1 Sunday 3630 4371.3 467 2 Friday 4550 6486.48 546 2...
Consider the following time series data.
Consider the following time series data. Week123456Value181516131716a. Choose the correct time series plot.What type of pattern exists in the data? b. Develop a three-week moving average for this time series. Compute MSE and a forecast for week 7. Round your answers to two decimal places. WeekTime Series ValueForecast118215316413517616 MSE: The forecast for week 7: c. Use α = 0.2 to compute the exponential smoothing values for the time series. Compute MSE and a forecast for week 7. Round your answers to two decimal places. WeekTime Series ValueForecast118215316413517616MSE: The...
Consider the following time series data.
  Consider the following time series data. Quarter Year 1 Year 2 Year 3 1 5 8 10 2 2 4 8 3 1 4 6 4 3 6 8 (b) Use a multiple regression model with dummy variables as follows to develop an equation to account for seasonal effects in the data. Qtr1 = 1 if Quarter 1, 0 otherwise; Qtr2 = 1 if Quarter 2, 0 otherwise; Qtr3 = 1 if Quarter 3, 0 otherwise.   If required,...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT