In: Statistics and Probability
The manager of an amusement park would like to be able to predict daily attendance in order to develop more accurate plans about how much food to order and how many ride operators to hire. After some consideration, he decided that the following three factors are critical: Yesterday’s attendance Weekday or weekend Predicted weather He then took a random sample of 36 days. For each day, he recorded the attendance, the previous day’s attendance, day of the week, and weather forecast(mostly sunny, rain, cloudy). The first independent variable is interval, but the other two are nominal. a. Create the three indicator variables you need. b. Conduct a regression analysis. c. Is this model valid? Explain. d. Can we conclude that weather is a factor in determining attendance? e. Do these results provide sufficient evidence that weekend attendance is, on average, larger than weekday attendance? f. Do these results provide sufficient evidence that mostly sunny attendance is, on average, larger than cloudy attendance?
Attendance | Yest Att | day of the week | weather forecast |
7882 | 8876 | 2 | 1 |
6115 | 7203 | 2 | 3 |
5351 | 4370 | 2 | 3 |
8546 | 7192 | 1 | 1 |
6055 | 6835 | 2 | 3 |
7367 | 5469 | 2 | 1 |
7871 | 8207 | 2 | 1 |
5377 | 7026 | 2 | 3 |
5259 | 5592 | 2 | 1 |
4915 | 3190 | 2 | 3 |
6538 | 7012 | 2 | 3 |
6607 | 5434 | 2 | 3 |
5118 | 3764 | 2 | 3 |
6077 | 7575 | 2 | 3 |
4475 | 6047 | 2 | 3 |
3771 | 4430 | 2 | 3 |
6106 | 5697 | 2 | 3 |
7017 | 3928 | 1 | 2 |
5718 | 5552 | 2 | 3 |
5966 | 3142 | 1 | 2 |
8160 | 8648 | 1 | 2 |
4717 | 3397 | 2 | 3 |
7783 | 7655 | 2 | 3 |
5124 | 5920 | 2 | 3 |
7495 | 7831 | 1 | 2 |
5848 | 6355 | 2 | 3 |
5166 | 3529 | 2 | 3 |
4487 | 4220 | 2 | 3 |
7320 | 7526 | 2 | 1 |
6925 | 4083 | 1 | 1 |
8133 | 6382 | 1 | 1 |
7929 | 6459 | 2 | 3 |
7291 | 3432 | 1 | 2 |
5419 | 8077 | 2 | 3 |
3634 | 3353 | 2 | 3 |
6859 | 3803 | 1 | 2 |
1 weekend | 1 mostly sunny | ||
2 weekdays | 2 rain | ||
3 cloudy |
A.
Since "day of the week" can assume two values (1 and 2), we will need only one variable e.g.
A = 0 ; if weekend
A = 1 ; if weekday
Since "weather forecast" can assume three values (1, 2 and 3), we will need two variable e.g.
B1 = 1, if sunny; B1 = 0, otherwise.
B2 = 1, if rain ; B2 = 0, otherwise.
if both the variables gets 0 then it is cloudy.
B.
Summary of regression analysis:
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.824568883 | |||||||
R Square | 0.679913843 | |||||||
Adjusted R Square | 0.638612404 | |||||||
Standard Error | 790.6786272 | |||||||
Observations | 36 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 4 | 41166949.31 | 10291737.33 | 16.4622311 | 2.47624E-07 | |||
Residual | 31 | 19380353.44 | 625172.6915 | |||||
Total | 35 | 60547302.75 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 4619.366527 | 695.8111859 | 6.638821882 | 2.02115E-07 | 3200.250257 | 6038.482797 | 3200.250257 | 6038.482797 |
Yest Att | 0.383998499 | 0.080450863 | 4.773081158 | 4.11184E-05 | 0.219917882 | 0.548079116 | 0.219917882 | 0.548079116 |
A | -1207.558126 | 586.0985572 | -2.060332876 | 0.047844232 | -2402.914014 | -12.20223744 | -2402.914014 | -12.20223744 |
B1 | 988.54631 | 410.8190614 | 2.406281506 | 0.022266169 | 150.6753102 | 1826.41731 | 150.6753102 | 1826.41731 |
B2 | 541.7985097 | 685.57383 | 0.790284702 | 0.435365659 | -856.438535 | 1940.035555 | -856.438535 | 1940.035555 |
C.
Since the significance of F-statistics (i.e. the p-value) = 2.476*10-7 < 0.05 (if = 0.05)
Thus model is significant.
D.
The p-value for B1 = 0.022 < 0.05
Thus, the weather is a significant factor in determining attendance.
Since for B2, p-value = 0.435 > 0.05 i.e not significant. It means knowing if a day is sunny is more helpful than knowing if it is rainy.
E.
Regression Equation:
Attendance = 4619.366 + 0.384*Yest. Att. -1207.558*A + 988.546*B1 + 541.7985097*B2
For the weekend:
A = 0
Attendance = 4619.366 + 0.384*Yest. Att. + 988.546*B1 + 541.7985097*B2
For the weekday:
day of the week = 1
Attendance = 4619.366 + 0.384*Yest. Att. + 988.546*B1 + 541.7985097*B2 - 1207.558
Thus if all the other factors are constant, attendance on a weekend will be more as compared to a weekday on average.
F.
Attendance = 4619.366 + 0.384*Yest. Att. -1207.558*A + 988.546*B1 + 541.7985097*B2
For mostly sunny:
B1 = 1
B2 = 0
Attendance = 4619.366 + 0.384*Yest. Att. -1207.558*A + 988.546
For cloudy:
B1 and B2 = 0
Attendance = 4619.366 + 0.384*Yest. Att. -1207.558*A
Thus if all the other factors are constant, attendance on a sunny day will be more as compared to a cloudy day on average.
Please upvote if you have liked my answer, would be of great help. Thank you.