In: Statistics and Probability
Concert Nation] Concert Nation, INC. is a nationwide promoter of
rock concerts. The president of
the company wants to develop a model to estimate the revenue of a
major concert event at large venues
(such as Ford Field, Madison Square Gardens) for planning marketing
strategies. The company has
collected revenue data of 32 recent large concert events. For each
concert, they have also recorded the
attendance, the number of concession stands in the venue, and the
Billboard chart of the artist in the
week of each event. This data is available in “Tickets”. They have
two potential models that could
explain the revenue. The two competing models are:
Model A: ??????? = ?? + ???????????? + ???????????? + ??????????? + ?0123?
Model B: ??????? = ?? + ???????????? + ??????????? + ?012?
Run regression on both models. Use only the regression outputs
of the two models and the original data
to answer questions 1 to 7 below.
1. [1 pt] Let’s consider the model A first. What does the result of
F-test indicate?
(a) The p-value of F-test is 100.83. Thus, the model does not
significantly explain the revenue.
(b) The p-value of F-test is close to zero. Thus, all independent
variables in the regression model are
statistically significant.
(c) The p-value of F-test is close to zero. This indicates that at
least some independent variables in the
regression model significantly explain the revenue.
(d) This indicates weak evidence of a linear relationship, because
the p-value is very low.
2
2. [1 pt] If we use model A for prediction, what is the point
estimate for the revenue of a concert that has
attendance of 50,000 people, 5 concession stands, and the song
ranked in no. 15 in the Billboard ranking?
(a) $3.145 M
(b) $2.851 M
(c) $3.252 M
(d) $340K
3. [1 pt] What is an approximate 95% prediction interval for the
concert listed in the previous question?
(a) [$2.757M, $3.533M]
(b) [$2.463M, $3.239M]
(c) [$2.368M, $3.922M]
(d) [$2.074M, $3.628M]
4. [1 pt] Which of the following statement is correct?
(a) The estimated slope for the attendance is only $59.2. This
means that, when keeping everything
else the same, the revenue does not depend much on the
attendance.
(b) The t-statistic associated with the slope for the attendance
variable is 16.9. This means that there is
too much noise to determine if the slope is definitely
positive.
(c) The p-value for the concession variable is 0.933. This means
that the number of concession stands
is not a statistically significant variable to determine the
revenue.
(d) The p-value for the concession variable is 0.933. This means
that the number of concession stands
is a statistically significant variable to determine the
revenue.
5. [1 pt] Is it appropriate to use model A as a final model to
estimate the revenue of a concert?
(a) Yes. All independent variables are statistically
significant.
(b) Yes, because the analysis indicates a linear relationship
between revenue and attendance.
(c) No, because not all independent variables are statistically
important. Thus, revision is necessary.
(d) No, because some of the slopes were negative. Thus, revision is
necessary.
3
6. [1 pt] Now, consider model B. According to model B, what is a
point estimate for a concert that has
attendance of 50000 people, 5 concession stands, and the song
ranked in no. 15 in the Billboard ranking?
(a) $3.147M
(b) $2.839M
(c) $7.139M
(d) $13.637M
7. [1 pt] Based on the regression outputs, which model would you
consider more suitable for predicting the
revenue between the two models– Model A and Model B?
(a) Model A is more suitable, because it has a higher ?2, lower
standard error of the estimates
(??), and lower F-test p-value.
(b) Model A is more suitable because the fraction of SST accounted
for by the residuals is higher than
for model B.
(c) Model B is more suitable, because, while both models have
similar ?2 and F-test p-value, model B
has lower standard error of the estimates (??) and all independent
variables are statistically
significant.
(d) Model B is more suitable, because the slope coefficient is
larger in magnitude.
Attendance | # of concessions | Billboard Charts | Concert Revenue |
30650 | 8 | 56 | 1531762 |
80997 | 1 | 87 | 4047180 |
93686 | 8 | 24 | 5805972 |
44405 | 4 | 99 | 2516538 |
77767 | 4 | 39 | 4197208 |
95780 | 7 | 35 | 6226065 |
82701 | 7 | 86 | 4123048 |
50165 | 8 | 29 | 3465110 |
50619 | 5 | 93 | 2843474 |
36259 | 7 | 86 | 1866318 |
52013 | 5 | 35 | 2670798 |
97447 | 7 | 71 | 5756817 |
69982 | 7 | 97 | 3681670 |
31789 | 10 | 72 | 2072149 |
39787 | 6 | 89 | 1964361 |
63596 | 5 | 65 | 3150802 |
73159 | 5 | 41 | 5064323 |
51172 | 8 | 1 | 2901564 |
54187 | 9 | 17 | 3170058 |
56681 | 7 | 1 | 3316764 |
78466 | 7 | 86 | 3825369 |
65132 | 8 | 86 | 2983563 |
52866 | 4 | 8 | 3091641 |
39536 | 2 | 20 | 3068049 |
32541 | 1 | 53 | 1796727 |
36441 | 1 | 60 | 2011990 |
74987 | 6 | 58 | 4389931 |
33791 | 8 | 81 | 1545359 |
64961 | 6 | 94 | 3792136 |
61429 | 3 | 86 | 2695672 |
68178 | 4 | 50 | 4147528 |
85701 | 5 | 52 | 5335423 |
Concert Nation] Concert Nation, INC. is a nationwide promoter of
rock concerts. The president of
the company wants to develop a model to estimate the revenue of a
major concert event at large venues
(such as Ford Field, Madison Square Gardens) for planning marketing
strategies. The company has
collected revenue data of 32 recent large concert events. For each
concert, they have also recorded the
attendance, the number of concession stands in the venue, and the
Billboard chart of the artist in the
week of each event. This data is available in “Tickets”. They have
two potential models that could
explain the revenue. The two competing models are:
Model A: ??????? = ?? + ???????????? + ???????????? + ??????????? + ?0123?
Regression Analysis |
||||||||
R² |
0.915 |
|||||||
Adjusted R² |
0.906 |
n |
32 |
|||||
R |
0.957 |
k |
3 |
|||||
Std. Error of Estimate |
388380.722 |
Dep. Var. |
Concert Revenue |
|||||
Regression output |
confidence interval |
|||||||
variables |
coefficients |
std. error |
t (df=28) |
p-value |
95% lower |
95% upper |
||
Intercept |
a = |
294,318.169 |
||||||
Attendance |
b1 = |
59.162 |
3.491 |
16.947 |
2.98E-16 |
52.011 |
66.313 |
|
# of concessions |
b2 = |
2,506.230 |
29,519.840 |
0.085 |
.9329 |
-57,962.420 |
62,974.881 |
|
Billboard Charts |
b3 = |
-7,980.320 |
2,311.354 |
-3.453 |
.0018 |
-12,714.914 |
-3,245.726 |
|
ANOVA table |
||||||||
Source |
SS |
df |
MS |
F |
p-value |
|||
Regression |
45,625,013,698,720.700 |
3 |
15,208,337,899,573.600 |
100.82 |
4.09E-15 |
|||
Residual |
4,223,508,396,681.690 |
28 |
150,839,585,595.775 |
|||||
Total |
49,848,522,095,402.400 |
31 |
||||||
Predicted values for: Concert Revenue |
||||||||
95% Confidence Interval |
95% Prediction Interval |
|||||||
Attendance |
# of concessions |
Billboard Charts |
Predicted |
lower |
upper |
lower |
upper |
Leverage |
50,000 |
5 |
15 |
3,145,256.45 |
2,881,523.75 |
3,408,989.14 |
2,307,119.48 |
3,983,393.42 |
0.110 |
Model B: ??????? = ?? + ???????????? + ??????????? + ?012?
Regression Analysis |
|||||||
R² |
0.915 |
||||||
Adjusted R² |
0.909 |
n |
32 |
||||
R |
0.957 |
k |
2 |
||||
Std. Error of Estimate |
381674.877 |
Dep. Var. |
Concert Revenue |
||||
Regression output |
confidence interval |
||||||
variables |
coefficients |
std. error |
t (df=29) |
p-value |
95% lower |
95% upper |
|
Intercept |
a = |
308,223.220 |
|||||
Attendance |
b1 = |
59.181 |
3.424 |
17.284 |
8.23E-17 |
52.178 |
66.184 |
Billboard Charts |
b2 = |
-7,992.387 |
2,267.147 |
-3.525 |
.0014 |
-12,629.224 |
-3,355.550 |
ANOVA table |
|||||||
Source |
SS |
df |
MS |
F |
p-value |
||
Regression |
45,623,926,448,881.800 |
2 |
22,811,963,224,440.900 |
156.59 |
2.87E-16 |
||
Residual |
4,224,595,646,520.690 |
29 |
145,675,711,948.989 |
||||
Total |
49,848,522,095,402.400 |
31 |
|||||
Predicted values for: Concert Revenue |
|||||||
95% Confidence Interval |
95% Prediction Interval |
||||||
Attendance |
Billboard Charts |
Predicted |
lower |
upper |
lower |
upper |
Leverage |
50,000 |
15 |
3,147,385.74 |
2,893,565.95 |
3,401,205.53 |
2,326,544.23 |
3,968,227.25 |
0.106 |
Run regression on both models. Use only the regression outputs
of the two models and the original data
to answer questions 1 to 7 below.
1. [1 pt] Let’s consider the model A first. What does the result of
F-test indicate?
(c) The p-value of F-test is close to zero. This indicates
that at least some independent variables in the regression model
significantly explain the revenue.
2. [1 pt] If we use model A for prediction, what is the point
estimate for the revenue of a concert that has attendance of 50,000
people, 5 concession stands, and the song ranked in no. 15 in the
Billboard ranking?
(a) $3.145 M
3. [1 pt] What is an approximate 95% prediction interval for the
concert listed in the previous question?
(c) [$2.368M, $3.922M]
4. [1 pt] Which of the following statement is correct?
(c) The p-value for the concession variable is 0.933. This
means that the number of concession stands is not a statistically
significant variable to determine the revenue.
5. [1 pt] Is it appropriate to use model A as a final model to
estimate the revenue of a concert?
(c) No, because not all independent variables are
statistically important. Thus, revision is
necessary.
6. [1 pt] Now, consider model B. According to model B, what is a
point estimate for a concert that has
attendance of 50000 people, 5 concession stands, and the song
ranked in no. 15 in the Billboard ranking?
(a) $3.147M
7. [1 pt] Based on the regression outputs, which model would you
consider more suitable for predicting the revenue between the two
models– Model A and Model B?
(c) Model B is more suitable, because, while both models
have similar ?2 and F-test
p-value, model B has lower standard error of the estimates
(??) and all independent
variables are statistically
significant.