In: Statistics and Probability
A large sports supplier has many stores located world wide. A regression model is to be constructed to predict the annual revenue of a particular store based upon the population of the city or town where the store is located, the annual expenditure on promotion for the store and the distance of the store to the center of the city.
Data has been collected on 30 randomly selected stores: show data
a)Find the multiple regression equation using all three explanatory variables. Assume that X1 is population, X2 is annual promotional expenditure and X3 is distance to city center. Give your answers to 3 decimal places.
y^ = + population + promo. expenditure + dist. to city
b)At a level of significance of 0.05, the result of the F test for this model is that the null hypothesis is, is not rejected.
For parts c) and d), using the data, separately calculate the correlations between the response variable and each of the three explanatory variables.
c)The explanatory variable that is most correlated with annual revenue is:
population
promotional expenditure
distance to city
d)The explanatory variable that is least correlated with annual revenue is:
population
promotional expenditure
distance to city
e)The value of R2 for this model, to 2 decimal places, is equal to
f)The value of se for this model, to 3 decimal places, is equal to
g)Construct a new multiple regression model by removing the variable distance to city center. Give your answers to 3 decimal places.
The new regression model equation is:
y^ = + population + promo. expenditure
h)In the new model compared to the previous one, the value of R2 (to 2 decimal places) is:
increased
decreased
unchanged
i)In the new model compared to the previous one, the value of se (to 3 decimal places) is:
increased
decreased
unchanged
ANSWER:
Given that,
We have excel output for multiple regression as :
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.991553 | |||||
R Square | 0.983177 | |||||
Adjusted R Square | 0.981236 | |||||
Standard Error | 30.27639 | |||||
Observations | 30 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 3 | 1392884 | 464294.5 | 506.507 | 3.59E-23 | |
Residual | 26 | 23833.15 | 916.6597 | |||
Total | 29 | 1416717 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | -11.5673 | 22.9015 | -0.50509 | 0.617751 | -58.642 | 35.5074 |
Population | 0.805069 | 0.023907 | 33.675 | 5.64E-23 | 0.755927 | 0.85421 |
Promotional exp | 0.494499 | 0.140416 | 3.521672 | 0.001605 | 0.20587 | 0.783128 |
Dist to city | -0.33873 | 1.148964 | -0.29481 | 0.770481 | -2.70046 | 2.023005 |
a)
We have coefficients for x1,x2,x3 as
b1 = 0.805, b2 = 0.494, b3 = -0.339
intercept = a =- 11.567
Hence the regression is
b)
The calculated value of F = 506.507
p value = 3.59 E-23 = 0.0000
Here p value < ( 0.05)
Hence we reject null hypothesis.
At a level of significance of 0.05, the result of the F test for this model is that the null hypothesis is rejected.
Using excel formula ' =correl( data 1 , data 2 ) " we get correlations.
So correlation between y and x1 is r1 = 0.99
correlation between y and x2 is r2 = 0.43
correlation between y and x3 = r3 = -0.24
c)
The explanatory variable that is most correlated with annual revenue is: Population
d)
The explanatory variable that is least correlated with annual revenue is: distance to city.
e)
R2 = 0.98
f)
Se = Standard error = 30.276
g)
The new regression model is given by
Regression Statistics | ||||||
Multiple R | 0.991525 | |||||
R Square | 0.983121 | |||||
Adjusted R Square | 0.981871 | |||||
Standard Error | 29.76004 | |||||
Observations | 30 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 2 | 1392804 | 696402 | 786.3084 | 1.17E-24 | |
Residual | 27 | 23912.82 | 885.6601 | |||
Total | 29 | 1416717 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | -15.8435 | 17.42022 | -0.90949 | 0.37114 | -51.5868 | 19.89984 |
Population | 0.807002 | 0.022598 | 35.7107 | 2.89E-24 | 0.760634 | 0.85337 |
Promotional exp | 0.489327 | 0.13694 | 3.5733 | 0.001352 | 0.20835 | 0.770304 |
Here a = -15.844, b1 = 0.807, b2 = 0.489
= -15.844 + 0.807 x1 + 0.489 x2
h)
Here R2 = 0.98
In the new model compared to the previous one, the value of R2 (to 2 decimal places) is: Unchanged
i)
Se = 29.760
In the new model compared to the previous one, the value of se (to 3 decimal places) is: decreased