In: Statistics and Probability
A retail outlet for air conditioners believes that its weekly sales are dependent upon the average temperature during the week. The outlet picks, at random, 12 weeks during the year and records the number of air conditioners that are sold and the average temperature that week. The data is below.
Temp (in deg.) Sales (# of conditioners)
72 3
77 4
82 7
43 1
31 0
28 0
81 8
83 5
76 5
60 4
50 4
55 5
a. If you had nothing other than the sales data and were asked to predict the number of air conditioners you would sell in a week, what is your prediction? (4 pts)
b. Use EXCEL and produce a scatterplot of this particular data set. Label all relevant information. COMPUTER DELIVERABLE (10 pts)
c. Based upon a visual analysis of your plot, what type of directional relationship do you anticipate? Is it consistent with conventional wisdom? (6 pts)
d. Calculate SSXX, SSYY, and SSXY. (12 pts)
e. Using your answers from part (d), is your anticipation in part (c) substantiated? Be specific as to your evidence. (4 pts)
f. What is the estimated OLS line appropriate for this situation? (8 pts)
g. Interpret the estimated slope coefficient. (5 pts)
h. Predict the number of air conditioners sold when the temperature is 76 degrees. (4 pts)
i. If there is an observation in that data set with that temperature, what is the residual for that observations? (4 pts)
j. Construct the ANOVA applicable to this regression situation. (10 pts)
k. Calculate the coefficient of determination. (5 pts)
l. Interpret the coefficient of determination. (5 pts)
m. Calculate the correlation coefficient. (5 pts)
n. Test, using the correlation coefficient, whether or not a positive relationships exists between sales and temperatures. Use alpha = 0.01. (16 pts)
o. Test, using the slope coefficient, whether or not a positive relationship exists between sales and temperatures. Use alpha = 0.01. (16 pts)
a.
In this case, the mean (average) value for the last 12 week will be the prediction,
b. Use EXCEL and produce a scatterplot of this particular data set. Label all relevant information. COMPUTER DELIVERABLE (10 pts)
The scatter plot is obtained in excel by following these steps
Step 1: Write the data values in excel. The screenshot is shown below,
Step 2: Select all the values then INSERT > Recommended Charts > XY Scatter > OK. The screenshot is shown below,
c.
There is positive association between variables sales and temp such that as the temperature increase, sales will also increase.
Yes, It is consistent with conventional wisdom because in summer sales of the conditioner increases.
d.
From the data values,
Temp (in deg.) , X | Sales (# of conditioners), Y | X^2 | Y^2 | XY | |
72 | 3 | 5184 | 9 | 216 | |
77 | 4 | 5929 | 16 | 308 | |
82 | 7 | 6724 | 49 | 574 | |
43 | 1 | 1849 | 1 | 43 | |
31 | 0 | 961 | 0 | 0 | |
28 | 0 | 784 | 0 | 0 | |
81 | 8 | 6561 | 64 | 648 | |
83 | 5 | 6889 | 25 | 415 | |
76 | 5 | 5776 | 25 | 380 | |
60 | 4 | 3600 | 16 | 240 | |
50 | 4 | 2500 | 16 | 200 | |
55 | 5 | 3025 | 25 | 275 | |
Total | 738 | 46 | 49782 | 246 | 3299 |
e.
the correlation formula is,
Which is positive hence there is positive correlation between two variable.
To answer the remaining part, regression analysis is performed in excel by following these steps,
Step 1: Write the data values in excel. The screenshot is shown below,
Step 2: DATA > Data Analysis > Regression > OK. The screenshot is shown below,
Step 3: Select Input Y Range: 'Y' column, Input X Range: 'X' column then OK. The screenshot is shown below,
The result is obtained. The screenshot is shown below,
f.
The regression equation is,
g.
For one unit increase in temperature, sales will increase 0.1069.
h.
For, Temperature = 76,
i.
From, the data point, the sales for temperature = 76 is 5.
j.
From, the regression output summary, the ANOVA table is,
ANOVA | |||||
df | SS | MS | F | Significance F | |
Regression | 1 | 50.26166 | 50.26166 | 25.90139 | 0.000471 |
Residual | 10 | 19.40501 | 1.940501 | ||
Total | 11 | 69.66667 |
k.
From, the regression output summary, the R square value is,
l.
The R square value state that, the regression model explains the 72.15% of the variance of the data points.
m.
n.
the hypothesis test for significance of the correlation coefficient is performed in following steps,
Step 1: The null and alternative hypotheses are,
Step 2: The t-statistic is obtained using the formula,
Step 3: The significance level,
The P-value for the t-statistic for degree of freedom = n-2=10 is obtained from t distribution table,
Step 4:
Since the P-value is less than 0.01 at 1% significance level value, the null hypothesis is rejected. Hence there is a significant correlation is between sales and temperature.
o.
The hypothesis is tested by calculating t-value and corresponding p-value for the estimated slope in regression model as shown below,
Null Hypothesis:
Alternate Hypothesis:
From, the regression output summary,
Coefficients | Standard Error | t Stat | |
Temp (in deg.) , X | 0.106939704 | 0.021012 | 5.089341 |
The P-value for the one tailed test is obtained from t distribution table for degree of freedom = n - 1 = 11 and significance level = 0.01.
Since the P-value is less than 0.01 at 1% significance level value, the null hypothesis is rejected. Hence there is a significant positive slope between sales and temperature.