In: Statistics and Probability
1FIND THE CORRELATION COEFFICIENT OF THE FOLLOWING DATA AND COMMENT ON ITS DIRECTION AND STRENGTH.HIGH TEMP(X)= 55 58 64 68 70 75 80 85 CANS SOLD(Y)= 340 335 410 450 460 610 735 780
2. IF THE EQUATION ABOVE IS A GOOD FIT PREDICT THE NUMBER OF CANS SOLD WHEN THE HIGH TEMP(X) IS 77
3. FIND THE RESIDUALS FOR TEMP(X)= 58 and 70
4. WHAT IS THE AMOUNT OF VARIATION IN CANS SOLD(Y) THAT KNOWING HIGH TEMP(X), EXPLAINS?
1.
The correlation coefficient (r ) formula as below:...............(1)
Step-1 :
Calculation table as below;
Sr.no. | HIGH TEMP(X) | CANS SOLD (Y) | XY | X^2 | Y^2 |
1 | 55 | 340 | 18700 | 3025 | 115600 |
2 | 58 | 335 | 19430 | 3364 | 112225 |
3 | 64 | 410 | 26240 | 4096 | 168100 |
4 | 68 | 450 | 30600 | 4624 | 202500 |
5 | 70 | 460 | 32200 | 4900 | 211600 |
6 | 75 | 610 | 45750 | 5625 | 372100 |
7 | 80 | 735 | 58800 | 6400 | 540225 |
8 | 85 | 780 | 66300 | 7225 | 608400 |
555 | 4120 | 298020 | 39259 | 2330750 | |
Total | Total | Total | Total | Total | |
69.375 | 515.00 | ||||
Mean | Mean | ||||
8 | 8 | ||||
n | n | ||||
Hence; we have the following .............(2)
n= | 8 |
Sum-x = | 555 |
Sum-y = | 4120 |
Sum-x^2 = | 39259 |
Sum-xy = | 298020 |
x-bar = | 69.38 |
y-bar = | 515.00 |
Sum-y^2 = | 2330750 |
Substitute (2) values in (1) , we get
r = | 97560 | 0.9704 | ||
6047 | x | 1671600 |
So correlation coefficient (r ) = 0.9704
Comment : Its direction is Positive and strength is very high ( near to 1 )
#####
2. Regression Equation as below;
y = a + b X................(3)
where a, and b are formula as below;........(4)
Put values of (2) in (4) we get;
b= | 12195 | = | 16.1336 |
755.875 |
a= | 515.00 | - | 1,119.2699 | = | -604.2699 |
Hence, Regression Equation ( 3) becomes;
y = -604.2699 + 16.1336 x ...................(5)
To check the best fit, we run Excel Regression and get the following output;
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.9704 | |||||
R Square | 0.9416 | |||||
Adjusted R Square | 0.9319 | |||||
Standard Error | 45.0934 | |||||
Observations | 8 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 196749.50 | 196749.50 | 96.7580 | 0.0001 | |
Residual | 6 | 12200.50 | 2033.42 | |||
Total | 7 | 208950.00 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | -604.2699 | 114.8981 | -5.2592 | 0.0019 | -885.4154 | -323.1243 |
HIGH TEMP(X) | 16.1336 | 1.6402 | 9.8366 | 0.0001 | 12.1203 | 20.1470 |
Since from the above output , we can say the fit is a Good fit, because of the following;
i) R-square = 0.9416 ( Very high )
ii) All p-values of F and t-statistic are < 0.05.
Now,
To PREDICT THE NUMBER OF CANS SOLD WHEN THE HIGH TEMP(X) IS 77,
we put x=77 in the above (5); we get;
y = -604.2699+16.1336*(77) = 638.0173 = 638 cans
####################
3.
FINDING THE RESIDUALS FOR TEMP(X)= 58 and 70
Put x=58 in (5) , we get
y = -604.2699+16.1336*(58) = 331
Residual = ( Y_actual - Y_predict ) = (335-331) = 4
#############
Put x=70 in (5) , we get
y = -604.2699+16.1336*(70) = 525
Residual = ( Y_actual - Y_predict ) = (460-525) = -65
#######
4.
THE AMOUNT OF VARIATION IN CANS SOLD(Y) THAT KNOWING HIGH TEMP(X), EXPLAINS by 94.16 %
( R-square = 94.16 from Excel output )
### End of answers
Note : since no methodology was mentioned in the question, Excel was used.
Please consider this while giving your feedback.