In: Statistics and Probability
*2.25. Refer to Airfreight breakage Problem 1.21. a. Set up the ANOVA table. Which elements are additive? b. Conduct an F test to decide whether or not there is a linear association between the number of times a carton is transferred and the number of broken ampules; control the a risk at .05. State the alternatives, decision rule, and conclusion. c. Obtain the t* statistic for the test in part (b) and demonstrate numerically its equivalence to the F* statistic obtained in part (b). d. Calculate R2 and r. What proportion of the variation in Y is accounted for by introducing -X into the regression model?
| i | X | Y | 
| 1 | 1 | 16 | 
| 2 | 0 | 9 | 
| 3 | 2 | 17 | 
| 4 | 0 | 12 | 
| 5 | 3 | 22 | 
| 6 | 1 | 13 | 
| 7 | 0 | 8 | 
| 8 | 1 | 15 | 
| 9 | 2 | 19 | 
| 10 | 0 | 11 | 
| SUM | 10 | 142 | 
| AVG | 1 | 14.2 | 
ANSWER:
| x | y | (x-x̅)² | (y-ȳ)² | (x-x̅)(y-ȳ) | 
| 1 | 16 | 0.0000 | 3.2400 | 0.0000 | 
| 0 | 9 | 1.0000 | 27.0400 | 5.2000 | 
| 2 | 17 | 1.0000 | 7.8400 | 2.8000 | 
| 0 | 12 | 1.0000 | 4.8400 | 2.2000 | 
| 3 | 22 | 4.0000 | 60.8400 | 15.6000 | 
| 1 | 13 | 0.0000 | 1.4400 | 0.0000 | 
| 0 | 8 | 1.00 | 38.44000 | 6.2000 | 
| 1 | 15 | 0.00 | 0.64000 | 0.000 | 
| 2 | 19 | 1.00 | 23.04 | 4.80 | 
| 0 | 11 | 1.00 | 10.24 | 3.20 | 
| ΣX | ΣY | Σ(x-x̅)² | Σ(y-ȳ)² | Σ(x-x̅)(y-ȳ) | |
| total sum | 10.00 | 142.00 | 10.00 | 177.60 | 40.00 | 
| mean | 1.00 | 14.20 | SSxx | SSyy | SSxy | 
sample size ,   n =   10  
   
here, x̅ = Σx / n=   1.000   ,
    ȳ = Σy/n =   14.200
          
   
SSxx =    Σ(x-x̅)² =    10.0000  
   
SSxy=   Σ(x-x̅)(y-ȳ) =   40.0  
   
SSE=   (SSxx * SSyy - SS²xy)/SSxx =
   17.6000
SST = SSyy=177.600
SSR = SST-SSE=160.000
a)
| Anova table | |||||
| variation | SS | df | MS | F-stat | p-value | 
| regression | 160.000 | 1 | 160.000 | 72.727 | 0.0000 | 
| error, | 17.600 | 8 | 2.200 | ||
| total | 177.600 | 9 | 
b)
Ho: regression model is not useful
H1: regression model is useful
F critical value(0.05,1,8) = 5.318
decision rule: reject Ho, if F >5.318
F stat = 72.727
since, F stat >5.318, Reject ho
hence, it is concluded that regression model is useful.
c)
estimated slope , ß1 = SSxy/SSxx = 40.0 / 10.000 = 4
std error ,Se =    √(SSE/(n-2)) =   
1.4832
estimated std error of slope =Se(ß1) = Se/√Sxx =   
1.483   /√   10.00   =  
0.4690
          
       
t stat = estimated slope/std error =ß1 /Se(ß1) =
   4.0000   /   0.4690  
=   8.5280
t² = 8.528² = 72.727
So, F stat = (t -stat)²
d)
correlation coefficient ,    r = Sxy/√(Sx.Sy)
=   0.949
      
R² =    (Sxy)²/(Sx.Sy) =    0.9009
0.9009 proportion of the variation in Y is accounted for by
introducing X into the regression model