In: Statistics and Probability
Create an ANOVA and regresion problem, show all steps and result.
1)
ANOVA Problem
Research Question: Do the teenagers spend equal time in playing video games on weekends across the three groups?
Null Hypothesis
H0: µGroup 1 = µGroup 2 = µGroup 3 (Teenagers spend equal time in playing video games on weekends across the three groups)
Alternate Hypothesis
Ha: Not all Means are equal
Data Given
| 
 10-12 Yrs Old  | 
 13-15 Yrs Old  | 
 16-18 Yrs Old  | 
||
| 
 Group 1  | 
 Group 2  | 
 Group 3  | 
 Overall  | 
|
| 
 Mean  | 
 2  | 
 5  | 
 6  | 
 4.38  | 
| 
 Variance  | 
 10  | 
 25  | 
 20  | 
|
| 
 Std Dev  | 
 3.16  | 
 5.00  | 
 4.47  | 
|
| 
 Sample Size  | 
 39  | 
 43  | 
 41  | 
 123  | 
Overall Mean = (2*39+5*43+6*41)/(39+43+41) = 4.38
alpha = 0.05
Degress of Freedom:
dfBetween = a – 1 = 3-1 =2
dfWithin = N-a = 123-3 = 120
dfTotal = N-1 = 123-1 = 122
Critical Values:
Time (dfBetween, dfWithin): (2,120) = 3.09
Decision Rule:
If F is greater than 3.09, reject the null hypothesis
Test Statistics:
Between Group Variance(SSB) = ∑n(Group Average – Overall Average)2 = 39*(2 – 4.38)2 + 43*(5 – 4.38)2 + 41*(6 – 4.38)2 = 345.04
Standard Deviation Pooled (sp) = { [SD12*n1+SD22*n2+SD32*n3] / [n1+n2+n3] }0.5 = 4.31
Within Group Variance = SSE
sp = (SSE/(n-1))1/2
where n = (39+43+41)/3 = 41
SSE = 743.09
SSBetwen = 345.04
SSWithin = 743.09
SSTotal = SSBetwen + SSWithin = 1088.13
MS = SS/df
F = MSeffect / MSerror
Hence,
F = 172.52/6.19 = 27.86
| 
 SS  | 
 df  | 
 MS  | 
 F  | 
|
| 
 Between  | 
 345.04  | 
 2  | 
 172.52  | 
 27.86  | 
| 
 Within  | 
 743.09  | 
 120  | 
 6.19  | 
|
| 
 Total  | 
 1,088.13  | 
 122  | 
Result:
Our F = 27.86, we reject the null hypothesis
Conclusion:
Not all means are equal.
2)
Regression Problem:
Suppose we want to find the Ice cream consumption on the basis of Temperature (in Fahrenheit).
Data Set
| 
 temp  | 
 icecream  | 
| 
 41  | 
 0.386  | 
| 
 56  | 
 0.374  | 
| 
 63  | 
 0.393  | 
| 
 68  | 
 0.425  | 
| 
 69  | 
 0.406  | 
| 
 65  | 
 0.344  | 
| 
 61  | 
 0.327  | 
| 
 47  | 
 0.288  | 
| 
 32  | 
 0.269  | 
| 
 24  | 
 0.256  | 
| 
 28  | 
 0.286  | 
| 
 26  | 
 0.298  | 
| 
 32  | 
 0.329  | 
| 
 40  | 
 0.318  | 
| 
 55  | 
 0.381  | 
| 
 63  | 
 0.381  | 
| 
 72  | 
 0.47  | 
| 
 72  | 
 0.443  | 
| 
 67  | 
 0.386  | 
| 
 60  | 
 0.342  | 
| 
 44  | 
 0.319  | 
| 
 40  | 
 0.307  | 
| 
 32  | 
 0.284  | 
| 
 27  | 
 0.326  | 
| 
 28  | 
 0.309  | 
| 
 33  | 
 0.359  | 
| 
 41  | 
 0.376  | 
| 
 52  | 
 0.416  | 
| 
 64  | 
 0.437  | 
| 
 71  | 
 0.548  | 
Solution:
Dependent Variable: Icecream
Independent Variable: Temperature
We expect the relationship to be positive as the temperature increases, people will spend more on ice-cream.
Steps for Regression in Excel
Regression Output
| 
 Regression Statistics  | 
|||||||
| 
 Multiple R  | 
 0.7756  | 
||||||
| 
 R Square  | 
 0.6016  | 
||||||
| 
 Adjusted R Square  | 
 0.5874  | 
||||||
| 
 Standard Error  | 
 0.0423  | 
||||||
| 
 Observations  | 
 30  | 
||||||
| 
 ANOVA  | 
|||||||
| 
 df  | 
 SS  | 
 MS  | 
 F  | 
 Significance F  | 
|||
| 
 Regression  | 
 1  | 
 0.08  | 
 0.08  | 
 42.28  | 
 0.00  | 
||
| 
 Residual  | 
 28  | 
 0.05  | 
 0.00  | 
||||
| 
 Total  | 
 29  | 
 0.13  | 
|||||
| 
 Coefficients  | 
 Standard Error  | 
 t Stat  | 
 P-value  | 
 Lower 95%  | 
 Upper 95%  | 
||
| 
 Intercept  | 
 0.21  | 
 0.02  | 
 8.37  | 
 0.00  | 
 0.16  | 
 0.26  | 
|
| 
 temp  | 
 0.003  | 
 0.00  | 
 6.50  | 
 0.00  | 
 0.00  | 
 0.00  | 
|
Since the Significance F Statistic is less than 0.05, we reject the null hypothesis ie the above model is significant.
Also, the p-value of the variable is less than 0.05, hence the variable is also significant in explaining the variation of dependent variable.
Regression Equation
Icecream = 0.21 + 0.003 temp