In: Statistics and Probability
Create an ANOVA and regresion problem, show all steps and result.
1)
ANOVA Problem
Research Question: Do the teenagers spend equal time in playing video games on weekends across the three groups?
Null Hypothesis
H0: µGroup 1 = µGroup 2 = µGroup 3 (Teenagers spend equal time in playing video games on weekends across the three groups)
Alternate Hypothesis
Ha: Not all Means are equal
Data Given
10-12 Yrs Old |
13-15 Yrs Old |
16-18 Yrs Old |
||
Group 1 |
Group 2 |
Group 3 |
Overall |
|
Mean |
2 |
5 |
6 |
4.38 |
Variance |
10 |
25 |
20 |
|
Std Dev |
3.16 |
5.00 |
4.47 |
|
Sample Size |
39 |
43 |
41 |
123 |
Overall Mean = (2*39+5*43+6*41)/(39+43+41) = 4.38
alpha = 0.05
Degress of Freedom:
dfBetween = a – 1 = 3-1 =2
dfWithin = N-a = 123-3 = 120
dfTotal = N-1 = 123-1 = 122
Critical Values:
Time (dfBetween, dfWithin): (2,120) = 3.09
Decision Rule:
If F is greater than 3.09, reject the null hypothesis
Test Statistics:
Between Group Variance(SSB) = ∑n(Group Average – Overall Average)2 = 39*(2 – 4.38)2 + 43*(5 – 4.38)2 + 41*(6 – 4.38)2 = 345.04
Standard Deviation Pooled (sp) = { [SD12*n1+SD22*n2+SD32*n3] / [n1+n2+n3] }0.5 = 4.31
Within Group Variance = SSE
sp = (SSE/(n-1))1/2
where n = (39+43+41)/3 = 41
SSE = 743.09
SSBetwen = 345.04
SSWithin = 743.09
SSTotal = SSBetwen + SSWithin = 1088.13
MS = SS/df
F = MSeffect / MSerror
Hence,
F = 172.52/6.19 = 27.86
SS |
df |
MS |
F |
|
Between |
345.04 |
2 |
172.52 |
27.86 |
Within |
743.09 |
120 |
6.19 |
|
Total |
1,088.13 |
122 |
Result:
Our F = 27.86, we reject the null hypothesis
Conclusion:
Not all means are equal.
2)
Regression Problem:
Suppose we want to find the Ice cream consumption on the basis of Temperature (in Fahrenheit).
Data Set
temp |
icecream |
41 |
0.386 |
56 |
0.374 |
63 |
0.393 |
68 |
0.425 |
69 |
0.406 |
65 |
0.344 |
61 |
0.327 |
47 |
0.288 |
32 |
0.269 |
24 |
0.256 |
28 |
0.286 |
26 |
0.298 |
32 |
0.329 |
40 |
0.318 |
55 |
0.381 |
63 |
0.381 |
72 |
0.47 |
72 |
0.443 |
67 |
0.386 |
60 |
0.342 |
44 |
0.319 |
40 |
0.307 |
32 |
0.284 |
27 |
0.326 |
28 |
0.309 |
33 |
0.359 |
41 |
0.376 |
52 |
0.416 |
64 |
0.437 |
71 |
0.548 |
Solution:
Dependent Variable: Icecream
Independent Variable: Temperature
We expect the relationship to be positive as the temperature increases, people will spend more on ice-cream.
Steps for Regression in Excel
Regression Output
Regression Statistics |
|||||||
Multiple R |
0.7756 |
||||||
R Square |
0.6016 |
||||||
Adjusted R Square |
0.5874 |
||||||
Standard Error |
0.0423 |
||||||
Observations |
30 |
||||||
ANOVA |
|||||||
df |
SS |
MS |
F |
Significance F |
|||
Regression |
1 |
0.08 |
0.08 |
42.28 |
0.00 |
||
Residual |
28 |
0.05 |
0.00 |
||||
Total |
29 |
0.13 |
|||||
Coefficients |
Standard Error |
t Stat |
P-value |
Lower 95% |
Upper 95% |
||
Intercept |
0.21 |
0.02 |
8.37 |
0.00 |
0.16 |
0.26 |
|
temp |
0.003 |
0.00 |
6.50 |
0.00 |
0.00 |
0.00 |
|
Since the Significance F Statistic is less than 0.05, we reject the null hypothesis ie the above model is significant.
Also, the p-value of the variable is less than 0.05, hence the variable is also significant in explaining the variation of dependent variable.
Regression Equation
Icecream = 0.21 + 0.003 temp