In: Statistics and Probability
We want to investigate now whether the average occupancy rate in May differs across the three regions.2.1 State the null and alternative hypotheses for the above research question.2.2 Conduct a Levene test for the homogeneity of the variances at the 10% level using the absolute deviations from the median. Make sure you state both the null and alternative hypotheses and the conclusions of your test.2.3 Test the null hypothesis in 1.1 at the 10% significance level.2.4 What can you conclude from the above test in 2.3? Explain the importance of the results in 2.2 for the procedure you performed in 2.3.
region id 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3
OR_MAY |
60 |
86 |
93 |
89 |
74 |
81 |
83 |
71 |
90 |
83 |
77 |
82 |
90 |
81 |
20 |
87 |
48 |
60 |
45 |
80 |
65 |
60 |
75 |
15 |
16 |
97 |
74 |
62 |
40 |
82 |
24 |
49 |
16 |
60 |
42 |
68 |
55 |
75 |
35 |
0 |
40 |
40 |
10 |
83 |
50 |
77 |
81 |
37 |
27 |
49 |
53 |
60 |
80 |
58 |
64 |
65 |
68 |
75 |
55 |
60 |
56 |
10 |
85 |
4 |
24 |
85 |
75 |
44 |
45 |
0 |
34 |
35 |
70 |
65 |
15 |
40 |
10 |
10 |
35 |
50 |
2 |
0 |
3 |
30 |
15 |
83 |
91 |
85 |
80 |
50 |
79 |
92 |
87 |
84 |
65 |
86 |
62 |
70 |
87 |
87 |
50 |
61 |
59 |
77 |
46 |
81 |
48 |
15 |
80 |
52 |
90 |
90 |
75 |
20 |
10 |
30 |
53 |
52 |
90 |
53 |
48 |
84 |
90 |
35 |
25 |
35 |
10 |
10 |
60 |
70 |
3 |
10 |
10 |
75 |
10 |
Answer -
Answer for 2.1)
Here we want to test whether the average occupancy rate is differs across the three regions or not.
i.e Here we have to test
Null Hypothesis H0: µ1= µ2=µ3 (average occupancy rate in all three regions is same) V/S
Alternative Hypothesis H1: at least one of the average occupancy rate in the three regions is not same
Answer for 2.2) (Using R software)
Here we have to Conduct a Levene test for the homogeneity of the variances at the 10% level using the absolute deviations from the median and we have to state both the null and alternative hypotheses and the conclusions of test.
Null Hypothesis H0: σ12 = σ22 = σ32 (variance of all three regions is same) V/S
Alternative Hypothesis H1: σi2 ≠ σj2 for at least one pair (i,j) (variance of at least one regions is not same)
Levene's Test for Homogeneity of Variance (center =
median)
Df F_value Pr(>F)
group 2 0.4049 0.6679
132
Here our p-value=0.6679=66.79% which is greater than 10% hence we accept null hypothesis at 10% level of significance
and conclude that variance of all three regions is same.
R-Code is -
rm(list=ls())
library(car)
region_id=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
3,3,3,3,3,3,3)
OR_MAY=c(60,86,93,89,74,81,83,71,90,83,77,82,90,81,20,87,48,60,45,
80,65,60,75,15,16,97,74,62,40,82,24,49,16,60,42,68,55,75,35,0,40,40,
10,83,50,77,81,37,27,49,53,60,80,58,64,65,68,75,55,60,56,10,85,4,24,
85,75,44,45,0,34,35,70,65,15,40,10,10,35,50,2,0,3,
30,15,83,91,85,80,50,79,92,87,84,65,86,62,70,87,
87,50,61,59,77,46,81,48,15,80,52,90,90,75,20,10,
30,53,52,90,53,48,84,90,35,25,35,10,10,60,70,3,
10,10,75,10)
test=leveneTest(OR_MAY, as.factor(region_id), center =
median)
test
Output of R Code is-
> rm(list=ls())
>
region_id=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
+
1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
+
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,
+
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
+ 3,3,3,3,3,3,3)
>
OR_MAY=c(60,86,93,89,74,81,83,71,90,83,77,82,90,81,20,87,48,60,45,
+
80,65,60,75,15,16,97,74,62,40,82,24,49,16,60,42,68,55,75,35,0,40,40,
+
10,83,50,77,81,37,27,49,53,60,80,58,64,65,68,75,55,60,56,10,85,4,24,
+ 85,75,44,45,0,34,35,70,65,15,40,10,10,35,50,2,0,3,
+ 30,15,83,91,85,80,50,79,92,87,84,65,86,62,70,87,
+ 87,50,61,59,77,46,81,48,15,80,52,90,90,75,20,10,
+ 30,53,52,90,53,48,84,90,35,25,35,10,10,60,70,3,
+ 10,10,75,10)
>
> test=leveneTest(OR_MAY, as.factor(region_id), center =
median)
> test
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 2 0.4049 0.6679
132
NOTE : To answer the 2.3 and 2.4 we need question number 1.1 which is not given .
Thank you...
answer of 2.3 and 2.4) (using R soft)
Here we have to perform analysis of variance at the 10% significance level
the anova table is
source of variation Df Sum Sq Mean Sq F value Pr(>F)
region_id 1 476 476.1 0.616 0.434
Residuals 133 102780 772.8
from this anova table
here p-value is =0.434 =43.3% and it is greater than 10% hence we accept null hypothesis
H0: µ1= µ2=µ3 (average occupancy rate in all three regions is same)
and conclude that average occupancy rate in all three regions is same.
Actually Levene's test is used to test if k samples have equal variances. Equal variances across samples is called homogeneity of variance.
But Some statistical tests, for example the analysis of variance, assume that variances are equal across groups or samples. The Levene test can be used to verify that assumption.
Actually in 2.3 we only check the effect of mean but in 2.2 we check effect of variation