In: Statistics and Probability
Hermit crabs live in shells, but they don't grow the shells themselves. They find an abandoned snail shell and make it their home. They inhabit shells of different snail species, but it's not clear if the hermit crabs choose shells of different species randomly or if they have a preference for certain snail species.
We can't just collect a bunch of hermit crabs and then conclude that the snail shell with the highest prevalence is the most preferred, because it might be the case that that snail species is simply more common. Instead, it makes more sense to collect a bunch of shells, and determine if they are occupied or unoccupied by a crab. If crabs have no preference, then the ratio of occupied to unoccupied should be the same across snail species.
For three snail species in the area, shells were collected and it was recorded whether they were inhabited by a hermit crab. The data:
Occupied | Vacant | |
Species 1 | 47 | 42 |
Species 2 | 10 | 41 |
Species 3 | 125 | 49 |
The question of interest is: do hermit crabs care about the species of shell they inhabit?
Step 2: State the null hypothesis.
Step 3: State the alternative hypothesis.
Step 4: What is the correct level of alpha?
Step 5: Which statistical test are you using?
Step 6: What is the value of the test statistic?
Step 6 continued: How many degrees of freedom in this test?
Step 7: What is the critical value for the test statistic?
Step 8: How does the test statistic compare to the critical value?
Step 9: Based on this comparison, do you accept or reject your null hypothesis?
Step 10: What do you conclude from this analysis?
Step 2 : The null hypothesis is
H0 : The ratio of occupied to unoccupied is same for three species
( In symbol , H0: p1=p2=p3 , where pi is the proportion of occupied shells for each species )
Step 3 : The alternative hypothesis is
Ha:The ratio of occupied to unoccupied is different for at least two of the species
( In symbol , H0: pi pj)
Step 4:Level of significance ,
Step 5: We are using chi square test of significance ( as we are comparing multiple proportions)
Step 6 :
Test statistic
where Oi : observed frequency
Ei : expected frequency
The contingency table of observed frequency
Occupied | Vacant | Total | |
Species1 | 47 | 42 | 89 |
Species2 | 10 | 41 | 51 |
Species3 | 125 | 49 | 174 |
Total | 182 | 132 | 314 |
Expected frequency calculation
E(47)=89*182/314 = 51.59
E( 42) = 89-51.59= 37.41
etc
The contingency table
Occupied | Vacant | Total | |
Species1 | 47(51.59) | 42(37.41) | 89 |
Species2 | 10(29.56) | 41(21.44) | 51 |
Species3 | 125(100.85) | 49(73.15) | 174 |
Total | 182 | 132 | 314 |
(The expected frequencies are in the parenthesis )
Calculation of Chi square
Oi | Ei | (Oi-Ei)^2/Ei | |
47 | 51.59 | 0.4084 | |
42 | 37.41 | 0.5632 | |
10 | 29.56 | 12.9429 | |
41 | 21.44 | 17.8449 | |
125 | 100.85 | 5.7831 | |
49 | 73.15 | 7.9730 | |
total | 45.5154 |
Thus
45.51
Step 6 continued
degrees of freedom = (2-1)*(3-1) = 2
Note : degrees of freedom = ( number of rows -1) *( number of columns-1)
Step 7
Critical value of chi square at 0.05 with 2 df
5.99 (from chi square table)
Step 8
calculated value of > 5.99
Step 9
Since calculated value of > 5.99
We reject the null hypothesis
Step 10
At 5% level of significance , there is sufficient evidence to conclude that hermit crab have preferrence for certain snail species .