In: Statistics and Probability
The University of California, Berkeley (Cal) and Stanford University are athletic archrivals in the Pacific 10 conference. Stanford fans claim Stanford's basketball team is better than Cal's team; Cal fans challenge this assertion. In 2004, Stanford University's basketball team went nearly undefeated within the Pac 10. Stanford's record, and those of Cal and the other eight teams in the conference, are listed in In all, there were 89 games played among the Pac 10 teams in the season. Stanford won 17 of the 18 games it played; Cal won 9 of 18. We would like to use these data to test the Stanford fans' claim that Stanford's team is better than Cal's. That is, we would like to determine whether the difference between the two teams' performance reasonably could be attributed to chance, if the Stanford and Cal teams in fact have equal skill.
To test the hypothesis, we shall make a number of simplifying assumptions. First of all, we shall ignore the fact that some of the games were played between Stanford and Cal: we shall pretend that all the games were played against other teams in the conference. One strong version of the hypothesis that the two teams have equal skill is that the outcomes of the games would have been the same had the two teams swapped schedules. That is, suppose that when Washington played Stanford on a particular day, Stanford won. Under this strong hypothesis, had Washington played Cal that day instead of Stanford, Cal would have won.
A weaker version of the hypothesis is that the outcome of Stanford's games is determined by independent draws from a 0-1 box that has a fraction pC of tickets labeled "1" (Stanford wins the game if the ticket drawn is labeled "1"), that the outcome of Berkeley's games is determined similarly, by independent draws from a 0-1 box with a fraction pS of tickets labeled "1," and that pS = pC. This model has some shortcomings. (For instance, when Berkeley and Stanford play each other, the independence assumption breaks down, and the fraction of tickets labeled "1" would need to be 50%. Also, it seems unreasonable to think that the chance of winning does not depend on the opponent. We could refine the model, but that would require knowing more details about who played whom, and the outcome.)
Nonetheless, this model does shed some light on how surprising the records would be if the teams were, in some sense, equally skilled. This box model version allows us to use Fisher's Exact test for independent samples, considering "treatment" to be playing against Stanford, and "control" to be playing against Cal, and conditioning on the total number of wins by both teams (26).
Q1) On the assumption that the null hypothesis is true, the bootstrap estimate of the standard error of the sample percentage of games won by Stanford is ?
Q2) On the assumption that the null hypothesis is true, the bootstrap estimate of the standard error of the sample percentage of games won by Cal is ?
Q3) The approximate P-value for z test against the two-sided alternative that the Stanford and Berkeley teams have different skills is ?
Note :*The z-score for the difference in sample percentages is 2.97685*
We are given:
Stanford won 17 of the 18 games it played.
Sample size, n1 = 18
Games won, x1 = 17
Therefore, let p1 be the proportion of games won by Stanford and is calculated as:
p1 = x1/n1 = 17/18 = 0.94
Cal won 9 of the 18 games.
Sample size, n1 = 18
Games won, x1 = 9
Therefore, let p2 be the proportion of games won by Cal and is calculated as:
p2 = x2/n2 = 9/18 = 0.5
Claim: The difference between the two teams' performance reasonably could be attributed to chance, if the Stanford and Cal teams in fact have equal skill.
The hypothesis for testing is:
The null hypothesis: H0: p1 = p2
There is no difference between Stanford's team and Cal's team performance
The alternative hypothesis: Ha: p1 ≠ p2
There is a difference between Stanford's team and Cal's team performance
Q1) On the assumption that the null hypothesis is true, the bootstrap estimate of the standard error of the sample percentage of games won by Stanford is ?
Standard error of the sample percentage of games won by Stanford = √p1 (1 - p1)/n1
= √0.94 (1 - 0.94)/18
= √0.0029
= 0.054
Therefore, Standard error of the sample percentage of games won by Stanford is 0.054.
Q2) On the assumption that the null hypothesis is true, the bootstrap estimate of the standard error of the sample percentage of games won by Cal is ?
The standard error of the sample percentage of games won by Cal = √p2 (1 - p2)/n2
= √0.5 (1 - 0.5)/18
= √0.0139
= 0.118
Therefore, Standard error of the sample percentage of games won by Stanford is 0.118.
Q3) The approximate P-value for z test against the two-sided alternative that the Stanford and Berkeley teams have different skills is ?
p̂ = (x1 + x2)/(n1 + n2)
p̂ = (17 + 9)/(18 +18)
p̂ = 0.72
The test statistic, z = (p1 - p2)/√p̂ (1 - p̂) (1/n1 + 1/n2)
z = (0.94 - 0.5)/√0.72 (1 - 0.72) (1/18 + 1 /18)
z = 0.44/√0.022
z = 0.44/0.149
z = 2.9768
The p-value for z = 2.9768 from a z-table is 0.002913.