In: Statistics and Probability
A baseball fan wanted to know if there is a difference between the number of games played in a World Series when the American League won the series versus when the National League won the series. From 1922 to 2012, the population standard deviation of games played in World Series ultimately won by the American League was 1.14, and the population standard deviation of games played in World Series ultimately won by the National League was 1.11. In 19 randomly selected World Series won by the American League, the mean number of games played was 5.84. The mean number of games played in 17 randomly selected World Series won by the National League was 5.57. Conduct a hypothesis test. (Use
α = 0.05.
For subscripts, let 1 = American League and 2 = National
League.)
NOTE: If you are using a Student's t-distribution for the
problem, including for paired data, you may assume that the
underlying population is normally distributed. (In general, you
must first prove that assumption, though.)
Part (a)
State the null hypothesis.H0: μ1 > μ2
H0: μ1 ≥ μ2
H0: μ1 = μ2
H0: μ1 < μ2
H0: μ1 ≤ μ2
Part (b)
State the alternative hypothesis.Ha: μ1 ≠ μ2
Ha: μ1 > μ2
Ha: μ1 ≤ μ2
Ha: μ1 < μ2
Ha: μ1 = μ2
Part (c)
In words, state what your random variableX1 − X2
represents.X1 − X2
represents the difference between the number of World Series won by American League and the National League.X1 − X2
represents the mean difference between the number of games played in World Series won by the American League and those won by the National League.X1 − X2
represents the difference between the mean number of games played in World Series won by the American League and those won by the National League.X1 − X2
represents the difference between the mean number of World Series won by American League and the National League.Part (d)
State the distribution to use for the test. (Round your answers to three decimal places.)X1 − X2
~Part (e)
What is the test statistic? (If using the z
distribution round your answer to two decimal places, and if using
the t distribution round your answer to three decimal
places.)
? z t =
Part (f)
What is the p-value? (Round your answer to four decimal places.)H0
is true, then there is a chance equal to the p-value that the mean number of games played in World Series won by the American League is at least 0.27 less than or 0.27 more than the mean number of games played in World Series won by the National League.IfH0
is false, then there is a chance equal to the p-value that the mean number of games played in World Series won by the American League is at least 0.27 less than or 0.27 more than the mean number of games played in World Series won by the National League. IfH0
is false, then there is a chance equal to the p-value that the difference between the mean number of games played in World Series won by the American League and those won by the National League is at most 0.27.IfH0
is true, then there is a chance equal to the p-value that the difference between the mean number of games played in World Series won by the American League and those won by the National League is at most 0.27.Part (g)
Sketch a picture of this situation. Label and scale the horizontal axis and shade the region(s) corresponding to the p-value.Part (h)
Indicate the correct decision ("reject" or "do not reject" the null hypothesis), the reason for it, and write an appropriate conclusion.(i) Alpha (Enter an exact number as an integer, fraction, or decimal.)reject the null hypothesisdo not reject the null hypothesis
Since p-value < α, we reject the null hypothesis.Since p-value > α, we do not reject the null hypothesis. Since p-value > α, we reject the null hypothesis.Since p-value < α, we do not reject the null hypothesis.
There is sufficient evidence to show that the mean number of games played in World Series won by the American League is different from the mean number of games played in World Series won by the National League.There is not sufficient evidence to show that the mean number of games played in World Series won by the American League is different from the mean number of games played in World Series won by the National League.
Part (i)
Explain how you determined which distribution to use.The standard normal distribution was used because the samples are independent and the population standard deviation is known.The t-distribution was used because the samples are independent and the population standard deviation is not known. The standard normal distribution was used because the samples involve the difference in proportions.The t-distribution was used because the samples are dependent.
HYPOTHESIS TEST-
Suppose, random variables and denote number of games played in a World Series when the American League and the National League won the series respectively.
Here, two different groups (American League and the National League) are used to collect data in two different situations. Further we do not know population standard deviation (or variance). So, we have to perform two sample t-test.
We have to test for null hypothesis
against the alternative hypothesis
Our test statistic is given by
Here,
First sample size
Second sample size
Sample mean of first sample
Sample mean of second sample
Pooled sample standard deviation is given by
Degrees of freedom
[Using R-code 'pt(-0.7182611,34)+1-pt(0.7182611,34)']
Level of significance
We reject our null hypothesis if
Here, we observe that
So, we cannot reject our null hypothesis.
Hence, based on the given data we can conclude that there is no significant evidence that there is a difference between the number of games played in a World Series when the American League won the series and when the National League won the series.
ANSWERS-
(a)
Null hypothesis is
(b)
Alternative hypothesis is
(c)
represents the difference between the mean number of games played in World Series won by the American League and those won by the National League.
(d)
(e)
Test statistic is
(f)
(g)
Interpretation of p-value is as follows.
If is true, then there is a chance equal to the p-value that the mean number of games played in World Series won by the American League is at least 0.27 less than or 0.27 more than the mean number of games played in World Series won by the National League.
(h)
(i)
The t-distribution was used because the samples are independent and the population standard deviation is not known.