In: Statistics and Probability
American League baseball teams play their games with a designated hitter rule, meaning that pitchers do not bat. The league believes that replacing the pitcher, traditionally a weak hitter, with another playing in the batting order produces more runs and generally more interest among the fans. (I, being a non-sports fan, assume the National League does NOT use a designated hitter…someone correct me if this is not correct!) The average number of home runs hit per game for the 2011 season in the American and National Leagues are found in the file Baseball.
Baseball file:
American League |
National League |
1.500 |
1.354 |
1.267 |
1.314 |
1.230 |
1.160 |
1.186 |
1.110 |
1.144 |
1.095 |
1.060 |
1.062 |
1.037 |
0.987 |
0.987 |
0.950 |
0.913 |
0.948 |
0.903 |
0.941 |
0.880 |
0.919 |
0.789 |
0.862 |
0.786 |
0.799 |
0.708 |
0.774 |
0.735 |
|
0.596 |
a) Obtain boxplots of the two data sets. Be sure to display them on the same plot. Are both data sets normally distributed? Does the spread of the data look roughly the same in each group? In other words, can we use the pooled t-test legitimately?
b) State the null and alternate hypothesis we would use to test whether there is a significant difference in the number of home runs hit per game between the American and National Leagues.
c) Run the two sample t-test two ways. The un-pooled test and the pooled test. What is the p-value for each test?
d) Do both tests reach the same conclusion? Use a 5% level of significance. Did the American League’s use of a designated hitter make any difference to the number of home runs hit per game?
(a)
Following are the box plots of the data:
Both box plots are approximately symmetric and have approximately same variation. Since box plots are symmetric so both data sets seem to be normally distributed. So it seems that pooled t-test is legitimately.
b)
Let population 1: the number of home runs hit per game by the American leagues
population 2: the number of home runs hit per game by the National leagues
Hypotheses are:
c)
Un-pooled t test:
The p-value is 0.5064
Since p-value is greater than 0.05 so we fail to reject the null hypothesis.
Pooled t test:
The p-value is 0.5064
Since p-value is greater than 0.05 so we fail to reject the null hypothesis.
d)
Yes both tests reach to same conclusion.
The American League’s use of a designated hitter did not make any difference to the number of home runs hit per game.