In: Statistics and Probability
a. Use an appropriate test to evaluate whether position in the batting order and bat speed contribute to the number of home runs (HR) hit per season. Keep in mind that we would like to be able to make inferences about the influence of bat speed across the entire range observed within Major League Baseball, not just the random bat speeds that we tested.
b. Make a single plot of the mean responses at each level of both factors and evaluate.
1) State all relevant hypotheses (nulls and alternatives).
2) State which test was used and why you used it.
3) State conclusions after completing your analyses.
batting order slot | bat speed | HR per season |
Top | Fast | 24 |
Middle | Fast | 41 |
Bottom | Fast | 21 |
Top | Fast | 15 |
Middle | Fast | 34 |
Bottom | Fast | 18 |
Top | Fast | 12 |
Middle | Fast | 29 |
Bottom | Fast | 14 |
Top | Slow | 13 |
Middle | Slow | 24 |
Bottom | Slow | 8 |
Top | Slow | 11 |
Middle | Slow | 27 |
Bottom | Slow | 12 |
Top | Slow | 6 |
Middle | Slow | 18 |
Bottom | Slow | 9 |
> data1=read.csv(file.choose(),header=T)
> names(data1)
[1] "batting.order.slot" "bat.speed" "HR.per.season"
> attach(data1)
The following objects are masked from data1 (pos = 4):
bat.speed, batting.order.slot, HR.per.season
b). > xyplot(HR.per.season~batting.order.slot|bat.speed,pch=16)
> by(HR.per.season,list(batting.order.slot,bat.speed),FUN =
mean)
: Bottom
: Fast
[1] 17.66667
---------------------------------------------------------------
: Middle
: Fast
[1] 34.66667
---------------------------------------------------------------
: Top
: Fast
[1] 17
---------------------------------------------------------------
: Bottom
: Slow
[1] 9.666667
---------------------------------------------------------------
: Middle
: Slow
[1] 23
---------------------------------------------------------------
: Top
: Slow
[1] 10
a) 1. Null hypothesis ->
Alternative hypothesis -> H1A : not
H0A
H1B : not H0B,
where = Mean of HR
per season for batting order slot i, i=1(1)3 [ 1=Top, 2=Middle,
3=Bottom]
and = Mean of HR
per season for batting speed j, j=1(1)2 [1=Fast, 2=Slow]
2. Two-way ANOVA with replications has been used since we have two factors Batting order slot and batting speed and we have more than observation for each combination of the two factors.
3..>
model=aov(HR.per.season~batting.order.slot+bat.speed)
> summary(model)
Df Sum Sq Mean Sq F value Pr(>F)
batting.order.slot 2 930.3 465.2 24.11 2.92e-05 ***
bat.speed 1 355.6 355.6 18.43 0.000744 ***
Residuals 14 270.1 19.3
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Since p-value for both the factors are very small, we reject both
the null hypotheses and conclude that both batting order slot and
batting speed significantly affect the HR per season.