In: Statistics and Probability
The two rows of data below come from recorded weights of young mice raised on two different diets, labeled S1 and S2. Use these two datasets to address the following questions.
S1 | 5.85 | 6.85 | 7.16 | 5.43 | 5.03 | 6.48 | 3.89 | 5.44 | 6.88 | 5.37 |
S2 | 4.52 | 5.29 | 5.74 | 5.48 | 3.74 | 4.61 | 4.00 | 4.67 | 4.87 | 5.12 |
a) Imagine you want to conduct a conventional parametric test of
H0: μ1 = μ2 versus H1: μ1 not equal to μ2. What test
would you use, what assumption(s) is(are) required for the test you
will use, show how you examine the
assumption(s), and clearly state your conclusions from your
examination of the assumption(s).
b) Test H0: μ1 = μ2, and clearly state your conclusion.
c) If the true difference in the mean values is 2.0, what is the
power of the test given that α = 0.05?
d) If the true difference in the mean values is 1.0 and α = 0.05,
what sample size is required to detect a
difference in means of 1.0 if the power of the test must be at
least 0.9?
e) Repeat the test of H0: μ1 = μ2 using a nonparametric test and
state your conclusion.
When the null hypothesis states that there is no difference between the two population means (i.e., d = 0), the null and alternative hypothesis are often stated in the following form.
Ho: μ1 = μ2
Ha: μ1 ≠ μ2
With the following assumptions.
Assumptions
1. The first assumption made regarding t-tests concerns the scale of measurement. The assumption for a t-test is that the scale of measurement applied to the data collected follows a continuous or ordinal scale.
2. The second assumption made is that of a simple random sample, that the data is collected from a representative, randomly selected portion of the total population.
3. The third assumption is the data, when plotted, results in a normal distribution, bell-shaped distribution curve. When a normal distribution is assumed, one can specify a level of probability (alpha level, level of significance, p) as a criterion for acceptance. In most cases, a 5% value can be assumed.
4. The two samples are independent. There is no relationship between the individuals in one sample as compared to the other (as there is in the paired t-test).
5. The final assumption is homogeneity of variance. Homogeneous, or equal, variance exists when the standard deviations of samples are approximately equal
Note that, unpaired two-samples t-test can be used only under certain conditions:
· when the two groups of samples (A and B), being compared, are normally distributed. This can be checked using Shapiro-Wilk test.
· and when the variances of the two groups are equal. This can be checked using F-test.
We will use R-studio to check the assumptions for normality ad equal variances
s1<-c(5.85,6.85,7.16,5.43,5.03,6.48,3.89,5.44,6.88,5.37) s2<-c(4.52,5.29,5.74,5.48,3.74,4.61,4.00,4.67,4.87,5.12) > data<-data.frame(s1,s2) > data s1 s2 1 5.85 4.52 2 6.85 5.29 3 7.16 5.74 4 5.43 5.48 5 5.03 3.74 6 6.48 4.61 7 3.89 4.00 8 5.44 4.67 9 6.88 4.87 10 5.37 5.12 > #Shapiro-Wilk normality tesr for s1 and s2 > with(data, shapiro.test(data$s1)) Shapiro-Wilk normality test data: data$s1 W = 0.9379, p-value = 0.5299 > with(data, shapiro.test(data$s2)) Shapiro-Wilk normality test data: data$s2 W = 0.97454, p-value = 0.9294 From the output, the two p-values are greater than the significance level 0.05 ,implying that the distribution of the data are not significantly different from the normal distribution. > #In other words, we can assume the normality. We’ll use F-test to test for homogeneity in variances. This can be performed with the function var.test() as follow: > var.test(data$s1,data$s2) F test to compare two variances data: data$s1 and data$s2 F = 2.581, num df = 9, denom df = 9, p-value = 0.174 alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: 0.6410917 10.3912172 sample estimates: ratio of variances 2.581031 The p-value of F-test is p = 0.174. It’s greater than the significance level alpha = 0.05. In conclusion, there is no significant difference between the variances of the two sets of data. Therefore, we can use the classic t-test with assume equality of the two variances. |
|
(b)
we will use a fuction t.test
> t.test(data$s1,data$s2,mu=0,var.equal=TRUE)
Two Sample t-test
data: data$s1 and data$s2
t = 2.7365, df = 18, p-value = 0.01356
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.2401591 1.8278409
sample estimates:
mean of x mean of y
5.838 4.804
Sice p-value is less than 0.05 we may reject the null hypothesis and conclude that the two meas are not equal.
(c )
> power.t.test(n=10,delta=2,sig.level=0.05,
type="two.sample")
Two-sample t test power calculation
n = 10
delta = 2
sd = 1
sig.level = 0.05
power = 0.988179
alternative = two.sided
NOTE: n is number in *each* group
Hnence power of test is 0.988179.
(d)
power.t.test(delta=1,sig.level=0.05,power=0.9,
type="two.sample")
Two-sample t test power calculation
n = 22.0211
delta = 1
sd = 1
sig.level = 0.05
power = 0.9
alternative = two.sided
NOTE: n is number in *each* group
Hence n=22.0211