Question

In: Statistics and Probability

The two rows of data below come from recorded weights of young mice raised on two...

The two rows of data below come from recorded weights of young mice raised on two different diets, labeled S1 and S2. Use these two datasets to address the following questions.

S1 5.85 6.85 7.16 5.43 5.03 6.48 3.89 5.44 6.88 5.37
S2 4.52 5.29 5.74 5.48 3.74 4.61 4.00 4.67 4.87 5.12

a) Imagine you want to conduct a conventional parametric test of H0: μ1 = μ2 versus H1: μ1 not equal to μ2. What test
would you use, what assumption(s) is(are) required for the test you will use, show how you examine the
assumption(s), and clearly state your conclusions from your examination of the assumption(s).
b) Test H0: μ1 = μ2, and clearly state your conclusion.
c) If the true difference in the mean values is 2.0, what is the power of the test given that α = 0.05?
d) If the true difference in the mean values is 1.0 and α = 0.05, what sample size is required to detect a
difference in means of 1.0 if the power of the test must be at least 0.9?
e) Repeat the test of H0: μ1 = μ2 using a nonparametric test and state your conclusion.

Solutions

Expert Solution

  1. )For this part we will use Two sample t-Test for difference of means

When the null hypothesis states that there is no difference between the two population means (i.e., d = 0), the null and alternative hypothesis are often stated in the following form.

Ho: μ1 = μ2

Ha: μ1 ≠ μ2

     With the following assumptions.

Assumptions

1.   The first assumption made regarding t-tests concerns the scale of measurement. The assumption for a t-test is that the scale of measurement applied to the data collected follows a continuous or ordinal scale.

2. The second assumption made is that of a simple random sample, that the data is collected from a representative, randomly selected portion of the total population.

3. The third assumption is the data, when plotted, results in a normal distribution, bell-shaped distribution curve. When a normal distribution is assumed, one can specify a level of probability (alpha level, level of significance, p) as a criterion for acceptance. In most cases, a 5% value can be assumed.

4. The two samples are independent. There is no relationship between the individuals in one sample as compared to the other (as there is in the paired t-test).

5. The final assumption is homogeneity of variance. Homogeneous, or equal, variance exists when the standard deviations of samples are approximately equal

Note that, unpaired two-samples t-test can be used only under certain conditions:

· when the two groups of samples (A and B), being compared, are normally distributed. This can be checked using Shapiro-Wilk test.

· and when the variances of the two groups are equal. This can be checked using F-test.

We will use R-studio to check the assumptions for normality ad equal variances

s1<-c(5.85,6.85,7.16,5.43,5.03,6.48,3.89,5.44,6.88,5.37)

s2<-c(4.52,5.29,5.74,5.48,3.74,4.61,4.00,4.67,4.87,5.12)

> data<-data.frame(s1,s2)

> data    

s1   s2

1 5.85 4.52

2 6.85 5.29

3 7.16 5.74

4 5.43 5.48

5 5.03 3.74

6 6.48 4.61

7 3.89 4.00

8 5.44 4.67

9 6.88 4.87

10 5.37 5.12

> #Shapiro-Wilk normality tesr for s1 and s2

> with(data, shapiro.test(data$s1))

Shapiro-Wilk normality test

data: data$s1

W = 0.9379, p-value = 0.5299

> with(data, shapiro.test(data$s2))

Shapiro-Wilk normality test

data: data$s2

W = 0.97454, p-value = 0.9294

From the output, the two p-values are greater than the significance level

0.05 ,implying that the distribution of the data are not significantly

different from the normal distribution. > #In other words, we can assume the

normality.

We’ll use F-test to test for homogeneity in variances. This can be performed

with the function var.test() as follow:

> var.test(data$s1,data$s2)

F test to compare two variances

data: data$s1 and data$s2

F = 2.581, num df = 9, denom df = 9, p-value =

0.174

alternative hypothesis: true ratio of variances is not equal to 1

95 percent confidence interval:

  0.6410917 10.3912172

sample estimates:

ratio of variances

          2.581031

The p-value of F-test is p = 0.174. It’s greater than the significance level

alpha = 0.05. In conclusion, there is no significant difference between the

variances of the two sets of data. Therefore, we can use the classic t-test

with assume equality of the two variances.

(b)

we will use a fuction t.test

> t.test(data$s1,data$s2,mu=0,var.equal=TRUE)

Two Sample t-test

data: data$s1 and data$s2

t = 2.7365, df = 18, p-value = 0.01356

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

0.2401591 1.8278409

sample estimates:

mean of x mean of y

    5.838     4.804

Sice p-value is less than 0.05 we may reject the null hypothesis and conclude that the two meas are not equal.

(c )

> power.t.test(n=10,delta=2,sig.level=0.05,

              type="two.sample")

     Two-sample t test power calculation

              n = 10

          delta = 2

             sd = 1

      sig.level = 0.05

          power = 0.988179

    alternative = two.sided

NOTE: n is number in *each* group

Hnence power of test is 0.988179.

(d)

power.t.test(delta=1,sig.level=0.05,power=0.9,              

type="two.sample")

     Two-sample t test power calculation

              n = 22.0211

          delta = 1

             sd = 1

      sig.level = 0.05

          power = 0.9

    alternative = two.sided

NOTE: n is number in *each* group

Hence n=22.0211


Related Solutions

In the following data set, the columns indicate young adults’ smoking habit, while the rows indicate...
In the following data set, the columns indicate young adults’ smoking habit, while the rows indicate their exercise status. Please conduct a hypothesis to determine whether smoking habit and exercise status are associated. Choose α = 0.05. (Please make sure to check assumptions, if assumptions are not met, you may stop). this q is for a biostatistical subject. Smoking Habit Exercise Status Frequent Some None Total Never 98 86 35 219 Occasion 29 47 23 99 Regular 17 9 17...
We want to compare the weights of two independent groups of mice. Group 1 consists of...
We want to compare the weights of two independent groups of mice. Group 1 consists of 14 mice that were fed only cheese. Group 2 consists of 18 mice that were fed only walnuts. Group 1 information: sample mean x1-bar = 18 and sample standard deviation s1 = 4. Group 2 information: sample mean x2-bar = 15 and sample standard deviation s2 = 7. Perform a 2-sided hypothesis test of H0: mu1 = mu2 against H1: mu1 not equal to...
The following data are the the weights of 25 students recorded as follows: 39, 43, 36,...
The following data are the the weights of 25 students recorded as follows: 39, 43, 36, 38, 46, 51, 33, 44, 44, 43, 63, 23, 35, 39, 23, 24, 33, 36, 49, 21, 30, 32, 33, 35, 36 Find the Mean, Median, Q1, Q3, IQR, Outliers (If there are no outliers, write "None") a) Without sketching the graph, reason and explain whether the distribution is symmetric or skewed. (b) Draw a sketchplot of this data
Consider the two groups below that do not seem to come from a normal distribution. Perform...
Consider the two groups below that do not seem to come from a normal distribution. Perform a Wilcoxon Rank Sum test to see if group 1 has greater values than (is shifted to the right of) group 2. Use α=0.05. Group 1: 31, 37, 40, 42, 43, 43, 48 Group 2: 30, 35, 37, 38, 41, 42, 42
Suppose you want to test the claim the the paired sample data given below come from...
Suppose you want to test the claim the the paired sample data given below come from a population for which the mean difference is μd=0μd=0. x: 51 66 54 78 89 71 70   y: 91 63 56 79 59 77 56 Use a 0.050.05 significance level to find the following: (a)    The mean value of the differnces dd for the paired sample data d⎯⎯⎯=d¯= (b)    The standard deviation of the differences dd for the paired sample data sd=sd= (c)    The t test statistic...
Suppose you want to test the claim the paired sample data given below come from a...
Suppose you want to test the claim the paired sample data given below come from a population for which the mean difference is ??=0. x 73 59 88 59 67 86 51 y 67 65 80 55 76 67 57 Use a 0.05 significance level to find the following: (a)    The mean value of the difference ? for the paired sample data d¯= (b)    The standard deviation of the differences ? for the paired sample data ??= (c)    The t-test statistic ?= (d)    The...
Do different treatments affect the weights of poplar trees? Data below report weights (in kilograms) of...
Do different treatments affect the weights of poplar trees? Data below report weights (in kilograms) of randomly selected poplar trees given different treatments (none, fertilizer, irrigation, fertilizer + irrigation). The treatments are assigned at random to the trees. Conduct a complete hypothesis testing to investigate if there are significant differences in the mean weight of poplar trees at α= 0.05 None (N) Fertilizer (F) Irrigation (I) Fertilizer + Irrigation none 0.18 0.78 0.23 2.03 0.02 Fertilizer 0.54 0.40 2.07 0.16...
Listed below are annual data for various years. The data are weights ( metric tons) of...
Listed below are annual data for various years. The data are weights ( metric tons) of imported lemons and car crash fatality rates per 100,000 population. Construct a scatterplot, find the value of the linear correlation coefficient 'r' and find the P- value using significance level 0.05. Is there sufficient evidence to conclude that there is a linear correlation between lemon imports and crash fatality rates? Do the results suggest that imported lemons cause car fatalities? Lemon imports 232 266...
Suppose you needed to test the claim that the two samples described below come from populations...
Suppose you needed to test the claim that the two samples described below come from populations with the same mean. Assume that the samples are independent simple random samples. Sample 1: n1=17, x¯¯¯1=24.7, s1=3.05n1=17, x¯1=24.7, s1=3.05 Sample 2: n2=7, x¯¯¯2=22.5, s2=4.62n2=7, x¯2=22.5, s2=4.62 Compute: (a) the degrees of freedom: (b) the test statistic (use Sample 1 −− Sample 2): (c) he P-value:
Listed below are annual data for various years. The data are weights​ (metric tons) of imported...
Listed below are annual data for various years. The data are weights​ (metric tons) of imported lemons and car crash fatality rates per​ 100,000 population. Construct a​ scatterplot, find the value of the linear correlation coefficient​ r, and find the​ P-value using alpha equals0.05 Is there sufficient evidence to conclude that there is a linear correlation between lemon imports and crash fatality​ rates? Do the results suggest that imported lemons cause car​ fatalities? Lemon_Imports_(x)   Crash_Fatality_Rate_(y) 230   15.9 266   15.7 357  ...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT