In: Statistics and Probability
Week 5 Extra Credit Assignment
Collect two sets of data from two different related/similar
populations (for example: Male & female).
1) Test a claim about the population mean for one of the
sets.
2) Test a claim about the population standard deviation for one of
the sets.
3) Test a claim about both population standard
deviations.
4) Test a claim about both population means (for example the
mean for the one population equals the mean of the other
population).
You are responsible for coming up with the claims and
picking your alpha.
Lets Suppose we have 2 datasets for human body temperature. Here are some details for the 2 datasets.
DATASET 1 -> no. of data points(n1) = 101, sample mean (X) = 97.89 degrees., sample standard deviation = 0.73 degrees
DATASET 2-> no. of data points(n2) = 101, sample mean(X) = 97.53 degrees, sample standard deviation = 0.69 degrees
1)
Lets test the claim for population mean of first dataset () whether it is 98.6 degrees or less.
For this we will do a one tailed test,
Now, first we form a hypothesis, The Null Hypothesis(H0) -> = 98.6 degrees
and Alternative Hypothesis (Ha) -> < 98.6 degrees.
Now that we have our hypothesis formed, we need to calculate the test statistic (t),
t =
standard error = = = 0.0726
t = (97.89-98.6)/0.0726 = -9.77
Now,we calculate p-value assuming null hypothesis is true.
For this, we need to look at t - distribution table for area below t= -9.77 for degree of freedom(df) = 100.
p-value <0.02 which is less than 0.05 (95% confidence level)
so, we reject the null hypothesis and accept the alternative hypothesis which is population mean is less than 98.6 degrees.
2)
For testing the claim about standard deviation, the test statistic will follow a chi-square distribution with (n-1) degrees of freedom, X2 = ((n-1)s2)/02
Let the test claim be that population standard deviation is atleast 0.7 degrees
So, Null Hypothesis(H0) -> = 0.7
Alternative Hypothesis(Ha) -> < 0.7
Now, for significance level () = 0.05
test statistic = (101-1)*0.732/0.72 = 108.76
So, the p- value for this( from chi square distribution table) = 0.258
since, p-value > 0.05, we fail to reject Null Hypothesis. So, the claim that standard deviation is 0.7 degrees is true.
3)
Let the claim be that both population have same standard deviation, so (1)2/(2)2= 1.
Sample variances are releated to chi-square distribution and ratio of variances are related to F-distribution.
Test Statistic, F = s12/s22
Now, if the first sample has smaller variance, it is left tailed test, else it is a right tailed test
So, with significance level () = 0.05
F = 0.732/0.692 = 1.1193,
Since, first sample has larger variance, it is a right tailed test
so, looking at F-distribution table, P-value = 0.287
since, p-value > 0.05. So, we fail to reject Null Hypothesis. So, the claim that both population have same standard deviation is true.
4)
Let the claim be that both population have same mean.
So, Null Hypothesis, (H0) -> 1 = 2
Alternative Hypothesis(Ha) -> 1 2
Test Statistic (t) = ((x1 - x2) - d0)/(sp*) with n1+n2-2 degrees of freedom., d0 is difference between population means as per null hypothesis so = 0.
sp2 = ((n1-1)s12+(n2-1)s22)/(n1+n2-2 ) is called pooled variance.
so, sp = (100*0.732+100*0.692)/(101+101-2) = 0.5045
So, t = ((97.89-97.53)-0)/(0.5045*0.141) = 0.36/0.0711 = 5.06
looking at normal distribution table for two tailed test, p-value < 0.001
So, Null Hypothesis is rejected. So, The population mean of both the data sets is not equal.