In: Statistics and Probability
Assume that we would like to test at significance level 0.01 whether there is enough evidence to claim that average height of children by the end of age three in families with low-socioeconomic status is less than the general average height for this age, which is 94 cm. Assume that height measurements by the end of age three follow a normal distribution with standard deviation 6 cm.
For this study, assume that 9 children (from families with low socioeconomic status )will be chosen to test the null paired with the appropriate alternative hypothesis. Assume that this test is run twice each time with a different sample data set. For the first sample the mean was measured as 92.0, and the second sample mean was 94.2. State the definition of p-value, and using this definition compute the p-value for each outcome. ie. p-value of 92.0 and p-value of 94.2. Which of the two values provides stronger evidence in support of the claim about the height of children born to families with low socio-economic status?
Objective: To test whether there is enough evidence to claim that average height of children by the end of age three in families with low-socioeconomic status is less than the general average height for this age, which is 94 cm.
a. Let denote the average height of children by the end of age three in families with low-socio economic status. The Null and Alternative Hypothesis can be expressed as follows:
Vs at 5% level of significance.
As mentioned in the problem, since, the population standard deviation is known, the appropriate test to test the above hypothesis would be a one sample Z test:
But before running this test, we must ensure that the data satisfies the assumptions of this test:
- The data is continuous - The observations are from a simple random sample - The data is normally distributed - The population standard deviation is known
Assuming that all the assumptions are satisfied,
The test statistic is given by:
with critical region for Left tailed test given by:
From standard normal table, looking for area 0.01:
We may reject the null if Z < - 2.33.
For Sample 1:
Given:
Substituting the given values,
= -1
Siince, Z = - 1 > -2.33 does not lie in the rejection region, we fail to reject the Null hypothesis.
By definition, P-value is nothing but the probability of obtainng a result as extreme as the one obtained, when the null hypothesis is true. To obtain the p-value of the test, we may look for the probability of the test statistic falling in the rejection region:
Here, for a left tailed critical region, the p-value of the test is nothing but . Since, the normal table gives the left tail probabilities, the p-value of the test:
P-value = 0.15866 > 0.01
For Sample 2:
Given:
Substituting the given values,
= 0.1
Siince, Z = 0.1 > -2.33 does not lie in the rejection region, we fail to reject the Null hypothesis. To obtain the p-value of the test, we may look for the probability .
P-value = 0.53983
We find that both the values fail to provide sufficient evidence in support of the claim about the height of children born to families with low socio-economic status; i.e. both failed to find significance. However, if were to compare the two any way, we may say that the first sample provides an average value closer to what we claim (, p-value = 0.16), as compared to that of Sample 2 (p-value = 0.54).
#In sample 1, a difference as large or larger than the one obtained in the experiment would occur 16% (erraneously) of the time even if there were no true difference between the treatments and in the 2nd sample it would occur almost 54% (erraneosly) of the time