In: Math
The National Center of Education Statistics conducted a survey of high school seniors, collecting test data on reading, writing, and several other subjects. Here we examine a simple random sample of 200 students from this survey. Side-by-side box plots of reading and writing scores as well as a histogram of the differences in scores are shown below.
(a) Is there a clear difference in the average reading and writing scores?
(b) Are the reading and writing scores of each student independent of each other?
(c) Create hypotheses appropriate for the following research question: is there an evident difference in the average scores of students in the reading and writing exam?
(d) Check the conditions required to complete this test.
(e) The average observed difference in scores is xreadwrite = 0.545, and the standard deviation of the difference is 8.887 points. Do these data provide convincing evidence of a difference between the average scores on the two exams?
(f) What type of error might we have made? Explain what the error means in the context of the application.
(g) Based on the results of this hypothesis test, would you expect a confidence interval for the average difference between the reading and writing scores to include 0? Explain your reasoning.
Paired t test: The averages of the same group at different times are compared. That is, the samples are dependent.
If two types of treatments are measured or compared on the same observational unit instead for two separate groups it is called as paired design. In other words, matched pair design is that in which the treatments are assigned randomly to the units and each observational unit in the study receives two treatments.
Assumptions:
• Dependent variable is measured on a continuous scale.
• Each score in one sample is paired with a particular score in the other sample.
• The difference of the either group follows normal distribution.
Rejection rule:
If , then reject the null hypothesis .
If , then reject the null hypothesis .
Confidence interval: A range of values such that the population parameter can expected to contain for the given confidence level is termed as the confidence interval. In other words, it can be defined as an interval estimate of the population parameter which is calculated for the given data based on a point estimate and for the given confidence level.
Moreover, the confidence level indicates the possibility that the confidence interval can contain the population parameter. Usually, the confidence level is denoted by . The value is chosen by the researcher. Some of the most common confidence levels are 90%, 95%, and 99%.
The margin of error is defined as a statistic which gives the amount of sampling error in the given study. Also, the margin of error tells the percentage of points that the obtained results would differ from that of the given population value.
P-value: The probability of getting the value of the statistic that is as extreme as the observed statistic when the null hypothesis is true is called as P-value.
Type I Error: Reject the null hypothesis when it is true, called a type I error. It is also known as level of significance. The type I error is denoted as .
Type II Error: Failing to reject the null hypothesis when the alternative is true, called a type II error. The type II error is denoted as .
The formula for paired t test is given below:
Where denotes the sample mean difference and denotes the sample standard deviation difference.
Degrees of freedom:
The formula for the confidence interval for the difference in means is,
Where is the sample mean difference, is the standard deviation difference and n be the sample size.
The general conditions to perform the paired t-test are as follows:
• Dependent variable is measured on a continuous scale.
• Each score in one sample is paired with a particular score in the other sample.
• The difference of the either group follows normal distribution.
Rejection rule using p-value:
If , then reject the null hypothesis.
If , then do not reject the null hypothesis.
Rejection rule based on confidence interval:
• If the confidence interval contains the value zero, then the null hypothesis is not rejected.
• If the confidence interval does not contain the value zero, then the null hypothesis not rejected.
(1.a)
From the boxplot of reading and writing scores, it is clear that there is no difference in the average scores as the distribution of reading and writing are approximately normal. This would lead to zero difference.
(2.b)
The reading scores and the writing scores are taken from each of the student. That is each student is measured in terms of reading and writing. This indicates that the reading an writing scores are not independent of each other.
(3.c)
The hypotheses are stated below:
Let be the population mean difference in the average scores of students in the reading and writing exam
Null hypothesis:
Alternative hypothesis:
(4.d)
In the given study, the averages of the same students for different subjects’ scores are compared.
• The average of reading and writing scores are compared on the same students. This implies that the samples are dependent.
• The each student paired with the reading and writing scores.
• The differences of the reading writing scores are approximately normal. Because in the histogram for the difference scores (read write) is symmetric.
(5.e)
Instructions to find the test statistic and p-value by using MINITAB:
1.Choose Stat > Basic Statistics > Paired t.
2.Choose Summarized data.
3.Enter Sample size as 200, Mean as 0.545, Standard deviation as 8.887.
4.Choose Options.
5.In Confidence level, enter 95.
6.In Alternative, select not equal.
7.Click OK.
Follow the above instructions to get the test statistic:
From MINITAB output, the value of test statistic is 0.87 and the p-value is 0.387.
The conclusion is stated below:
Use the significance level 0.05.
The p-value is 0.387 and the level of significance is 0.05.
That is, .
By the rejection rule, do not reject the null hypothesis.
Therefore, it can be concluded that, there is no difference in the average scores of students in the reading and writing exam.
(6.f.1)
From the information in part 5.e the null hypothesis is not rejected.
Since the null hypothesis is not rejected, there might be a chance that of not rejecting the false null hypothesis. In this situation the error that can occur would be type II error.
(6.f.2)
The result of the study indicates that the null hypothesis is not rejected. That is, there is no significance difference in the average reading and writing scores.
But because of type II error, the result states that there is no significance difference in the average reading and writing scores when actually there is significance difference in the average reading and writing scores.
(7.g)
The result stated that the null hypothesis is not rejected. If the null hypothesis is not rejection then the confidence interval must definitely contain the value zero based on the rejection rule of the confidence interval.
Ans: Part 1.aThus, no, there is no clear difference in the average reading and writing scores.
Part 2.bThus, yes, the reading and writing scores of each student not independent of each other.
Part 3.cThus, the Null hypothesis is and alternative hypothesis is
Part 4.dYes, the conditions are satisfied to complete the test
Part 5.eThus, do not reject the null hypothesis: there is no difference in the average scores of students in the reading and writing exam.
Part 6.f.1Thus, it is possible to make type II error.
Part 6.f.2The error in the context of the study is that determining that there is no difference in scores when actually there is difference in the scores.
Part 7.gBased on the results of the hypothesis test, it cannot be expected that a confidence interval for the average difference between the reading and writing scores would include 0.