25) (AM#10) Calculate the 95% confidence interval for the
difference between the mean Anger-In score for the athletes and
non-athletes. What can you conclude? (relevant section)
Learning Objectives
State the assumptions for computing a confidence interval on
the difference between means
Compute a confidence interval on the difference between
means
Format data for computer analysis
It is much more common for a researcher to be interested in
the difference between means than in the specific values of the
means themselves. We take as an example the data from the "Animal
Research" case study. In this experiment, students rated (on a
7-point scale) whether they thought animal research is wrong. The
sample sizes, means, and variances are shown separately for males
and females in Table 1.
Table 1. Means and Variances in Animal Research study.
Condition n Mean Variance
Females 17 5.353 2.743
Males 17 3.882 2.985
As you can see, the females rated animal research as more
wrong than did the males. This sample difference between the female
mean of 5.35 and the male mean of 3.88 is 1.47. However, the gender
difference in this particular sample is not very important. What is
important is the difference in the population. The difference in
sample means is used to estimate the difference in population
means. The accuracy of the estimate is revealed by a confidence
interval.
In order to construct a confidence interval, we are going to
make three assumptions:
The two populations have the same variance. This assumption is
called the assumption of homogeneity of variance.
The populations are normally distributed.
Each value is sampled independently from each other
value.
The consequences of violating these assumptions are discussed
in a later section. For now, suffice it to say that
small-to-moderate violations of assumptions 1 and 2 do not make
much difference.
A confidence interval on the difference between means is
computed using the following formula:
Lower Limit = M1 - M2 -(tCL)()
Upper Limit = M1 - M2 +(tCL)()
where M1 - M2 is the difference between sample means, tCL is
the t for the desired level of confidence, and is the estimated
standard error of the difference between sample means. The meanings
of these terms will be made clearer as the calculations are
demonstrated.
We continue to use the data from the "Animal Research" case
study and will compute a confidence interval on the difference
between the mean score of the females and the mean score of the
males. For this calculation, we will assume that the variances in
each of the two populations are equal.
The first step is to compute the estimate of the standard
error of the difference between means (). Recall from the relevant
section in the chapter on sampling distributions that the formula
for the standard error of the difference in means in the population
is:
In order to estimate this quantity, we estimate σ2 and use
that estimate in place of σ2. Since we are assuming the population
variances are the same, we estimate this variance by averaging our
two sample variances. Thus, our estimate of variance is computed
using the following formula:
where MSE is our estimate of σ2. In this example,
MSE = (2.743 + 2.985)/2 = 2.864.
Note that MSE stands for "mean square error" and is the mean
squared deviation of each score from its group's mean.
Since n (the number of scores in each condition) is 17,
== = 0.5805.
The next step is to find the t to use for the confidence
interval (tCL). To calculate tCL, we need to know the degrees of
freedom. The degrees of freedom is the number of independent
estimates of variance on which MSE is based. This is equal to (n1 -
1) + (n2 - 1) where n1 is the sample size of the first group and n2
is the sample size of the second group. For this example, n1= n2 =
17. When n1= n2, it is conventional to use "n" to refer to the
sample size of each group. Therefore, the degrees of freedom is 16
+ 16 = 32.
Online: Calculator: Find t for confidence interval
From either the above calculator or a t table, you can find
that the t for a 95% confidence interval for 32 df is 2.037.
We now have all the components needed to compute the
confidence interval. First, we know the difference between
means:
M1 - M2 = 5.353 - 3.882 = 1.471
We know the standard error of the difference between means
is
= 0.5805
and that the t for the 95% confidence interval with 32 df
is
tCL = 2.037
Therefore, the 95% confidence interval is
Lower Limit = 1.471 - (2.037)(0.5805) = 0.29
Upper Limit = 1.471 + (2.037)(0.5805) = 2.65
We can write the confidence interval as:
0.29 ≤ μf - μm ≤ 2.65
where μf is the population mean for females and μm is the
population mean for males. This analysis provides evidence that the
mean for females is higher than the mean for males, and that the
difference between means in the population is likely to be between
0.29 and 2.65.
Formatting data for Computer Analysis
Most computer programs that compute t tests require your data
to be in a specific form. Consider the data in Table 2.
Table 2. Example Data.
Group 1 Group 2
3 5
4 6
5 7
Here there are two groups, each with three observations. To
format these data for a computer program, you normally have to use
two variables: the first specifies the group the subject is in and
the second is the score itself. For the data in Table 2, the
reformatted data look as follows:
Table 3. Reformatted Data.
G Y
1 3
1 4
1 5
2 5
2 6
2 7
To use Analysis Lab to do the calculations, you would copy the
data and then
Click the "Enter/Edit User Data" button. (You may be warned
that for security reasons you must use the keyboard shortcut for
pasting data.)
Paste your data.
Click "Accept Data."
Set the Dependent Variable to Y.
Set the Grouping Variable to G.
Click the t-test confidence interval button.
The 95% confidence interval on the difference between means
extends from -4.267 to 0.267.
Computations for Unequal Sample Sizes (optional)
The calculations are somewhat more complicated when the sample
sizes are not equal. One consideration is that MSE, the estimate of
variance, counts the sample with the larger sample size more than
the sample with the smaller sample size. Computationally this is
done by computing the sum of squares error (SSE) as follows:
where M1 is the mean for group 1 and M2 is the mean for group
2. Consider the following small example:
Table 4. Example Data.
Group 1 Group 2
3 2
4 4
5
M1 = 4 and M2 = 3.
SSE = (3-4)2 + (4-4)2 + (5-4)2 + (2-3)2 + (4-3)2 = 4
Then, MSE is computed by: MSE = SSE/df
where the degrees of freedom (df) is computed as before:
df = (n1 -1) + (n2 -1) = (3-1) + (2-1) = 3.
MSE = SSE/df = 4/3 = 1.333.
The formula
=
is replaced by
=
where nh is the harmonic mean of the sample sizes and is
computed as follows:
nh = = = 2.4
and
= = 1.054.
tCL for 3 df and the 0.05 level = 3.182.
Therefore the 95% confidence interval is
Lower Limit = 1 - (3.182)(1.054)= -2.35
Upper Limit = 1 + (3.182)(1.054)= 4.35
We can write the confidence interval as:
-2.35 ≤ μ1 - μ2 ≤ 4.35
24)(AM#6c) Is there a difference in how much males and females
use aggressive behavior to improve an angry mood? For the
"Anger-Out" scores, compute a 99% confidence interval on the
difference between gender means.
Learning Objectives
State the assumptions for computing a confidence interval on
the difference between means
Compute a confidence interval on the difference between
means
Format data for computer analysis
It is much more common for a researcher to be interested in
the difference between means than in the specific values of the
means themselves. We take as an example the data from the "Animal
Research" case study. In this experiment, students rated (on a
7-point scale) whether they thought animal research is wrong. The
sample sizes, means, and variances are shown separately for males
and females in Table 1.
Table 1. Means and Variances in Animal Research study.
Condition n Mean Variance
Females 17 5.353 2.743
Males 17 3.882 2.985
As you can see, the females rated animal research as more
wrong than did the males. This sample difference between the female
mean of 5.35 and the male mean of 3.88 is 1.47. However, the gender
difference in this particular sample is not very important. What is
important is the difference in the population. The difference in
sample means is used to estimate the difference in population
means. The accuracy of the estimate is revealed by a confidence
interval.
In order to construct a confidence interval, we are going to
make three assumptions:
The two populations have the same variance. This assumption is
called the assumption of homogeneity of variance.
The populations are normally distributed.
Each value is sampled independently from each other
value.
The consequences of violating these assumptions are discussed
in a later section. For now, suffice it to say that
small-to-moderate violations of assumptions 1 and 2 do not make
much difference.
A confidence interval on the difference between means is
computed using the following formula:
Lower Limit = M1 - M2 -(tCL)()
Upper Limit = M1 - M2 +(tCL)()
where M1 - M2 is the difference between sample means, tCL is
the t for the desired level of confidence, and is the estimated
standard error of the difference between sample means. The meanings
of these terms will be made clearer as the calculations are
demonstrated.
We continue to use the data from the "Animal Research" case
study and will compute a confidence interval on the difference
between the mean score of the females and the mean score of the
males. For this calculation, we will assume that the variances in
each of the two populations are equal.
The first step is to compute the estimate of the standard
error of the difference between means (). Recall from the relevant
section in the chapter on sampling distributions that the formula
for the standard error of the difference in means in the population
is:
In order to estimate this quantity, we estimate σ2 and use
that estimate in place of σ2. Since we are assuming the population
variances are the same, we estimate this variance by averaging our
two sample variances. Thus, our estimate of variance is computed
using the following formula:
where MSE is our estimate of σ2. In this example,
MSE = (2.743 + 2.985)/2 = 2.864.
Note that MSE stands for "mean square error" and is the mean
squared deviation of each score from its group's mean.
Since n (the number of scores in each condition) is 17,
== = 0.5805.
The next step is to find the t to use for the confidence
interval (tCL). To calculate tCL, we need to know the degrees of
freedom. The degrees of freedom is the number of independent
estimates of variance on which MSE is based. This is equal to (n1 -
1) + (n2 - 1) where n1 is the sample size of the first group and n2
is the sample size of the second group. For this example, n1= n2 =
17. When n1= n2, it is conventional to use "n" to refer to the
sample size of each group. Therefore, the degrees of freedom is 16
+ 16 = 32.
Online: Calculator: Find t for confidence interval
From either the above calculator or a t table, you can find
that the t for a 95% confidence interval for 32 df is 2.037.
We now have all the components needed to compute the
confidence interval. First, we know the difference between
means:
M1 - M2 = 5.353 - 3.882 = 1.471
We know the standard error of the difference between means
is
= 0.5805
and that the t for the 95% confidence interval with 32 df
is
tCL = 2.037
Therefore, the 95% confidence interval is
Lower Limit = 1.471 - (2.037)(0.5805) = 0.29
Upper Limit = 1.471 + (2.037)(0.5805) = 2.65
We can write the confidence interval as:
0.29 ≤ μf - μm ≤ 2.65
where μf is the population mean for females and μm is the
population mean for males. This analysis provides evidence that the
mean for females is higher than the mean for males, and that the
difference between means in the population is likely to be between
0.29 and 2.65.
Formatting data for Computer Analysis
Most computer programs that compute t tests require your data
to be in a specific form. Consider the data in Table 2.
Table 2. Example Data.
Group 1 Group 2
3 5
4 6
5 7
Here there are two groups, each with three observations. To
format these data for a computer program, you normally have to use
two variables: the first specifies the group the subject is in and
the second is the score itself. For the data in Table 2, the
reformatted data look as follows:
Table 3. Reformatted Data.
G Y
1 3
1 4
1 5
2 5
2 6
2 7
To use Analysis Lab to do the calculations, you would copy the
data and then
Click the "Enter/Edit User Data" button. (You may be warned
that for security reasons you must use the keyboard shortcut for
pasting data.)
Paste your data.
Click "Accept Data."
Set the Dependent Variable to Y.
Set the Grouping Variable to G.
Click the t-test confidence interval button.
The 95% confidence interval on the difference between means
extends from -4.267 to 0.267.
Computations for Unequal Sample Sizes (optional)
The calculations are somewhat more complicated when the sample
sizes are not equal. One consideration is that MSE, the estimate of
variance, counts the sample with the larger sample size more than
the sample with the smaller sample size. Computationally this is
done by computing the sum of squares error (SSE) as follows:
where M1 is the mean for group 1 and M2 is the mean for group
2. Consider the following small example:
Table 4. Example Data.
Group 1 Group 2
3 2
4 4
5
M1 = 4 and M2 = 3.
SSE = (3-4)2 + (4-4)2 + (5-4)2 + (2-3)2 + (4-3)2 = 4
Then, MSE is computed by: MSE = SSE/df
where the degrees of freedom (df) is computed as before:
df = (n1 -1) + (n2 -1) = (3-1) + (2-1) = 3.
MSE = SSE/df = 4/3 = 1.333.
The formula
=
is replaced by
=
where nh is the harmonic mean of the sample sizes and is
computed as follows:
nh = = = 2.4
and
= = 1.054.
tCL for 3 df and the 0.05 level = 3.182.
Therefore the 95% confidence interval is
Lower Limit = 1 - (3.182)(1.054)= -2.35
Upper Limit = 1 + (3.182)(1.054)= 4.35
We can write the confidence interval as:
-2.35 ≤ μ1 - μ2 ≤ 4.35