In: Statistics and Probability
(Computational) Applied Statistics
Problem 1: A noted medical researcher has suggested that a heart
attack is less likely to occur among adults who actively
participate in athletics. A random sample of 300 adults is
obtained. Of that total, 100 are found to be athletically active.
Within this group, 10 suffered heart attacks; among the 200
athletically inactive adults, 26 had suffered heart
attacks.
a) Test the hypothesis that the proportion of adults who are active
and sufferedheart attacks is different from the proportion of
adults who are not active and suffered heart
attacks.
b) Construct a 95% confidence interval for the difference between
the proportions of all active and inactive adults who suffered
heart attacks.What can you conclude and why?
Problem 2: The data below refer to aluminum contents
in soil at two different locations. Summary of the data is provided
in the table below. You mayassume that the data are normally
distributed.
Location
n sample mean
sample standard deviation
1
5
2935
235.7657
2
4
2637
741.8416
a) Give a 95% confidence interval for the mean aluminum contents at
location 1.
b) Give a 95% confidence interval for the mean aluminum contents at
location 2.
What are your conclusions from the confidence intervals in (a) and
(b)and why?
c) At , test H0 : = versus H1 :
d) Give the approximate p-value for the test in (c).
Problem 3: Nine students were randomly selected who had taken the
TOEFL test twice. A researcher would like to test the claim that
students who take the TOEFL test a second time score higher than
their first test.
Student
A
B
C
D
E
F
G H I
First TOEFL
Score
480 510 530 540
550 560 600 620 660
Second TOEFL Score 460
500 530 520 580
580 560 640 690
Test the claim using a level of significance of 0.05 and construct
a 90% confidence interval for µd .
Problem 4: According to reported figures, the average price of used
car nationally is $8,000 with a standard deviation of $4,500. A
student at annajah national university wants to purchase a used car
and wishes to find out if the average used car price in Nablus is
less than the national average. The student collected figures on a
random sample of 81 used car sales at dealerships across Nablus.
The sample mean price was $7,100.
a) State the null and alternative hypotheses, compute the test
statistic, find the p-value and what is your conclusion? Use α =
0.05.
b) If the actual mean of the prices is $7500, find the probability
of type II error.
c) What value of n is necessary to ensure that β = 0.10 when α =
0.05 and the actual mean is $7500?
Good LuckDr. Ali Barakat
Problem 1:
(a) Let p1= true proportion of adults who are active and sufferedheart attacks
p2=true proportion of adults who are not active and suffered heart attacks
Minitab output:
Test and CI for Two Proportions
Sample X N Sample p
1 10 100 0.100000
2 26 200 0.130000
Difference = p (1) - p (2)
Estimate for difference: -0.03
95% CI for difference: (-0.105031, 0.0450310)
Test for difference = 0 (vs not = 0): Z = -0.75 P-Value = 0.451
Fisher's exact test: P-Value = 0.572
Value of Z score=-0.75
P-Value = 0.451
Since P-value>0.05, we fail to reject H0 at 5% level of significance and conclude that there is insufficient evidence that the proportion of adults who are active and sufferedheart attacks is different from the proportion of adults who are not active and suffered heart attacks.
(b) 95% confidence interval for the difference between the proportions of all active and inactive adults who suffered heart attacks: (-0.1050, 0.04503).
Since the confidence interval contains zero we get same conclusion as Part (a).
Problem 2:
(a) Minitab output:
N Mean StDev SE Mean 95% CI
5 2935 235.7657 105 (2642, 3228)
95% confidence interval for the mean aluminum contents at location 1: (2642, 3228).
(b) Minitab output:
N Mean StDev SE Mean 95% CI
4 2637 741.8416 371 (1457, 3817)
95% confidence interval for the mean aluminum contents at location 2: (1457, 3817).
(c)
Test and CI for Two Variances
Method
Null hypothesis Sigma(1) / Sigma(2) = 1
Alternative hypothesis Sigma(1) / Sigma(2) not = 1
Significance level Alpha = 0.05
Statistics
Sample N StDev Variance
1 5 235.766 55585.465
2 4 741.842 550328.959
Ratio of standard deviations = 0.318
Ratio of variances = 0.101
Test
Method DF1 DF2 Statistic P-Value
F Test (normal) 4 3 0.10 0.051
Since P-value>0.05, we can assume two population variances are equal.
Two-Sample T-Test and CI
Sample N Mean StDev SE Mean
1 5 2935 236 105
2 4 2637 742 371
Difference = mu (1) - mu (2)
Estimate for difference: 298
95% CI for difference: (-523, 1119)
T-Test of difference = 0 (vs not =): T-Value = 0.86 P-Value = 0.419
DF = 7
Both use Pooled StDev = 517.3185
(d) p-value=0.419.
Problem 3:
Minitab output:
Paired T for First TOEFL score - Second TOEFL score
N Mean StDev SE Mean
First TOEFL score 9 561.1 56.4 18.8
Second TOEFL score 9 562.2 70.8 23.6
Difference 9 -1.11 25.22 8.41
95% upper bound for mean difference: 14.52
T-Test of mean difference = 0 (vs < 0): T-Value = -0.13 P-Value
= 0.449
Since P-value>0.05, there is insufficient evidence to accept the claim that students who take the TOEFL test a second time score higher than their first test.
Minitab output:
Paired T-Test and CI: First TOEFL score, Second TOEFL score
Paired T for First TOEFL score - Second TOEFL score
N Mean StDev SE Mean
First TOEFL score 9 561.1 56.4 18.8
Second TOEFL score 9 562.2 70.8 23.6
Difference 9 -1.11 25.22 8.41
90% CI for mean difference: (-16.74, 14.52)
T-Test of mean difference = 0 (vs not = 0): T-Value = -0.13 P-Value
= 0.898
construct a 90% confidence interval for µd: (-16.74, 14.52).