In: Statistics and Probability
1. A random sample of 36 skeletal remains from females was taken from data stored in the Forensic Anthropology Data Bank (FDB) at the University of Tennessee. The femur lengths (right leg) in millimeters are recorded below.
432 |
432 |
435 |
460 |
432 |
440 |
448 |
449 |
434 |
443 |
525 |
451 |
448 |
443 |
450 |
467 |
436 |
423 |
475 |
435 |
433 |
438 |
453 |
438 |
435 |
413 |
439 |
442 |
507 |
424 |
468 |
419 |
434 |
483 |
448 |
514 |
a. Determine the sample mean and standard deviation.
b. Since the sample size is large, we can use the sample standard s in place of σ in calculations of confidence intervals.
c. Before doing any calculations, think about a 90%, 95% and 99% confidence for µ, the mean femur bone length for women. Which of these intervals would be the widest? Which would be the narrowest? Explain how you know without calculating the confidence intervals.
d. Calculate 90%, 95%, and 99% confidence intervals for µ, the mean femur bone length for adult females. Do your results confirm your answer to (b)?
e Redo the 95% confidence interval using the 68-95-99.7 Rule. Comment on the difference between this and the answer you got in part d.
f. How much can a single outlier affect a confidence interval?
Suppose that the first observation of 432 millimeters had been
mistakenly entered as 4.32 millimeters.
(i) Make a boxplot of the modified data set to show that this short
femur length is an outlier.
(ii) Recalculate the 95% confidence interval based on the modified data. How much did the outlier affect the confidence interval?
comment: Always look at your data before calculating confidence intervals.
Outliers can greatly affect your results.
a)
∑x = 16146
∑x² = 7263870
n = 36
Mean , x̅ = Ʃx/n = 16146/36 = 448.5
Standard deviation, s = √[(Ʃx² - (Ʃx)²/n)/(n-1)] = √[(7263870-(16146)²/36)/(36-1)] = 25.2920
b)
Yes, Since the sample size is large, we can use the sample standard s in place of σ
c)
Widest confidence interval 0.99
narrowest confidence interval 0.90
As the confidence level increasing the length of confidence interval also increases.
d)
90% Confidence interval :
At α = 0.1 two tailed critical value, z_c = ABS(NORM.S.INV(0.1/2))
= 1.645
Lower Bound = x̅ - z_c*σ/√n = 448.5 - 1.645 * 25.292/√36 =
441.566
Upper Bound = x̅ + z_c*σ/√n = 448.5 + 1.645 * 25.292/√36 =
455.434
441.566 < µ < 455.434
--
95% Confidence interval :
At α = 0.05 two tailed critical value, z_c =
ABS(NORM.S.INV(0.05/2)) = 1.960
Lower Bound = x̅ - z_c*σ/√n = 448.5 - 1.96 * 25.292/√36 =
440.238
Upper Bound = x̅ + z_c*σ/√n = 448.5 + 1.96 * 25.292/√36 =
456.762
440.238 < µ < 456.762
--
99% Confidence interval :
At α = 0.01 two tailed critical value, z_c =
ABS(NORM.S.INV(0.01/2)) = 2.576
Lower Bound = x̅ - z_c*σ/√n = 448.5 - 2.576 * 25.292/√36 =
437.642
Upper Bound = x̅ + z_c*σ/√n = 448.5 + 2.576 * 25.292/√36 =
459.358
437.642 < µ < 459.358
e)
The empirical rule states that for a normal distribution:
95% of the data will fall within two standard deviations of the mean.
(x̅ - 2*s, x̅ + 2*s)
(448.5 - 2*25.292, 448.5 + 2*25.292)
(397.916, 499.084)
f) i)
ii)
∑x = 15718.32
∑x² = 7077264.662
n = 36
Mean , x̅ = Ʃx/n = 15718.32/36 =
436.6200
Standard deviation, s = √[(Ʃx² - (Ʃx)²/n)/(n-1)] =
√[(7077264.6624-(15718.32)²/36)/(36-1)] = 78.2545
95% Confidence interval :
At α = 0.05 two tailed critical value, z_c =
ABS(NORM.S.INV(0.05/2)) = 1.960
Lower Bound = x̅ - z_c*σ/√n = 436.62 - 1.96 * 78.2545/√36 =
411.057
Upper Bound = x̅ + z_c*σ/√n = 436.62 + 1.96 * 78.2545/√36 =
462.183
411.057 < µ < 462.183