In: Statistics and Probability
Recall in our discussion of the normal distribution the research study that examined the blood vitamin D levels of the entire US population of landscape gardeners. The intent of this large-scale and comprehensive study was to characterize fully this population of landscapers as normally distributed with a corresponding population mean and standard deviation, which were determined from the data collection of the entire population.
Suppose you are now in a different reality in which this study never took place though you are still interested in studying the average vitamin D levels of US landscapers. In other words, the underlying population mean and standard deviation are now unknown to you. Furthermore, you would like to collect data from US office workers to examine the difference between the average vitamin D levels of landscapers and office workers, which will reflect any occupational sun exposure differences as measured by blood vitamin D levels. You obtain research funding to sample at random 45 landscapers and 31 office workers, collect blood samples, and send these samples to your collaborating lab in order to quantify the amount of vitamin D in both groups' blood. After anxiously awaiting your colleagues to complete their lab quantification protocol, they email you the following vitamin D level data as shown in the following tables.
|
|
What is the estimated 95% confidence interval (CI) of the average difference in blood vitamin D levels between US landscapers and office workers in ng/mL? Assign groups 1 and 2 to be landscapers and office workers, respectively.
Please note the following: 1) in practice, you as the analyst decide how to assign groups 1 and 2 and subsequently interpret the results appropriately in the context of the data, though for the purposes of this exercise the groups are assigned for you; 2) you might calculate a CI that is different from any of the multiple choice options listed below due to rounding differences, therefore select the closest match; 3) ensure you use either the large or small sample CI formula as appropriate; and 4) you may copy and paste the data into Excel to facilitate analysis.
Select one:
a. -2.63 to 0.75 ng/mL
b. -3.30 to 0.61 ng/mL
c. -3.46 to 0.59 ng/mL
d. -2.99 to 0.65 ng/mL
Sample #1 ----> 1
mean of sample 1, x̅1= 48.465
standard deviation of sample 1, s1 =
3.9746
size of sample 1, n1= 45
Sample #2 ----> 2
mean of sample 2, x̅2= 49.633
standard deviation of sample 2, s2 =
3.9699
size of sample 2, n2= 31
α=0.05
Degree of freedom, DF= n1+n2-2 =
74
t-critical value = t α/2 =
1.9925 (excel formula =t.inv(α/2,df)
pooled std dev , Sp= √([(n1 - 1)s1² + (n2 -
1)s2²]/(n1+n2-2)) = 3.9727
std error , SE = Sp*√(1/n1+1/n2) =
0.9273
margin of error, E = t*SE = 1.9925
* 0.93 =
1.84762
difference of means = x̅1-x̅2 =
48.4647 - 49.633 =
-1.1680
confidence interval is
Interval Lower Limit= (x̅1-x̅2) - E =
-1.1680 - 1.8476 =
-3.016
Interval Upper Limit= (x̅1-x̅2) + E =
-1.1680 + 1.8476 =
0.680
so, answer is
d. -2.99 to 0.65 ng/mL