Text book: Essentials of Biostatistics in Public Health 3rd ed. by Lisa Sullivan
Instructor provided question: An SBB instructor wants to determine if there is a correlation between the students incoming GPA and their overall score on their ASCP exam. Several potential confounders were also considered including years of clinical practice in blood bank and gap between completing the program and taking the exam. Perform the appropriate analysis on the data below and interpret the results. Do SBB ASCP scores correlate with student’s GPAs? Do confounders exist? Justify your response.
SBB ASCP Score |
GPA |
BB Experience (years) |
Gap (months) |
698 |
3.98 |
3 |
0.5 |
540 |
3.12 |
8 |
0.25 |
572 |
3.24 |
2 |
2 |
440 |
3.46 |
10 |
3 |
401 |
3.71 |
7 |
12 |
380 |
2.95 |
13 |
5 |
660 |
3.55 |
15 |
0.5 |
398 |
2.62 |
3 |
4 |
498 |
3.16 |
2 |
2 |
557 |
3.32 |
20 |
0.5 |
613 |
3.48 |
2 |
1 |
474 |
3.21 |
5 |
3 |
487 |
2.89 |
7 |
2 |
522 |
3.81 |
2 |
3 |
501 |
3.65 |
9 |
2 |
422 |
2.86 |
5 |
1 |
475 |
2.98 |
8 |
1 |
555 |
3.71 |
10 |
1 |
514 |
2.98 |
17 |
0.5 |
439 |
2.62 |
2 |
2 |
In: Math
In: Math
The accompanying data are the percentage of babies born prematurely in a particular year for the 50 U.S. states and the District of Columbia (DC). State Premature Percent State Premature Percent State Premature Percent Alabama 12.3 Kentucky 11.3 North Dakota 9.0 Alaska 9.1 Louisiana 12.9 Ohio 10.9 Arizona 9.6 Maine 9.0 Oklahoma 10.9 Arkansas 10.6 Maryland 10.7 Oregon 8.3 California 8.9 Massachusetts 9.2 Pennsylvania 10.0 Colorado 9.0 Michigan 10.4 Rhode Island 9.2 Connecticut 9.8 Minnesota 9.3 South Carolina 11.4 Delaware 9.9 Mississippi 13.5 South Dakota 9.1 DC 10.2 Missouri 10.4 Tennessee 11.4 Florida 10.5 Montana 9.9 Texas 11.0 Georgia 11.4 Nebraska 9.7 Utah 9.7 Hawaii 10.6 Nevada 10.7 Vermont 8.5 Idaho 8.8 New Hampshire 8.8 Virginia 9.8 Illinois 10.7 New Jersey 10.2 Washington 8.7 Indiana 10.3 New Mexico 9.8 West Virginia 11.4 Iowa 9.9 New York 9.5 Wisconsin 9.8 Kansas 9.3 North Carolina 10.3 Wyoming 11.8 (a) The smallest value in the data set is 8.3 (Oregon), and the largest value is 13.5 (Mississippi). Are these values outliers? Explain. Any observations smaller than 8.3 Incorrect: Your answer is incorrect. % or larger than 13.5 Incorrect: Your answer is incorrect. % are considered outliers. Therefore, Oregon's data value (8.3%) Correct: Your answer is correct. an outlier and Mississippi's data value (13.5%) Changed: Your submitted answer was incorrect. Your current answer has not been submitted. an outlier. (b) Construct a boxplot for this data set. Comment on the interesting features of the plot. The boxplot shows Incorrect: Your answer is incorrect. and the distribution is . The minimum value is %, the lower quartile is %, the median is %, the upper quartile is %, and the maximum value is %.
In: Math
15. A Pew Poll in 2010, found that 50% of adults aged 25-29 had access to only a cell phone, while 40% had access to both a cellphone and landline and 10% had access to only a landline. We wish to conduct a Goodness of Fit Test to see if the results of a poll of a random sample of 130 adults aged 25-29 are significantly different from the 2010 poll results.
The random sample of 130 adults found that 69 had access to only a cell phone, 39 had access to only a landline, and 22 had access to both types of phones.
Which of the following are the correct null and alternative hypothesis for this test?
A. |
Ho: p = 0.50 Ha: p =/= 0.50 |
|
B. |
Ho: The distribution is the same as in 2010. Ha: The distribution is different than in 2010. |
|
C. |
Ho: The distribution is different than in 2010. Ha: The distribution is the same as in 2010. |
|
D. |
Ho: pcell=0.33 , pboth=0.33, pland=0.33 Ha: pcell=0.50 , pboth=0.40, pland=0.10 |
16. A Pew Poll in 2010, found that 50% of adults aged 25-29 had access to only a cell phone, while 40% had access to both a cellphone and landline and 10% had access to only a landline. We wish to conduct a Goodness of Fit Test to see if the results of a poll of a random sample of 130 adults aged 25-29 are significantly different from the 2010 poll results. The table below presents the observed results.
Cell only | Landline Only | Both | |
Observed | 69 | 39 | 22 |
Expected |
Compute the expected count for the "BOTH" cell.
17. A Pew Poll in 2010, found that 50% of adults aged 25-29 had access to only a cell phone, while 40% had access to both a cellphone and landline and 10% had access to only a landline. We wish to conduct a Goodness of Fit Test to see if the results of a poll of a random sample of 130 adults aged 25-29 are significantly different from the 2010 poll results.
The random sample of 130 adults found that 69 had access to only a cell phone, 39 had access to only a landline, and 22 had access to both types of phones.
The chi-square statistic was calculated as 9.73 and has a p-value of 0.0077.Using α=0.01 what can you conclude?
A. |
There is enough evidence that the distribution is different than in 2010. |
|
B. |
There is enough evidence that the distribution is the same as in 2010. |
|
C. |
There is not enough evidence that the distribution is different than in 2010. |
|
D. |
There is not enough evidence the distribution is the same as in 2010. |
18. According to the CDC 2.8% of high school students currently use electronic cigarettes. A high school counselor is concerned that the use of e-cigs at her school is higher.
A test of the following hypothesis Ho: p = 0.028 Ha: p > 0.028 is conducted using a random sample of 50 students at this school and the null hypothesis is rejected. What conclusion can the counselor make?
A. |
There is not enough evidence that the percent of students using e-cigs is higher than2.8%. |
|
B. |
There is not enough evidence that the percent of students using e-cigs is 2.8%. |
|
C. |
There is enough evidence that the percent of students using e-cigs is higher than2.8%. |
|
D. |
There is enough evidence that the percent of students using e-cigs is 2.8%. |
In: Math
If a random sample of 16 homes south of Center Street in Provo has a mean selling price of $145,450 and a standard deviation of $4525, and a random sample of 28 homes north of Center Street has a mean selling price of $148,900 and a standard deviation of $5625, can you conclude that there is a significant difference between the selling price of homes in these two areas of Provo at the 0.05 level? Assume normality. (a) Find t. (Give your answer correct to two decimal places.)
(ii) Find the p-value. (Give your answer correct to four decimal places.)
(b) State the appropriate conclusion.
Reject the null hypothesis, there is not significant evidence of a difference in means.
Reject the null hypothesis, there is significant evidence of a difference in means.
Fail to reject the null hypothesis, there is significant evidence of a difference in means.
Fail to reject the null hypothesis, there is not significant evidence of a difference in means.
In: Math
11. Determine the p-value given the stated hypothesis and the test statistic value (Z) Ho:μ= 300 H1: μ < 300; z = -2.13 Answer to 4 decimal places.
12. Determine the p-value given the stated hypothesis and test statistic value (Z). Ho: μ = 120 H1: μ ≠ 120; z = 1.92. Answer to 4 decimal places.
13. Some people claim that the physical demand on dancers are such that dancers tend to be shorter than the typical person. Nationally, adult heights are normally distributed with an average of 64.5 inches. A random sample of 20 adult dancers has an average height of 63.275 inches and a standard deviation of 2.17 inches. Is there evidence that dancers are shorter? Calculate the test-statistic for a test
Ho: μ = 64.5 versus Ha: μ < 64.5
Compute the test statistic for this test. (round your answer to 2 decimal places)
14. The mean caffeine content per cup of regular coffee served at a certain coffee shop is supposed to be 100mg. A test is made of Ho: μ=100 versus Ha: μ≠100. A sample of 35 cups does not provide enough evidence that the mean caffeine content is different from 100mg. Which type of error is possible in this situation?
A. |
Type 2 error |
|
B. |
Type 1 error |
|
C. |
Impossible to tell |
|
D. |
Either type is possible |
In: Math
In: Math
11.You measure the weight of 37 turtles, and find they
have a mean weight of 52 ounces. Assume the population standard
deviation is 10.9 ounces. Based on this, what is the maximal margin
of error associated with a 92% confidence interval for the true
population mean turtle weight.
Give your answer as a decimal, to two places
12.In a survey, 18 people were asked how much they spent on their child's last birthday gift. The results were roughly shaped as a normal curve with a mean of $45 and standard deviation of $8. Find the margin of error at a 80% confidence level.
13. In a survey, 20 people were asked how much they
spent on their child's last birthday gift. The results were roughly
shaped as a normal curve with a mean of $34 and standard deviation
of $12. Construct a confidence interval at a 80% confidence
level.
Give your answers to one decimal place.
.................. ± ..............
In: Math
Age | Mileage |
6 | 53808 |
7 | 82838 |
11 | 115903 |
6 | 54903 |
8 | 77564 |
10 | 95911 |
4 | 40686 |
12 | 126675 |
15 | 167636 |
14 | 128798 |
10 | 96589 |
5 | 35049 |
A used car dealer wants to develop a regression equation that determines mileage as a function of the age of a car in years. He collects the data shown below for the 12 cars he has on his lot.
a) What is the slope of the regression equation? Give your
answer to two decimal places.
b) What is the value of the correlation coefficient? Give your
answer to two decimal places.
c) A 4 year old car is delivered to his lot with 160000 miles.
Manually enter these values in the data table above and rerun the
regression analysis. What is the value of the slope? Give your
answer to two decimal places.
d) Including the additional car, what is the value of the
correlation coefficient? Give your answer to two decimal
places.
e) Did the additional car strengthen or weaken the linear
relationship between age and mileage?
It strengthened the linear relationship.
It weakened the linear relationship.
Can not be determined.
In: Math
ID of Respondent |
# of Friends who Bully |
Respondent was a Bully Victim (0 = No; 1 = Yes) |
Gender (0 = Female; 1 = Male) |
# of Times Respondent Bullied Others |
1 |
2 |
1 |
1 |
5 |
2 |
4 |
1 |
0 |
2 |
3 |
3 |
0 |
1 |
8 |
4 |
2 |
0 |
0 |
4 |
5 |
6 |
1 |
1 |
6 |
6 |
3 |
0 |
0 |
2 |
7 |
7 |
1 |
1 |
7 |
8 |
4 |
0 |
0 |
0 |
9 |
2 |
1 |
1 |
1 |
10 |
7 |
1 |
1 |
8 |
1. What is the proportion of males who bullied others? What is the proportion of females who bullied others? Which gender (male or female) possessed a deeper involvement in bullying others?
In: Math
Kroger is in the process of designing a new store to be located in a plaza under development in Mason. They intend to use their Symmes Township store as a model, but they are concerned that the customer base in Mason might have different needs and expectations. One area of concern is in Service Meats. Grocery shoppers in Symmes Township expect a Service Meat counter, and the department has been quite profitable. Kroger would like to know if the expectation of having a Service Meat counter will be the same in Mason as it is in Symmes Township. They survey residents of both areas, and among the questions is, "Do you buy meat from the Service Meat counter on a regular (weekly) basis?" In city a, 505 out of 780 respondents said YES. In city b, 325 out of 620 respondents said YES. Using a = .05, test the claim that the percentage of grocery shoppers who use Service Meats is the same in these two areas of the city.
In: Math
Suppose you are interested in buying a new Toyota Corolla. You are standing on the sales lot looking at a model with different options. The list price is on the vehicle. As a salesperson approaches, you wonder what the dealer invoice price is for this model with its options. The following data are based on a random selection of Toyota Corollas of different models and options. Let y be the dealer invoice (in thousands of dollars) for the given vehicle.
x | 12.9 | 13.0 | 12.8 | 13.6 | 13.4 | 14.2 |
y | 11.6 | 11.9 | 11.5 | 12.2 | 12.0 | 12.8 |
(a) Verify that Σx = 79.9, Σy = 72, Σx2 = 1065.41, Σy2 = 865.1, Σxy = 960.02, and r ≈ 0.980.
Σx | = |
Σy | = |
Σx2 | = |
Σy2 | = |
Σxy | = |
r | = |
(b) Use a 1% level of significance to test the claim that
ρ > 0. (Use 2 decimal places.)
t | = |
critical t | = |
Conclusion
Reject the null hypothesis, there is sufficient evidence that ρ > 0.
Reject the null hypothesis, there is insufficient evidence that ρ > 0.
Fail to reject the null hypothesis, there is insufficient evidence that ρ > 0.
Fail to reject the null hypothesis, there is sufficient evidence that ρ > 0.
(c) Verify that Se ≈ 0.1039, a ≈
0.464, and b ≈ 0.866.
Se | = |
a | = |
b | = |
(d) Find the predicted dealer invoice when the list price is
x = 13.6 (thousand dollars). (Use 2 decimal places.)
(e) Find a 90% confidence interval for y when x =
13.6 (thousand dollars). (Use 2 decimal places.)
lower limit | = |
upper limit | = |
(f) Use a 1% level of significance to test the claim that
β > 0. (Use 2 decimal places.)
t | = |
critical t | = |
Conclusion
Reject the null hypothesis, there is sufficient evidence that β > 0.
Reject the null hypothesis, there is insufficient evidence that β > 0.
Fail to reject the null hypothesis, there is insufficient evidence that β > 0.
Fail to reject the null hypothesis, there is sufficient evidence that β > 0.
(g) Find a 90% confidence interval for β and interpret its
meaning. (Use 2 decimal places.)
lower limit | = |
upper limit | = |
Interpretation
For every $1,000 increase in list price, the dealer price increases by an amount that falls within the confidence interval.
For every $1,000 increase in list price, the dealer price increases by an amount that falls outside the confidence interval.
For every $1,000 increase in list price, the dealer price decreases by an amount that falls within the confidence interval.
For every $1,000 increase in list price, the dealer price decreases by an amount that falls outside the confidence interval.
In: Math
ID of Respondent |
# of Friends who Bully |
Respondent was a Bully Victim (0 = No; 1 = Yes) |
Gender (0 = Female; 1 = Male) |
# of Times Respondent Bullied Others |
1 |
2 |
1 |
1 |
5 |
2 |
4 |
1 |
0 |
2 |
3 |
3 |
0 |
1 |
8 |
4 |
2 |
0 |
0 |
4 |
5 |
6 |
1 |
1 |
6 |
6 |
3 |
0 |
0 |
2 |
7 |
7 |
1 |
1 |
7 |
8 |
4 |
0 |
0 |
0 |
9 |
2 |
1 |
1 |
1 |
10 |
7 |
1 |
1 |
8 |
If you were to draw one individual from this sample of 10 individuals, what would do characteristics do you believe would be the most likely to be drawn (1 = Bully Victim and Bully Offender; 2 = Bully Victim but not Bully Offender; 3 = Bully Offender but not Bully Victim; or 4 = Non Bully Victim and Non Bully Offender)? Why?
In: Math
The following table shows ceremonial ranking and type of pottery sherd for a random sample of 434 sherds at an archaeological location.
Ceremonial Ranking | Cooking Jar Sherds | Decorated Jar Sherds (Noncooking) | Row Total |
A | 91 | 44 | 135 |
B | 88 | 57 | 145 |
C | 80 | 74 | 154 |
Column Total | 259 | 175 | 434 |
Use a chi-square test to determine if ceremonial ranking and pottery type are independent at the 0.05 level of significance.
(a) What is the level of significance?
.05
State the null and alternate hypotheses.
H0: Ceremonial ranking and pottery type are
not independent.
H1: Ceremonial ranking and pottery type are not
independent.
H0: Ceremonial ranking and pottery type are
independent.
H1: Ceremonial ranking and pottery type are not
independent.
H0: Ceremonial ranking and
pottery type are not independent.
H1: Ceremonial ranking and pottery type are
independent.
H0: Ceremonial ranking and pottery type are
independent.
H1: Ceremonial ranking and pottery type are
independent.
(b) Find the value of the chi-square statistic for the sample.
(Round the expected frequencies to at least three decimal places.
Round the test statistic to three decimal places.)
Are all the expected frequencies greater than 5?
What sampling distribution will you use?
chi-square
normal
Student's t
binomial
uniform
What are the degrees of freedom?
(c) Find or estimate the P-value of the sample test
statistic. (Round your answer to three decimal places.)
p-value > 0.1000
.050 < p-value < 0.100
0.025 < p-value < 0.0500
.010 < p-value < 0.0250
.005 < p-value < 0.010
p-value < 0.005
(d) Based on your answers in parts (a) to (c), will you reject or
fail to reject the null hypothesis of independence?
Since the P-value > α, we fail to reject the null hypothesis.
Since the P-value > α, we reject the null hypothesis.
Since the P-value ≤ α, we reject the null hypothesis.
Since the P-value ≤ α, we fail to reject the null hypothesis.
(e) Interpret your conclusion in the context of the
application.
At the 5% level of significance, there is sufficient evidence to conclude that ceremonial ranking and pottery type are not independent.At the 5% level of significance, there is insufficient evidence to conclude that ceremonial ranking and pottery type are not independent.
In: Math
Let x = age in years of a rural Quebec woman at the time of her first marriage. In the year 1941, the population variance of x was approximately σ2 = 5.1. Suppose a recent study of age at first marriage for a random sample of 51 women in rural Quebec gave a sample variance s2 = 2.4. Use a 5% level of significance to test the claim that the current variance is less than 5.1. Find a 90% confidence interval for the population variance.
(a) What is the level of significance?
.05
State the null and alternate hypotheses.
Ho: σ2 = 5.1; H1: σ2 ≠ 5.1
Ho: σ2 = 5.1; H1: σ2 > 5.1
Ho: σ2 = 5.1; H1: σ2 < 5.1
Ho: σ2 < 5.1; H1: σ2 = 5.1
(b) Find the value of the chi-square statistic for the sample.
(Round your answer to two decimal places.)
What are the degrees of freedom?
What assumptions are you making about the original
distribution?
We assume a normal population distribution.
We assume a exponential population distribution.
We assume a binomial population distribution.
We assume a uniform population distribution.
(c) Find or estimate the P-value of the sample test
statistic.
P-value > 0.100
0.050 < P-value < 0.100
0.025 < P-value < 0.0500
.010 < P-value < 0.0250
.005 < P-value < 0.010
P-value < 0.005
(d) Based on your answers in parts (a) to (c), will you reject or
fail to reject the null hypothesis?
Since the P-value > α, we fail to reject the null hypothesis.
Since the P-value > α, we reject the null hypothesis.
Since the P-value ≤ α, we reject the null hypothesis.
Since the P-value ≤ α, we fail to reject the null hypothesis.
(e) Interpret your conclusion in the context of the
application.
At the 5% level of significance, there is insufficient evidence to conclude that the variance of age at first marriage is less than 5.1.
At the 5% level of significance, there is sufficient evidence to conclude that the that the variance of age at first marriage is less than 5.1.
(f) Find the requested confidence interval for the population
variance. (Round your answers to two decimal places.)
lower limit | |
upper limit |
Interpret the results in the context of the application.
We are 90% confident that σ2 lies outside this interval.
We are 90% confident that σ2 lies above this interval.
We are 90% confident that σ2 lies within this interval.
We are 90% confident that σ2 lies below this interval.
In: Math