1) Background: Data on infant
mortality mortality (measured as deaths per 1,000 under 1 year old)
was collected by the United Nations Educational, Scientific, and
Cultural Organization (UNESCO) for the 1990 Demographic Year Book
and is provided below. Perform an analysis of variance procedure to
determine whether or not the differences in the mortality rates of
infants in African, Asian and Middle Eastern countries is
statistically significant.
Directions: Perform an analysis of variance (ANOVA) to determine if the differences in the infant mortality rates in African, Asian and Middle Eastern countries is statistically significant.
Data
Africa | Asia | MiddleEast |
---|---|---|
75 | 181.6 | 13 |
138 | 118 | 110.1 |
69 | 132 | 68 |
74 | 34 | 10.7 |
48.4 | 5.1 | 42 |
138 | 90 | 11.6 |
103 | 76 | 46 |
142 | 23 | 39 |
90 | 26 | 70 |
74 | 72 | 71 |
83 | 129 | 28 |
129 | 105.7 | |
82 | 49 | |
141 | 11.5 | |
136 | 19.4 | |
105 | 27 | |
155 | 64 | |
131 | ||
71 | ||
109 | ||
118 | ||
53 | ||
103 | ||
107 | ||
84 | ||
82 | ||
66 |
Source | S.S. | df | M.S. | F |
Treatment | ||||
Error | ||||
Total |
2)
For which of the following would it be most appropriate to use an ANOVA to analyze the data?
In: Statistics and Probability
The following three independent random samples are obtained from
three normally distributed populations with equal variance. The
dependent variable is starting hourly wage, and the groups are the
types of position (internship, co-op, work study). We are testing
the claim that the starting salaries for new college graduate are
different depending on the positions at α=0.2α=0.2 given the
following data
Group 1: Internship | Group 2: Co-op | Group 3: Work Study |
---|---|---|
10 | 11.25 | 16 |
14.75 | 13 | 14 |
10.5 | 13.5 | 14 |
9.5 | 17.75 | 13 |
14.75 | 8.5 | 16.5 |
14 | 10 | 16 |
15 | 14 | 13.5 |
11 | 14.25 | 12 |
12.75 | 12.5 | 15.75 |
11.25 | 13.25 | 16.25 |
In: Statistics and Probability
PLEASE EXPLAIN. THANK YOU!
DataActivity
1. A random sample of 36 skeletal remains from females was taken from data stored in the Forensic Anthropology Data Bank (FDB) at the University of Tennessee. The femur lengths (right leg) in millimeters are recorded below.
432 |
432 |
435 |
460 |
432 |
440 |
448 |
449 |
434 |
443 |
525 |
451 |
448 |
443 |
450 |
467 |
436 |
423 |
475 |
435 |
433 |
438 |
453 |
438 |
435 |
413 |
439 |
442 |
507 |
424 |
468 |
419 |
434 |
483 |
448 |
514 |
b. Since the sample size is large, we can use the sample standard s in place of σ in calculations of confidence intervals.
c. Before doing any calculations, think about a 90%, 95% and 99% confidence for µ, the mean femur bone length for women. Which of these intervals would be the widest? Which would be the narrowest? Explain how you know without calculating the confidence intervals.
d. Calculate 90%, 95%, and 99% confidence intervals for µ, the mean femur bone length for adult females. Do your results confirm your answer to (c)?
e Redo the 95% confidence interval using the 68-95-99.7 Rule.
Comment on the difference between this and the answer you got in
part d.
f. How much can a single outlier affect a confidence interval? Suppose that the first observation of 432 millimeters had been mistakenly entered as 4.32 millimeters.
(i) Make a boxplot of the modified data set to show that this short femur length is an outlier.
(ii) Recalculate the 95% confidence interval based on the modified data. How much did the outlier affect the confidence interval?
In: Statistics and Probability
A tire manufacturer produces tires that have a mean life of at least 30000 miles when the production process is working properly. The operations manager stops the production process if there is evidence that the mean tire life is below 30000 miles. The testable hypotheses in this situation are ?0:?=30000 H 0 : μ = 30000 vs ??:?<30000 H A : μ < 30000 .
1. Identify the consequences of making a Type I error. A. The manager does not stop production when it is necessary. B. The manager does not stop production when it is not necessary. C. The manager stops production when it is not necessary. D. The manager stops production when it is necessary.
2. Identify the consequences of making a Type II error. A. The manager does not stop production when it is not necessary. B. The manager stops production when it is not necessary. C. The manager stops production when it is necessary. D. The manager does not stop production when it is necessary. To monitor the production process, the operations manager takes a random sample of 15 tires each week and subjects them to destructive testing. They calculate the mean life of the tires in the sample, and if it is less than 28500, they will stop production and recalibrate the machines. They know based on past experience that the standard deviation of the tire life is 2000 miles.
3. What is the probability that the manager will make a Type I error using this decision rule? Round your answer to four decimal places.
4. Using this decision rule, what is the power of the test if the actual mean life of the tires is 28600 miles? That is, what is the probability they will reject ?0 H 0 when the actual average life of the tires is 28600 miles? Round your answer to four decimal places.
In: Statistics and Probability
Because of the relatively high interest rates, most consumers
attempt to pay off their credit card bills promptly. However, this
is not always possible. An analysis of the amount of interest paid
monthly by a bank’s Visa cardholders reveals that the amount is
normally distributed with a mean of $27 and a standard deviation of
$7.
(a) What proportion of the bank’s Visa cardholders pay more than
$30 in interest?
(b) What proportion of the bank’s Visa cardholders pay more than
$40 in interest?
(c) What proportion of the bank’s Visa cardholders pay less than
$15 interest?
(d) What interest payment is exceeded by only 20% of the bank’s
Visa cardholders?
In: Statistics and Probability
Anecdotal evidence has suggested that a specific type of oral contraceptive pill puts women at greater risk for blood clots. Researchers decide to examine this scientifically by starting a prospective cohort study. They enroll women between the ages of 15 and 45 who are using this type of oral contraceptive pill as well as similar women who are not using this contraceptive pill. At baseline none of the women had ever had a blood clot. Then they follow these study participants for 5 years, following up with them once a year to determine if they suffered from a blood clot. At the end of 5 years the researchers report the following information: Out of a total of 6000 women that were taking the oral contraceptive of interest, 575 had reported blood clots. Of the 7000 women not taking the oral contraceptive of interest, 250 reported a blood clot.
A. Create an appropriate 2x2 table for this data. (Fill out the chart)
Blood Clot | No Blood Clot | |
Oral Contraceptive (Exposure) | ||
No Oral Contraceptive (No Exposure) |
B. Calculate the relative risk of having a blood clot for women taking the oral contraceptive pill in question compared to those not taking the contraceptive pill in question (show steps).
C. Assume that this RR is significant. What does this RR mean (be specific using the context of this study)?
In: Statistics and Probability
A chef is opening a new restaurant in Portland. Oregon. and wants to know how many vegan meals he should prepare. 300 people will attend opening night. He knows that about 4% of people in Portland are vegan, and the approximate population of Portland is 600,000. You do NOT need to check CLT here.
A. What is the probability that less than 4.5% of the people attending opening night will be vegans?
B. What is the probability that more than 3.5% of the people attending opening night will be vegans?
C. What is the probability that between 5.5% and 6% night will be vegans?
In: Statistics and Probability
Jan | 523.9 |
Feb | 510.23 |
March | 545.65 |
April | 514.23 |
May | 550.71 |
June | 505.34 |
July | 525.34 |
August | 600 |
Sept. | 569.42 |
October | 500.2 |
Novem. | 533.12 |
December |
507.11 |
What percentage of your bills are within two standard deviations of the mean?
In: Statistics and Probability
1. Identify which test the scenario will use: independent samples t test, paired samples t test, chi square test for goodness of fit, chi square test for independence or chi square test for homogeneity.
A. A researcher goes to the parking lot at a large grocery chain and observes whether each person is male or female and whether they return the cart to the correct spot before Chi leaving (yes or no).
B. Amber Sanchez, a statistics student, collected data on the prices of the same items at the Navy commissary on the naval base in Ventura County, California, and a nearby Kmart The items were matched for content, manufacturer, and size and were priced separately
C. A random survey of automobiles parked in the student lot and the staff lot at a large university classified the brands as either domestic or foreign.
D. Surfers and statistics students Rex Robinson and Sandy Hudson collected data on the number of days on which surfers surfed in the last month for 30 random longboard users and 30 random shortboard users. Test the hypothesis that the mean days surfed for all long boarders is larger than the mean days surfed for all short boarders (because longboards can go out in many different surfing conditions)
E. Suppose you have a random sample of students attending a public university in Nevada and want to determine whether the racial distribution of students different from the racial distribution in the state as a whole
F. Compare the weekday and weekend/holiday hours of sleep. Each pair of numbers is from one randomly selected person.
G. Students observe the number of office hours posted for a random sample of tenured and a random sample of untenured professors
H . Based on a random sample of students at a university, you wish to determine if there is an association between whether or not a student is a transfer student and whether he or she belongs to an on-campus club.
In: Statistics and Probability
For this problem, carry at least four digits after the decimal
in your calculations. Answers may vary slightly due to
rounding.
The National Council of Small Businesses is interested in the
proportion of small businesses that declared Chapter 11 bankruptcy
last year. Since there are so many small businesses, the National
Council intends to estimate the proportion from a random sample.
Let p be the proportion of small businesses that declared
Chapter 11 bankruptcy last year.
(a) If no preliminary sample is taken to estimate p,
how large a sample is necessary to be 99% sure that a point
estimate p̂ will be within a distance of 0.11 from
p? (Round your answer up to the nearest whole
number.)
_________small businesses
(b) In a preliminary random sample of 30 small businesses, it was
found that six had declared Chapter 11 bankruptcy. How many
more small businesses should be included in the sample to
be 99% sure that a point estimate p̂ will be within a
distance of 0.110 from p? (Round your answer up to the
nearest whole number.)
____________more small businesses
In: Statistics and Probability
sas questions
What options would you specify to direct a proc print statement to print only observations 5 through 10?
How do you get the last word of a text string? Write a statement to show your answer.
Describe the difference between using proc means and the mean function to compute a mean.
Describe the role of the input and infile statements in a data step that reads an external data file.
Write the SAS code required to write the SAS data set dogs_data (assume in the work library) to an excel file called dogsdatafile.xlsx in a worksheet called Dogs Data
In: Statistics and Probability
In: Statistics and Probability
The production of a nationally marketed detergent results in certain workers receiving prolonged exposures to a Bacillus subtilis enzyme. Nineteen workers were tested to determine the effects of those exposures, if any, on various respiratory functions. One such function, airflow rate, is measured by computing the ratio of a person’s forced expiratory volume (FEV) to his or her vital capacity (VC). (Vital capacity is the maximum volume of air a person can exhale after taking as deep a breath as possible; FEV is the maximum volume of air a person exhale in one second.) In persons with no lung dysfunction, the “norm” for FEV/VC ratios is 0.80. Assume that the FEV/VC ratios are known to be normally distributed. For the 19 workers in the study, the mean FEV/VC ratio was 0.766 with standard deviation 0.0859. a. Based on your result, is it believable that exposure to Bacillus subtilis enzyme has no effect on the FEV/VC ratio? Please compute the p-value for the test in two different ways: (i) assuming σ = 0.09 and (ii) assuming σ is unknown. b. Based on this data, is it believable that σ = 0.09 ? Conduct a hypothesis test to answer this question.
In: Statistics and Probability
Wait-Times (Raw Data, Software Required):
There are three registers at the local grocery store. I suspect the
mean wait-times for the registers are different. The sample data is
depicted below. It gives the wait-times in minutes.
Register 1 | Register 2 | Register 3 |
2.0 | 1.8 | 2.1 |
2.0 | 2.0 | 2.1 |
1.1 | 2.2 | 1.8 |
2.0 | 1.9 | 1.5 |
1.0 | 1.8 | 1.4 |
2.0 | 2.1 | 1.4 |
1.0 | 2.2 | 2.0 |
1.5 | 1.7 | 1.9 |
The Test: Complete the steps in testing the claim that there is a difference in mean wait-times between the registers.
(a) What is the null hypothesis for this test?
H0: μ1 ≠ μ2 ≠ μ3.
H0: At least one of the population means is different from the others.
H0: μ1 = μ2 = μ3.
H0: μ2 > μ3 > μ1.
(b) What is the alternate hypothesis for this test?
H1: μ2 > μ3 > μ1.H1:
μ1 = μ2 = μ3.
H1: At least one of the population means is different from the others.
H1: μ1 ≠ μ2 ≠ μ3.
(c) Use software to get the P-value of the test statistic (
F ). Round to 4 decimal places unless your
software automatically rounds to 3 decimal places.
P-value =
(d) What is the conclusion regarding the null hypothesis at the
0.01 significance level?
reject H0
fail to reject H0
(e) Choose the appropriate concluding statement.
We have proven that all of the mean wait-times are the same.
There is sufficient evidence to conclude that the mean wait-times are different.
There is not enough evidence to conclude that the mean wait-times are different.
(f) Does your conclusion change at the 0.10 significance level?
Yes
No
In: Statistics and Probability
Grades and AM/PM Section of Stats: There were two large sections of statistics this term at State College, an 8:00 (AM) section and a 1:30 (PM) section. The final grades for both sections are depicted in the contingency table below.
Observed Frequencies: Oi's
A | B | C | D | F | Totals | |
AM | 6 | 11 | 19 | 20 | 15 | 71 |
PM | 19 | 19 | 19 | 13 | 7 | 77 |
Totals | 25 | 30 | 38 | 33 | 22 | 148 |
The Test: Test for a significant dependent
relationship between grades and the section of the course. Conduct
this test at the 0.05 significance level.
(a) What is the test statistic? Round your answer to 3 decimal places.
χ2
=
(b) What is the conclusion regarding the null hypothesis?
reject H0fail to reject H0
(c) Choose the appropriate concluding statement.
We have proven that grades and section of the course are independent.
The evidence suggests that there is a significant dependent relationship between grades and the section of the course.
There is not enough evidence to conclude that there is a significant dependent relationship between grades and the section of the course.
In: Statistics and Probability