The distribution of the number of eggs laid by a certain species of hen during their breeding period has a mean of 36 eggs with a standard deviation of 18.3. Suppose a group of researchers randomly samples 47 hens of this species, counts the number of eggs laid during their breeding period, and records the sample mean. They repeat this 1,000 times, and build a distribution of sample means. A) What is this distribution called? B) Would you expect the shape of this distribution to be symmetric, right skewed, or left skewed? Explain your reasoning. Left skewed, because the population distribution is left skewed. Left skewed, because according to the central limit theorem this distribution is approximately normal. Left skewed, because the population standard deviation is smaller than the population mean. Symmetric, because the population distribution is symmetric. Symmetric, because according to the central limit theorem this distribution is approximately normal. Symmetric, because the population standard deviation is smaller than the population mean. Right skewed, because the population distribution is right skewed. Right skewed, because according to the central limit theorem this distribution is approximately normal. Right skewed, because the population standard deviation is smaller than the population mean. C) Calculate the standard deviation of this distribution (i.e. the standard error). D) Suppose the researchers' budget is reduced and they are only able to collect random samples of 10 hens. The sample mean of the number of eggs is recorded, and we repeat this 1,000 times, and build a new distribution of sample means. What would be the standard error of this new distribution?
In: Math
In: Math
A study examined the average pay for men and women entering the workforce as doctors for 21 different positions.
(a) If each gender was equally paid, then we would expect about half of those positions to have men paid more than women and women would be paid more than men in the other half of positions. Write appropriate hypotheses to test this scenario.
(b) Men were, on average, paid more in 19 of those 21 positions. Complete a hypothesis test using your hypotheses from part (a).
In: Math
As part of a study on transportation safety, the U.S. Department of Transportation collected data on the number of fatal accidents per 1000 licenses and the percentage of licensed drivers under the age of 21 in a sample of 42 cities. Data collected over a one-year period follow. These data are contained in the file named “Safety.csv”.
1- Find the sample mean and standard deviation for each variable. Round your answers to the nearest thousandth.
2- Use the function lm() in R to run a simple linear regression model on the data provided. Use the function summary() in R to generate the regression output. Use the function aov() in R to generate the corresponding ANOVA table. You ought to be able to determine which is the dependent variable and which is the independent variable in this SLR model.
Please copy your R code and the result and paste them here.
3- Write down the estimated regression function below and provide a practical interpretation of the coefficient of the independent variable.
4- Please find a 95% confidence interval for the coefficient of the independent variable and provide a practical interpretation of this interval.
5- At the 5% level of significance, is there a significant relationship between the two variables? Why or why not?
6- What is the value of the coefficient of determination for this simple linear regression model? Provide a brief interpretation of this value.
7- Use the information from the ANOVA table to compute the standard error of estimate, a.k,a, residual standard error. This value must match the residual standard error in the regression summary.
8- What is the point estimate of the expected number of fatal accidents per 1000 licenses if there are 10% drivers under age in a city?
9- Suppose we want to develop a 95% confidence interval for the average number of fatal accidents per 1000 licenses for all the cities with 10% of drivers under age 21. What is the estimate of the standard deviation for this confidence interval?
10-Suppose we want to develop a 95% confidence interval for the average number of fatal accidents per 1000 licenses for all the cities with 10% of drivers under age 21. Compute the t value and the margin of error needed for this confidence interval.
Please copy your R code and the result and paste them here.
11-Provide a 95% confidence interval for the average number of fatal accidents per 1000 licenses for all the cities with 10% of drivers under age 21 and a practical interpretation to this confidence interval.
12- Suppose we want to develop a 95% prediction interval for the number of fatal accidents per 1000 licenses for a city with 10% of drivers under age 21. What is the estimate of the standard deviation for this prediction interval?
13- Suppose we want to develop a 95% prediction interval for the number of fatal accidents per 1000 licenses for a city with 10% of drivers under age 21. Compute the margin of error needed for this prediction interval.
14- Provide a 95% prediction interval for the number of fatal accidents per 1000 licenses for a city with 10% of drivers under age 21 and a practical interpretation to this prediction interval.
| Percent Under 21 | Fatal Accidents per 1000 |
| 13 | 2.962 |
| 12 | 0.708 |
| 8 | 0.885 |
| 12 | 1.652 |
| 11 | 2.091 |
| 17 | 2.627 |
| 18 | 3.83 |
| 8 | 0.368 |
| 13 | 1.142 |
| 8 | 0.645 |
| 9 | 1.028 |
| 16 | 2.801 |
| 12 | 1.405 |
| 9 | 1.433 |
| 10 | 0.039 |
| 9 | 0.338 |
| 11 | 1.849 |
| 12 | 2.246 |
| 14 | 2.855 |
| 14 | 2.352 |
| 11 | 1.294 |
| 17 | 4.1 |
| 8 | 2.19 |
| 16 | 3.623 |
| 15 | 2.623 |
| 9 | 0.835 |
| 8 | 0.82 |
| 14 | 2.89 |
| 8 | 1.267 |
| 15 | 3.224 |
| 10 | 1.014 |
| 10 | 0.493 |
| 14 | 1.443 |
| 18 | 3.614 |
| 10 | 1.926 |
| 14 | 1.643 |
| 16 | 2.943 |
| 12 | 1.913 |
| 15 | 2.814 |
| 13 | 2.634 |
| 9 | 0.926 |
| 17 | 3.256 |
Ps: I do appreciate your help But please do not simply copy and paste irrelevant answer, Thanks
In: Math
One of the major measures of the quality of service provided by any organization is the speed with which it responds to customer complaints. A large family-held department store selling furniture and flooring, including carpet, had undergone a major expansion in the past several years. In particular, the flooring department had expanded from 2 installation crews to an installation supervisor, a measurer, and 15 installation crews. The store had the business objective of improving its response to complaints. The variable of interest was defined as the number of days between when the complaint was made and when it was resolved. Data were collected from 40 complaints that were made in the past year (furniture2.xlsx). (a) The installation supervisor claims that the mean number of days between the receipt of a complaint and the resolution of the complaint is 30 days. To test the claim, build null and alternative hypotheses (for a two-tail test). (b) To conduct a two-tail t test based on the hypotheses in (a), identify the rejection regions (two sides) given the 99% critical level. (c) Using the given data, compute the test statistic for the t test. (d) At the 0.01 level of significance, should the claim be rejected (i.e., the mean number of days is different from 30)? In the critical-value approach, what is your conclusion based on (b) and (c)? Explain. (e) Using the test statistic in (c), determine the p-value for the t test. (f) In the p-value approach, what is your conclusion based on (e)? Explain.
| Days |
| 65 |
| 43 |
| 35 |
| 137 |
| 31 |
| 27 |
| 152 |
| 22 |
| 123 |
| 81 |
| 74 |
| 27 |
| 11 |
| 19 |
| 126 |
| 110 |
| 110 |
| 29 |
| 61 |
| 35 |
| 94 |
| 31 |
| 26 |
| 5 |
| 12 |
| 4 |
| 165 |
| 32 |
| 29 |
| 28 |
| 29 |
| 26 |
| 25 |
| 1 |
| 14 |
| 13 |
| 13 |
| 10 |
| 5 |
| 27 |
In: Math
what are the conditions and assumptions necessary to construct confidence interval on the mean and the variance separately?
In: Math
Below is a list of the profit or (loss) of 40 companies.
- Calculate the population parameters for all 40 companies profit or loss
- Select a sample of 10
- Calculate the point estimators of the sample
- Compare the results (Point estimators with Population Parameters)
$ In Millions
$9,862 $19,710
$44,940 $48,351
$10,558 $5,070
$6,662 $3,033
$29,450 ($3,864)
7,602 $364.5
$9,195 $1,288
$2,679 $30,101
$1,907 ($5786)
$4,078 $24,441
$2,463 $12,662
$8,630 $18,232
$4,517.4 $22,183
$8,197 $5,106
$3,842.8 $21,204
$4,065 $22,714
$2,997 $9,609
$246.5 $4,286
$3,577 $2,736
$2,421.9 $1,982
In: Math
Question 3 This question is testing your understanding of some important concepts about hypothesis testing and confidence intervals. For each part below, answer the question (1 mark) and then succinctly explain your reasoning (1 mark). (a) We are performing a one-sample t test at the 5% level of significance where the hypotheses are 0 1 H VH :5 :5 µ µ = ≠ . The number of observations is 15. State the critical value? (b) We are performing a hypothesis test and we conclude that we reject H0 at the 5% level of significance. Will we reject the same H0 (with the same H1 ) at the 10% level of significance? (c) Suppose we are performing a hypothesis test and we conclude that we cannot reject H0 at the 5% level of significance. Can we reject the same H0 (with the same H1 ) at the 10% level of significance? (d) Suppose we are performing a two-sample proportion test at the 5% level of significance where the hypotheses are 01 2 11 2 H p p VH p p : 0: 0 −= −≠ . The calculated p-value is 0.00268. Do we reject H0 ? (e) Based on the data, we obtain (1.85, 1.95) as the 95% confidence interval for the true mean. Can we reject 0 H : 0 µ = against 1 H : 0 µ ≠ at the 5% level of significance?
In: Math
Problem 1.
(a) The columns of response and factors can be defined in R as follows, use these codes to solve the problem.
y<-c(2, 3, 10, 12, 8, 4, 11, 8)##response : scores
a<-c("Heart","Heart", "Soul", "Soul","Heart","Heart", "Soul", "Soul")##factor A
b<-c("D", "D", "D", "D", "R", "R", "R", "R")##factor B (group variable)
(b) Find the overall mean, row means, column means, each cell mean for the table given in problem 1.
In: Math
A data set includes 103 body temperatures of healthy adult humans having a mean of 98.5°F and a standard deviation of 0.61°F. Construct a 99% confidence interval estimate of the mean body temperature of all healthy humans. What does the sample suggest about the use of 98.6°F as the mean body temperature?
What is the confidence interval estimate of the population mean μ?
°F < μ < °F
(Round to three decimal places as needed.)
In: Math
When applying statistical tests involving comparing two means or a sample to a population mean, there are many organizational applications. For example, Human Resources may want to track entrance exam scores of their new hires. This would be an example of two mean comparison. In terms of the recent election, Gallup may take a sample and compare to a population of candidate votes (sample mean compared to a population mean).
Think of an example in your organization of either one of these tests and discuss its application as well as the risk of type 1 or 2 errors.
In: Math
2. What is the sample size, n, for a 95% confidence interval on the mean, if we know that the process’ standard error is 3.2 units, and we want to allow at most 1.0 units for our error?
3. Let’s say that you just randomly pulled 32 widgets from your production line and you determined that you need a sample size of 46 widgets, However, you get delayed in being able to pull another bunch of widgets from the line until the start of the next day. How many widgets should you now pull for your analysis?
4. What is the sample size, n, for a 98% confidence interval on the mean, if we know that the process’ standard error is 3.2 units, and we want to allow at most 0.5 units for our error?
5. What is the sample size, n, for a 95% confidence interval on the mean, if we know that the process’ standard error is 3.2 units, and we want to allow at most 0.5 units for our error?
In: Math
According to published reports, practice under fatigued
conditions distorts mechanisms that govern performance. An
experiment was conducted using
15 college males, who were trained to make a continuous horizontal
right-to-left arm movement from a microswitch to a barrier,
knocking over the barrier coincident with the arrival of a clock
sweephand to the 6 o’clock position. Theabsolute value of the
difference between the times, in milliseconds, that it took to
knock over the barrier and the time for the sweephand to reach the
6 o’clockposition (500 msec) was recorded. Each participant
performed the task five times under prefatigue and postfatigue
conditions, and the sums of the absolute differences for the five
performances were recorded. The data can be found in the folder of
this question.
a) (0.5 point) Read the data into R using read.csv function. Note: Show your codes but not the result/output.
b) (0.5 point) An increase in the mean absolute time difference when the task is performed under postfatigue conditions would support the claim that practice under fatigued conditions distorts mechanisms that govern performance. Assuming the populations to be normally distributed, write the two hypothesis of interest to test this claim.
c) (1 point) Use a suitable test in R to test your hypothesis in (b). Show your codes, output and use α = 0.05.
d) (1 point) Interpret your finding in (c).
data:
Prefatigue,Postfatigue
159,92
93,60
66,216
99,227
34,224
90,92
149,93
59,178
143,135
118,117
75,154
67,220
110,144
58,165
86,101
In: Math
A machine that puts corn flakes into boxes is adjusted to put an average of 15.1 ounces into each box, with standard deviation of 0.23 ounce. If a random sample of 15 boxes gave a sample standard deviation of 0.35 ounce, do these data support the claim that the variance has increased and the machine needs to be brought back into adjustment? (Use a 0.01 level of significance.)
(i) Give the value of the level of significance.
State the null and alternate hypotheses.
H0: σ2 < 0.0529; H1: σ2 = 0.0529
H0: σ2 = 0.0529; H1: σ2 ≠ 0.0529
H0: σ2 = 0.0529; H1: σ2 < 0.0529
H0: σ2 = 0.0529; H1: σ2 > 0.0529
(ii) Find the sample test statistic. (Round your answer to two
decimal places.)
(iii) Find or estimate the P-value of the sample test
statistic.
P-value > 0.1000
.050 < P-value < 0.100
0.025 < P-value < 0.0500
.010 < P-value < 0.0250
.005 < P-value < 0.010
P-value < 0.005
(iv) Conclude the test.
Since the P-value ≥ α, we fail to reject the null hypothesis.
Since the P-value < α, we reject the null hypothesis.
Since the P-value < α, we fail to reject the null hypothesis
.Since the P-value ≥ α, we reject the null hypothesis.
(v) Interpret the conclusion in the context of the application.
At the 1% level of significance, there is sufficient evidence to conclude that the variance has increased and the machine needs to be adjusted.
At the 1% level of significance, there is insufficient evidence to conclude that the variance has increased and the machine needs to be adjusted.
In: Math
Results from the National Health Interview Survey show that among the U.S. adult population, 45.9% do not meet physical activity guidelines, 3.5% meet only strength activity, 29.0% meet only aerobic activity, and 21.6% meet both strength and aerobic activity. We sampled 4475 adults from Ohio and the results were as follows: 50.0% do not meet physical activity guidelines, 5.0% meet only strength activity, 35.0% meet only aerobic activity, and 10.0% meet both strength and aerobic activity. Conduct an appropriate hypothesis test to determine if the distribution in physical activity among Ohioans is similar to the U.S. population. Interpret your results.
In: Math