Questions
Let Y be a random variable that represents the number of infants in a group of...

Let Y be a random variable that represents the number of infants in a group of 4,000 who die from asthma in a year. In the United States, the probability that a child dies from an acute asthma attack in a year is 0.0035.

a. What is the mean number of infants who would die in a group of this size?

b. What is the probability that at most four infants out of 4,000 die from asthma in a year?

c. What is the probability that between 6 and 10 infants out of 4,000 die from asthma in a year?

In: Math

2. What is the consequence of using unconditional logistic regression to analyze the data collected from...

2. What is the consequence of using unconditional logistic regression to analyze the data collected from a 1:M matched case-control study?

In: Math

Rework problem 28 from section 3.4 of your text, involving the drawing of two balls from...

Rework problem 28 from section 3.4 of your text, involving the drawing of two balls from a box of colored balls. Assume the box contains 12 balls: 6 red, 3 blue, and 3 yellow. A ball is drawn and its color noted. If the ball is yellow, it is replaced; otherwise, it is not. A second ball is then drawn and its color is noted.

What is the probability that the first ball was yellow, given that the second was red?

In: Math

The following table consists of one student athlete's time (in minutes) to swim 2000 yards and...

The following table consists of one student athlete's time (in minutes) to swim 2000 yards and the student's heart rate (beats per minute) after swimming on a random sample of 10 days. Swim Time Heart Rate 34.13 144 35.74 152 34.73 124 34.03 140 34.14 152 35.73 146 36.16 128 35.56 136 35.39 144 35.58 148 Which point from the data has the largest residual? (x, y) = 35.58,148 Incorrect: Your answer is incorrect. Explain what the residual means in context. Is this point an outlier? An influential point? Explain. (Round your answers to two decimal places.) The residual means that when the swim time is Incorrect: Your answer is incorrect. , the observed heart rate is about Incorrect: Your answer is incorrect. beats less than the predicted rate. When this point is removed, it has a Correct: Your answer is correct. effect on the regression line, so it Correct: Your answer is correct. influential. The point Correct: Your answer is correct. an outlier, because the residual is Correct: Your answer is correct. twice the standard deviation.

In: Math

When would you use FPC? Please list an example. Remember, if we have a large, but...

When would you use FPC? Please list an example.

Remember, if we have a large, but finite population and we are taking a small sample of it, we need to use the FPC.

For example: let's say 1,000 students took this class in the last 3 years in all ways, i.e. on-line, in a traditional classroom during a normal semester, and during the summer terms. Let's say that since Finance majors have a lot more math in many of their courses, I would expect them to have higher grades. Let's say that there were 100 Finance majors that took the course. Should I use the FPC? Yes. The sample is small relative to the population.

Can you think of an example?

In: Math

. A photocopier company claims that the average time it takes its technicians to service its...

. A photocopier company claims that the average time it takes its technicians to service its brand of photocopiers onsite is two hours. To test this claim, 30 service times are recorded. The sample mean of the service times was 2.4 hours and the sample standard deviation was 0.5 hours. For µ denoting the mean service time, the hypotheses to be tested are: H0 : µ = 2 versus H1 : µ ̸= 2.

(a) Using the fact that t29,0.975 = 2.045, calculate a 95% confidence interval for the mean service time assuming that service time is normally distributed.

(b) Does your interval provide evidence that the company’s claim is false? That is, do you reject H0? Explain.

(c) Write a simple statement that summarises your findings.

2. A quality control inspector is interested in comparing two concreting companies with respect to overall quality of their work. One measure of quality is whether or not the concrete that has been poured is at least 15 centimeters (cm) thick. The inspector has recorded details for jobs from both companies and is now interested in whether the companies differ with respect to this measure of quality. Let p1 denote the true proportion of times that Company 1 will fail to pour concrete at least 15 cm thick and similarly let p2 denote the proportion for Company 2. The data collected for Company 1 shows that out of 120 jobs measured, 14 of these were less than 15 cm thick. For Company 2, 21 jobs out of 85 resulted in concrete less than 15 cm thick.

(a) Carry out a hypothesis test comparing the two proportions by using the R function prop.test. From the R output find and report the following: i. The estimates to p1 and p2. ii. The approximate 95% confidence interval for p1 − p2. iii. The p-value for the test comparing p1 and p2.

(b) Using the p-value you reported above, can you reject that the proportions are equal at the α = 0.05 significance level? Explain.

(c) Does your confidence interval suggest that one company performs better than the other with respect to this measure of quality? If so, which company performs better and why? If not then clearly explain why this is the case.

(d) Provide a simple statement that summarises the findings you have reported above.

3. Suppose that the scientists are concerned with the growth of the orange trees in their area. Load the data called ‘Orange’ (the Excel copy is saved in datasets folder in LMS) in R. Throughout we will assume that the data is stored in R as the data frame Orange. This dataset consists of three variables; Soil (soil enzymes level) , age (days since 31/12/1968) and the circumference (diameter of the trunk of the tree).

(a) To visualize any linear relationship between the dependent (response) variable Circumference and independent (predictor) variables Soil and Age, create two separate (one for each predictor) scatter plots with a smooth line, utilizing the ‘scatter.smooth’ command.

(b) In order to check for outliers, produce three BoxPlots of the variables Age, Soil and Circumference, by dividing the graph area into three columns, using the command ‘mfrow’. 1 (c) Execute the following command to obtain least squares estimates and associated output in R. lm.model <- lm(Circumference ~ Age + Soil, data = Orange) summary(lm.model) Provide a copy of your results displayed by summary(lm.model).

(d) Create plots of the residual versus fits and the Q-Q plot of the standardised residuals. Do you think the residuals versus fits plot or the Q-Q plot of the residuals suggest that there are any linear regression model violations that we need to be concerned with? Justify your answer with reference to both of the plots. NOTE: Regardless of your answer to (d), for the remainder of this question please assume that there are no linear regression model violations. (e) Does the R output suggest that the regression model fits the data well? Explain. (f) What are the estimates of the coefficients for the Age and Soil explanatory variables? Interpret these estimated coefficients. (g) Let β1 denote the true coefficient for the Age explanatory variable and consider the hypotheses H0 : β1 = 0 versus H1 : β1 ̸= 0. Do you reject the null hypothesis at the α = 0.05 significance level? Explain. (h) Repeat (g), but this time for the coefficient for the Soil explanatory variable. You may denote this coefficient as β2. (i) Using the fact that t32,0.975 = 2.037, construct 95% confidence interval for β1. (j) For a soil enzymes level of 2 and age of 500, what is the estimated circumference measurement from your model? (k) For a soil enzymes level of 2 and age of 500 that you considered above, provide a 95% confidence interval and 95% prediction interval for the circumference of the tree. Provide a justification as to why these intervals are different (i.e. what are these intervals used for?).

In: Math

There are a number of steps involved in testing hypotheses. The mechanics of testing a hypothesis...

There are a number of steps involved in testing hypotheses. The mechanics of testing a hypothesis are not difficult, but the underlying logic requires quite a bit more explanation. In this activity, you will start by working through the steps of hypothesis testing to give you some experience with the process. Then, you will work through the steps for that same problem in far more detail. Finally, there is some logic that is inherent in the process of hypothesis testing that is not obvious in the steps but is vitally important to your understanding. The last part of this activity explains that logic. Let’s start with an example: Suppose you are an educational researcher who wants to increase the science test scores of high school students. Based on tremendous amounts of previous research, you know that the national average test score for all senior high school students in the United States is 50 with a standard deviation of 20. In other words, 50 and 20 are the known population parameters for the mean and the standard deviation, respectively (i.e., µ = 50, σ = 20). You also know that this population of test scores has a normal shape. You and your research team take a sample of 16 high school seniors (N = 16), help them for 2 months, and then give them the national science test to determine if their test scores after help were higher than the national average science test score of 50 (i.e., µ = 50). After the help, the mean test score of the 16 student sample was M = 61 (SD = 21). Now you need to determine if the difference between the sample’s mean of 61 and the population mean of 50 is likely to be due to sampling error or if the help improved the sample’s science test score. Because you are only interested in adopting the help program if it increases science test scores, you use a one-tailed hypothesis test. You also chose a .05 alpha value (i.e., α = .05) as the decision criterion for your significance test. If the sample mean has a chance probability less than .05, you will conclude that the help program improved students’ test scores.

2. Write H0 next to the symbolic notation for the null hypothesis and H1 next to the research hypothesis.

______µhelp > 50 ______µhelp < 50 ______µhelp ≥ 50 ______µhelp ≤ 50 ______µhelp > 61 ______µhelp < 61 ______µhelp ≥ 61 ______µhelp ≤ 61

3. Write H0 next to the verbal description of the null hypothesis and H1 next to the research hypothesis.

_____The population of students who receive help will have a mean science test score that is equal to 50. _____The population of students who receive help will have a mean science test score that is greater than 50. _____The population of students who receive help will not have a mean science test score that is greater than 50. _____The population of students who receive help will have a mean science test score that is less than 50.

In: Math

Assume that when adults with smartphones are randomly​ selected, 43​% use them in meetings or classes....

Assume that when adults with smartphones are randomly​ selected, 43​% use them in meetings or classes. If 10 adult smartphone users are randomly​ selected, find the probability that fewer than 5 of them use their smartphones in meetings or classes.

The probability is ?

In: Math

a) The maximum daily water level at an embankment with height 6m can be described with...

a) The maximum daily water level at an embankment with height 6m can be described with mean 2m and varians 1m2.

What is the upper bound that the embankment will be flooded the given day?

b) For the stochastic variable  Z = 4X − 5Y − 5, it is given that E(X) = 5, E(Y ) = 3, Var(X) = 16 and Var(Y ) = 9 while Cov(X, Y ) = 8.

Find Var(Z)

In: Math

The x-bar and R values for 20 samples of size five are shown in Table 10E.6....

The x-bar and R values for 20 samples of size five are shown in Table 10E.6. Specifications on this product have been established as 0.550   +/- 0.02.

Table 10.E.8

Sample No. X Bar R
1 0.549 0.0025
2 0.548 0.0021
3 0.548 0.0023
4 0.551 0.0029
5 0.553 0.0018
6 0.552 0.0017
7 0.550 0.0020
8 0.551 0.0024
9 0.553 0.0022
10 0.556 0.0028
11 0.547 0.0020
12 0.545 0.0030
13 0.549 0.0031
14 0.552 0.0022
15 0.550 0.0023
16 0.548 0.0021
17 0.556 0.0019
18 0.546 0.0018
19 0.550 0.0021
20 0.551 0.022

(a) Construct a modified control chart with α=0.0013, assuming that if the true process

fraction nonconforming is as large as 1%, the process is unacceptable.

(b) Suppose that if the true process fraction nonconforming is as large as 1%, we would like an

acceptance control chart to detect this out-of control condition with probability 0.90.

Construct this acceptance control chart, and compare it to the chart obtained in part (a).

In: Math

You are a cognitive psychologist who has developed a new treatment for depression. Currently, the most...

You are a cognitive psychologist who has developed a new treatment for depression. Currently, the most effective treatments for depression are behavioral treatments and pharmacological treatments (treatments involving medication). In order to validate your new treatment, you need to show that it is at least as effective as behavioral treatment and medication. You conduct a clinical trial to determine the effectiveness of your treatment by randomly assigning 60 individuals with depression to four treatment groups: a Control group (n = 15), a Cognitive Therapy group (n = 15), a Behavioral Therapy group (n = 15), and a Pharmacological Therapy group (n = 15). After 6 weeks of treatment, you assess depression using the Beck Depression Inventory (BDI; range: 0 – 63 with 0 indicating no depression and 63 indicating severe depression). Use the 6 steps of hypothesis testing and SPSS to determine whether the groups differ in terms of scores on the BDI. Use the SPSS #2 to conduct the statistical test, record your answers on the answer sheet and attach SPSS output.

Control

Pharmacological Therapy

Behavioral Therapy

Cognitive Therapy

74

79

44

55

70

89

68

48

83

89

71

64

66

75

35

40

90

67

62

30

97

74

65

31

104

90

70

63

100

81

65

30

75

65

45

64

92

68

41

69

89

85

46

45

70

71

68

39

83

69

69

31

95

67

39

66

108

70

42

3

a) What is the appropriate statistical test to answer question? (0.5 pt)

b) Step 1: What is your prediction regarding the results of the statistical test? (0.5 pt)

c) Step 2: Set up hypotheses (2 pts)

H0 (1 pt):
H1 (1 pt):   

d) Step 3: Set criteria for decision (2 pts)

Critical value (1 pt):
Decision Rule (1 pt):   

e) Step 5: Report Results (2 pts) – Must include test statistic (0.5 pts), degrees of freedom (0.5 pts), p-value (0.5 pts), and appropriate measure of effect size (0.5 pts)

f) Post-hoc comparisons (1 pt) – Conduct Tukey’s HSD and report differences between groups by using “>”, “<”, or “=” to demonstrate which group’s mean is greater than the other (The first two have been filled out as a guide).

Post-hoc Comparisons

Control

=

Pharmacological

Control

<

Behavioral

Control

Cognitive

Behavioral

Pharmacological

Behavioral

Pharmacological

Cognitive

Pharmacological

g) Step 6: Interpret the results of the statistical test in terms of the research question (1 pt)

SPSS DATA

Control   74
Control   70
Control   83
Control   66
Control   90
Control   97
Control   104
Control   100
Control   75
Control   92
Control   89
Control   70
Control   83
Control   95
Control   108
Pharmacological Therapy   79
Pharmacological Therapy   89
Pharmacological Therapy   89
Pharmacological Therapy   75
Pharmacological Therapy   67
Pharmacological Therapy   74
Pharmacological Therapy   90
Pharmacological Therapy   81
Pharmacological Therapy   65
Pharmacological Therapy   68
Pharmacological Therapy   85
Pharmacological Therapy   71
Pharmacological Therapy   69
Pharmacological Therapy   67
Pharmacological Therapy   70
Behavioral Therapy   44
Behavioral Therapy   68
Behavioral Therapy   71
Behavioral Therapy   35
Behavioral Therapy   62
Behavioral Therapy   65
Behavioral Therapy   70
Behavioral Therapy   65
Behavioral Therapy   45
Behavioral Therapy   41
Behavioral Therapy   46
Behavioral Therapy   68
Behavioral Therapy   69
Behavioral Therapy   39
Behavioral Therapy   42
Cognitive Therapy   55
Cognitive Therapy   48
Cognitive Therapy   64
Cognitive Therapy   40
Cognitive Therapy   30
Cognitive Therapy   31
Cognitive Therapy   63
Cognitive Therapy   30
Cognitive Therapy   64
Cognitive Therapy   69
Cognitive Therapy   45
Cognitive Therapy   39
Cognitive Therapy   31
Cognitive Therapy   66
Cognitive Therapy   36

In: Math

In statistics, is there a golden rule or a procedure that can be applied given a...

In statistics, is there a golden rule or a procedure that can be applied given a problem that will help me I identify when I should apply the central limit theorem vs. applying e.g. binomial probability or sample size estimate.

In: Math

The distribution of wait times for a movie to load on Hulu is normally distributed with...

The distribution of wait times for a movie to load on Hulu is normally distributed with a standard deviation of 4.4 seconds. What changes the mean is how long it takes Windows to give the Hulu player priority in the process queue. I want to know the average on my old computer, so I randomly sample it 19 times, and I get an average of 668 seconds. Find a 87% confidence interval for the average time my old computer takes to play a movie on Hulu.
Ues 2 decimal places.

In: Math

When one company buys another company, it is not unusual that some workers are terminated. The...

When one company buys another company, it is not unusual that some workers are terminated. The severance benefits offered to the laid-off workers are often the subject of dispute. Suppose that the Laurier Company recently bought the Western Company and subsequently terminated 20 of Western’s employees. As part of the buyout agreement, it was promised that the severance packages offered to the former Western employees would be equivalent to those offered to Laurier employees who had been terminated in the past year. Thirty-six-year-old Bill Smith, a Western employee for the past 10 years, earning $32,000 per year, was one of those let go. His severance package included an offer of 5 weeks’ severance pay. Bill complained that this offer was less than that offered to Laurier’s employees when they were laid off, in contravention of the buyout agreement. A statistician was called in to settle the dispute. The statistician was told that severance is determined by three factors: age, length of service with the company, and pay. To determine how generous the severance package had been, a random sample of 50 Laurier ex-employees was taken. For each, the following variables were recorded:Number of weeks of severance pay,Age of employee,Number of years with the company,Annual pay (in thousands of dollars),

A. Determine the regression equation.

B.Comment on how well the model fits the data.

***USE EXCEL, or xlstat***

Weeks SP Age Years Pay
13 37 16 46
13 53 19 48
11 36 8 35
14 44 16 33
3 28 4 40
10 43 9 31
4 29 3 33
7 31 2 43
12 45 15 40
7 44 15 32
8 42 13 42
11 41 10 38
9 32 5 25
10 45 13 36
18 48 19 40
17 52 20 34
13 42 11 33
14 42 19 38
5 27 2 25
11 50 15 36
10 46 14 36
8 28 6 22
15 44 16 32
7 40 6 27
9 37 8 37
11 44 12 35
10 33 13 32
8 41 14 42
5 33 7 37
6 27 4 35
14 39 12 36
12 50 17 30
10 43 11 29
14 49 14 29
12 48 17 36
12 41 17 37
8 39 8 36
12 49 16 28
10 37 10 35
11 37 13 37
15 44 19 33
5 31 6 37
8 42 9 36
11 40 11 32
15 35 15 30
11 46 13 40
6 25 5 33
6 40 7 33
13 40 14 48
9 38 10 37

In: Math

Robert Altoff is vice president of engineering for a manufacturer of household washing machines. As part...

Robert Altoff is vice president of engineering for a manufacturer of household washing machines. As part of a new product development project, he wishes to determine the optimal length of time for the washing cycle. Included in the project is a study of the relationship between the detergent used (four brands) and the length of the washing cycle (18, 20, 22, or 24 minutes). In order to run the experiment, 32 standard household laundry loads (having equal amounts of dirt and the same total weights) are randomly assigned to the 16 detergent–washing cycle combinations. The results (in pounds of dirt removed) are shown below. Detergent Brand Cycle Time (min) 18 20 22 24 A 0.14 0.13 0.19 0.17 0.13 0.11 0.18 0.20 B 0.15 0.16 0.19 0.21 0.12 0.14 0.17 0.18 C 0.17 0.17 0.19 0.20 0.19 0.16 0.20 0.22 D 0.11 0.12 0.18 0.15 0.14 0.15 0.17 0.19 Complete an ANOVA table. Use the 0.05 significance level. (Do not round your intermediate calculations. Enter your SS, MS, p to 3 decimal places and F to 2 decimal places.) Choose the right option.

In: Math