A probability of 1 is the same as a probability of 100%
The difference between interval and ordinal data is that interval data has a natural zero.
If you are doing a study and the population is Americans, the easiest type of study to run would be a simple random sample.
If your population is 80% female and your sample is 60% male, there is undercoverage bias.
In order to calculate a mean on Excel, we type in "=MEAN".
In: Statistics and Probability
#16
A random sample of companies in electric utilities (I), financial services (II), and food processing (III) gave the following information regarding annual profits per employee (units in thousands of dollars).
I | II | III |
49.5 | 55.1 | 39.1 |
43.5 | 25.3 | 37.2 |
32.9 | 41.1 | 10.9 |
27.9 | 29.5 | 32.6 |
38.1 | 39.4 | 15.5 |
36.2 | 42.6 | |
20.7 |
Shall we reject or not reject the claim that there is no difference in population mean annual profits per employee in each of the three types of companies? Use a 1% level of significance.
(b) Find SSTOT, SSBET, and SSW and check that SSTOT = SSBET + SSW. (Use 3 decimal places.)
SSTOT | = | |
SSBET | = | |
SSW | = |
Find d.f.BET, d.f.W,
MSBET, and MSW. (Use 3 decimal
places for MSBET, and
MSW.)
dfBET | = | |
dfW | = | |
MSBET | = | |
MSW | = |
Find the value of the sample F statistic. (Use 3 decimal
places.)
What are the degrees of freedom?
(numerator)
(denominator)
(f) Make a summary table for your ANOVA test.
Source of Variation |
Sum of Squares |
Degrees of Freedom |
MS | F Ratio |
P Value | Test Decision |
Between groups | NA | NA. | ||||
Within groups | ||||||
Total |
In: Statistics and Probability
Twenty-one daily responses of stack loss (y) (the amount of ammonia escaping) were measured with air flow x1, temperature x2, and acid concentration x3.
Using stepwise method, find the best regression model by Minitab and explain the results obtained at each step. [3 marks]
y | x1 | x2 | x3 |
42 |
80 |
27 |
89 |
37 |
80 |
27 |
88 |
37 |
75 |
25 |
90 |
28 |
62 |
24 |
87 |
18 |
62 |
22 |
87 |
18 |
62 |
23 |
87 |
19 |
62 |
24 |
93 |
20 |
62 |
24 |
93 |
15 |
58 |
23 |
87 |
14 |
58 |
18 |
80 |
14 |
58 |
18 |
89 |
13 |
58 |
17 |
88 |
11 |
58 |
18 |
82 |
12 |
58 |
19 |
93 |
8 |
50 |
18 |
89 |
7 |
50 |
18 |
86 |
8 |
50 |
19 |
72 |
8 |
50 |
19 |
79 |
9 |
50 |
20 |
80 |
15 |
56 |
20 |
82 |
15 |
70 | 20 | 91 |
In: Statistics and Probability
2. To raise awareness of its capabilities, FedEx developed a sales promotion that was sent to selected offices. To assess the possible benefit of the promotion, FedEx pulled the shipping records for a random sample of 50 offices that received the promotion and another random sample of 75 that did not and collected data on the number of mailings. They want to see if those who received the sales promotions shipped more mailings. The complete set of results is provided below (promotions columns). a. State the null and alternate hypotheses. b. Run the test. Paste the test output and state your decision (minitab - Stat-paired T-Test and CI). c. What is the best estimate for the population difference in means for the number of mailings between offices with the promotion and offices without the promotion? (Be 90% confident in your estimate for the confidence interval). d. Interpret the confidence interval in part c. e. What is the margin of error associated with 90% confidence interval?
Promotion Mailings
Promotions_NO 15
Promotions_NO 49
Promotions_NO 42
Promotions_NO 22
Promotions_NO 26
Promotions_NO 35
Promotions_NO 38
Promotions_NO 13
Promotions_NO 35
Promotions_NO 14
Promotions_NO 5
Promotions_NO 64
Promotions_NO 27
Promotions_NO 57
Promotions_NO 50
Promotions_NO 43
Promotions_NO 32
Promotions_NO 39
Promotions_NO 13
Promotions_NO 19
Promotions_NO 47
Promotions_NO 45
Promotions_NO 38
Promotions_NO 59
Promotions_NO 35
Promotions_NO 8
Promotions_NO 10
Promotions_NO 58
Promotions_NO 44
Promotions_NO 9
Promotions_NO 10
Promotions_NO 0
Promotions_NO 42
Promotions_NO 37
Promotions_NO 23
Promotions_NO 12
Promotions_NO 54
Promotions_NO 41
Promotions_NO 36
Promotions_NO 43
Promotions_NO 45
Promotions_NO 18
Promotions_NO 65
Promotions_NO 10
Promotions_NO 17
Promotions_NO 59
Promotions_NO 26
Promotions_NO 18
Promotions_NO 8
Promotions_NO 14
Promotions_NO 74
Promotions_NO 29
Promotions_NO 60
Promotions_NO 19
Promotions_NO 30
Promotions_NO 29
Promotions_NO 12
Promotions_NO 0
Promotions_NO 20
Promotions_NO 31
Promotions_NO 13
Promotions_NO 5
Promotions_NO 7
Promotions_NO 42
Promotions_NO 36
Promotions_NO 9
Promotions_NO 23
Promotions_NO 70
Promotions_NO 28
Promotions_NO 25
Promotions_NO 26
Promotions_NO 24
Promotions_NO 50
Promotions_NO 7
Promotions_NO 0
Promotions_YES 38
Promotions_YES 74
Promotions_YES 18
Promotions_YES 65
Promotions_YES 60
Promotions_YES 51
Promotions_YES 71
Promotions_YES 47
Promotions_YES 29
Promotions_YES 39
Promotions_YES 45
Promotions_YES 36
Promotions_YES 57
Promotions_YES 36
Promotions_YES 12
Promotions_YES 20
Promotions_YES 23
Promotions_YES 79
Promotions_YES 16
Promotions_YES 4
Promotions_YES 62
Promotions_YES 37
Promotions_YES 2
Promotions_YES 23
Promotions_YES 6
Promotions_YES 10
Promotions_YES 28
Promotions_YES 65
Promotions_YES 25
Promotions_YES 86
Promotions_YES 27
Promotions_YES 58
Promotions_YES 33
Promotions_YES 54
Promotions_YES 40
Promotions_YES 92
Promotions_YES 71
Promotions_YES 0
Promotions_YES 77
Promotions_YES 60
Promotions_YES 56
Promotions_YES 38
Promotions_YES 16
Promotions_YES 89
Promotions_YES 62
Promotions_YES 9
Promotions_YES 42
Promotions_YES 73
Promotions_YES 49
Promotions_YES 14
In: Statistics and Probability
Bighorn sheep are beautiful wild animals found throughout the western United States. Let x be the age of a bighorn sheep (in years), and let y be the mortality rate (percent that die) for this age group. For example, x = 1, y = 14 means that 14% of the bighorn sheep between 1 and 2 years old died. A random sample of Arizona bighorn sheep gave the following information:
x | 1 | 2 | 3 | 4 | 5 |
y | 12.8 | 20.9 | 14.4 | 19.6 | 20.0 |
Σx = 15; Σy = 87.7; Σx2 = 55; Σy2 = 1592.17; Σxy = 276.2
(a) Draw a scatter diagram.
(b) Find the equation of the least-squares line. (Round your
answers to two decimal places.)
ŷ = | + x |
(c) Find r. Find the coefficient of determination
r2. (Round your answers to three decimal
places.)
r = | |
r2 = |
Explain what these measures mean in the context of the problem.
The correlation coefficient r measures the strength of the linear relationship between a bighorn sheep's age and the mortality rate. The coefficient of determination r2 measures the explained variation in mortality rate by the corresponding variation in age of a bighorn sheep.
The coefficient of determination r measures the strength of the linear relationship between a bighorn sheep's age and the mortality rate. The correlation coefficient r2 measures the explained variation in mortality rate by the corresponding variation in age of a bighorn sheep.
Both the correlation coefficient r and coefficient of determination r2 measure the strength of the linear relationship between a bighorn sheep's age and the mortality rate.
The correlation coefficient r2 measures the strength of the linear relationship between a bighorn sheep's age and the mortality rate. The coefficient of determination r measures the explained variation in mortality rate by the corresponding variation in age of a bighorn sheep.
(d) Test the claim that the population correlation coefficient is
positive at the 1% level of significance. (Round your test
statistic to three decimal places.)
t =
Find or estimate the P-value of the test statistic.
P-value > 0.250
0.125 < P-value < 0.250
0.100 < P-value < 0.125
0.075 < P-value < 0.100
0.050 < P-value < 0.075
0.025 < P-value < 0.050
0.010 < P-value < 0.025
0.005 < P-value < 0.010
0.0005 < P-value < 0.005
P-value < 0.0005
Conclusion
Reject the null hypothesis, there is sufficient evidence that ρ > 0.
Reject the null hypothesis, there is insufficient evidence that ρ > 0.
Fail to reject the null hypothesis, there is sufficient evidence that ρ > 0.
Fail to reject the null hypothesis, there is insufficient evidence that ρ > 0.
(e) Given the result from part (c), is it practical to find
estimates of y for a given x value based on the
least-squares line model? Explain.
Given the lack of significance of r, prediction from the least-squares model might be misleading.
Given the significance of r, prediction from the least-squares model is practical.
Given the significance of r, prediction from the least-squares model might be misleading.
Given the lack of significance of r, prediction from the least-squares model is practical.
In: Statistics and Probability
Starting salaries of 110 college graduates who have taken a
statistics course have a mean of $42,647. Suppose the distribution
of this population is approximately normal and has a standard
deviation of $10,972.
Using a 75% confidence level, find both of the following:
(NOTE: Do not use commas or dollar signs in your answers.)
(a) The margin of error:
(b) The confidence interval for the mean μ: <?<
In: Statistics and Probability
JenStar tracks their daily profits and has found that the
distribution of profits is approximately normal with a mean of
$16,900.00 and a standard deviation of about $650.00. Using this
information, answer the following questions.
For full marks your answer should be accurate to at least three
decimal places.
Compute the probability that tomorrow's profit will be
a) between $18,050.50 and $18,603.00
b) less than $17,920.50 or greater than $18,271.50
c) greater than $17,062.50
d) less than $16,835.00 or greater than $18,083.00
e) less than $17,985.50
In: Statistics and Probability
a. For a sample of n = 4 scores, conduct a single sample t-test to evaluate the
significance of the treatment effect and calculate Cohen’s d to measure the size of the
treatment effect. Use a two-tailed test with α = .05.Show the sampling distribution.(2pts)
b. For a sample of n = 16 scores, conduct a single sample t-test to evaluate the significance
of the treatment effect and calculate Cohen’s d to measure the size of the treatment effect.
Use a two-tailed test with α = .05. Show the sampling distribution. (2 pts)
c. Using symbols, write up your results. Describe how increasing the size of the sample
affects the likelihood of rejecting the null hypothesis and the measure of effect size. (1 pt)
In: Statistics and Probability
To investigate the fluid mechanics of swimming, twenty swimmers each swam a specified distance in a water-filled pool and in a pool where the water was thickened with food grade guar gum to create a syrup-like consistency. Velocity, in meters per second, was recorded and the results are given in the table below.
Swimmer | Velocity (m/s) | |
---|---|---|
Water | Guar Syrup | |
1 | 0.90 | 0.93 |
2 | 0.92 | 0.97 |
3 | 1.00 | 0.95 |
4 | 1.10 | 1.14 |
5 | 1.20 | 1.23 |
6 | 1.25 | 1.23 |
7 | 1.25 | 1.27 |
8 | 1.30 | 1.30 |
9 | 1.35 | 1.34 |
10 | 1.40 | 1.42 |
11 | 1.40 | 1.44 |
12 | 1.50 | 1.53 |
13 | 1.65 | 1.59 |
14 | 1.70 | 1.70 |
15 | 1.75 | 1.80 |
16 | 1.80 | 1.77 |
17 | 1.80 | 1.84 |
18 | 1.85 | 1.86 |
19 | 1.90 | 1.89 |
20 | 1.95 | 1.95 |
The researchers concluded that swimming in guar syrup does not change mean swimming speed. Are the given data consistent with this conclusion? Carry out a hypothesis test using a 0.01 significance level. (Use
μd = μwater − μguar syrup.)
State the appropriate null and alternative hypotheses.
H0: μd = 0
Ha: μd > 0
H0: μd ≠ 0
Ha: μd = 0
H0: μd = 0
Ha: μd < 0
H0: μd < 0
Ha: μd = 0
H0: μd = 0
Ha: μd ≠ 0
Find the test statistic and P-value. (Round your test statistic to one decimal place and your P-value to three decimal places.)
t=
P-value=
State the conclusion in the problem context.
We reject H0. The data do not provide convincing evidence that swimming in guar syrup changes mean swimming speed.
We fail to reject H0. The data do not provide convincing evidence that swimming in guar syrup changes mean swimming speed.
We fail to reject H0. The data provide convincing evidence that swimming in guar syrup changes mean swimming speed.
We reject H0. The data provide convincing evidence that swimming in guar syrup changes mean swimming speed.
In: Statistics and Probability
The null and alternate hypotheses are:
H0 : μ1 =
μ2
H1 : μ1 ≠
μ2
A random sample of 10 observations from one population revealed a sample mean of 23 and a sample standard deviation of 3.5. A random sample of 4 observations from another population revealed a sample mean of 27 and a sample standard deviation of 3.6.
At the 0.01 significance level, is there a difference between the population means?
State the decision rule. (Negative values should be indicated by a minus sign. Round your answers to 3 decimal places.)
Compute the pooled estimate of the population variance. (Round your answer to 3 decimal places.)
Compute the test statistic. (Negative value should be indicated by a minus sign. Round your answer to 3 decimal places.)
In: Statistics and Probability
Be sure to clearly state the hypotheses in the hypothesis tests and state the conclusions in terms of the problem. Use ?=.?? for all tests.
The following table presents shear strength (in kN/mm) and weld diameters (in mm) for a sample of spot welds.
Diameter Strength
4.2 51
4.4 54
4.6 69
4.8 81
5.0 75
5.2 79
5.4 89
5.6 101
5.8 98
6.0 102
1.Construct a scatterplot of strength (y) versus diameter (x). Does it appear as though a linear model is appropriate? Explain.
2.Compute the correlation coefficient between x and y.
3.Compute the least-squares line for predicting shear strength from weld diameter.
4.Compute the fitted value and residual for each point. (This can be done with a simple option selection in Minitab.
5.Predict the strength for a diameter of 5.5 mm.
6.Can the least-squares line be used to predict the strength for a diameter of 8 mm? If so, predict the strength. If not, explain why not.
7.For what diameter would you predict a strength of 95 kN/mm?
8.Compute the coefficient of determination and explain what it represents.
9.Compute a 90% confidence interval for the mean shear strength of welds with diameters of 5.1 mm.
10.Compute a 99% prediction interval for the shear strength of particular weld with diameter 5.1 mm.
11.Construct two residual plots (residuals versus the fitted y values and a normal probability plot of the residuals) and discuss what they tell you about the fit of the model and whether the model assumptions are satisfied.
Please solve using Minitab and show step
In: Statistics and Probability
In: Statistics and Probability
3) A random sample of 1001 Americans aged 15 or older revealed that the amount of time spent eating or drinking per day is 1.22 hours, with a standard deviation of 0.65 hours.
a) Suppose a histogram of time spent eating and drinking were right-skewed. Use this result to explain why a large sample size is needed in order to construct a confidence interval for the mean time spent eating and drinking each day. (2 points)
b) There are over 215 million Americans aged 15 and older. Explain why this fact, and the fact that our sample was a true random sample, satisfies the requirements for constructing a confidence interval. (2 points)
c) Construct 90%, 95% and 99% confidence intervals for the mean amount of time Americans aged 15 and older spend eating and drinking per day. Be sure to show all work (or, if using StatCrunch, paste your output to this document) and provide a proper interpretation for each interval
In: Statistics and Probability
An administrator wanted to study the utilization of long-distance telephone service by a department. One variable of interest (let’s call it X) is the length, in minutes, of long-distance calls made during one month. There were 38 calls that resulted in a connection. The length of calls, already ordered from smallest to largest, are presented in the following table.
1.6 |
1.7 |
1.8 |
1.8 |
1.9 |
2.1 |
2.5 |
3.0 |
3.0 |
4.4 |
4.5 |
4.5 |
5.9 |
7.1 |
7.4 |
7.5 |
7.7 |
8.6 |
9.3 |
9.5 |
12.7 |
15.3 |
15.5 |
15.9 |
15.9 |
16.1 |
16.5 |
17.3 |
17.5 |
19.0 |
19.4 |
22.5 |
23.5 |
24.0 |
31.7 |
32.8 |
43.5 |
53.3 |
Which one of the following statements is not true?
The 75th percentile (Q3) is 17.5 minutes.
The 50th percentile is (Q2) 9.4 minutes.
The 25th percentile (Q1) is 4.4 minutes.
Q3- Q2 > Q2- Q1
Average X > Median X.
X distribution is positively skewed.
The percentile rank of 5.9 minutes is 13.
Range of X is 51.7 minutes.
IQR (Inter-Quartile Range) is 13.1 minutes.
There are 2 outliers in X distribution.
Q4: (This continues Q3: 2 marks) Which one of the following cannot be used to describe the distribution of X?
A Histogram.
A Stemplot.
Skewness and Kurtosis.
Mean and SD (Standard Deviation).
The 5-number Summary.
The coefficient of determination.
The coefficient of relative variation (CRV).
The 1.5 IQR Rule.
The Deciles.
A Boxplot.
In: Statistics and Probability
a random variable X has the following pmf:
X |
-1 |
0 |
1 |
P[X] |
0.25 |
0.5 |
0.25 |
Define Y = X2 & W= Y+2.
Which one of the following statements is not true?
V[Y] = 0.25.
E[XY] = 0.
E[X3] = 0.
E[X+2] = 2.
E[Y+2] = 2.5.
E[W+2] = 4.5.
V[X+2] = 0.5.
V[W+2] = 0.25.
P[W=1] = 0.5
X and W are not independent.
In: Statistics and Probability