Below are two judges’ rankings of bands at a marching band competition. Calculate the Covariance, Coefficient of Correlation, the Least Squares Equation and the Coefficient of Determination. Also, use your equation of calculate Judge 2’s rank of the bands if Judge 1 ranks them a 4. Judge 1 (x) Judge 2 (y) 1 6 3 4 4 5 7 1 5 3 6 2 2 7
In: Statistics and Probability
The following data were obtained in a study of the relationship between diastolic blood pressure (Y) and age (X) for boys 5 to 13 years old.
X |
5 |
8 |
11 |
7 |
13 |
12 |
12 |
6 |
Y |
63 |
67 |
74 |
64 |
75 |
69 |
90 |
60 |
Fit a simple linear regression model to the data and plot the residuals against the fitted values.
Omit case 7 and refit the model. Plot the residuals versus the fitted values and compare to what you got in (a).
Using the fitted model from (b), obtain a 99% prediction interval for a new Y observation at X = 12. Does observation y7 fall outside this prediction interval? What is the significance of this?
In: Statistics and Probability
An investigator compares the durability of two different compounds used in the manufacture of a certain automobile brake lining. A sample of 256 brakes using Compound 1 yields an average brake life of 49,386 miles. A sample of 298 brakes using Compound 2 yields an average brake life of 47,480 miles. Assume that the population standard deviation for Compound 1 is 1649 miles, while the population standard deviation for Compound 2 is 3911 miles. Determine the 95% confidence interval for the true difference between average lifetimes for brakes using Compound 1 and brakes using Compound 2.
Step 1 of 3 : Find the point estimate for the true difference between the population means.
Step 2 of 3: Calculate the margin of error of a confidence interval for the difference between the two population means. Round your answer to sox decimal places.
Step 3 of 3: Construct the 80% confidence interval. Round your answers to the nearest whole number.
In: Statistics and Probability
According to national data, 20.7% of burglaries are cleared with arrests. A new detective is assigned to six different burglaries. What is the probability that at least one of them is cleared with an arrest?
Consider 4 species of plants in Point of Rocks Park where there are 160 White Pines, 80 Hackberries, 300 Pin Oaks, and 60 Tulip Poplars. What is the probability, and explain how you go about finding the probability, of randomly picking a Pin Oak out of this park.
The probability is . To get the probability, divide by .
Dr. Baum is analyzing the distribution of two genus of trees, Acer and Quercus. In the forest you are currently studying with her, there are 11 species in the genus Acer, while there are 87 species of the genus Quercus. How many possible combinations, consisting of one member from each genus, are possible?
In: Statistics and Probability
Randomly selected students participated in an experiment to test their ability to determine when one minute (or sixty seconds) has passed. Forty students yielded a sample mean of 57.5 seconds. Assuming that sigmaequals9.2 seconds, construct and interpret a 95% confidence interval estimate of the population mean of all students. hat is the 95% confidence interval for the population mean mu? Based on the result, is it likely that the students' estimates have a mean that is reasonably close to sixty seconds? A. Yes, because the confidence interval does not include sixty seconds. B. Yes, because the confidence interval includes sixty seconds. C. No, because the confidence interval includes sixty seconds. D. No, because the confidence interval does not include sixty seconds.
In: Statistics and Probability
Let the following sample of 8 observations be drawn from a
normal population with unknown mean and standard deviation: 16, 21,
12, 23, 25, 26, 22, 17. [You may find it useful to
reference the t table.]
a. Calculate the sample mean and the sample
standard deviation. (Round intermediate calculations to at
least 4 decimal places. Round "Sample mean" to 3 decimal places and
"Sample standard deviation" to 2 decimal places.)
b. Construct the 80% confidence interval for the population mean. (Round "t" value to 3 decimal places and final answers to 2 decimal places.)
c. Construct the 90% confidence interval for the population mean. (Round "t" value to 3 decimal places and final answers to 2 decimal places.)
d. What happens to the margin of error as the confidence level increases from 80% to 90%?
In: Statistics and Probability
A marketing company based out of New York City is doing well and is looking to expand internationally. The CEO and VP of Operations decide to enlist the help of a consulting firm that you work for, to help collect data and analyze market trends.
You work for Mercer Human Resources. The Mercer Human Resource Consulting website (www.mercer.com) lists prices of certain items in selected cities around the world. They also report an overall cost-of-living index for each city compared to the costs of hundreds of items in New York City (NYC). For example, London at 88.33 is 11.67% less expensive than NYC.
More specifically, if you choose to explore the website further you will find a lot of fun and interesting data. You can explore the website more on your own after the course concludes.
Down below, you will find the 2018 data for 17 cities in the data set Cost of Living. Included are the 2018 cost of living index, cost of a 3-bedroom apartment (per month), price of monthly transportation pass, price of a mid-range bottle of wine, price of a loaf of bread (1 lb.), the price of a gallon of milk and price for a 12 oz. cup of black coffee. All prices are in U.S. dollars.
You use this information to run a Multiple Linear Regression to predict Cost of living, along with calculating various descriptive statistics. This is given in the Excel output (that is, the MLR has already been calculated. Your task is to interpret the data). Based on this information, in which city should you open a second office in? You must justify your answer. If you want to recommend 2 or 3 different cities and rank them based on the data and your findings, this is fine as well. This should be ¾ to 1 page, no more than 1 single-spaced page in length, using 12-point Times New Roman font. You do not need to do any calculations, but you do need to pick a city to open a second location at and justify your answer based upon the provided results of the Multiple Linear Regression. Think of this assignment as the first page of a much longer report, known as an Executive Summary, that essentially summarizes your findings briefly and at a high level. This needs to be written up neatly and professionally. This would be something you would present at a board meeting in a corporate environment.
What is an Executive Summary?
To help you make this decision here are some things to consider:
Based on the MLR output, what variable(s) is/are significant?
From the significant predictors, review the mean, median, min, max, Q1 and Q3 values? It might be a good idea to compare these values to what the New York value is for that variable. Remember New York is the baseline as that is where headquarters are located.
Based on the descriptive statistics, for the significant predictors, what city/cities has the best potential? What city or cities fall above or below the median and/or the mean? What city or cities are in the upper 3rd quartile? Or the bottom quartile? These are some things to consider not necessarily questions you need to answer in your Executive Summary. But they are questions to help guide you along in your analysis
City | Cost of Living Index | Rent (in City Centre) | Monthly Pubic Trans Pass | Loaf of Bread | Milk | Bottle of Wine (mid-range) | Coffee |
Mumbai | 31.74 | $1,642.68 | $7.66 | $0.41 | $2.93 | $10.73 | $1.63 |
Prague | 50.95 | $1,240.48 | $25.01 | $0.92 | $3.14 | $5.46 | $2.17 |
Warsaw | 45.45 | $1,060.06 | $30.09 | $0.69 | $2.68 | $6.84 | $1.98 |
Athens | 63.06 | $569.12 | $35.31 | $0.80 | $5.35 | $8.24 | $2.88 |
Rome | 78.19 | $2,354.10 | $41.20 | $1.38 | $6.82 | $7.06 | $1.51 |
Seoul | 83.45 | $2,370.81 | $50.53 | $2.44 | $7.90 | $17.57 | $1.79 |
Brussels | 82.2 | $1,734.75 | $57.68 | $1.66 | $4.17 | $8.24 | $1.51 |
Madrid | 66.75 | $1,795.10 | $64.27 | $1.04 | $3.63 | $5.89 | $1.58 |
Vancouver | 74.06 | $2,937.27 | $74.28 | $2.28 | $7.12 | $14.38 | $1.47 |
Paris | 89.94 | $2,701.61 | $85.92 | $1.56 | $4.68 | $8.24 | $1.51 |
Tokyo | 92.94 | $2,197.03 | $88.77 | $1.77 | $6.46 | $17.75 | $1.49 |
Berlin | 71.65 | $1,695.77 | $95.34 | $1.24 | $3.52 | $5.89 | $1.71 |
Amsterdam | 85.9 | $2,823.28 | $105.93 | $1.33 | $4.34 | $7.06 | $1.71 |
New York | 100 | $5,877.45 | $121.00 | $2.93 | $3.98 | $15.00 | $0.84 |
Sydney | 90.78 | $3,777.72 | $124.55 | $1.94 | $4.43 | $14.01 | $2.26 |
Dublin | 87.93 | $3,025.83 | $144.78 | $1.37 | $4.31 | $14.12 | $2.06 |
London | 88.33 | $4,069.99 | $173.81 | $1.23 | $4.63 | $10.53 | $1.90 |
mean | 75.49 | $2,463.12 | $78.01 | $1.47 | $4.71 | $10.41 | $1.76 |
median | 82.2 | $2,354.10 | $74.28 | $1.37 | $4.34 | $8.24 | $1.71 |
min | 31.74 | $569.12 | $7.66 | $0.41 | $2.68 | $5.46 | $0.84 |
max | 100 | $5,877.45 | $173.81 | $2.93 | $7.90 | $17.75 | $2.88 |
Q1 | 66.75 | $1,695.77 | $41.20 | $1.04 | $3.63 | $7.06 | $1.51 |
Q3 | 88.33 | $2,937.27 | $105.93 | $1.77 | $5.35 | $14.12 | $1.98 |
New York | 100 | $5,877.45 | $121.00 | $2.93 | $3.98 | $15.00 | $0.84 |
In: Statistics and Probability
In a packing plant, one of the machines packs jars into a box. A sales rep for a packing machine manufacturer comes into the plant saying that a new machine he is selling will pack the jars faster than the old machine. To test this claim, each machine is timed for how long it takes to pack 10 cartons of jars at randomly chosen times. Given a 95% confidence interval of (-13.84, -6.68) for the true difference in average times to pack the jars (old machine - new machine), what can you conclude from this interval?
Question 3 options:
|
|||
|
|||
|
|||
|
|||
|
In: Statistics and Probability
3).A political poll estimates that 48% of Democrat voters will vote for Elizabeth Warren in the presidential primaries. They provide a "margin of error" of 3 percentage points, which leads to a 95% confidence interval of 45% to 51%.
a) Explain what "95% confidence level" means.
b) Why is a confidence interval needed here? What does it tell us
that the "48%" does not tell us?
c) Which of the following is the best interpretation of the
confidence interval?
i) 95% of the samples had between 45% and 51% of respondents
supporting Warren.
ii) There's a 95% chance that between 45% and 51% of those sampled
support Warren.
iii) We are 95% confident that between 45% And 51% of all voters
support Warren.
iv) We are 95% confident that between 45% and 51% of those sampled
support Warren.
In: Statistics and Probability
6) I took a random sample of people from the 2000 U.S.
Census and recorded the incomes. The sample size was n = 30 (with
replacement). The average of the sample was $23606 and the standard
deviation of the sample was $24757. Answer these questions using
the statistics that I found from my sample.
a) Find a 95% confidence interval for the mean income of the
population.
b) What assumptions must you make in order for the 95% confidence
level to be correct?
c) Your confidence interval is of the form (LB, UB) where LB and UB
are both numbers. Suppose I'm going to take a new sample of size
30. Can I conclude that the probability the average of my sample is
between LB and UB is 95%? (where LB and UB are the values you found
in your interval.) Why or why not?
d)A politician claims that the mean income of US residents (in
2000) was $25,000. Test this claim with an appropriate hypothesis
test using a significance level of 5%.
e) For part (d), what's the smallest significance level you could
have used so that the result would be that you would reject the
null hypothesis?
In: Statistics and Probability
1. T or F If a population is normally distributed, the sampling distribution of sample means will
also always be normally distributed
2. If a population is not normally distributed when will the sampling distribution of sample means
be guaranteed to have a normal distribution?
3. T or F For a given population and a given sample size there is only one sampling distribution of
sample mean that will be generated. .
4. The following 2 terms are NOT exactly the same. Define each including how they are similar and
how they are different
a.Sampling Error:
b. Standard Error:
5. T or F Another name for the standard deviation of the sampling distribution of sample means is
the standard error.
6. What are the 2 reasons that samples with larger sizes tend to have sample means closer to the
true population value?
7. A population is normally distributed with a mean of 50 and a standard deviation of 20. For
samples of size 25, what is the probability of randomly sampling and find a sample mean of 54 or
more? Assume population is normally distributed. Show work for partial credit. Solve to a final
addition or subtraction step. Circle your final answer.
In: Statistics and Probability
5) Explain the differences between the population
distribution, the distribution of a sample, and a sampling
distribution. Describe each of these in a single context of your
choosing.
In: Statistics and Probability
In a study of jury behavior, two samples of participants were provided details about a trial in which the defendant was obviously guilty. Although group 2 received the same details as group 1, the second group was also told that same evidence had been withheld from the jury by the judge. Later the participants were asked to recommend a jail sentence. The length of term suggested by each participant is presented here. Is there a significant difference between the two groups in their response. Test at α = .001
Group 1 Group 2
4 3
4 7
3 8
2 5
5 4
1 7
1 7
4 8
In: Statistics and Probability
4) Educators introduce a new math curriculum and measure 45 students' scores before and at the end of the academic year using an exam widely believed to accurately measure students' understanding. The average score at the start the year was 65%, and after was 75%.
a) In words, what are the null and alternative hypothesis tests
that the educators would most likely be interested in
testing?
b) What does a hypothesis test tell us that we can't learn from
simply noting that students improved 10 percentage points?
c) Suppose the p-value from this hypothesis test was 0.54. Explain
how to interpret this value in this context.
In: Statistics and Probability
The following time series shows the sales of a particular product over the past 12 months.
Month | Sales |
---|---|
1 | 105 |
2 | 135 |
3 | 120 |
4 | 105 |
5 | 90 |
6 | 120 |
7 | 145 |
8 | 140 |
9 | 100 |
10 | 80 |
11 | 100 |
12 | 110 |
Use α = 0.5 to compute the exponential smoothing forecasts for the time series. (Round your answers to two decimal places.)
Month t | Time Series Value
Yt |
Forecast
Ft |
---|---|---|
1 | 105 | |
2 | 135 | |
3 | 120 | |
4 | 105 | |
5 | 90 | |
6 | 120 | |
7 | 145 | |
8 | 140 | |
9 | 100 | |
10 | 80 | |
11 | 100 | |
12 | 110 |
Use a smoothing constant of α = 0.7 to compute the exponential smoothing forecasts. (Round your answers to two decimal places.)
Month t | Time Series Value
Yt |
Forecast
Ft |
---|---|---|
1 | 105 | |
2 | 135 | |
3 | 120 | |
4 | 105 | |
5 | 90 | |
6 | 120 | |
7 | 145 | |
8 | 140 | |
9 | 100 | |
10 | 80 | |
11 | 100 | |
12 | 110 |
Week | Sales (1,000s of gallons) |
---|---|
1 | 17 |
2 | 22 |
3 | 19 |
4 | 24 |
5 | 19 |
6 | 16 |
7 | 21 |
8 | 19 |
9 | 23 |
10 | 20 |
11 | 16 |
12 | 22 |
(a) Compute four-week and five-week moving averages for the time series.
Week | Time Series Value |
4-Week Moving Average Forecast |
5-Week Moving Average Forecast |
---|---|---|---|
1 | 17 | ||
2 | 22 | ||
3 | 19 | ||
4 | 24 | ||
5 | 19 | ||
6 | 16 | ||
7 | 21 | ||
8 | 19 | ||
9 | 23 | ||
10 | 20 | ||
11 | 16 | ||
12 | 22 |
Compute the MSE for the four-week moving average forecasts. (Round your answer to two decimal places.)
Compute the MSE for the five-week moving average forecasts. (Round your answer to two decimal places.)
In: Statistics and Probability