In: Statistics and Probability
In a population of Siberian flying squirrels in western Finland, assume that the the number of pups born to each female over her lifetime has mean ?=3.66 and standard deviation ?=2.9598. The distribution of squirrel pups born is non‑normal because it takes only whole, non‑negative values.
Determine the mean number of pups, x¯, such that in 70% of all random samples of such squirrels of size ?=40, the mean number of pups born to females in the sample is less than ?⎯⎯⎯.
In: Statistics and Probability
Explain when you use the χ 2-Test for Independence/Homogeneity and when you use the χ 2-Test for Goodness-of-Fit. Why do you not compute a confidence interval for these two tests. (You should give at least two reasons.)
In: Statistics and Probability
A statistical program is recommended.
The Consumer Reports Restaurant Customer Satisfaction Survey is based upon 148,599 visits to full-service restaurant chains.†Assume the following data are representative of the results reported. The variable type indicates whether the restaurant is an Italian restaurant or a seafood/steakhouse. Price indicates the average amount paid per person for dinner and drinks, minus the tip. Score reflects diners' overall satisfaction, with higher values indicating greater overall satisfaction. A score of 80 can be interpreted as very satisfied. (Let x1 represent average meal price, x2 represent type of restaurant, and y represent overall customer satisfaction.)
Restaurant | Type | Price ($) | Score |
---|---|---|---|
Bertucci's | Italian | 16 | 77 |
Black Angus Steakhouse | Seafood/Steakhouse | 24 | 79 |
Bonefish Grill | Seafood/Steakhouse | 26 | 85 |
Bravo! Cucina Italiana | Italian | 18 | 84 |
Buca di Beppo | Italian | 17 | 81 |
Bugaboo Creek Steak House | Seafood/Steakhouse | 18 | 77 |
Carrabba's Italian Grill | Italian | 23 | 86 |
Charlie Brown's Steakhouse | Seafood/Steakhouse | 17 | 75 |
Il Fornaio | Italian | 28 | 83 |
Joe's Crab Shack | Seafood/Steakhouse | 15 | 71 |
Johnny Carino's | Italian | 17 | 81 |
Lone Star Steakhouse & Saloon | Seafood/Steakhouse | 17 | 76 |
Longhorn Steakhouse | Seafood/Steakhouse | 19 | 81 |
Maggiano's Little Italy | Italian | 22 | 83 |
McGrath's Fish House | Seafood/Steakhouse | 16 | 81 |
Olive Garden | Italian | 19 | 81 |
Outback Steakhouse | Seafood/Steakhouse | 20 | 80 |
Red Lobster | Seafood/Steakhouse | 18 | 78 |
Romano's Macaroni Grill | Italian | 18 | 82 |
The Old Spaghetti Factory | Italian | 12 | 79 |
Uno Chicago Grill | Italian | 16 | 76 |
Develop the estimated regression equation to show how overall customer satisfaction is related to the average meal price and the type of restaurant. (Use the dummy variable developed in part (c). Round your numerical values to two decimal places.)
ŷ =
Find the value of the test statistic. (Round your answer to two decimal places.)
Find the p-value. (Round your answer to three decimal places.)
p-value =
(f)
Predict the Consumer Reports customer satisfaction score for a seafood/steakhouse that has an average meal price of $25. (Round your answer to two decimal places.)
How much would the predicted score have changed for an Italian restaurant? (Round your answer to two decimal places.)
The predicted satisfaction score increases by points for Italian restaurants.
In: Statistics and Probability
A national soccer team scores 1 goal per game on average. Let
the data from the past 20 games be given by
1,2,2,1,1,1,1,2,1,4,0,2,3,0,2,1,0,3,4,1. Using this information,
please answer the following questions.
(a) (1 point) What is an appropriate statistical model for the
number of goals scored in a given game if modeled by a random
variable X taking values in {0, 1, 2, . . .}? (b) (1 point)
Formulate an appropriate two-sided testing problem for the model’s
parameter in (a).
(c) (3 points) Use the likelihood ratio test statistic to test the
null hypothesis in (b).
(d) (2+3 points) Make a decision on the null hypothesis in (b) by
means of - an appropriate confidence interval - an appropriate
p-value.
In: Statistics and Probability
Each of three supermarket chains in the Denver area claims to have the lowest overall prices. As part of an investigative study on supermarket advertising, a local television station conducted a study by randomly selecting nine grocery items. Then, on the same day, an intern was sent to each of the three stores to purchase the nine items. From the receipts, the following data were recorded. At the 0.010 significance level, is there a difference in the mean price for the nine items between the three supermarkets?
Item | Super's | Ralph's | Lowblaw's | ||||||
1 | $ | 1.87 | $ | 3.10 | $ | 1.87 | |||
2 | 1.07 | 2.46 | 2.46 | ||||||
3 | 1.14 | 1.23 | 1.37 | ||||||
4 | 1.10 | 1.29 | 1.29 | ||||||
5 | 1.25 | 2.46 | 1.25 | ||||||
6 | 3.54 | 1.72 | 2.40 | ||||||
7 | 1.25 | 1.25 | 2.40 | ||||||
8 | 1.80 | 1.87 | 2.10 | ||||||
9 | 3.10 | 2.50 | 2.30 | ||||||
Click here for the Excel Data File
A. State the null hypothesis and the alternate hypothesis.
For Treatment (Stores): Null hypothesis
choices:
a. H0: μ1 ≠ μ2 ≠ μ3
b. H0: μ1 = μ2 = μ3
B. Alternate hypothesis
choices:
a. H1: There is no difference in the store means.
b. H1: There is a difference in the store means.
C. For blocks (Items):
choices:
a. H0: μ1 = μ2 = ... μ9
b. H0: μ1 ≠ μ2 ≠ ... μ9
D. Alternate hypothesis
choices:
a. H1: There is no difference in the item means.
b. H1: There is a difference in the item means.
E. What is the decision rule for both? (Round your answers to 2 decimal places.)
Reject H0 if F> |
Reject H0 if F> |
For stores | ? |
For items | ? |
F. Complete an ANOVA table. (Round your SS, MS to 3 decimal places, and F to 2 decimal places.)
source | SS | df | MS | F |
Stores | ? | ? | ? | ? |
Items | ? | ? | ? | ? |
Error | ? | ? | ? | |
Total | ? |
G. What is your decision regarding the null hypothesis? The decision for the F value (Stores) at 0.010 significance is:
choices:
a. Do not reject H0
b. Reject H0
H. The decision for the F value (Items) at 0.010 significance is:
choices:
a. Reject H0
b. Do not reject H0
I. Is there a difference in the item means and in the store means?
There is (a difference / no difference) in the store means. There is (a difference / no difference) in the item means.
In: Statistics and Probability
Suppose you are interested in finding out how many people in the population have Covid19. How would you attempt to do that? Explain carefully. What are some challenges that you might encounter and how might you overcome them?
In: Statistics and Probability
A pet food company has a business objective of expanding its product line beyond its current kidney and shrimp-based cat foods. The company developed two new products, one based on chicken liver and the other based on salmon. The company conducted an experiment to compare the two new products with its two existing ones, as well as a generic beef-based product sold at a supermarket chain. For the experiment, a sample of 50 cats from the population at a local animal shelter was selected. Ten cats were randomly assigned to each of the five products being tested. Each of the cats was then presented with 3 ounces of the selected food in a dish at feeding time. The researchers defined the variable to be measured as the number of ounces of food that the cat consumed within a 10-minute time interval that began when the filled dish was presented. The results are summarized in the dataset (CatFood.xlsx).
Where CatFood.xlsx Data:
Kidney Shrimp Chicken Liver Salmon Beef
2.37 2.26 2.29 1.79 2.09
2.62 2.69 2.23 2.33 1.87
2.31 2.25 2.41 1.96 1.67
2.47 2.45 2.68 2.05 1.64
2.59 2.34 2.25 2.26 2.16
2.62 2.37 2.17 2.24 1.75
2.34 2.22 2.37 1.96 1.18
2.47 2.56 2.26 1.58 1.92
2.45 2.36 2.45 2.18 1.32
2.32 2.59 2.57 1.93 1.94
To test that there is evidence of a difference in the mean amount of food eaten among the various products, fill in all the values in the following One-way ANOVA summary table.
Source of Variation | SS | df | MS | F |
Among Groups | ||||
Within Groups | - | |||
Total | - | - |
At the 0.01 level of significance, is there evidence of a difference in the mean amount of food eaten among the various products? Test the hypothesis using F test based on the result of (a).
At the 0.01 level of significance, determine which products appear to differ significantly in the mean amount of food eaten using the Tukey-Kramer method.
At the 0.01 level of significance, is there evidence of a difference in the variation in the amount of food eaten among the various products? Test the homogeneity of the variances using the Levene’s test.
In: Statistics and Probability
2. (9 pts) It is difficult to determine a person’s body fat percentage accurately without immersing him or her in water. Researchers hoping to find ways to make a good estimate immersed 20 male subjects, then measured their weights shown in the table.
(2 pts) Determine the linear correlation coefficient.
(4 pts) Find the least-squares regression line.
(2 pts) Interpret the slope and y-intercept if appropriate.
(1 pts) Predict the body fat percentage if the weight is 190 lb.
Weight (lb) |
Body Fat (%) |
175 |
6 |
181 |
21 |
200 |
15 |
159 |
6 |
196 |
22 |
192 |
31 |
205 |
32 |
173 |
21 |
187 |
25 |
188 |
30 |
188 |
10 |
240 |
20 |
175 |
22 |
168 |
9 |
246 |
38 |
160 |
10 |
215 |
27 |
159 |
12 |
146 |
10 |
219 |
28 |
In: Statistics and Probability
Scientific Reasoning with Philosophy and Statistics
We learned about what is commonly referred to as classical or frequentist statistics and discussed Bayesian statistics in the Romero paper and the Aeon article with the Papineau article. These three papers provided reasons for favoring Bayesian statistics over frequentists statistics, explain a brief summary of some of those reasons from these discussions.
In: Statistics and Probability
An unknown distribution has mean 82 and a standard deviation of 11.2. Samples of size n = 35 are drawn randomly from the population. Find the probability that the mean of the sample means is between 81.2 and 83.6.
In: Statistics and Probability
We are interested in seeing if there are any meaningful differences in the willingness of certain tapestry segments to buy organic products. Top Tier reports purchasing an average of 3.8 organic items per week per family with a standard deviation of 2, while Pleasantville reports purchasing 2.3 organic items per week per family with a standard deviation of 0.5. The sample size for Top Tier is 50 and the sample size for Pleasantville is 100. Test the hypothesis at a 95% confidence level. Explain.
In: Statistics and Probability
Suppose that the sitting back-to-knee length for a group of adults has a normal distribution with a mean of mu equals 23.4 in. and a standard deviation of sigma equals 1.1 in. These data are often used in the design of different seats, including aircraft seats, train seats, theater seats, and classroom seats. Instead of using 0.05 for identifying significant values, use the criteria that a value x is significantly high if P(x or greater)less than or equals0.01 and a value is significantly low if P(x or less)less than or equals0.01. Find the back-to-knee lengths separating significant values from those that are not significant. Using these criteria, is a back-to-knee length of 25.7 in. significantly high?
In: Statistics and Probability
R Problem Set:
#Work with the inbuilt dataset "Cars"
View(cars)
This will show you the dataset on 2 variables speed and distance.
?cars
This will explain what the variables mean.
#Q1) Describe the dataset. What are the main findings?
#Q2) Design a relevant question to model using linear
regressions
#Q3) Run the regression and report the std error, t-stat, p value
and f stat.
#Q4) Is this a valid regression? Is the normality assumption
justified? Show clearly.
#Q5) Are there any ouliers in the data? How do the observations
influence the results?
#Q6) Write a 3-5 lines summary of what you conclude from this
exercise
Please submit a clear pdf document showing all your results. Also, attach a copy of your code.
In: Statistics and Probability
Why might the exponential family of distributions be important? In other words, if areal-world application turned out to have a distribution which is of this family, what interesting results or convenient truths might that bring to the table?
In: Statistics and Probability