Your insurance company has converged for three types of cars. The annual cost for each type of cars can be modeled using Gaussian (Normal) distribution, with the following parameters: (Discussions allowed!)
Use Random number generator and simulate 1000 long columns, for each of the three cases. Example: for the Car type 1, use Number of variables=1, Number of random numbers=1000, Distribution=Normal, Mean=520 and Standard deviation=110, and leave random Seed empty.
Next: use either sorting to construct the appropriate histogram or rule of thumb to answer the questions:
13. What is approximate probability that Car Type 1 has annual cost less than $550?
14. Which of the three types of cars is most likely to cost more than $1000?
15. For which of the three types we have the highest average cost?
In: Statistics and Probability
You are developing a simple linear regression analysis model. The simple correlation coefficient between y and x is -0.72. What do you know must be true about b1. The least squares estimator of B1? Why?
In a multiple linear regression analysis with k = 3. From the t test associated with B1, you conclude that B1 = 0. When you do the f test will you reject or fail to reject the null hypothesis? Why?
In a simple bilinear regression analysis, when x = -1.2, the 95% confidence interval for is (42, 5B). Give an interval (there are multiple answers that can be correct) that could be the 95% prediction interval for y when x = -12. Explain why your answer works.
You have quarterly time series data and you want to develop a model that will allow you to make forecasts for the next four quarters. Should you use a moving average model a simple exponential smoothing model or a regression model? Why?
In a regression analysis, you get r^2 = 0.40. What does this tell you about the mathematical relationship of SSE and SSR? How do you know?
In: Statistics and Probability
One risky and one prudent, two investment startegies are compared. 7 of the funds following the first strategy have average return of 15% with standard deviation of 12%. 15 of the funds that follow the second strategy have an average return of 12% with standard deviation of 9%. What are the calculated and critical test statistics and the test result? (alpha=5%, consider one sided test that one of the varances is higher)
In: Statistics and Probability
Collinearity
Discuss the problems that result when collinearity is present in a regression analysis.
How can you detect collinearity?
What remedial measures are available when collinearity is detected?
In: Statistics and Probability
A consumer buying cooperative tested the effective heating area of 20 different electric space heaters with different wattages. Here are the results.
Heater | Wattage | Area | ||
1 | 1,250 | 365 | ||
2 | 1,750 | 455 | ||
3 | 2,000 | 545 | ||
4 | 1,800 | 455 | ||
5 | 1,500 | 355 | ||
6 | 1,250 | 354 | ||
7 | 300 | 65 | ||
8 | 1,000 | 35 | ||
9 | 600 | 45 | ||
10 | 1,000 | 55 | ||
11 | 300 | 65 | ||
12 | 2,250 | 545 | ||
13 | 300 | 55 | ||
14 | 950 | 55 | ||
15 | 300 | 25 | ||
16 | 400 | 35 | ||
17 | 800 | 55 | ||
18 | 240 | 65 | ||
19 | 1,800 | 455 | ||
20 | 300 | 25 | ||
a. Compute the correlation between the wattage and heating area. Is there a direct or an indirect relationship? (Negative value should be indicated by a minus sign. Round your answer to 3 decimal places.)
The correlation of Wattage and Area is _______.
b. Conduct a test of hypothesis to determine if it is reasonable that the coefficient is greater than zero. Use the 0.05 significance level. (Negative value should be indicated by a minus sign. Round intermediate calculations and final answer to 3 decimal places.)
H0: ρ ≤ 0; H1: ρ > 0 Reject H0 if t > 1.7341
t=______
C. Develop the regression equation for effective heating based on wattage. (Negative value should be indicated by a minus sign. Round your answers to 3 decimal places.)
The regression equation is Area=______+______ Wattage
D. Which heater looks like the “best buy” based on the size of the residual? (Negative value should be indicated by a minus sign. Round residual value to 2 decimal places.)
The fifth heater is the "best buy." It heats an area that is ________ square feet larger than estimated by the regression equation.
In: Statistics and Probability
A psychologist conducted an experiment analysing the relationship between student scores in an exam and the amount of attention they paid in class. The latter was measured using a type of brain monitor. The Psychologist believed that scores would increase by 1 for every two unit increase in attention. The data are listed in the excel spreadsheet.
Estimate a linear regression between the score (Y) and the measure of attention(X).
(a) Write out the equation for Y in the form , but with coefficients. Show the estimated standard errors in parenthesis below the coefficients. What is the R2 of the regression? Calculate a 99 percent confidence interval for β. [5 pts]
(b) What are the mean and the estimated standard deviation of the estimated residuals? [2 pts]
Hint: the first answer is definitional and the second answer is easily seen from the output.
(c )Test the hypothesis that there is no relationship between the variables at the 90 percent significance level. [3 pts]
(d) Test the hypothesis that the coefficient β=0.5 at the 99% significance level. [3 pts]
(e) The Psychologist concluded from the experiment that test scores increase significantly if students pay attention in class. In one word, how would you describe the results of this experiment based on the data you have? [2 pts]
DATA:
Regression data for Psychology Experiment | |||
Attention | Score | ||
18 | 80 | ||
35 | 90 | ||
86 | 80 | ||
22 | 50 | ||
72 | 76 | ||
102 | 74 | ||
86 | 75 | ||
30 | 80 | ||
35 | 85 | ||
94 | 82 | ||
16 | 80 | ||
42 | 41 | ||
50 | 50 | ||
96 | 96 | ||
60 | 80 | ||
106 | 70 | ||
80 | 65 | ||
14 | 14 | ||
11 | 14 | ||
80 | 85 | ||
12 | 14 | ||
37 | 43 | ||
26 | 80 | ||
86 | 70 | ||
5 | 20 | ||
17 | 20 | ||
35 | 80 | ||
76 | 68 | ||
50 | 70 | ||
15 | 16 | ||
90 | 86 | ||
96 | 80 | ||
7 | 16 | ||
10 | 14 | ||
35 | 65 | ||
88 | 88 | ||
20 | 32 | ||
22 | 70 | ||
50 | 65 | ||
22 | 62 | ||
35 | 50 | ||
64 | 92 | ||
68 | 84 | ||
13 | 15 | ||
102 | 102 | ||
86 | 85 | ||
18 | 24 | ||
78 | 64 | ||
98 | 78 | ||
70 | 80 | ||
60 | 70 | ||
98 | 98 | ||
9 | 14 | ||
50 | 90 | ||
104 | 72 | ||
35 | 45 | ||
60 | 60 | ||
74 | 72 | ||
88 | 88 | ||
80 | 95 | ||
22 | 58 | ||
8 | 14 | ||
86 | 110 | ||
60 | 75 | ||
92 | 84 | ||
60 | 100 | ||
80 | 75 | ||
86 | 95 | ||
16 | 18 | ||
86 | 90 | ||
35 | 75 | ||
35 | 60 | ||
80 | 60 | ||
80 | 70 | ||
104 | 104 | ||
80 | 100 | ||
60 | 90 | ||
86 | 100 | ||
62 | 96 | ||
60 | 65 | ||
39 | 41 | ||
50 | 80 | ||
50 | 75 | ||
6 | 18 | ||
60 | 95 | ||
22 | 54 | ||
21 | 40 | ||
100 | 100 | ||
94 | 94 | ||
80 | 90 | ||
48 | 41 | ||
106 | 106 | ||
50 | 43 | ||
46 | 41 | ||
90 | 90 | ||
60 | 85 | ||
92 | 92 | ||
22 | 80 | ||
35 | 70 | ||
66 | 88 | ||
80 | 60 | ||
50 | 60 | ||
80 | 80 | ||
100 | 76 | ||
50 | 45 | ||
86 | 65 | ||
19 | 28 | ||
50 | 85 | ||
22 | 75 | ||
86 | 105 |
In: Statistics and Probability
) Consider the following regression results based on 30 observations.
y = 238.33 – 0.95x1 + 7.13x2 + 4.76x3; SSE = 3,439
y = 209.56 – 1.03x1 + 5.24(x2 + x3); SSE = 3,559
a. Formulate the hypotheses to determine whether the influences of x2 and x3 differ in explaining y.
b. Calculate the value of the test statistic.
c. At the 5% significance level, find the critical value(s).
d. What is your conclusion to the test?
In: Statistics and Probability
The Body Mass Index (BMI) is a value calculated based on the weight and the height of an individual. In a small European city, a survey was conducted one year ago to review the BMI of the citizens. In the sample with 200 citizens, the mean BMI was 23.3 kg/m2 and standard deviation was 1.5 kg/m2 . It is reasonable to assume the BMI distribution is a normal distribution.
(a) Find the point estimate of the population mean BMI one year ago.
(b) Calculate the sampling error at 90% confidence.
(c) Construct a 90% confidence interval estimate of the population mean BMI one year ago. This city launched a healthy exercise program to reduce citizen’s BMI after last year’s survey. Suppose the program effectively reduces the BMI of each citizen by 2.5%.
(d) Construct a 98% confidence interval estimate of the population mean BMI after the healthy exercise program. (Hint: find the updated sample mean and sample standard deviation of the BMI of the sample with 200 citizens selected last year)
In: Statistics and Probability
1. What assumptions must be validated to use t-distribution methods for a single sample mean (confidence interval or hypothesis test)?
2. What assumptions must be validated to use t-distribution methods for two independent sample means (confidence interval or hypothesis test)?
In: Statistics and Probability
According to Nielsen Media Research, the average number of hours
of TV viewing per household per week in the United States is 50.4
hours.
1 (a) Suppose the population standard deviation is 11.8 hours and a
random sample of 42 U.S. household is taken, what is the
probability that the sample mean TV viewing time is between 47.5
and 52 hours?
1 (b) Suppose the population mean and sample size is still 50.4
hours and 42, respectively, but the population standard deviation
is unknown. If 72% of all sample means are greater than 49 hours,
what is the value of the unknown population standard
deviation?
1(c) What is the result of part (a) if the sample only consists of
5 households? Explain.
The average age of online consumers ten years ago was 23.3 years.
As older individuals gain confidence with the Internet, it is
believed that the average age has increased. We would like to test
this belief.
2(a) Write the appropriate hypotheses to be tested.
2(b) The online shoppers in our sample consisted of 40 individuals,
had an average age of 24.2 years, with a standard deviation of 5.3
years. What is the test statistic and p‐value for the hypotheses
being tested in part (a)? (Remark: Report the p‐value using the
statistical table, but NOT Excel function.) 2 (c) What is the
practical implication of the conclusion of the hypothesis test at
i. 5% level of significance, and ii. 10% level of significance?
In: Statistics and Probability
This is an SPSS question:
Demographic data for 1000 individuals is contained in the file demographic.sav. Use SPSS to conduct a test to determine if the four levels of education are equally likely using α = 0.05.
Question 2: Using the data from question 1, conduct a test in SPSS to determine if there is a relationship between level of education and employment using α = 0.05.
I dont want someone to solve this for me, i just need to know what would the null and alternate hypotheses for these be?
In: Statistics and Probability
According to the market research firm NPD Group, Americans ate an average of 211 meals in 2001. The following data show the number of meals eaten in restaurants as determined from a random sample of Americans in 2014. 212 128 197 344 143 79 180 313 57 200 161 320 90 224 266 284 231 322 200 173 a. Using a = 0.05, test the hypothesis that the number of meals eaten at restaurants by Americans has not changed since 2001 Estimate the p-value for this test using Table 5 in Appendix A. c. Determine the precise p-value for this test using Excel. d. Use PHSat to validate these results. d. What assumption need to be made to perform this analysis?.
In: Statistics and Probability
Test A Score
Respondent X Y x-x ( x-x) 2 y-y ( y-y) 2 ( x-x) ( y-y)
Jones |
77 |
51 |
24.8 |
615.04 |
7.8 |
60.84 |
193.44 |
Dunn |
66 |
59 |
13.8 |
190.44 |
15.8 |
249.64 |
218.04 |
Dean |
41 |
44 |
-11.2 |
125.44 |
0.8 |
0.64 |
-8.96 |
Hampton |
47 |
34 |
-5.2 |
27.04 |
-9.2 |
84.64 |
47.84 |
Nichols |
32 |
28 |
-22.2 |
492.84 |
-15.2 |
231.04 |
337.44 |
Test B Score
Respondent X Y x-x ( x-x) 2 y-y ( y-y) 2 ( x-x) ( y-y)
Jones |
50 |
51 |
-3.4 |
11.56 |
7.4 |
54.76 |
-25.16 |
Dunn |
38 |
69 |
-15.4 |
237.16 |
25.40 |
645.16 |
-391.16 |
Dean |
58 |
34 |
4.6 |
21.16 |
-9.6 |
92.16 |
-44.16 |
Hampton |
49 |
36 |
-4.4 |
19.36 |
-7.6 |
57.76 |
33.44 |
Nichols |
72 |
28 |
18.6 |
345.96 |
-15.6 |
243.36 |
-290.16 |
a. Calculate Pearson’s Product Movement Correlation Coefficient (r) for Test A. Show your work. (6 points)
b. Based on the correlation coefficient which you calculated, in two words how would you describe the relationship between the two variables in Test A? (4 points)
c. Calculate Pearson’s Product Movement Correlation Coefficient (r) for Test B. Show your work. (6 points)
d. Based on the correlation coefficient which you calculated, in two words how would you describe the relationship between the two variables in Test B? (4 points)
e. Which test would you select? (2 points) Why? (4 points)
In: Statistics and Probability
In: Statistics and Probability
Suppose that GE is trying to prevent Maytag from entering the market for high efficiency clothes dryers. Even though high efficiency dryers are more costly to produce, they are also more profitable as they command sufficiently higher prices from consumers. The following payoffs table shows the annual profits for GE and Maytag for the advertising spending and entry decisions that they are facing.
GE |
|||
MAYTAG |
Advertising = $12m |
Advertising = $0.7m |
|
Stay Out |
$0, $30m |
$0, $35m |
|
Enter |
$1m , $20m |
$12m, $15 |
Based on this information, can GE successfully prevent Maytag from entering this market by increasing its advertising levels? What is the equilibrium outcome in this game?
Suppose that an analyst at GE is convinced that just a little bit more advertising by GE, say another $2m, would be sufficient to deter enough customers from buying Maytag, thus, yield less than $0 profits for Maytag in the event it enters. Suppose that spending an extra $2m on advertising by GE will reduce its expected profits by $1.5 m, regardless of whether Maytag enters or stays out. Would this additional spending on advertising achieve the effect of deterring Maytag from entering? Should GE pursue this option?
In: Statistics and Probability