In: Statistics and Probability
1. Consider a multinomial experiment with n = 307 and k = 4. The null hypothesis to be tested is H0: p1 = p2 = p3 = p4 = 0.25. The observed frequencies resulting from the experiment are: (You may find it useful to reference the appropriate table: chi-square table or F table)
Category | 1 | 2 | 3 | 4 |
Frequency | 85 | 58 | 89 | 75 |
a. Choose the appropriate alternative hypothesis.
Not all population proportions are equal to 0.25.
All population proportions differ from 0.25.
b-1. Calculate the value of the test statistic. (Round intermediate calculations to at least 4 decimal places and final answer to 3 decimal places.)
b-2. Find the p-value.
p-value < 0.01
c. At the 10% significance level, what is the conclusion to the hypothesis test?
Reject H0 since the p-value is less than the significance level.
Reject H0 since the p-value is greater than the significance level.
Do not reject H0 since the p-value is greater than the significance level.
Do not reject H0 since the p-value is less than the significance level.
2. An analyst is trying to determine whether the prices of certain stocks on the NASDAQ are independent of the industry to which they belong. She examines four industries and, classifies the stock prices in these industries into one of three categories (high-priced, average-priced, low-priced).
Industry | ||||
Stock Price | I | II | III | IV |
High | 18 | 12 | 24 | 24 |
Average | 19 | 18 | 20 | 25 |
Low | 8 | 8 | 9 | 12 |
a. Choose the competing hypotheses to determine
whether stock price depends on the industry.
H0: Stock price is independent of the industry.; HA: Stock price is dependent on the industry.
H0: Stock price is dependent on the industry.; HA: Stock price is independent on the industry.
b-1. Calculate the value of the test statistic.
(Round intermediate calculations to at least 4 decimal
places and final answer to 3 decimal places.)
b-2. Find the p-value.
p-value < 0.01
0.01 p-value < 0.025
c. At a 1% significance level, what can the
analyst conclude?
Do not reject H0; there is not enough evidence to support the claim that the stock price is dependent on the industry.
Reject H0; there is enough evidence to support the claim that the stock price is dependent on the industry.
Reject H0; there is not enough evidence to support the claim that the stock price is dependent on the industry.
Do not reject H0; there is enough evidence to support the claim that the stock price is dependent on the industry.
3. Using 20 observations, the multiple regression model y = β0 + β1x1 + β2x2 + ε was estimated. A portion of the regression results is shown in the accompanying table:
df | SS | MS | F | Significance F | |
Regression | 2 | 2.10E+12 | 1.12E+12 | 63.503 | 1.30E-08 |
Residual | 17 | 3.10E+11 | 1.77E+10 | ||
Total | 19 | 2.43E+12 | |||
Coefficients | Standard Error | t Stat | p-value | Lower 95% | Upper 95% | ||||||
Intercept | −988,484 | 130,933 | −7.550 | 0.000 | −1,264,728 | −712,240 | |||||
x1 | 28,503 | 32,372 | 0.880 | 0.391 | −39,796 | 96,802 | |||||
x2 | 29,494 | 33,046 | 0.893 | 0.385 | −40,227 | 99,215 | |||||
a. At the 5% significance level, are the
explanatory variables jointly significant?
No, since the p-value of the appropriate test is less than 0.05.
Yes, since the p-value of the appropriate test is more than 0.05.
Yes, since the p-value of the appropriate test is less than 0.05
No, since the p-value of the appropriate test is more than 0.05.
b. At the 5% significance level, is each
explanatory variable individually significant?
Yes, since both p-values of the appropriate test are less than 0.05.
Yes, since both p-values of the appropriate test are more than 0.05.
No, since both p-values of the appropriate test are not less than 0.05.
No, since both p-values of the appropriate test are not more than 0.05.
c. What is the likely problem with this model?
Multicollinearity since the standard errors are biased.
Multicollinearity since the explanatory variables are individually and jointly significant.
Multicollinearity since the explanatory variables are individually significant but jointly insignificant.
Multicollinearity since the explanatory variables are individually insignificant but jointly significant.
4. The following table lists a portion of Major League Baseball’s (MLB’s) leading pitchers, each pitcher’s salary (In $ millions), and earned run average (ERA) for 2008.
Salary | ERA | |||||
J. Santana | 17.0 | 2.31 | ||||
C. Lee | 3.0 | 2.39 | ||||
⋮ | ⋮ | ⋮ | ||||
C. Hamels | 0.2 | 3.00 | ||||
Salary | ERA | |
J. Santana | 17.0 | 2.31 |
C. Lee | 3.0 | 2.39 |
T. Lincecum | 0.3 | 2.42 |
C. Sabathia | 10.0 | 2.20 |
R. Halladay | 10.0 | 2.39 |
J. Peavy | 5.4 | 2.15 |
D. Matsuzaka | 7.8 | 2.43 |
R. Dempster | 7.1 | 2.32 |
B. Sheets | 11.7 | 3.04 |
C. Hamels | 0.2 | 3.00 |
a-1. Estimate the model: Salaryˆ=Salary^=
β0 + β1ERA + ε.
(Negative values should be indicated by a minus sign. Enter
your answers, in millions, rounded to 2 decimal
places.)
a-2. Interpret the coefficient of ERA.
A one-unit increase in ERA, predicted salary decreases by $2.89 million.
A one-unit increase in ERA, predicted salary increases by $2.89 million.
A one-unit increase in ERA, predicted salary decreases by $11.48 million.
A one-unit increase in ERA, predicted salary increases by $11.48 million.
b. Use the estimated model to predict salary for
each player, given his ERA. For example, use the sample regression
equation to predict the salary for J. Santana with ERA = 2.31.
(Round coefficient estimates to at least 4 decimal places
and final answers, in millions, to 2 decimal places.)
c. Derive the corresponding residuals.
(Negative values should be indicated by a minus sign. Round
coefficient estimates to at least 4 decimal places and final
answers, in millions, to 2 decimal places.)
1.
Category | Observed Frequency (O) | Proportion, p | Expected Frequency (E) | (O-E)²/E |
1 | 85 | 0.25 | 307 * 0.25 = 76.75 | (85 - 76.75)²/76.75 = 0.8868 |
2 | 58 | 0.25 | 307 * 0.25 = 76.75 | (58 - 76.75)²/76.75 = 4.5806 |
3 | 89 | 0.25 | 307 * 0.25 = 76.75 | (89 - 76.75)²/76.75 = 1.9552 |
4 | 75 | 0.25 | 307 * 0.25 = 76.75 | (75 - 76.75)²/76.75 = 0.0399 |
Total | 307 | 1.00 | 307 | 7.4625 |
a. Appropriate alternative hypothesis:
Not all population proportions are equal to 0.25.
b-1.
Test statistic:
χ² = ∑ ((O-E)²/E) = 7.463
b-2. df = n-1 = 3
p-value = CHISQ.DIST.RT(7.4625, 3) = 0.0585
0.05 p-value < 0.10
c. Conclusion to the hypothesis test:
Reject H0 since the p-value is less than the significance level.
-----------
2.
Observed Frequencies | |||||
Stock Price | I | II | III | IV | Total |
High | 18 | 12 | 24 | 24 | 78 |
Average | 19 | 18 | 20 | 25 | 82 |
Low | 8 | 8 | 9 | 12 | 37 |
Total | 45 | 38 | 53 | 61 | 197 |
Expected Frequencies | |||||
I | II | III | IV | Total | |
High | 45 * 78 / 197 = 17.8173 | 38 * 78 / 197 = 15.0457 | 53 * 78 / 197 = 20.9848 | 61 * 78 / 197 = 24.1523 | 78 |
Average | 45 * 82 / 197 = 18.731 | 38 * 82 / 197 = 15.8173 | 53 * 82 / 197 = 22.0609 | 61 * 82 / 197 = 25.3909 | 82 |
Low | 45 * 37 / 197 = 8.4518 | 38 * 37 / 197 = 7.1371 | 53 * 37 / 197 = 9.9543 | 61 * 37 / 197 = 11.4569 | 37 |
Total | 45 | 38 | 53 | 61 | 197 |
(fo-fe)²/fe | |||||
High | (18 - 17.8173)²/17.8173 = 0.0019 | (12 - 15.0457)²/15.0457 = 0.6165 | (24 - 20.9848)²/20.9848 = 0.4332 | (24 - 24.1523)²/24.1523 = 0.001 | |
Average | (19 - 18.731)²/18.731 = 0.0039 | (18 - 15.8173)²/15.8173 = 0.3012 | (20 - 22.0609)²/22.0609 = 0.1925 | (25 - 25.3909)²/25.3909 = 0.006 | |
Low | (8 - 8.4518)²/8.4518 = 0.0241 | (8 - 7.1371)²/7.1371 = 0.1043 | (9 - 9.9543)²/9.9543 = 0.0915 | (12 - 11.4569)²/11.4569 = 0.0257 |
a)
H0: Stock price is independent of the industry.;
HA: Stock price is dependent on the industry.
Test statistic:
χ² = ∑ ((fo-fe)²/fe) = 1.8020
df = (r-1)(c-1) = 6
p-value = CHISQ.DIST.RT(1.802, 6) = 0.937
p-value >0.10
conclusion:
Do not reject H0; there is not enough evidence to support the claim that the stock price is dependent on the industry.
--------
For other answer please repost thankyou