In: Statistics and Probability
Research into the relationship between hours of study and grades shows widely different conclusions. A recent survey of graduates who wrote the Graduate Management Admissions Test (GMAT) had the following results.
Hours
Studied [Average Score]
(Midpoint)
40 220
50 310
65 350
75 440
85 560
105 670
95 700
a) Run the regression analysis in Excel on this data. Include your output with your answer. (Note: You may calculate by hand if you prefer).
b) What is the regression equation for this relationship?
c) Use the regression equation to predict the average score for each category of hours studied.
d) Plot the original data and the regression line on a scatter gram. (You may use Excel).
e) How accurate is this regression at predicting GMAT scores based on hours studied? Explain.
f) Use the t statistic to determine whether the Correlation Coefficient is “significant” at the 95% confidence level.
Answer
(a) we have run the regression analysis on the given data on Hours studied (X) and Avergae Score (Y). The output is given below
X | Y |
40 | 220 |
50 | 310 |
65 | 350 |
75 | 440 |
85 | 560 |
105 | 670 |
95 | 700 |
Regression Statistics | |
Multiple R | 0.95970248 |
R Square | 0.92102886 |
Adjusted R Square | 0.90128607 |
Standard Error | 51.5401955 |
Observations | 6 |
Estimates of intercept term,regression coefficients, their standard error and 95% confidence interval
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | -114.948454 | 93.1729456 | -1.23371 | 0.284846 | -373.638 | 143.74112 |
X | 7.83092784 | 1.14651783 | 6.830184 | 0.002403 | 4.647684 | 11.014172 |
The P-value for X is 0.002403 which is less than the level of significance(0.05). It means that the X(Hours studied) is not significance in explaining the variance in Y(Average Score).
(b) The regression equation is given by
(c) The predicted values of Y for all X, using the regression equation given in (b), are
Observation | Predicted Y |
1 | 276.5979 |
2 | 394.0619 |
3 | 472.3711 |
4 | 550.6804 |
5 | 707.2990 |
6 | 628.9897 |
(d) The scatter plot between of original data and the fitted regression line is