In: Statistics and Probability
A guidance counselor wanted to see if she could predicted student’s GPAs in college. She decided to she collected the High School GPA, SAT Score, and if the student was involved in an extracurricular activity in High School of eight students and found out their GPA in their first year of college.
HS GPA | SAT | EXTRA | COLLEGE GPA |
3.5 | 1230 | no | 3.4 |
2.8 | 1070 | yes | 3.0 |
2.5 | 1090 | no | 2.7 |
3.7 | 1280 | no | 3.6 |
3.2 | 1130 | yes | 3.5 |
3.1 | 1300 | yes | 3.3 |
3.9 | 1290 | no | 3.6 |
2.6 | 1020 | yes | 2.9 |
a.) Provide an equation that can be used to predict College GPA based on the 3 independent variables given above.
b.) Explain what the coefficient of the “Extra” variable means.
c.) What is the p-value of the overall model? Is the overall model significant at the ? = 0.05 level?
d.) List out and explain what the p-values for the individual variables mean.
e.) Were there any variables shown in part d that were not significant? If so remove them and give a new regression model. Is it better than in part a? Worse? Why? If it is better use this new one for the remaining problems. Use overall p-value, R2, individual variable p-values, etc to support your case.
f.) Suppose we wanted to make a prediction for the College GPA for a student who had a 3.2 high school GPA, scored a 1280 on her SATs, and participated in band. Use whichever model you decided was the best in part e. Also give a 95% prediction interval for your prediction for College GPA from the Excel output of whichever model you chose.
a) Regression Analysis: COLLEGE GPA versus HS GPA, SAT, EXTRA_yes
The regression equation is
COLLEGE GPA = 0.805 + 0.719 HS GPA + 0.000062 SAT + 0.197
EXTRA_yes
b) If the student was involed in an extra curricular activity in shcools then Extra variable is YES i.e. 1 or not involed in an extracurricular activity then it is NO i.e. 0
c)
P-value of regresssion is 0.004 < alpha 0.05. so we conclude
that the regression equation is best fit to the given data
d)
P-value of HS GPA = 0.005
P-value of SAT = 0.919
P-value Extra_yes = 0.064
e) Here Onle one independent variable of HS_GPA is
significant
since its p-value < alpha 0.05.
The remaining two variables SAT and Extra_Yes are no significant
since Its p-value > alpha 0.05.
R-square = 0.956 and Standard error of residuals = 0.094925
From the MINITAB Output,
Now we eliminate SAT and Extra_yes variables. The new regression equation is
P-value of regression is 0.001< alpha 0.05. so we conclude that the regression equation is best fit to the given data
but R-square = 0.885 and S = 0.125495 which is higher than
previous model.
Thus previous model is best model compare to this model.
g)
The regression equation is
COLLEGE GPA = 0.805 + 0.719 (3.2) + 0.000062 (1280) + 0.197(1) =
3.3820
e) e) The 95% prediction interval for your prediction for College GPA is (3.0399, 3.7241)