In: Statistics and Probability
A business statistics professor at a college would like to develop a regression model to predict the final exam scores for students based on their current GPAs, the number of hours they studied for the exam, the number of times they were absent during the semester, and their genders. Use the accompanying data to complete parts a through c below.
Score GPA Hours
Absences Gender
68 2.55 3.00 0
0
69 2.22 4.00 3
0
70 2.60 2.50 1
0
71 3.09 0.50 0
1
74 3.08 6.00 4
1
77 2.80 3.50 6
0
77 3.34 1.50 0
0
78 2.98 3.00 3
1
78 2.99 2.00 3
1
79 2.81 2.50 2
1
79 2.80 4.50 0
1
82 3.47 7.00 1
0
83 3.19 3.00 1
0
84 3.10 3.00 4
0
84 3.18 5.50 0
1
84 2.98 2.00 0
0
84 2.71 4.00 1
1
85 3.20 4.50 3
0
85 3.75 2.00 0
1
85 3.57 3.50 2
0
86 2.87 6.00 1
1
86 3.09 6.50 1
0
86 3.20 5.00 3
1
87 3.89 7.50 4
0
87 3.55 4.00 0
1
89 3.31 6.50 1
1
89 3.66 5.00 0
1
90 2.90 3.50 1
0
90 3.41 6.00 1
0
91 3.28 4.50 2
0
91 3.74 7.00 0
0
92 3.91 6.00 2
1
92 3.95 5.00 0
0
92 3.54 6.50 1
0
93 3.02 4.00 2
1
94 3.26 6.50 0
1
98 2.88 3.50 0
0
99 3.78 5.00 1
0
100 3.46 6.50 1
1
100 3.00 7.00 0
0
a. Use technology to check for the presence of multicollinearity.
Find the variance inflation factor (VIF) for each independent variable. Use x1=GPA, x2=Hours, x3=Absences,and x4=Gender , where x4=1 if the gender is male, and x4=0 otherwise.
Independent Variable |
VIF |
|
GPA |
(x1 ) |
|
Hours |
(x2 ) |
|
Absences |
(x3 ) |
|
Gender |
(x4 ) |
Is there multicollinearity?
A. Yes, because at least one of the VIFs is greater than or equal to 5.0.
B. No, because at least one of the VIFs is greater than or equal to 5.0.
C. No, because none of the VIFs are greater than or equal to 5.0.
D. Yes, because none of the VIFs are greater than or equal to 5.0.
b. If multicollinearity is present, take the necessary steps to eliminate it.
What is necessary to eliminate any multicollinearity? Select all that apply.
A.Remove the independent variable
x1
(GPAGPA ).
Your answer is not correct.
B. Remove the independent variable x2 (Hours).
C. Remove the independent variable x3 (Absences).
D.Remove the independent variable x4 (Gender).
E. Since no multicollinearity is present, no steps are necessary.
c. Perform a general stepwise regression using
alphaαequals=0.10
for the p-value to enter and to remove independent variables from the regression model.
What is the resulting regression equation? Use x1=GPA, x2=Hours, x3=Absences, and x4=Gender, where x4=1 if the gender is male, and x4=0 otherwise. Note that the coefficient is 0 for any variable that was removed or is not significant.
y=(_)+(_)x1+(_)x2+(_)x3+(_)x4
Regression Analysis: Score versus GPA, Hours, Absences, Gender
Analysis of Variance
Source | DF | Adj SS | Adj MS | F-Value | P-Value |
Regression | 4 | 1274.25 | 318.562 | 7.97 | 0.000 |
GPA | 1 | 292.24 | 292.242 | 7.32 | 0.010 |
Hours | 1 | 362.46 | 362.457 | 9.07 | 0.005 |
Absences | 1 | 108.85 | 108.851 | 2.72 | 0.108 |
Gender | 1 | 3.06 | 3.055 | 0.08 | 0.784 |
Error | 35 | 1398.15 | 39.947 | ||
Total | 39 | 2672.40 |
Model Summary
S | R-sq | R-sq(adj) | R-sq(pred) |
6.32038 | 47.68% | 41.70% | 31.30% |
Coefficients
Term | Coef | SE Coef | T-Value | P-Value | VIF |
Constant | 54.62 | 8.55 | 6.39 | 0.000 | |
GPA | 7.52 | 2.78 | 2.70 | 0.010 | 1.22 |
Hours | 1.860 | 0.618 | 3.01 | 0.005 | 1.19 |
Absences | -1.150 | 0.697 | -1.65 | 0.108 | 1.04 |
Gender | -0.56 | 2.01 | -0.28 | 0.784 | 1.00 |
Regression Equation
Score | = | 54.62 + 7.52 GPA + 1.860 Hours - 1.150 Absences - 0.56 Gender |
Fits and Diagnostics for Unusual Observations
Obs | Score | Fit | Resid |
Std Resid |
|
37 | 98.00 | 82.79 | 15.21 | 2.53 | R |
a) C. No, because none of the VIFs are greater than or equal to 5.0.
b) Since no multicollinearity is present, no steps are necessary.
c)