Question

In: Statistics and Probability

A business statistics professor at a college would like to develop a regression model to predict...

A business statistics professor at a college would like to develop a regression model to predict the final exam scores for students based on their current GPAs, the number of hours they studied for the exam, the number of times they were absent during the semester, and their genders. Use the accompanying data to complete parts a through c below.

Score   GPA   Hours   Absences   Gender
68   2.55   3.00   0   0
69   2.22   4.00   3   0
70   2.60   2.50   1   0
71   3.09   0.50   0   1
74   3.08   6.00   4   1
77   2.80   3.50   6   0
77   3.34   1.50   0   0
78   2.98   3.00   3   1
78   2.99   2.00   3   1
79   2.81   2.50   2   1
79   2.80   4.50   0   1
82   3.47   7.00   1   0
83   3.19   3.00   1   0
84   3.10   3.00   4   0
84   3.18   5.50   0   1
84   2.98   2.00   0   0
84   2.71   4.00   1   1
85   3.20   4.50   3   0
85   3.75   2.00   0   1
85   3.57   3.50   2   0
86   2.87   6.00   1   1
86   3.09   6.50   1   0
86   3.20   5.00   3   1
87   3.89   7.50   4   0
87   3.55   4.00   0   1
89   3.31   6.50   1   1
89   3.66   5.00   0   1
90   2.90   3.50   1   0
90   3.41   6.00   1   0
91   3.28   4.50   2   0
91   3.74   7.00   0   0
92   3.91   6.00   2   1
92   3.95   5.00   0   0
92   3.54   6.50   1   0
93   3.02   4.00   2   1
94   3.26   6.50   0   1
98   2.88   3.50   0   0
99   3.78   5.00   1   0
100   3.46   6.50   1   1
100   3.00   7.00   0   0

a. Use technology to check for the presence of multicollinearity.

Find the variance inflation factor (VIF) for each independent variable. Use x1=GPA, x2=Hours, x3=Absences,and x4=Gender , where x4=1 if the gender is male, and x4=0 otherwise.

Independent Variable

VIF

GPA

(x1 )

Hours

(x2 )

Absences

(x3 )

Gender

(x4 )

Is there multicollinearity?

A. Yes, because at least one of the VIFs is greater than or equal to 5.0.

B. No, because at least one of the VIFs is greater than or equal to 5.0.

C. No, because none of the VIFs are greater than or equal to 5.0.

D. Yes, because none of the VIFs are greater than or equal to 5.0.

b. If multicollinearity is present, take the necessary steps to eliminate it.

What is necessary to eliminate any multicollinearity? Select all that apply.

A.Remove the independent variable

x1

(GPAGPA ).

Your answer is not correct.

B. Remove the independent variable x2 (Hours).

C. Remove the independent variable x3 (Absences).

D.Remove the independent variable x4 (Gender).

E. Since no multicollinearity is present, no steps are necessary.

c. Perform a general stepwise regression using

alphaαequals=0.10

for the p-value to enter and to remove independent variables from the regression model.

What is the resulting regression equation? Use x1=GPA, x2=Hours, x3=Absences, and x4=Gender, where x4=1 if the gender is male, and x4=0 otherwise. Note that the coefficient is 0 for any variable that was removed or is not significant.

y=(_)+(_)x1+(_)x2+(_)x3+(_)x4

Solutions

Expert Solution

Regression Analysis: Score versus GPA, Hours, Absences, Gender

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value
Regression 4 1274.25 318.562 7.97 0.000
GPA 1 292.24 292.242 7.32 0.010
Hours 1 362.46 362.457 9.07 0.005
Absences 1 108.85 108.851 2.72 0.108
Gender 1 3.06 3.055 0.08 0.784
Error 35 1398.15 39.947
Total 39 2672.40

Model Summary

S R-sq R-sq(adj) R-sq(pred)
6.32038 47.68% 41.70% 31.30%

Coefficients

Term Coef SE Coef T-Value P-Value VIF
Constant 54.62 8.55 6.39 0.000
GPA 7.52 2.78 2.70 0.010 1.22
Hours 1.860 0.618 3.01 0.005 1.19
Absences -1.150 0.697 -1.65 0.108 1.04
Gender -0.56 2.01 -0.28 0.784 1.00

Regression Equation

Score = 54.62 + 7.52 GPA + 1.860 Hours - 1.150 Absences - 0.56 Gender

Fits and Diagnostics for Unusual Observations

Obs Score Fit Resid Std
Resid
37 98.00 82.79 15.21 2.53 R

a) C. No, because none of the VIFs are greater than or equal to 5.0.

b) Since no multicollinearity is present, no steps are necessary.

c)


Related Solutions

A university would like to develop a regression model to predict the point differential for games...
A university would like to develop a regression model to predict the point differential for games played by its men's basketball team. A point differential is the difference between the final points scored by two competing teams. A positive differential is a win for the university's team and a negative differential is a loss. For a random sample of games, the point differential (y) was calculated, along with the number of assists (x1), rebounds (x2), turnovers (x3) and personal fouls...
Suppose an athletic director would like to develop a regression model to predict the point differential...
Suppose an athletic director would like to develop a regression model to predict the point differential for games played by the college's men's basketball team. A point differential is the difference between the final points scored by two competing teams. A positive differential is a win, and a negative differential is a loss. For a random sample of home and away games, the point differential was calculated, along with the number of assists, rebounds, and turnovers. The data are given...
Suppose a bank would like to develop a regression model to predict a? person's credit score...
Suppose a bank would like to develop a regression model to predict a? person's credit score based on his or her? age, weekly?income, highest education level? (high school, bachelor? degree, graduate? degree), and whether or not he or she owns or rents his or her primary residence. The accompanying table provides these data for a random sample of customers. Complete parts a through d below Credit_Score   Income_($)      Age      Education        Residence 592                              1,383   55        Bachelor         Own 702                              1,707   65       ...
A statistics professor would like to build a model relating student scores on the first test...
A statistics professor would like to build a model relating student scores on the first test to the scores on the second test. The test scores from a random sample of 2121 students who have previously taken the course are given in the table. Test Scores Student First Test Grade Second Test Grade 1 81 79 2 55 64 3 52 61 4 81 74 5 50 65 6 70 71 7 43 64 8 43 58 9 77 75...
A statistics professor would like to build a model relating student scores on the first test...
A statistics professor would like to build a model relating student scores on the first test to the scores on the second test. The test scores from a random sample of 2121 students who have previously taken the course are given in the table. Test Scores Student First Test Grade Second Test Grade 1 6767 7070 2 9797 8181 3 4242 6161 4 6969 7272 5 5656 6363 6 6666 7272 7 9999 9090 8 7979 7373 9 8888 7878...
Use Excel to develop a regression model for the Hospital Database to predict the number of...
Use Excel to develop a regression model for the Hospital Database to predict the number of Personnel by the number of Births. How many residuals are within 1 standard error? Write your answer as a whole number. Personnel Births 792 312 1762 1077 2310 1027 328 355 181 168 1077 3810 742 735 131 1 1594 1733 233 257 241 169 203 430 325 0 676 2049 347 211 79 16 505 2648 1543 2450 755 1465 959 0 325...
Use Excel to develop a regression model for the Hospital Database to predict the number of...
Use Excel to develop a regression model for the Hospital Database to predict the number of Personnel by the number of Births. How many residuals are within 1 standard error? Write your answer as a whole number. Personnel(y) Births(x) 792 312 1762 1077 2310 1027 328 355 181 168 1077 3810 742 735 131 1 1594 1733 233 257 241 169 203 430 325 0 676 2049 347 211 79 16 505 2648 1543 2450 755 1465 959 0 325...
An electronics company is looking to develop a regression model to predict the number of units...
An electronics company is looking to develop a regression model to predict the number of units sold for a special running watch. Data is provided below: Sales (units) Price ($) Advertising ($) Holiday 500 100 50 Yes 480 120 40 Yes 485 110 45 No 510 103 55 Yes 490 108 40 No 488 109 30 No 496 106 45 Yes Compile a spreadsheet for the data and determine the predicted number of units sold if the watch is sold...
An electronics company is looking to develop a regression model to predict the number of units...
An electronics company is looking to develop a regression model to predict the number of units sold for a special running watch. Data is provided below: Sales (units) Price ($) Advertising ($) Holiday 500 100 50 Yes 480 120 40 Yes 485 110 45 No 510 103 55 Yes 490 108 40 No 488 109 30 No 496 106 45 Yes Compile an excel spreadsheet for the above data and determine the regression equation Answer (a) X3 = 1 if...
Develop the best logistic regression model that can predict the wage by using the combination of...
Develop the best logistic regression model that can predict the wage by using the combination of any following variables: total unit (X2), constructed unit (X3), equipment used (X4), city location (X5) and total cost of a project (X6). Make sure that you partition your data with 60% training test, 40% validation test, and default seed of 12345 before running the logistic regression (15 points) Wage - X1 Total Unit - X2 Contracted Units - X3 Equipment Used - X4 City...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT