In: Statistics and Probability
SAT | Income | GPA |
1651 | 47000 | 2.79 |
1581 | 34000 | 2.97 |
1790 | 90000 | 3.48 |
1626 | 60000 | 2.5 |
1754 | 113000 | 2.92 |
1754 | 71000 | 3.76 |
1706 | 105000 | 2.8 |
1765 | 59000 | 3.26 |
1786 | 50000 | 3.89 |
1686 | 27000 | 3.67 |
1790 | 107000 | 3.31 |
1707 | 109000 | 3.16 |
1804 | 81000 | 3.73 |
1712 | 62000 | 3.21 |
1607 | 72000 | 2.8 |
1738 | 63000 | 3.7 |
1790 | 55000 | 3.86 |
1796 | 64000 | 3.91 |
1547 | 47000 | 2.63 |
1692 | 89000 | 2.98 |
1711 | 42000 | 3.45 |
1689 | 70000 | 3.06 |
1740 | 118000 | 2.88 |
1940 | 113000 | 3.96 |
Use the SAT data file
The data file gives you a list of SAT scores, test-takers’ family income, and students’ GPA. For part a, you must run three different regression models to try to predict SAT score – Model 1 has income as the independent variable; Model 2 has GPA as the independent variable; Model 3 has both income and GPA as independent variables. Run the models and place the results either in separate tabs within your spreadsheet, or in different sections of the same spreadsheet. Underneath the Excel results, display:
The equation for the model
The adjusted R2 of the model
Also on the spreadsheet, answer part b (using the R2, what is the best-fitting model?) and part c (use the “best” model to predict the SAT scores using the mean of the explanatory variables). :
Summarize an article that reports on a current issue regarding the SAT. It could be related to this problem – about whether SAT score is related to income and/or GPA. Or, it could be about the relevance of the SAT – how many schools still use it, do schools use an alternative to SAT, etc. Or, it could be about the new “adversity” score that is reportedly going to be added to the SAT. Or, it could be any article that you find regarding SAT.
1)
SUMMARY OUTPUT
Regression Statistics | |
Multiple R | 0.470211 |
R Square | 0.221098 |
Adjusted R Square | 0.185693 |
Standard Error | 76.22174 |
Observations | 24 |
47.02% indicates that the model explains all the variability of the SAT data around its mean.
ANOVA
df | SS | MS | F | Significance F | |
Regression | 1 | 36281.27 | 36281.27 | 6.24489 | 0.020413 |
Residual | 22 | 127814.6 | 5809.753 | ||
Total | 23 | 164095.8 |
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 1616.363 | 45.57701 | 35.46443 | 6.57E-21 | 1521.842 | 1710.884 | 1521.842 | 1710.884 |
Income | 0.00147 | 0.000588 | 2.498978 | 0.020413 | 0.00025 | 0.00269 | 0.00025 | 0.00269 |
Regression model1:
SAT = 1616.363 + 0.00147 * Income
The predicat SAT to increase by 1616.363 as per one unit of SAT.
The predicat SAT to Income increase by 0.00147 as per one unit of SAT.
2)
Model 2:
SUMMARY OUTPUT
Regression Statistics | |
Multiple R | 0.755055 |
R Square | 0.570108 |
Adjusted R Square | 0.550567 |
Standard Error | 56.62619 |
Observations | 24 |
57.01% indicates that the model explains all the variability of the SAT data around its mean.
ANOVA
df | SS | MS | F | Significance F | |
Regression | 1 | 93552.29 | 93552.29 | 29.1756 | 2E-05 |
Residual | 22 | 70543.55 | 3206.525 | ||
Total | 23 | 164095.8 |
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 1259.638 | 86.63659 | 14.53933 | 9.19E-13 | 1079.964 | 1439.311 | 1079.964 | 1439.311 |
GPA | 141.468 | 26.19076 | 5.401444 | 2E-05 | 87.15163 | 195.7843 | 87.15163 | 195.7843 |
SAT = 1259.638 + 141.468 * GPA
The predicat SAT to increase by 1259.638 as per one unit of SAT.
The predicat SAT to Income increase by 141.468 as per one unit of SAT.
3)
model 3:
SUMMARY OUTPUT
Regression Statistics | |
Multiple R | 0.930005 |
R Square | 0.864909 |
Adjusted R Square | 0.852043 |
Standard Error | 32.4902 |
Observations | 24 |
86.49% indicates that the model explains all the variability of the SAT data around its mean.
ANOVA
df | SS | MS | F | Significance F | |
Regression | 2 | 141928 | 70963.98 | 67.22536 | 7.44E-10 |
Residual | 21 | 22167.88 | 1055.613 | ||
Total | 23 | 164095.8 |
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 1104.258 | 54.75239 | 20.16821 | 3.17E-15 | 990.3941 | 1218.122 | 990.3941 | 1218.122 |
Income | 0.001705 | 0.000252 | 6.76957 | 1.07E-06 | 0.001181 | 0.002228 | 0.001181 | 0.002228 |
GPA | 150.992 | 15.09309 | 10.00404 | 1.92E-09 | 119.6042 | 182.3798 | 119.6042 | 182.3798 |
SAT = 1104.258 + 0.001705 * Income + 150.992 * GPA
The predicat SAT to increase by 1104.258 as per one unit of SAT.
The predicat SAT to Income increase by 0.001705 as per one unit of SAT and GPA increase by 150.992 as per one unit of SAT.
Descriptive statistics
SAT | Income | GPA | |||
Mean | 1723.417 | Mean | 72833.33 | Mean | 3.278333 |
Standard Error | 17.24167 | Standard Error | 5515.678 | Standard Error | 0.092024 |
Median | 1725 | Median | 67000 | Median | 3.235 |
Mode | 1790 | Mode | 47000 | Mode | 2.8 |
Standard Deviation | 84.46657 | Standard Deviation | 27021.19 | Standard Deviation | 0.450822 |
Sample Variance | 7134.601 | Sample Variance | 7.3E+08 | Sample Variance | 0.203241 |
Kurtosis | 1.041193 | Kurtosis | -1.03933 | Kurtosis | -1.31756 |
Skewness | 0.073531 | Skewness | 0.259968 | Skewness | 0.046499 |
Range | 393 | Range | 91000 | Range | 1.46 |
Minimum | 1547 | Minimum | 27000 | Minimum | 2.5 |
Maximum | 1940 | Maximum | 118000 | Maximum | 3.96 |
Sum | 41362 | Sum | 1748000 | Sum | 78.68 |
Count | 24 | Count | 24 | Count | 24 |
Confidence Level(95.0%) | 35.6671 | Confidence Level(95.0%) | 11410.05 | Confidence Level(95.0%) | 0.190365 |
On average SAT is 1723.417 , it's man's that the the most data point near to this value.also calculate the median, mode variance value.total 24 observation and minimum value is 1547 and maximum vlaue 1940.
On average Income is 72833.33 , it's man's that the the most data point near to this value.also calculate the median, mode variance value.total 24 observation and minimum value is 27000 and maximum vlaue 118000.
On average GPA is 3.278333, it's man's that the the most data point near to this value.also calculate the median, mode variance value.total 24 observation and minimum value is 2.5 and maximum vlaue 3.96.