In: Statistics and Probability
A sample of nine public universities and nine private universities was taken. The total cost for the year (including room and board) and the median SAT score (maximum total is 2400) at each school were recorded. It was felt that schools with higher median SAT scores would have a better reputation and would charge more tuition as a result of that. The data are in the following table. Run the regression one time without dummy variable and one time with dummy variable
a) Write the predicted regression equation and highlight it for both models
b) High light the r2 and explain it for both models
c) Highlight Significance F and explain it for both models
d) Highlight the p-value and discuss if independent variable is significant or not for both models
e) Discuss the sign of the co-efficient for both models
f) Are private schools more expensive than public schools when SAT scores are taken into consideration?
g) Discuss how accurate you believe these results are using information related to the regression models. ( not for this session).
Hint: Dummy variable gets the value of 0 for public universities and1 for private universities.
UNIVERSITY |
Total Cost ($) |
Median SAT |
Dummy |
University 1 |
21700 |
1990 |
Public |
University 2 |
15600 |
1620 |
Public |
University 3 |
16900 |
1810 |
Public |
University 4 |
15400 |
1540 |
Public |
University 5 |
23100 |
1540 |
Public |
University 6 |
21400 |
1600 |
Public |
University 7 |
16500 |
1560 |
Public |
University 8 |
23500 |
1890 |
Public |
University 9 |
20200 |
1620 |
Public |
University 10 |
30400 |
1630 |
Private |
University 11 |
41500 |
1840 |
Private |
University 12 |
36100 |
1980 |
Private |
University 13 |
42100 |
1930 |
Private |
University 14 |
27100 |
2130 |
Private |
University 15 |
34800 |
2010 |
Private |
University 16 |
32100 |
1590 |
Private |
University 17 |
31800 |
1720 |
Private |
University 18 |
32100 |
1770 |
Private |
please show screenshots and what formulas used in excel/the labels thank you
without dummy variable
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.4701 | |||||
R Square | 0.2210 | |||||
Adjusted R Square | 0.1723 | |||||
Standard Error | 7887.3489 | |||||
Observations | 18.0000 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1.0000 | 282345071.1862 | 282345071.1862 | 4.5386 | 0.0490 | |
Residual | 16.0000 | 995364373.2583 | 62210273.3286 | |||
Total | 17.0000 | 1277709444.4444 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | -11364.6773 | 18008.0215 | -0.6311 | 0.5369 | -49539.9774 | 26810.6228 |
Median SAT | 21.6199 | 10.1483 | 2.1304 | 0.0490 | 0.1064 | 43.1334 |
With Dummy variable
for dummy
use =IF(D2="Private",1,0) to assign D = 1 if private
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.8870 | |||||
R Square | 0.7867 | |||||
Adjusted R Square | 0.7583 | |||||
Standard Error | 4262.2253 | |||||
Observations | 18.0000 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 2.0000 | 1005210980.9696 | 502605490.4848 | 27.6665 | 0.0000 | |
Residual | 15.0000 | 272498463.4748 | 18166564.2317 | |||
Total | 17.0000 | 1277709444.4444 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 10988.5314 | 10356.4312 | 1.0610 | 0.3055 | -11085.6792 | 33062.7421 |
D | 14065.7907 | 2229.8296 | 6.3080 | 0.0000 | 9313.0214 | 18818.5600 |
Median SAT | 4.9705 | 6.0861 | 0.8167 | 0.4269 | -8.0017 | 17.9428 |
without dummy
y^ = -11364.6773 21.6199* Median SAT
with dummy variable
y^ = 10988.5314 + 14065.7907 * D + 4.9705 * Median SAT
b)
model 1 - without dummy
model 2 - with dummy
R^2 = 0.2210 for model 1
R^2 = 0.7867 for model 2
c)
significance F for model 1 = 0.0490
significance F for model 2= 0.0000
model 2 is very significant as significance level << 0.01
model is significant at 0.05 level but not at 0.01 level
d)
p-value in model 1 is 0.049 , hence significant at 0.05 level
p-value for D is 0.0000, hence significant
p-value for Median SAT in model 2 is 0.4269 , hence not significant