In: Economics
x |
y |
sex |
1.37 |
55.29 |
0 |
1.94 |
57.26 |
0 |
3.44 |
66.92 |
1 |
3.59 |
69.05 |
0 |
4.18 |
70.63 |
1 |
Y and X are continuous variables while Sex is a categorical variable where 0-male and 1-female.
f.Write down estimate regression and calculate R-squared
The objective of the following analysis is to estimate the regression model between variable y and variable male.
a. On turning the variable Sex into a binary variable male, the result is:
x | y | sex | male |
1.37 | 55.29 | 0 | 1 |
1.94 | 57.26 | 0 | 1 |
3.44 | 66.92 | 1 | 0 |
3.59 | 69.05 | 0 | 1 |
4.18 | 70.63 | 1 | 0 |
b. The direction of relationship between variable y and variable male is negative. The extreme higher value of Y and its relationship with variable male dominate the direction of relationship.
c. The population model that looks into the relationship between y and male is:
y = b1 + b2*male +u
d. On estimating the OLS regression between y and male in excel, the result is:
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.639784995 | |||||||
R Square | 0.40932484 | |||||||
Adjusted R Square | 0.21243312 | |||||||
Standard Error | 6.261600346 | |||||||
Observations | 5 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 1 | 81.51008333 | 81.51008333 | 2.07893374 | 0.24501494 | |||
Residual | 3 | 117.6229167 | 39.20763889 | |||||
Total | 4 | 199.133 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 99.0% | Upper 99.0% | |
Intercept | 68.775 | 4.427620066 | 15.53317561 | 0.000579759 | 54.68433688 | 82.86566312 | 42.91367274 | 94.63632726 |
male | -8.241666667 | 5.716032926 | -1.441850803 | 0.24501494 | -26.43263453 | 9.949301199 | -41.6284966 | 25.14516326 |
Thus, the intercept coefficient is 68.775 and the slope coefficient is -8.241666667
The intercept coefficient shows that expected value of y is 68.775 when there is a female.
The slope coefficient shows that expected value of y is 8.241666667 lower when there is male vis-a-vis when there is female
e. The standard error of the intercept is 3.61 , whereas, the standard error of slope-coefficient is 5.71
It shall be noted that for intercept coefficient, the standard error is 4.42762006 instead of 3.61
The 99% confidence interval for intercept coefficient as obtained in excel is 42.91367274 (Lower Confidence Limit) and 94.63632726 (Upper Confidence Limit)
The 99% confidence interval for slope coefficient as obtained in excel is -41.6284966 (Lower Confidence Limit) and 25.14516326 (Upper Confidence Limit)
It shall be noted that coefficient on male variable is the slope coefficient and it shows that 10 lies between -41.6284 and 25.14516326, thereby indicating that the coefficient on variable male is not statistically different from 10
f. The estimated regression model is:
y_estimated = 68.775 - 8.241666667*male
The R-Squared is 0.40932484