In: Economics
I put economics, because this question is from my econometrics review, I can change it to statistics if needed
Question 1 Consider the following sample of 6 observations
measuring students’ test scores (score) and average hours spent on
studying each week (study).
Table 1
Score | Study |
95 | 18 |
85 | 11 |
93 | 15 |
80 | 7 |
100 | 16 |
89 | 10 |
Suppose that you are interested in estimating the following
model:
score = β0 + β1study + u.
(i) Use the method of OLS to estimate β0 and β1. Show your work.
(ii) What are the predicted test scores if a student spends 8 hours and 20 hours per week studying, respectively? Comment on your predictions
(iii) Find the sum of squared residuals, standard error of residuals, and the standard error of ˆ β1.
(iv) Calculate and interpret R^2.
(v) Suppose that you include in (1) another regressor, sleep, which is average hours spent sleeping per week. All the 6 students in your survey reported that they sleep 8 hours. Can you apply OLS to estimate the effect of sleep time on test scores? Explain.
1. (i) We have the required values as below.
, , , .
For the regression be , the slope estimate would be or . The intercept estimate would be or or .
The regression equation thus obtained is .
(ii) For st=8, the score on average would be or . This means, for 8 hours of study time, the scores would be 82.85 on average. It is admissible as its a within-sample forecast, since 8 is within the range of the study variable.
For st=20, the score on average would be or . This means, for 20 hours of study time, the score would be 101.42 on average, which seems an impossible outcome since score can be at most 100. It is not admissible, as it is an out of sample forecast, since 20 is not within the range of study variable.
(iii) The residual sum of square (RSS) would be .
The residual standard error would be , where N is number of observation, and N-2 is degree of freedom, since there are two estimates including intercept. It would be .
The standard error of beta1 would be .
(iv) The R square in this case would be the square of coefficient of correlation since it is a two variable regression. But, we must use the more general formula as . We have the RSS, the TSS (Total sum of square) would be . The R-square would be hence or .
This means that the variation in explanatory variable explains about 80.14% of the variation in the dependent variable, ie. variation in study explains about 80.14% of the variation in scores.
(v) The OLS can be applied if there is a variation in the explanatory variable. In this case, OLS would not be applied since all the observation of variable sleep would be 8 hours, meaning that there is no variation. An explanatory variable requires a variation within in order to explain, but that is absent here. More so, the estimation its coefficient would require a non zero , which in this case is , and hence, the estimate would be not defined in this case.