In: Math
Please do this task by R.
a) Revise the simulation shown in the lecture with the aim of constructing the empirical sampling distribution of beta_hat, based on 5000 trials.
b) According to the lecture, the mean of that histogram is supposed to be approximately equal to the true slope. Is it? Show code.
c) According to the lecture, the standard deviation of that histogram is supposed to be approximately equal to sigma_eps/sqrt(Sxx). Is it? Show code.
d) According to the lecture, the distribution of the beta_hat is supposed to be normal with certain parameters. Use qqnorm() and abline() to confirm that it is normal.
not sure if this help or not,
n = 10
n.trial = 64
x = c(1:n)
y_true = 10 + 2*x
sigma_eps = 15
par(mfrow=c(8,8),mar=c(0,0,0,0))
set.seed(123)
for(trial in 1:n.trial){
y_obs = y_true + rnorm(n,0,sigma_eps)
lm.1 = lm(y_obs ~ x)
plot(x, y_obs)
abline(10,2, col=2)
abline(lm.1, col=4)
}
a) Observe that to change the simulation trials to 5000 times, we just need to change the n.trial variable. Changing it's value to 5000 will ensure that in the given code. Also the for loop should be changed to the following:
This will store all the slopes in the beta vector.
After this use the hist function on beta to get the following result:
b) Use mean() function on beta to get the mean of this histogram which in this case is 1.969032, which is quite close to the actual slope.
c) Use sd() function on beta to get the standard deviation of this histogram which in this case is 1.642933, where
so the obtained value is quite close to the theoretical value.
d) The qqnorm and abline results are given below:
Observe that the Q-Q line passes almost exactly through the Q-Q plot. Also observe that the middle line drawn through abline lies exactly in the middle of the histogram and the other 2 lines are standard deviations away from the mean, which covers almost all of data points. Hence, we can say that the coefficients are normally distributed.