Question

In: Statistics and Probability

Perform all the steps for ALL PARTS (a - d) in R code and show and...

Perform all the steps for ALL PARTS (a - d) in R code and show and explain the results. Thank you.

Problem 2.23 Consider the simple linear regression model y = 50 + 10x + E where E is NID(0,16). Suppose that the n = 20 pairs of observations are used to fit this model. Generate 500 samples of 20 observations, drawing one observation for each level of x = 0.5,1, 1.5, 2, ..., 10 [i.e. going up by an interval of 0.5 to 10 by 0.5 for n=20 observations; a for loop will possibly be needed to do this 500 times] for each sample.

# (a) For EACH sample, compute the least-squares regression estimates of the slope, (Beta1) and intercept, (Beta0). Construct HISTOGRAMS of the sample values of Beta0hat and Beta1hat. Discuss the shape of these histograms.

# (b) For each sample, compute an estimate of E(y|x=5). Construct a histogram of the estimates you obtained. Discuss the shape of the histogram.

# (c) For each sample, compute a 95% CI (i.e. alpha = 0.05; alpha/2 = 0.025). on the Slope, (Beta1). How many of these intervals contain the true value, Beta1 = 10? Is this what you would expect? Why or Why not?

# (d) For each estimate of E(y|x = 5) in part b, compute the 95% CI. How many of these intervals contain the true VALUE of E(y|x=5) = 100 (i.e. 10^2; POTENTIALLY associated with the variance, s^2)? Is this what you would expect?

Solutions

Expert Solution

ii) estimate of E(y|x=5)

The Estimated model is E(y|x=5)=beta0hat+beta1hat*x

Rcode:

n=20; x1=seq(0.5,10,0.5); x0=rep(1,20);

x=as.matrix(data.frame(x0,x1));

beta=as.matrix(c(50,10))

result=matrix(0,nrow=500, ncol=2)

for( i in 1:500)

{

y=x%*%beta+rnorm(20,0,4);    ## var=16 and sd=4

paraest=as.matrix(lm(y~x1)$coefficients)

result[i,]=t(paraest);

}

hist(result[,1], main="histogram of beta0hat")

hist(result[,2],main="histogram of beta1hat")

### computation of E(y|x=5)

est=result%*%as.matrix(c(1,5))

hist(est,main=" Histogram of E(y|x=5)");

iii) #### 95% CI for beta1########

n=20;
x1=seq(0.5,10,0.5);
x0=rep(1,20);
x=as.matrix(data.frame(x0,x1));
beta=as.matrix(c(50,10))
ci=matrix(0,nrow=500, ncol=2)
z={};
for( i in 1:500)
{
y=x%*%beta+rnorm(20,0,4); ## var=16 and sd=4
paraest=lm(y~x1)
ci[i,]=as.matrix(confint(paraest,'x1',level=0.95));
z[i]=ifelse(ci[i,1]<10 && ci[i,2],1,0);
}

mean(z)

Answer: 0.972

coverage probability is 0.972


Related Solutions

In parts a-d evaluate the following determinants. show all steps. a. 2x2 matrix the first row...
In parts a-d evaluate the following determinants. show all steps. a. 2x2 matrix the first row being 1 and 2 the second row being -3 and 4. b. 3x3 matrix, the first row being 2,1, 5, the second row being 0, 3, 2, the third row being 0, 0, 4. c. 3x3 matrix, the first row being 3, -1, 4, the second row being 2, -2, 3, the third row being 1, -1, 2 d. 4x4 matrix, the first row...
Please answer all parts of the question. Please show all work and all steps. 1a.) Show...
Please answer all parts of the question. Please show all work and all steps. 1a.) Show that the solutions of x' = arc tan (x) + t cannot have maxima 1b.) Find the value of a such that the existence and uniqueness theorem applies to the ivp x' = (3/2)((|x|)^(1/3)), x(0) = a. 1c.) Find the limits, as t approaches both positive infinity and negative infinity, of the solution Φ(t) of the ivp x' = (x+2)(1-x^4), x(0) = 0
Please solve all parts of the following question. Please show all work and all steps. 1a.)...
Please solve all parts of the following question. Please show all work and all steps. 1a.) Solve x' = x + 3y + 2t y' = x - y + t^2 1b.) Solve x' + ty = -1 y' + x' = 2 1c.) Solve x' + y = 3t y' - tx' = 0
post all the steps Let S = $50, r = 4% (continuously compounded), d = 3%,...
post all the steps Let S = $50, r = 4% (continuously compounded), d = 3%, s = 30%, T = 1.5. In this situation, the appropriate values of u and d are 1.30644 and 0.77701, respectively. Using a 2-step binomial tree, calculate the value of a $45-strike American call option. a. $10.477 b. $9.867 c. $10.168 d. $9.919 e. $10.367
II. Show all of your work in each question. In parts (d), (e), and (g) make...
II. Show all of your work in each question. In parts (d), (e), and (g) make sure to set up your null and alternative hypotheses and write your conclusions. Also, please round your numbers to 2 decimal points. Write legibly and neatly. III. You can use p-value approach or critical-value approach in writing the conclusions of your hypotheses. A large firm employing tens of thousands of workers has been accused of discriminating against its female managers. The accusation is based...
Show all of your work in each question. In parts (d), (e), and (g) make sure...
Show all of your work in each question. In parts (d), (e), and (g) make sure to set up your null and alternative hypotheses and write your conclusions. Also, please round your numbers to 2 decimal points. Write legibly and neatly. You can use p-value approach or critical-value approach in writing the conclusions of your hypotheses. A large firm employing tens of thousands of workers has been accused of discriminating against its female managers. The accusation is based on a...
Please Use R studio and show all the steps to answer this question NY Marathon 2013...
Please Use R studio and show all the steps to answer this question NY Marathon 2013 the table below shows the winning times (in minutes) for men and women in the new york city marathon between 1978 and 2013. (the race was not run in 2012 because of superstorm sandy.) assuming that performances in the big apple resemble performances elsewhere, we can think of these data as a sample of performance in marathon competitions. Create a 90% confidence interval for...
DO NOT USE EXCEL SHOW STEPS EXPLAIN ALL PARTS A group of 18 people from New...
DO NOT USE EXCEL SHOW STEPS EXPLAIN ALL PARTS A group of 18 people from New York and a group of 15 people from Los Angeles passed the same quiz. The mean grade of group A is 78 points, with the standard deviation σ1=5, the mean grade of group B is 75 points, σ2=4.5. Use α=0.05. Assuming that the subjects are chosen randomly, the cities’ population are independent, and the points are normally distributed: (a) Check if there is a...
Using R and the data in the table below, perform the regression of D on C...
Using R and the data in the table below, perform the regression of D on C (i.e., report the regression equation). C D 3 2 6 7 8 5 9 4 1 0 3 4 Hint: The code to enter the vectors C and D into R is: C <- c(3, 6, 8, 9, 1, 3) D <- c(2, 7, 5, 4, 0, 4) You must figure out how to obtain the regression equation from R. Enter the code below...
a. Consider d on R, the real line, to be d(x,y) = |x2 – y2|. Show...
a. Consider d on R, the real line, to be d(x,y) = |x2 – y2|. Show that d is NOT a metric on R.    b.Consider d on R, the real line, to be d(x,y) = |x3 – y3|. Show that d is a metric on R.    2. Let d on R be d(x,y) = |x-y|. The “usual” distance. Show the interval (-2,7) is an open set. Note: you must show that any point z in the interval has...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT