In: Statistics and Probability
Q3. (This question is based in R) Now use the simulation ("X = rnorm(1000, mean = 10, sd = 2)", "Y = rnorm(1000, mean = 5, sd = 3)") to estimate the distribution of X+Y and create confidence intervals. A) Form a set of Xs and Ys by repeating the individual experiment for B = 2000 times, where each experiment has n = 1000 samples. You may want to write a for loop and create two matrices "sample_X" and "sample_Y" to save those values. B) Calculate the mean of X+Y for each experiment and save it to a vector which has a length of B, and plot a histogram of these means. C) Now as we have a simulated sampling distribution of X+Y, calculate the 95% confidence interval for mean of X+Y (this can be done empirically). D) In the above example, we have fixed the sample size n and number of experiments B. Next, we want to change B and n, and see how the confidence interval will change. Please write a function to create confidence intervals for any B and n. E) Suppose the sample size n varies (100, 200, 300, .... , 1000) (fix B=2000) and the number of experiments B varies (1000, 2000, ... , 10000) (fix n=500). Plot your confidence intervals to compare the effect of changing the sample size n and changing the number of simulation replications B. What do you conclude? (Hint: Check function errbar() in Hmisc package for plot - library(Hmisc)) fix n, B varies fix B, n varies