In: Statistics and Probability
Obtain results for standard error of the mean, confidence intervals and CLT convergence for random samples drawn from exponential distribution.
Sample Size: 1,2,4,8,16,32,64,128
a. Produce plot of standard error of the mean for exponential distribution (using rexp). Remember that population variance for exponential distribution with rate λ is 1/ λ^2
b. Demonstrate the calculation of confidence interval for a mean of a random sample drawn from exponential distribution – both using confint and mean, sd, qt – make sure the two approaches produce the same answer.
c. Generate plots illustrating CLT convergence to normality for the sum of i.i.d. variables distributed according to the exponential distribution (do not fall into a trap: in contrast to Bernoulli distribution, we do not have a ready-to-use analytical expression for a sum of n exponential random variables, so you will have to sample explicitly and sum up the samples drawn).
(a) R CODE TO PRODUCE PLOT OF STANDARD ERROR OF MEAN OF EXPONENTIAL DISTRIBUTION USING "rexp":
SIMULATION:
means<-vector("numeric")
variance<-vector("numeric")
Std.error<-sqrt(variance)
for (i in 1:nsimulations) #nsimulations- number of simulations you prefer
{
sample=rexp(sample size,rate=lambda)
means=c(means,mean(sample))
variances=c(variances,variance(sample))
Std_error=c(Std_error,sqrt((variances)/n^2))
}
CODE FOR PLOT OF STANDARD ERROR:
hist(Std_error, break=50, main="Histogram of standard error", xlab="standard error of sample")
abline(v=Std.error.theory,lwd=3,col="red") #Stderror.theory is theoretical standard error
abline(v=mean(Std.error),lwd=3,col="blue",lty=2)
legend("topright", c("Theoretical Standard Error","Average Standard error"), bty="n", lwd=3,
col=c("red","blue"), lty=c(1,2))
(b) CONFIDENCE INTERVAL:
USING sd:
simdata=rexp(sample size, rate=lamda)
matrixdata=matrix(simdata,nrow=,ncol=)
means.exp=apply(matrixdata,1,mean)
Mean=mean(means.exp)
se=sd(mean.exp)/sqrt(n)
lower=mean(means.exp) - (1.96*se)
upper=mean(means.exp) + (1.96*se)
c(lower,upper)
USING qt:
means=c()
for (i in 1:nevents) #nevents- no of sample means
{
m=mean(rexp(sample size, rate=lamba))
means=append(means,m)
}
center=mean(means)
sdev=sd(means)
error=qt(0.975, df=length(means)-1)*sdev/sqrt(length(means)
lower=center-error
upper=center+error
CI=c(lower,upper)
(c) GENERATING PLOTS ILLUSTRATING CENTRAL LIMIT THEOREM (CLT):
CLT states that when certain conditions are satisfied, sample means are approximately normally distributed, nomatter what the underlying distribution is.
set.seed(1234)
X=rexp(1000,rate=lambda) # 1000 is number of sample means
hist(X, probability=TRUE, main="Figure 1-1000 Exponential distribution with rate(lambda)",col="grey", xlab="sample means",breaks=100)
#CODE FOR GENERATING DISTRIBUTION OF SAMPLE MEANS
set.seed(112)
mns=NULL
for ( i in 1:1000)
{
mns=c(mns, mean(rexp(sample size, lambda))
hist(mns,probability=TRUE, main="Figure 2-1000 sample means of exponential sample size=preferred samplesize", xlab="Means of sample means", breaks=100, col="grey")
abline(v=mean(mns),col="blue",lty=2,lwd=4)
abline(v=population mean,col="red",lty=2,lwd=3)
lines(density(mns), lwd=3, col="black")
text(3.8,0.45, pos=3.8, "Red line is the \n location of \n the theoretical mean",cex=0.8,col="red")
text(5.5,0.58, pos=4, "Blue line is the \n location of \n the calculated mean",cex=0.8,col="blue")
#CODE FOR TEST OF NORMALITY
qqnorm(mns,main="Figure 3-Normal Q-Q plot, sample size n= ");
qqline(mns,col=2)
For random sample of size n, the following three corollaries hold:
#CODE FOR CALCULATING MEAN AND VARIANCE OF SAMPLE MEAN:
X_n=mean(mns) #Sample mean
v=var(mns) #Sample variance