In: Statistics and Probability
R-question
Set the random number generation seed to the value 1234. Draw a sample of size 11 fromExp(λ = 0.097) and find the mean get-better time in this sample. Repeat this process for a total of 10000 get-better averages and store these values in the variables gbas. Typically, we would make a histogram of the values, but R has a nice function that will draw a smooth representation of the histogram: plot(density(gbas)). Plot this. We now know that the normal model is not the best fit for this sampling distribution. To convince yourself of this, figure out how to draw a normal distribution (in red) atop the existing plot. You should be able to figure out what the mean and spread of the normal distribution would be if H0 is assumed to be true. Include your code and a sketch of the graph.
The mean of Exp(0.097) is
the standard deviation is
The mean of sampling distribution of mean is
the standard error of mean for a sample of size n=11 is
Uisng the central limit theorem we know that the sampling distribution of mean for a sample of size n will tend towards normal distribution as the sample size increases. But for n=11, normal distribution with mean and standard deviation is not yet a good fit.
The R code (all statements starting with # are comments)
#set the sample size
n<-11
#set the total number of samples
r<-10000
#set the value of lambda
lambda<-0.097
#Draw n*r random numberd from Exp(lambda)
x<-rexp(n*r,lambda)
#transform x to a matrix of n rows and r columns
x<-matrix(x,nrow=n,ncol=r)
#get r averages for samples of size n and store in gbas (gbas has r
averages)
gbas<-apply(x,2,mean)
#plot the histogram
plot(density(gbas),xlab="Sample Mean",main="Histogram of sample
means")
#hypothesized mean (mean of Exp(0.097))
mu<-1/lambda
#standard deviation of Exp(0.097)
sigma<-1/lambda
#hypothesized standard error of mean
se<-sigma/sqrt(n)
#fit a normal pdf on to the plot
xfit<-seq(min(x),max(x),length=1000)
lines(xfit,dnorm(xfit,mean=mu,sd=se),col="red",lty=2)
legend("topright",c("sample means","normal
pdf"),col=c("black","red"),lty=c(1,2))
# get this output
We can see that the normal model is not the best fit for this sampling distribution.