In: Statistics and Probability
Applying the Central Limit Theorem in R
Please click on the link above to participate in this week's discussion.
We can design an experiment to see how the Central Limit Theorem applies practically. Execute the following commands in R:
> mu <- 100
> sigma <- 10
> n <- 5
> numSims <- 500
> xbar <- rep(0,numSims)
mu is the mean, sigma is the standard deviation of the normal distribution we will use. n is the size of the samples that we will use and xbar will be a vector of size numSims, in which we will put sample means.
> for (i in 1:numSims)
{xbar[i]=mean(rnorm(n,mean=mu,sd=sigma))}
> hist(xbar,prob=TRUE,breaks=12,xlim=c(70,130),ylim=c(0,0.1))
What is the shape of the histogram that you drew and what distribution does it represent?
How would expect the graph to change if you changed the values for:
What do you expect the change to be if you changed the distribution that’s being used? What if instead of rnorm, we used rpois with lambda = mu? Would you expect the graph to change much, given what you know about the Central Limit Theorem?
The R code for generating 500 samples of sample size and plotting the histogram is given below.
mu <- 100
sigma <- 10
n <- 5
numSims <- 500
xbar <- rep(0,numSims)
for (i in 1:numSims)
{
xbar[i]=mean(rnorm(n,mean=mu,sd=sigma))
}
plot(1:1)
dev.new()
hist(xbar,prob=TRUE,breaks=12,col="sky blue", main="Histogram of
Means", xlim=c(70,130),ylim=c(0,0.1))
The shape of the histogram is bell shaped. The distribution of means is approximately normal with .
If you change , the histogram shift to right when increases and vice versa.
If you reduce , the histogram becomes more centered. Otherwise spread of the histogram increases.
If you increase the histogram approaches true normal distribution.
Increasing numSims does not have much effect on the histogram.
If you change the distribution to Poisson, the R code below.
mu <- 100
sigma <- 10
n <- 5
numSims <- 5000
xbar <- rep(0,numSims)
for (i in 1:numSims)
{
xbar[i]=mean(rpois(n,lambda=mu))
}
plot(1:1)
dev.new()
hist(xbar,prob=TRUE,breaks=12,col="sky blue", main="Histogram of
Means", xlim=c(70,130),ylim=c(0,0.1))
The histogram is
There is not much change in the graph which is the result of CLT. According to CLT, for all distributions the distribution of means is approximately normal.