In: Statistics and Probability
According to a Statistics Canada study from 2008: about 29% of women aged 18 to 24 engage in heavy drinking. (It is 47% for males.) The standard definition of heavy drinking is consuming 5 or more drinks on one occasion, 12 or more times over the past year. Assuming this to be true, if we choose a woman in this age group at random there is a probability of 0.29 of getting a heavy drinker. We can use software to simulate choosing random samples of women. (In most software, the key phrase to look for is “Binomial” This is the technical term for independent trials with “Yes/No” outcomes. Our outcomes here are “Heavy drinker” and “Not.”) a. Simulate 100 draws of 20 women and make a histogram of the percents who are heavy drinkers. Describe the shape, centre, and spread of this distribution. b. Simulate 100 draws of 500 women and make a histogram of the percents who are heavy drinkers. Describe the shape, centre, and spread of this distribution. c. In what ways are the distributions in parts (a) and (b) similar? In what ways do they differ? (Be sure to mention shape, location and spread.)
Here I use "R" to do the simulation.
Probability of getting a heavy drinker woman = 0.29. Here I simulate samples as "0" and "1" where "0" represents "no" in independent binomial trials that is "not heavy drinker" and "1" represents "yes" in independent binomial trials that is "heavy drinker".
a. Simulation of 100 draws of 20 women then calculation of percentage women who are heavy drinkers using p=0.29 that is the given probability.
p=0.29 ## given probability
w1=rep(0,100) ## null vector of size 100
for(i in 1:100)
{
a=sample(0:1,20,rep=T,prob=c(0.71,0.29)) ##..drawing
samples
w1[i]=sum(a)/20 ##..calculating percentages
}
hist(w1,prob=T,main="histogram of heavy drinker women",xlab="%of heavy drinker women",ylab="frequency") ##..histogram
Comment : Here the distribution is positively symmetric. At p=0.3 is the mode of the disribution and the spread is about 0.1 to 0.5 beacuse after 0.5 the height of histogram is almost 0. These properties are natural for binomal distribution for moderately large 'n' (=20) and given 'p'
b. Simulation of 100 draws of 500 women then calculation of percentage women who are heavy drinkers using p=0.29 that is the given probability.
w2=rep(0,100) ##..null vector of size 100
for(i in 1:100)
{
a=sample(0:1,500,rep=T,prob=c(0.71,0.29)) ##..drawing
samples each of size 500
w2[i]=sum(a)/500 ##..calculation o percentage
}
hist(w2,prob=T,main="histogram of heavy drinker women",xlab="%of heavy drinker women",ylab="frequency") ##..histogram
Comment : The distribution becomes symmetric about p=0.29 and the spread of the distribution decreases. As we know that binomal distribution tends to symmetry as 'n' increases. The mode of the distribution is p=0.29. Spread is from 0.24 to 0.33.
Please up-vote for the answer