In: Statistics and Probability
In this Task you will investigate the sampling distribution and model for the proportion of heads that may show up when a coin is tossed repeatedly. Toss the coins if you want, but it's much easier and faster to do a simulation!
1. Set up the calculator's or random number generator to simulate tossing a coin 25 times. (The easiest way to do this is to generate 0's and 1's with equal probability, with 1 representing heads. By adding up all the 0's and 1's you can effectively count the number of heads. Dividing that coin by the number of theses will get you ^P, the sample proportion of heads.)
2. Run 20 trials, recording all the sample proportions and make a histogram of the results.
3. Repeat your simulation, this time tossing the coin 100 times. Again make a histogram of twenty sample proportions.
4. Compare your two distributions of the proportions of heads observed in your simulations.
5. What should have happened? Describe the sampling model for 100 tosses.
6. Compare the actual distribution of your twenty sample proportions for 100 tosses to what the sampling model predicts.
7. Describe how your results might differ if you had 1000 trials of the simulation instead of only 20.
Since no preference is mentioned, I will use excel. Please let me know in comments below if you need any specific software:
1: Have setup the excel cells as =if(rand()>0.5,1,0); getting results like below:
2. The pivot table/ frequency table for the experiment:
Row Labels | Count of Sum |
10 | 1 |
12 | 3 |
13 | 7 |
14 | 3 |
15 | 6 |
Grand Total | 20 |
The histogram looks like:
3. The experiment is run for 100 tosses:
The pivot table:
Row Labels | Count of Sum |
39 | 1 |
40 | 2 |
41 | 1 |
42 | 1 |
43 | 1 |
45 | 1 |
47 | 1 |
48 | 2 |
50 | 5 |
51 | 1 |
52 | 1 |
53 | 2 |
57 | 1 |
The histogram is:
4.
We see that as the number of tosses increased in the sample, we see a better representation of symmetric distribution around it's mean.
Sum | Proportion |
10 | 0.05 |
12 | 0.15 |
13 | 0.35 |
14 | 0.15 |
15 | 0.3 |
Sum | Proportion |
39 | 0.05 |
40 | 0.1 |
41 | 0.05 |
42 | 0.05 |
43 | 0.05 |
45 | 0.05 |
47 | 0.05 |
48 | 0.1 |
50 | 0.25 |
51 | 0.05 |
52 | 0.05 |
53 | 0.1 |
57 | 0.05 |
5. Sampling model for 100:
The number of heads in 100 tosses are distributed as Binomial (100, 0.5).
We know that for large n, the binomial can be approximated by normal and we would see a symmetric curve similar to normal distribution.
The binomial distribution is also symmetric around it's mean. As the n increases, the proportion of head has lower variance.
6. The theoretical sampling distribution of number of heads in 100 tosses would have highest mast at 50 and relatively low as we go away from 50. The probability mass will decrease sharply.
In experiment, we see a number of observations with 1 frequency but as the number of tosses increases, the observation away from the mean value of n/2 would be unlikely.
7. For 1000 trials, we are expected to be highly concentrated around 500. We would unlikely see any observation outside 495,505.
Thumbs up if you like! Please post under different threads for more than 4 subparts.
Thanks.