In: Statistics and Probability
Formula=IF(RAND()>0.5,T.INV(RAND(),10)-2,T.INV(RAND(),10)+2
observation | sample 1 |
1 | 1.37700278 |
2 | 1.827378045 |
3 | 3.479013387 |
4 | 1.382604626 |
5 | 2.572039451 |
6 | 2.38234939 |
7 | 0.240414349 |
8 | -1.347432349 |
9 | 2.85777933 |
10 | -3.379978992 |
11 | -2.746482213 |
12 | 1.886442756 |
13 | -1.947527669 |
14 | 1.540754548 |
15 | -0.233174876 |
16 | -1.104079702 |
17 | -1.226712691 |
18 | 3.300631732 |
19 | 0.940368484 |
20 | -1.845113569 |
21 | -1.250733918 |
22 | -1.392547733 |
23 | 2.478557615 |
24 | 0.823135564 |
25 | 1.630991977 |
sample mean | 0.489827213 |
Use the excel spreadsheet to simulate 1000 samples of size 25 by copying cells C:3 through C:27 and pasting into rows 3 through 27 in the adjacent columns. For each sample calculate the sample mean. Then in row 28 you obtain a sample of sample means. If you copy and paste these into the column “sample of sample means” (starting with cell c:31) then the histogram counts will be automatically produced. (Using copy special and the “values” and “transpose” options.) Plug these counts into a bar chart to get a histogram. Submit only your histogram. Then create a second histogram by using only observations 1 through 8 for each sample (instead of using all 25 observations). How do the two histograms differ?
The image on the left is the histogram by using observations 1 to 8, while the histogram on the right is histogram by using all 25 observation.
Since the the data is completely random, the 2-histograms look quite different (there is no relation or similarity between both the histograms, just the data is spread between -1.5 to 1 in both of them). But for every sample set (1 to 25, x-axis) the 8-observations can be considered sample of the population (25-observations). From central limit theorem if we take the mean of the sampling distribution is almost equal to the population. In the same way if we consider mean of the sampling distribution (i.e. for any 8-observations out of 25-observations in every sample set) then it would be almost same to the population mean (i.e. in that case the histograms will be almost same)