In: Statistics and Probability
5. This problem illustrates an interesting variation of simple random sampling.
a. Open a blank spreadsheet and use the RAND() function to create a column of 1000 random numbers. Don’t freeze them. This is actually a simple random sample from the uniform distribution between 0 and 1. Use the COUNTIF function to count the number of values between 0 and 0.1, between 0.1 and 0.2, and so on. Each such interval should contain about 1/10 of all values. Do they? (Keep pressing the F9 key to see how the results change.)
b.Repeat part a, generating a second column of random numbers, but now generate the first 100 as uniform between 0 and 0.1, the next 100 as uniform between 0.1 and 0.2, and so on, up to 0.9 to 1. (Hint: For example, to create a random number uniformly distributed between 0.5 and 0.6, use the formula =0.5+0.1*RAND(). Do you see why?) Again, use COUNTIF to find the number of the 1000 values in each of the intervals, although there shouldn’t be any surprises this time. Why might this type of random sampling be preferable to the random sampling in part a? (Note: The sampling in part a is called Monte Carlo sampling, whereas the sampling in part b is basically Latin Hypercube sampling, the form of sampling we advocate in Chapters 15 and 16 on simulation.)
Sample 1 is generated below - Formula and output is shown.
Sample 2 is generated over the 10 columns to help you see the formula to be used. You need to stack it in one column.
Why might this type of random sampling be preferable to the random sampling in part a?
This is called a stratified random sample, wherein the representation from every group is correctly represented in the sample.