In: Statistics and Probability
Please do these questions in Excel!
Generate a set of 1000 pairs of standard uniform random values ??and ??. Then perform the following algorithm for each of these 1000 pairs: Let the output of this algorithm be denoted by Y.
Step 1: Generate random values ?? = −?N(??) and ?? = −?N(??)
Step 2: Calculate ? = (??−?)^?/? . If ?? ≥ ?, then generate a random number ?. If ? > ?.? accept ??as ?(that is, let ? = ??); otherwise if ? ≤ ?.?, else accept −?? as ? (that is, let ? = −??). If ?? < ?, no result is obtained, and the algorithm returns to step 1. This means that the algorithm skips the pair ?? and ?? for which ?? < ? without generating any result and moves to the next pair ?? and??. After repeating the above algorithm 1000 times, a number N of the Y values will be generated. Obviously ? ≤ ?0,??? since there will be instances when a pair ?? and ?? would not generate any result, and consequently that pair would be wasted. Investigate the probability distribution of ? by doing the following:
1. Create a relative frequency histogram of ?.
2. Select a probability distribution that, in your judgement, is the best fit for ?.
3. Support your assertion above by creating a probability plot for ?.
4. Support your assertion above by performing a Chi-squared test of best fit with a 0.05 level of significance.
Steps:
1. Open an excel workbook to generate the observations.
The following image shows the required column with the respective formulas used to create them.
Note that the "=rand" function will generate new values each time the sheet is calculated so it is advisable to copy and paste the values after generating it for the first time.
The "I" values in the formula below correspond to the respective row number of the 1000 test cases.
2. Now we need to separate the shortlisted values of the random variable Y in another sheet. This can be done by copy pasting the Y values in another sheet and excluding the rows which say "No Result"
3. In order to determine the relative frequency we need to determine the subgroups for "Y" variable values. This is based on judgement or a quick line plot will show the data range. The next step would be to calculate the frequency of values that fall within each range. This can be done using excel functions.
4. The histogram can be edited using the "Insert" tab at the top.
The formulas used are as follows:
2. From the above it is apparent that the distribution of Y very closely resembles the distribution of a standard normal distribution.
3. The probability plot of "Y" is as follows:
(Here "i" and "j" denote the starting and ending rows of data.