In: Math
Describe the algorithm to generate random numbers from an arbitrary discrete distribution with finite number of outcomes.
Ans.
One of the best algorithms for sampling from a discrete distribution is the alias method.
The alias method (efficiently) precomputes a two-dimensional data structure to partition a rectangle into areas proportional to the probabilities.seebin fig1
In this schematic from the referenced site, a rectangle of unit height has been partitioned into four kinds of regions--as differentiated by color--in the proportions 1/21/2, 1/31/3, 1/121/12, and 1/121/12, in order to sample repeatedly from a discrete distribution with these probabilities. The vertical strips have a constant (unit) width. Each is divided into just one or two pieces. The identities of the pieces and the locations of the vertical divisions are stored in tables accessible via the column index.
The table can be sampled in two simple steps (one for each coordinate) requiring generating just two independent uniform values and O(1)O(1) calculation. This improves on the O(log(n))O(log(n)) computation needed to invert the discrete CDF as described in other replies here.