Mersenne Twister(MT) Pseudo Random Number Generator (PRNG) is one of the common PRNG. With this post I am sharing a C++ AMP implementation of Mersenne Twister based on a pseudorandom number generator algorithm developed by Makoto Matsumoto and Takuji Nishimura. Like other PRNG, MT is iterative and is hard to parallelize a single twister update step among several execution threads. To solve this problem and to enable efficient implementation of Mersenne twister on parallel architectures, Makoto Matsumoto and Takuji Nishimura developed an offline library for dynamic creation of MT parameters.
Below is the explanation of my C++ AMP sample ported from the above reference.
Here the Mersenne Twister parameters, generated by an offline tool, are loaded from .dat file. These parameters are copied to a vector and the generate_rand_on_amp function is called to generate a bunch of floating point random numbers. Eventually the random numbers are dumped to a text file.
This function is called from the main with the twister parameters and an output container. These inputs and output are encapsulated using array objects and dispatched to run on the accelerator. If needed (controlled by the apply_transform parameter) a Box-Muller transform is applied to the generated random numbers. Before returning, the calculated data is copied to host memory.
The core logic is best described in the research paper mentioned above. I will briefly mention the mapping of key parameters (in italics) which should help you understand the relationship of the code to the paper. A Mersenne twister is completely defined by 11 parameters, only the bit-vector parameters vary on a per-thread basis for the same period and seed : matrix_a - the lower row of matrix A ; mask_b, mask_c - tempering masks of x •T transformation. The rest of the (integer) parameters are shared among all the threads, and are inline in the source.
Uniformly distributed random number sequences, produced by Mersenne twisters or any other RNG, can be transformed into normal distribution with the Box-Muller transformation, which is implemented as a separate kernel. box_muller_kernel is called from generate_rand_on_amp. In this function, each thread picks two samples for transformation and the returned data is saved to the GPU memory. box_muller_transform function is the core of the Box-Muller transform, given two samples from the uniform distribution in the interval (0, 1] maps to normally distributed samples with standard deviation 1.
Please download the attached project of the Mersenne Twister RNG that we discussed here and run it on your hardware, and try to understand what the code does and to learn from it. You will need, as always, Visual Studio 11.
As with all the samples we post, they are meant to help you get started with C++ AMP. Once you are up and running, you may be able to come up with a better C++ AMP implementation of one of the algorithms. If you do, please share it with us.