Efficient parallel random number generation is a basic requirement of Monte Carlo simulation.

So far, I have implemented two generators:

- L'Ecuyer's
mrg32k3a
pseudo-random generator, with uniform, exponential, Normal and
gamma
output distributions
- Sobol's quasi-random generator (starting from a sequential implementation by Joe and Kuo) with uniform, exponential and Normal output distributions -- an early version of this is the basis for the Sobol example in NVIDIA's CUDA SDK.

I worked with NAG to incorporate these into a new set of numerical routines for GPUs available free to academics who sign a collaborative agreement.

In my initial work with GPUs, a visiting student and I implemented a LIBOR market model. An updated version of this code uses the mrg32k3a random number generator described above, and includes a comparison between the output of CPU and GPU code to show that the results are identical to machine precision

- libor.cu, main code
- libor_kernels.cu, CUDA kernel code
- libor_gold.cpp, CPU code for comparison
- params.h, header file
- mrg32k3a_gold.cpp, random number generator
- Makefile following CUDA SDK style
- more information on the LIBOR testcase

Current NVIDIA GPUs have double precision support, but it is 2-4 times slower than single precision. Similarly, when using SSE vectorisation on Intel CPUs double precision is 2 times slower than single precision.

Many in the finance sector consider single precision to be inadequate, but my own view is that it is perfectly adequate for Monte Carlo applications except when computing sensitivities ("Greeks") by finite difference perturbation ("bumping").

It is also important to use either a double precision accumulator or some form of binary tree summation to minimise the accumulation of roundoff error when averaging the payoffs from a very large number of paths. The links below give a test implementation of a binary tree summation, and a reference on the error analysis of related methods:

- bin_tree_sum.h, header file
- bin_tree_sum_test.cpp, test code
- Nick Higham SISC paper

- Oxford-Man Institute of Quantitative Finance
- NVIDIA through a CUDA Fellowship Award
- EPSRC, through funding for the Many-core and Reconfigurable Supercomputing Network
- Microsoft
- CRL, now part of TCS (Tata Consultancy Services)