Course on CUDA Programming, November 30 -- December 15, 2023, at AIMS South Africa

This is a 2.5 week course to learn how to develop parallel applications to run on NVIDIA GPUs. All that will be assumed is some proficiency with C and basic C++ programming. No prior experience with parallel computing will be assumed.

The aim is that by the end of the course you will be able to write relatively simple parallel programs, and will feel confident to continue learning to use CUDA through studying the code samples provided by NVIDIA on GitHub.

CUDA Programming references

As preliminary reading, please read chapters 1 and 2 of the NVIDIA CUDA C Programming Guide which is available both as PDF and online HTML.

There is lots of other information available online. You might find some of this useful, but you definitely don't need to read most of it.


Week 1: Week 2: Week 3:


We will be working on Linux servers within the Google Cloud. Before starting the practicals, please read these notes on using Google Cloud, and have a look at the online user documentation.

The practicals all use these header files (helper_cuda.h, helper_string.h) which came originally from the CUDA SDK. They provide routines for error-checking and initialisation.

Practical 1

Application: a trivial "hello world" example

CUDA aspects: launching a kernel, copying data to/from the graphics card, error checking and printing from kernel code Note: the instructions explain how files can be copied from my user account so there's no need to download from here

Practical 2

Application: Monte Carlo simulation using NVIDIA's CURAND library for random number generation

CUDA aspects: constant memory, random number generation, kernel timing, minimising device memory bandwidth requirements

Practical 3

Application: 3D Laplace finite difference solver

CUDA aspects: thread block size optimisation, multi-dimensional memory layout

Practical 4

Application: reduction

CUDA aspects: dynamic shared memory, thread synchronisation

Practical 5

Application: using the CUBLAS and CUFFT libraries

Practical 6

Application: revisiting the simple "hello world" example

CUDA aspects: using g++ for the main code, building libraries, using templates

Practical 7

Application: tri-diagonal equations

Practical 8

Application: scan operation and recurrence equations

Ideas for presentation topics


Many thanks to:
webpage link checker