Stephen Jones is a principal software engineer in the CUDA group at NVIDIA, working on making the CUDA language and programming model span the needs of parallel programming from high performance computing to artificial intelligence. Prior to NVIDIA he led the Simulation & Analytics group at SpaceX, where he worked on various projects including large-scale simulation of combustion processes in rocket engines. His background is in computational fluid mechanics and plasma physics, but he has worked in diverse industries including networking, CAD/CAM, and scientific computing.
Optimizing your code can be one of the most challenging tasks in GPU programming, but also one of the most rewarding: the performance difference between an initial version and well-tuned code can be a factor of 10 or more. Some optimizations can be quite straightforward while others require care and deep understanding of how the code is executing. A particular focus will be on optimization of the CPU part of your code, which is frequently overlooked even though it is often easier to tune and just as effective. Sometimes the biggest obstacle is just knowing what to look for, so we'll cover a range of techniques that everyone from beginners to CUDA ninjas might not have thought of before.