Course Overview
In this one-day, instructor-led CUDA course in Washington, DC Metro, Tysons Corner, VA, Columbia, MD or Live Online, students will learn the fundamental tools and techniques for running GPU-accelerated Python applications using CUDA® GPUs and the Numba compiler. Participants will work though dozens of hands-on coding exercises and, at the end of the training, implement a new workflow to accelerate a fully functional linear algebra program originally designed for CPUs, observing impressive performance gains. After taking this course, learners will be able to:
- GPU-accelerate NumPy ufuncs with a few lines of code.
- Configure code parallelization using the CUDA thread hierarchy.
- Write custom CUDA device kernels for maximum performance and flexibility.
- Use memory coalescing and on-device shared memory to increase CUDA kernel bandwidth.
Schedule
Currently, there are no public classes scheduled. Please contact a LEXX LIVETraining Consultant to discuss hosting a private class at 301-258-8200.
Program Level
Beginner
Prerequisites
All learners are expected to have:
- Basic Python competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations
- NumPy competency, including the use of ndarrays and ufuncs
- No previous knowledge of CUDA programming is required
Course Outline
Module 1: Introduction
Module 2: Introduction to CUDA Python with Numba
- Begin working with the Numba compiler and CUDA programming in Python.
- Use Numba decorators to GPU-accelerate numerical Python functions.
- Optimize host-to-device and device-to-host memory transfers.
• Module 3: Custom CUDA Kernels in Python with Numba
- Learn CUDA’s parallel thread hierarchy and how to extend parallel program possibilities.
- Launch massively parallel custom CUDA kernels on the GPU.
- Utilize CUDA atomic operations to avoid race conditions during parallel execution.
• Module 4: Multidimensional Grids, and Shared Memory for CUDA Python with Numba
- Learn multidimensional grid creation and how to work in parallel on 2D matrices.
- Leverage on-device shared memory to promote memory coalescing while reshaping 2D matrices.
• Module 5: Final Review
LEXX Live is registered with the National Association of State Boards of Accountancy (NASBA) as a sponsor of continuing professional education on the National Registry of CPE Sponsors. State boards of accountancy have final authority on the acceptance of individual courses for CPE credit. Complaints re-garding registered sponsors may be submitted to the National Registry of CPE Sponsors through its web site: www.nasbaregistry.org
![[GSA LOGO]](https://live.lexx.com/wp-content/themes/lexx-live/assets/images/gsa-logo-black.png)
