A new era for Phoenix TS.

Click here to learn more.
AI Training

Fundamentals of Accelerated Computing with CUDA Python Training

Course Overview

In this one-day, instructor-led CUDA course in Washington, DC Metro, Tysons Corner, VA, Columbia, MD or Live Online, students will learn the fundamental tools and techniques for running GPU-accelerated Python applications using CUDA® GPUs and the Numba compiler. Participants will work though dozens of hands-on coding exercises and, at the end of the training, implement a new workflow to accelerate a fully functional linear algebra program originally designed for CPUs, observing impressive performance gains. After taking this course, learners will be able to:

  • GPU-accelerate NumPy ufuncs with a few lines of code.
  • Configure code parallelization using the CUDA thread hierarchy.
  • Write custom CUDA device kernels for maximum performance and flexibility.
  • Use memory coalescing and on-device shared memory to increase CUDA kernel bandwidth.

Schedule

Currently, there are no public classes scheduled. Please contact a LEXX LIVETraining Consultant to discuss hosting a private class at 301-258-8200.

Program Level

Beginner

Prerequisites

All learners are expected to have:

  • Basic Python competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations
  • NumPy competency, including the use of ndarrays and ufuncs
  • No previous knowledge of CUDA programming is required

Course Outline

Module 1: Introduction

Module 2: Introduction to CUDA Python with Numba

  • Begin working with the Numba compiler and CUDA programming in Python.
  • Use Numba decorators to GPU-accelerate numerical Python functions.
  • Optimize host-to-device and device-to-host memory transfers.

• Module 3: Custom CUDA Kernels in Python with Numba

  • Learn CUDA’s parallel thread hierarchy and how to extend parallel program possibilities.
  • Launch massively parallel custom CUDA kernels on the GPU.
  • Utilize CUDA atomic operations to avoid race conditions during parallel execution.

• Module 4: Multidimensional Grids, and Shared Memory for CUDA Python with Numba

  • Learn multidimensional grid creation and how to work in parallel on 2D matrices.
  • Leverage on-device shared memory to promote memory coalescing while reshaping 2D matrices.

• Module 5: Final Review

 

LEXX Live is registered with the National Association of State Boards of Accountancy (NASBA) as a sponsor of continuing professional education on the National Registry of CPE Sponsors. State boards of accountancy have final authority on the acceptance of individual courses for CPE credit. Complaints re-garding registered sponsors may be submitted to the National Registry of CPE Sponsors through its web site: www.nasbaregistry.org

Download Course Brochure

Enter your information below to download this brochure!

Name