Workshop: Fundamentals of Accelerated Computing with CUDA C/C++

Europe/Ljubljana
MS Teams

MS Teams

Domen Verber, Jani Dugonik
Description

Description: Learn how to accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques. You'll also learn an iterative style of CUDA development that will allow you to ship accelerated applications fast.

This workshop teaches the fundamental tools and techniques for accelerating C/C++ applications to run on massively parallel GPUs with CUDA®. You’ll learn how to write code, configure code parallelization with CUDA, optimize memory migration between the CPU and GPU accelerator, and implement the workflow that you’ve learned on a new task—accelerating a fully functional, but CPU-only, particle simulator for observable massive performance gains. At the end of the workshop, you’ll have access to additional resources to create new GPU-accelerated applications on your own. 

At the end of the workshop, participants can obtain an official certificate from Deep Learning Institute from NVIDIA.


Workflow: The workshop takes place remotely via a browser on the AWS cloud infrastructure.

Difficulty: Basic 

Language: English

Prerequisite knowledge: Basic C/C++ competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations. No previous knowledge of CUDA programming is assumed.

Target audience: HPC developers using CUDA in the network or cloud.

Skills to be gained: At the conclusion of the workshop, you’ll have an understanding of the fundamental tools and techniques for GPU-accelerating C/C++ applications with CUDA and be able to:


– Write code to be executed by a GPU accelerator 
– Expose and express data and instruction-level parallelism in C/C++ applications using CUDA 
– Utilize CUDA-managed memory and optimize memory migration using asynchronous prefetching 
– Leverage command-line and visual profilers to guide your work 
– Utilize concurrent streams for instruction-level parallelism 
– Write GPU-accelerated CUDA C/C++ applications, or refactor existing CPU-only applications, using a profile-driven approach 
 

Maximum number of participants: 30

Virtual location: MS Teams

Organizer:

 

 

 


Lecturers:

Name:Domen Verber
Description:Domen Verber is an assistant professor at the Faculty of Electrical Engineering and Computer Science of the University of Maribor (UM FERI) and ambassador of the NVIDIA Deep Learning Institute for the University of Maribor and their HPC specialist. He has been dealing with HPC and artificial intelligence issues for more than 25 years.
 domen.verber@um.si, deep.learning@um.si

 

Name:Jani Dugonik
Description:Jani Dugonik is an academic researcher at the Faculty of Electrical Engineering, Computer Science and Informatics of the University of Maribor (UM FERI). He has been working in the field of natural language processing and evolutionary algorithms for more than 10 years.
 jani.dugonik@um.si

Registration
Registration
    • 09:00 09:30
      Introduction (- Meet the instructor. - Get familiar with your GPU-accelerated interactive JupyterLab environment.)
      Conveners: Domen Verber, Jani Dugonik
    • 09:30 11:30
      Accelerating Applications with CUDA C/C++ (Learn the essential syntax and concepts to be able to write GPU-enabled C/C++ applications with CUDA: - Write, compile, and run GPU code. - Control parallel thread hierarchy. - Allocate and free memory for the GPU. )
    • 11:30 12:30
      Lunch break
    • 12:30 14:30
      Managing Accelerated Application Memory with CUDA C/C++ (Learn the command line profiler and CUDA managed memory, focusing on observation-driven application improvements and a deep understanding of managed memory behavior: - Profile CUDA code with the command line profiler. - Go deep on unified memory. - Optimize unified memory management.)
    • 14:30 14:45
      Odmor: Cofee break
    • 14:45 16:45
      Asynchronous Streaming and Visual Profiling for Accelerated Applications with CUDA C/C++ (Identify opportunities for improved memory management and instruction-level parallelism: - Profile CUDA code with the NVIDIA Visual Profiler. - Use concurrent CUDA streams.)
    • 16:45 17:00
      Final Review (- Complete the assessment to earn a certificate. – Review key learnings and wrap up questions. – Take the workshop survey. Conveners: Domen Verber, Jani Dugonik )