Beyond CUDA: GPU Accelerated Python for Machine Learning on Cross-Vendor Graphics Cards Made Simple

By Medium - 2021-01-18

Description

A practical deep dive into GPU Accelerated Python ML in cross-vendor graphics cards (AMD, Qualcomm, NVIDIA & friends) using Vulkan Kompute

Summary

  • Machine learning algorithms — together with many other advanced data processing paradigms — fit incredibly well to the parallel-architecture that GPU computing offers.
  • Kompute is the Python GPGPU framework that we will be using in this tutorial to build the GPU Accelerated machine learning algorithms.
  • This is what allows us to know what index in the parallel execution loop we are currently running, which is what we extract from the component i = index.x — the reason why here we select x is because the execution index can be defined as a vec3 component, where there would be execution indices for inedx.x , index.y and index.z .
  • In this case, these Tensors use Device-only memory for processing efficiency, so the mapping is performed with a staging Tensor inside the operation (which is re-used throughout the operations for efficiency).

 

Topics

  1. Backend (0.37)
  2. Machine_Learning (0.34)
  3. NLP (0.14)

Similar Articles