Description
A practical deep dive into GPU Accelerated Python ML in cross-vendor graphics cards (AMD, Qualcomm, NVIDIA & friends) using Vulkan Kompute
Summary
- Machine learning algorithms — together with many other advanced data processing paradigms — fit incredibly well to the parallel-architecture that GPU computing offers.
- Kompute is the Python GPGPU framework that we will be using in this tutorial to build the GPU Accelerated machine learning algorithms.
- This is what allows us to know what index in the parallel execution loop we are currently running, which is what we extract from the component i = index.x — the reason why here we select x is because the execution index can be defined as a vec3 component, where there would be execution indices for inedx.x , index.y and index.z .
- In this case, these Tensors use Device-only memory for processing efficiency, so the mapping is performed with a staging Tensor inside the operation (which is re-used throughout the operations for efficiency).