[2/100] Timeslicing, MPS, MIG? HAMi! The Missing Piece in GPU Virtualization on K8s
Watch now: https://youtu.be/ffKTAsm0AzA
Duration: 5:59
For: Kubernetes users interested in GPU virtualization, AI infrastructure, and advanced scheduling.
In this animated video, I dive into the limitations of native Kubernetes GPU support — such as the inability to share GPUs between Pods or allocate fractional GPU resources like 40% compute or 10GiB memory. I also cover the trade-offs of existing solutions like Timeslicing, MPS, and MIG.
Then I introduce HAMi, a Kubernetes-native GPU virtualization solution that supports flexible compute/memory slicing, GPU model binding, NUMA/NVLink awareness, and more — all without changing your application code.
---
[1/100] Good software comes with best practices built-in — NVIDIA GPU Operator
Watch now: https://youtu.be/fuvaFGQzITc
Duration: 3:23
For: Kubernetes users deploying GPU workloads, and engineers interested in Operator patterns, system validation, and cluster consistency.
This animated explainer shows how NVIDIA GPU Operator simplifies the painful manual steps of enabling GPUs on Kubernetes — installing drivers, configuring container runtimes, deploying plugins, etc. It standardizes these processes using Kubernetes-native CRDs, state machines, and validation logic.
I break down its internal architecture (like ClusterPolicy, NodeFeature, and the lifecycle validators) to show how it delivers consistent and automated GPU enablement across heterogeneous nodes.
---
Voiceover is in Chinese, but all animation elements are in English and full English subtitles are available.
I made both of these videos to explain complex GPU infrastructure concepts in an approachable, visual way.
Let me know what you think, and I’d love any suggestions for improvement or future topics!
In this animated video, I dive into the limitations of native Kubernetes GPU support — such as the inability to share GPUs between Pods or allocate fractional GPU resources like 40% compute or 10GiB memory. I also cover the trade-offs of existing solutions like Timeslicing, MPS, and MIG. Then I introduce HAMi, a Kubernetes-native GPU virtualization solution that supports flexible compute/memory slicing, GPU model binding, NUMA/NVLink awareness, and more — all without changing your application code.
---
[1/100] Good software comes with best practices built-in — NVIDIA GPU Operator Watch now: https://youtu.be/fuvaFGQzITc Duration: 3:23 For: Kubernetes users deploying GPU workloads, and engineers interested in Operator patterns, system validation, and cluster consistency.
This animated explainer shows how NVIDIA GPU Operator simplifies the painful manual steps of enabling GPUs on Kubernetes — installing drivers, configuring container runtimes, deploying plugins, etc. It standardizes these processes using Kubernetes-native CRDs, state machines, and validation logic.
I break down its internal architecture (like ClusterPolicy, NodeFeature, and the lifecycle validators) to show how it delivers consistent and automated GPU enablement across heterogeneous nodes.
---
Voiceover is in Chinese, but all animation elements are in English and full English subtitles are available.
I made both of these videos to explain complex GPU infrastructure concepts in an approachable, visual way. Let me know what you think, and I’d love any suggestions for improvement or future topics!