I animated GPU Operator internals and a missing vGPU solution on K8s

Comments (1)

nimbus-nimo · 6h ago

[2/100] Timeslicing, MPS, MIG? HAMi! The Missing Piece in GPU Virtualization on K8s Watch now: https://youtu.be/ffKTAsm0AzA Duration: 5:59 For: Kubernetes users interested in GPU virtualization, AI infrastructure, and advanced scheduling.

In this animated video, I dive into the limitations of native Kubernetes GPU support — such as the inability to share GPUs between Pods or allocate fractional GPU resources like 40% compute or 10GiB memory. I also cover the trade-offs of existing solutions like Timeslicing, MPS, and MIG. Then I introduce HAMi, a Kubernetes-native GPU virtualization solution that supports flexible compute/memory slicing, GPU model binding, NUMA/NVLink awareness, and more — all without changing your application code.

---

[1/100] Good software comes with best practices built-in — NVIDIA GPU Operator Watch now: https://youtu.be/fuvaFGQzITc Duration: 3:23 For: Kubernetes users deploying GPU workloads, and engineers interested in Operator patterns, system validation, and cluster consistency.

This animated explainer shows how NVIDIA GPU Operator simplifies the painful manual steps of enabling GPUs on Kubernetes — installing drivers, configuring container runtimes, deploying plugins, etc. It standardizes these processes using Kubernetes-native CRDs, state machines, and validation logic.

I break down its internal architecture (like ClusterPolicy, NodeFeature, and the lifecycle validators) to show how it delivers consistent and automated GPU enablement across heterogeneous nodes.

---

Voiceover is in Chinese, but all animation elements are in English and full English subtitles are available.

I made both of these videos to explain complex GPU infrastructure concepts in an approachable, visual way. Let me know what you think, and I’d love any suggestions for improvement or future topics!

Virtual Power Plants: Reimagining the Grid for the 21st Century (utilitydive.com)

Auto-generate Linear tasks from meeting transcripts (snaplinear.app)

We Faked the Moon Landing (rumble.com)

Hostile Alien Object Speeds to Earth, Harvard Scientist Says It's Hiding (ibtimes.co.uk)

Founders and Recruiters, Beware (twitter.com)

Throwing AI at Developers Won't Fix Their Problems (aviator.co)

Show HN: KrackTheKode – Daily number code-breaking game (krackthekode.pyrrho.dev)

Text-audio foundation model from Boson AI (github.com)

Jetson Thor – Advanced AI for Physical Robotics (nvidia.com)

Sign in with Google in Chrome (underpassapp.com)

The West's data centers suck (water and power) (hcn.org)

LLMs can now identify public figures in images (minimaxir.com)

Retirement: Azure SQL Edge will be retired on September 30th, 2025 (azure.microsoft.com)

The Caribbean islands that give you a passport if you buy a home (bbc.com)

Modern Day LLM Blueprint: A compilation of the most recent technologies (nofone.io)

2025 NASA Space Apps Challenge Science (science.nasa.gov)

JPMorgan says fintech middlemen like Plaid are 'massively taxing' its APIs (cnbc.com)

Launch: Miget – A New Kind of PaaS (No Per-App or Usage-Based Billing)

How Twiddling Enshittifies Your Brain (pluralistic.net)

SpaceX employee claims he was fired for flagging 'despicable' safety practices (independent.co.uk)

Why Not Matrix (2024) (benharri.org)

Dulce Base (en.wikipedia.org)

ASI-Arch – AlphaGo Moment for Model Architecture Discovery (github.com)

Show HN: Tectonic Game Engine (github.com)

Robot hand could harvest blackberries better than humans (news.uark.edu)

Claude Code System Prompt (github.com)

Paper FOMO and ICML 2025 Outstanding Papers (gonzoml.substack.com)

I designed my own fast game streaming video codec (themaister.net)

Compile Svelte 5 in your head (lihautan.com)

Trump Administration Weighs Patent System Overhaul to Raise Revenue (wsj.com)

The Goldfish, the Elephant, and the AI: A User's Guide to Context (theproductbrew.com)

Hyprperks: A new official subscription to support Hyprland development (hypr.land)

Scientists and Engineers Craft Radio Telescope Bound for the Moon (bnl.gov)

1990–1994 Swedish financial crisis (en.wikipedia.org)

GLM-4.5 Teardown: Is This the GPT-4 and Claude Killer We've Been Waiting For? (algogist.com)

What Happened When Hitler Took on Germany's Central Banker (theatlantic.com)

EU's privacy supervisor clears Commission's use of Microsoft (euractiv.com)

NSF-DOE Vera C. Rubin Observatory Observations of Interstellar Comet 3I/Atlas (arxiv.org)

JavaScript metaprogramming with the 2022-03 decorators API (2022) (2ality.com)

AI SQL Shell – New in Beekeeper Studio (beekeeperstudio.io)

Why Is Airplane Wi-Fi Still So Bad? (theatlantic.com)

Benn Jordan releases new RFC for IPoAC (tomshardware.com)

Working Effectively with AI Coding Tools Like Claude Code (sajalsharma.com)

Why are AI SWE Agents no longer an AI research statement? (supratikdas.com)

The EU-USA trade deal (marginalrevolution.com)

When Software Engineers Think They Need More Focus Time (jola.dev)

Positron raises $50M+ to make GPUs optional for AI inference. Thoughts? (positron.ai)

Anthropic unveils new rate limits to curb Claude Code power users (techcrunch.com)

Python OpenAI API create Pinecone embeddings from PDF documents and RAG examples (github.com)

Node.js team dismisses new Windows device name bug after patching CVE-2025-27210 (hackerone.com)

I animated GPU Operator internals and a missing vGPU solution on K8s

Comments (1)