Layers All the Way Down: The Untold Story of Shader Compilation

81 birdculture 42 5/19/2025, 2:49:25 AM moonside.games ↗

Comments (42)

raphlinus · 9h ago
We tried something like this with piet-gpu-hal. One problem is that spirv-cross is lossy, though gaps in target language support are getting better. For example, a device scoped barrier is just dropped on the floor before Metal 3.2. Atomics are also a source of friction.

But the main problem is not the shader language itself, but the binding model. It's a pretty big mess, and things change as you go in the direction of bindless (descriptor indexing). There are a few approaches to this, certainly reinventing WebGPU is one. Another intriguing approach is blade[1] by Dzmitry Malyshau.

I wish the authors well, but this is a hard road, especially if the goal is to enable more advanced features including compute.

[1]: https://github.com/kvark/blade

pavlov · 2h ago
I’d very much like to read about Blade, but seems like they have literally no documentation in text format, not even a basic introduction. Every link on the GitHub page goes to YouTube.

Project authors, please don’t do this. It’s impossible to get a two-minute overview from a video. Browsing through tutorials and documentation is much more efficient.

If you really have never written anything about the project except conference slides, then at least put up that deck in addition to the YouTube link. Clicking through slides is not great, but it’s still a better browsing experience than seeking at random in a video.

pjmlp · 9h ago
As expected, they don't touch the issue of how shaders work in PlayStation (LibGNM, LibGNMX) and Switch (NVN).
genocidicbunny · 5h ago
I really do wish that Sony made even more info about GNM and GNMX public. I was only starting to learn it when I got laid off and lost my access. I may or may not still have some older docs that found their way into my box as I was leaving on the last day, but if any did, they're definitely incomplete. I spent most of my time working on non-graphics parts of the project, so my time that I got to spend on digging into graphics system of the PS5 was pretty limited.
gmueckl · 7h ago
These systems are highly proprietary and Inam reasonably certain that stating anything about them publicly would break some NDAs.
pjmlp · 6h ago
As someone still having an Nintendo Developer Portal account, holding SCEE content back when the London Soho office used to have a developer site (aka Team Soho), and PS2Linux owner, there is plenty of material that can be discussed publicly without breaking NDAs.
flohofwoe · 5h ago
Console specific information also is not all that interesting anymore these days since game consoles have switched to off-the-shelf GPU designs with only minor modifications.
genocidicbunny · 2h ago
Even the current generation of consoles still have some interesting stuff going on. The 'core' of the console is fairly off the shelf, but they do still have modifications specific to the console that you won't find elsewhere. As far as GPU stuff goes, they tend to provide somewhat lower-level access to the hardware that you would normally not get with consumer stuff.
pjmlp · 4h ago
Yeah, but they still use their own proprietary APIs.
pornel · 47m ago
I really like the WebGPU API. That's the API where the major players, including Apple and Microsoft, are forced to collaborate. It has real-world implementations on all major platforms.

With the wgpu and Google dawn implementations, the API isn't actually tied to the Web, and can be used in native applications.

pjmlp · 33m ago
The only reason I like WebGL and WebGPU is that they are the only 3D APIs where major players take managed language runtimes into consideration, because they can't do otherwise.

Had WebAssembly already been there without being forced to go through JavaScript for Web APIs, and most likely they would be C APIs where everyone and their dog are writing bindings insteads.

Now, it is pretty much a ChromeOS only API still, and only available across macOS, Android and Windows.

Safari and Firefox have it as preview, and who knows when it will ever it stable at a scale that doesn't require "Works best on Chrome" banners.

Support on GNU/Linux, even from Chrome, is pretty much not there, at least for something to use in production.

And then we have the whole drama that after 15 years, there are still no usable developer tools on browsers for 3D debugging, one is forced to either guess what rendering calls are from the browser or which are from the application, GPU printf debugging, or having a native version that can be plugged into Renderdoc or similar.

riggsdk · 8h ago
BGFX (https://github.com/bkaradzic/bgfx) uses a different approach. You basically write your shader in a GLSL-like language but it's all just (either very clever or very horrible) macro expansions that handles all the platform differences. With that you get a very performant backend for OpenGL, WebGL, Vulkan, Metal, Direct3D 11 and 12 and Playstation. It's shader compiler program basically does minimal source level transformations before handing it over to the platforms' own shader compiler if available.
shmerl · 9h ago
> Rendering, by comparison, is a huge can of worms. Every platform has their own unique support matrix.

GPU APIs landscape is stuck in the dumb NIH mentality of the '90s because stuck up lock-in proponents refuse to support Vulkan on their walled gardens.

The only recent somewhat positive development about that was MS deciding to use SPIR-V and eventually ditch DXIL. A small step in the right direction, but not really enough yet.

alexk101 · 8h ago
I have been playing around with slang, which is supposed to be more cross platform. They have a neural rendering slant, and I have yet to fully test on all platforms, but I think it's a welcome move to consolidate all these apis. https://shader-slang.org/
socalgal2 · 6h ago
https://compute.toys/view/1948 recently added Slang support

No comments yet

raphlinus · 8h ago
Yup, I think slang is the future. Anyone on this thread willing to fund a Rust implementation?
pjmlp · 27m ago
It is according to Khronos anyway, for those that aren't already deeply invested into HLSL.

Khronos has been quite vocal that there is no further development on GLSL, they see that as a community effort, they only provide SPIR-V.

This is how vendor specific tooling eventually wins out, they kind of got lucky AMD decided to offer Mantle as basis for Vulkan, LunarG is doing the SDK, now NVidia contributed slang, otherwise they would still be arguing about OpenGL vNext.

flohofwoe · 5h ago
AFAIK a very large part of Slang are massively big 3rd party libraries written in C++, the Slang-specific Rust code would just be a very thin layer on top of millions(?) of lines of C++ code that has been grown over decades and is maintained elsewhere.

(fwiw I've been considering to write the custom parts of the Sokol shader compiler in Zig instead of C++, but that's just a couple of thousand lines of glue code on top of massive C++ libraries (SPIRVTools, SPIRVCross, glslang and Tint), and those C++ APIs are terrible to work with from non-C++ languages.

As far as developer friction for integration into asset workflows goes, that's exactly where I would prefer Zig over Rust (but a simple build.zig already goes most of the way without porting any code to Zig).

gmueckl · 7h ago
Why? Unless you need to embed the slang transpiler (unlikely), the language it is written in literally doesn't matter.
raphlinus · 6h ago
This is a longer and deeper conversation, but I think on topic for the original article, so I'll go into it a bit. The tl;dr is developer friction.

By all means if you're doing a game (or another app with similar build requirements), figure out a shader precompilation pipeline so you're able to compile down to the lowest portable IR for each target, and ship that in your app bundle. Slang is meant for that, and this pipeline will almost certainly contain other tools written in C++ or even without source available (DXC, the Apple shader compiler tools, etc).

There are two main use cases where we want different pieces of shaders to come from different sources of truth, and link them together downstream. One is integrating samplers for (vello_hybrid) sparse strip textures so those can be combined with user paint sources in the user's 2D or 3D app. The other is that we're trying to make the renderer more modular so we have separate libraries for color space conversion and image filters (blur etc). To get maximal performance, you don't want to write out the blur result to a full-resolution texture, but rather have a function that can sample from an intermediate result. See [1] for more context and discussion of that point.

Stitching together these separate pieces of shader is a major potential source of developer friction. There is a happy path in the Rust ecosystem, albeit with some compromises, which is to fully embrace WGSL as the source of truth. The pieces can be combined with string-pasting, though we're looking at WESL as a more systematic approach. With WGSL, you can either do all your shader compilation at runtime (using wgpu for native), or do a build.rs script invoking naga to precompile. See [2] for the main PR that implements the latter in vello_hybrid. In the former case, you can even have hot reloading of shaders; implemented in Vello main but not (yet) vello_hybrid.

To get the same quality of developer experience with Slang, you'd need an implementation in Rust. I think this would be a good thing for Slang.

I've consistently underestimated the importance of developer friction in the past. As a contrast, we're also doing a CPU-only version of Vello now, and it's absolutely night and day, both for development velocity and attracting users. I think it's possible the GPU world gets better, but at the moment it's quite painful. I personally believe doing a Rust implementation of the Slang compiler would be an important step in the right direction, and is worth funding. Whether the rest of the world agrees with me, we'll see.

[1]: https://xi.zulipchat.com/#narrow/channel/197075-vello/topic/...

[2]: https://github.com/linebender/vello/pull/1011

coffeeaddict1 · 4h ago
> The pieces can be combined with string-pasting, though we're looking at WESL as a more systematic approach.

> To get the same quality of developer experience with Slang, you'd need an implementation in Rust. I think this would be a good thing for Slang.

WESL has the opposite problem: it doesn't have a C++ implementation. IMO, the graphics world will largely remain C++ friendly for the forseeable future, so if an effort like WESL wants to succeed, they will need to provide a C++ implementation (even more so than the need for Slang to provide a Rust one).

raphlinus · 2h ago
You're probably right about this. In the short to medium term, I expect that the Rust and C++ sub-ecosystems will be making different sets of choices. I don't know of any major C++ game or game-adjacent project adopting, say, Dawn for their RHI (render hardware interface) to buy into WebGPU. In the longer term, I expect the ecosystems to start blending together more, especially as C++/Rust interop improves (it's pretty janky now).
gmueckl · 4h ago
Long story short: you want to compose shaders at runtime and need a compilation pipeline for that. So what you really need is a C interface to the slang transpiler that is callable from rust.

Rewriting the whole slang pipeline in rust is a fool's errand.

shmerl · 7h ago
I think Rust itself as a language for GPU programming is another interesting alternative:

https://github.com/Rust-GPU/rust-gpu/

flohofwoe · 5h ago
I would agree if Vulkan would be a good 3D API, but it turned out to be the worst of the modern 3D APIs because it repeats the same main mistake as GL:

There is no design vision, instead it's a cobbled together mess of adhoc vendor extensions that are eventually promoted to 'core'.

socalgal2 · 6h ago
Vulkan is by far the worst API of all the modern graphics API. It's crap. It's also not even trying to be portable, expecting you to query a million things and then adapt to the platform.

I for one am glad Vulkan is not "it"

Also, it's not even portable on Windows. Sure, if you're making a game, you can expect your gamer audiences have it installed. If you're making an app and you expect businesses to use you it you'll find neither OpenGL nor Vulkan work on most business class machines view Remote Desktop. The point being, it's not portable by design nor in actuality

viraptor · 1h ago
Why would remote desktop make a difference for the API? I thought MS killed RemoteFX without any replacement? As in, everyone including DirectX is in the same bad state...
pjmlp · 18m ago
DirectX works just fine in RDP.

How do you think those Azure VMs for game developers work?

https://learn.microsoft.com/en-us/azure/virtual-desktop/grap...

voidUpdate · 6h ago
If you need to compile shaders at runtime, then why do I have to wait 10 mins for unreal to compile 30,000 shaders whenever it feels like it?
flohofwoe · 5h ago
Because the input bytecode blobs that are passed into 3d APIs are only a gpu-vendor- and render-pipeline-agnostic bytecode format (eg SPIRV or DXBC) - this is not what's actually running on the GPU. There's a second compile step happening inside the driver which compiles into a proprietary "machine code" format, and specialized for a specific pipeline object. Normally that driver internal compile step is quite fast, but that doesn't matter if there's tens of thousands of shader variants.
mvdtnz · 7h ago
This is a very interesting article, it was fun to learn what shaders are and why I have to wait for them to compile every time I play Call of Duty.

What I'd like to know is why game developers can't provide a service / daemon that will perform shader compilation when shaders or drivers change so I don't need to waste 25 minutes of my precious gaming time when I load CoD and see they need compiling. My computer has been sitting there unused all week. Steam was smart enough to download game updates in that time, why isn't your game smart enough to recompile my shaders on an idle machine?

shaggie76 · 2h ago
One alternative that many games choose is to do it on-demand which is felt as micro-stutters while you play but this is a poor choice for a competitive game like CoD.

We take a slightly different approach: we don't do any up-front in the launcher but do as much as possible on the level loading screen; it's not perfect though: due to the way some legacy code works we can't always anticipate all permutations ahead of time so you get the occasional micro-stutter as missing shaders are sent to the driver.

You can get away with being lazier with modern drivers because they will cache the compiled result for you (if you don't cache the pipeline state object yourself in DirectX 12) but on older DirectX drivers for Intel IGPs there wasn't a cache at all so the first 30 seconds after loading into a level would be very busy.

flohofwoe · 5h ago
A better approach is to drastically reduce the number of individual shaders. Having tens of thousands of shader variants is an unfortunate side effect of treating shaders as asset data, and building them with visual noodle-graph tools by artists. No game actually needs tens of thousands of shaders.
tsukikage · 3h ago
GPUs suck at things like e.g. data driven branches. What looks like one shader at a high level ends up creating many separate compiled blobs, because you really want some of the parameters baked in at compile time to avoid the performance tanking, and this means you need to compile a version of the shader for every combination of values those parameters can take.
viraptor · 1h ago
Sometimes it pays off to have an ubershader with all the options though https://dolphin-emu.org/blog/2017/07/30/ubershaders/
tubs · 2h ago
Uniform branches are free pretty much

The main issue is that gpr allocation is static and worse case. So on the majority of hardware you hose your occupancy.

Negitivefrags · 7h ago
Steam actually shares compiled shaders between users with the same hardware / driver version, but only for Vulkan I believe.
ThatPlayer · 2h ago
The shared compiled shaders are only for the Steam Deck I believe. Other hardware will require a background compile or will compile at launch.
Firehawke · 1h ago
Nope, it's been available in the Steam client beta since 2016, and somewhere around 2017 it went into the mainline client.
gmueckl · 7h ago
Imagine the outcry if every game came with a background process that just sits there in the background.

Also - and this may sound a little bonkers - some renderers are so complex and flexible that the actual set of required shaders is only discovered when it tries to render a frame and the full set of possible shader configuration permutations is so large that compiling them all in advance is pointless or borderline infeasible.

mvdtnz · 7h ago
Lots of games come with background processes.
gmueckl · 4h ago
Can you name an example? I have a large library of games and can't name one that does.