I can't tell from your GLSL if these would have forced FMAs for a lot of the intermediate product sums. That would probably be a non-trivial effect, particularly for your large anisotropy cases.
The Heckbert paper also describes the basic theory, but you would want to supplement with some of the offline rendering work that followed it. OpenImageIO (OIIO) is pretty widely used, and has gone through several iterations of bug fixing like https://github.com/AcademySoftwareFoundation/OpenImageIO/pul...
But for your purposes, you probably just need to find all the magic epsilons and sign checks to make it match.
flexagoon · 4h ago
Btw, in case you're not aware, the article is somewhat unreadable on mobile devices because the code blocks can't be scrolled horizontally, so half of the code just doesn't fit on the screen. Also, the long latex formula overflows the screen and causes the entire page to move horizontally.
sebastianmestre · 2h ago
Fyi; you can scroll the code blocks if you zoom out until there is no more horizontal scroll on the page
Still sucky but at leas you can read the code
AshleysBrain · 6h ago
Perfect blog post for HN IMO - any blog title involving "in too much detail" will probably do well! Great job with the post, the visualizations are fantastic.
DDoSQc · 5h ago
This is great! Would've been really useful a couple months ago when I was refactoring Lavapipe's texture filtering. I worked off the Vulkan spec, which doesn't mention the elliptical transformation. I did notice that the spec says:
> The minimum and maximum scale factors (ρmin, ρmax) should be the minor and major axes of this ellipse.
Where "should" probably means some transformation can be applied (would be "must" otherwise).
Now I'm tempted to implement your visualizations so I can compare my work to your hardware references, and spend more hours getting it closer to actual hardware.
hmage · 6h ago
I have a hunch nvidia's mipmapping algorithm changes if you open nvidia control panel and change texture filtering to "high performance" vs "high quality"
PaulDavisThe1st · 9h ago
Totally fantastic article. I don't do work that overlaps with this at all, but even after 37+ years as a C++ programmer, I found this enlightening, engaging and informative. Thank you very much.
TonyTrapp · 5h ago
Great article! If you think it has too much detail, you probably selected the wrong mipmap level for it ;)
Agentlien · 9h ago
This was a wonderful article! I love this kind of exploration.
lloeki · 6h ago
I for one liked the article! Great visualisations.
There's a bit of nostalgia ;) Brought me back to the days where GL display lists were the fancy thing to do and any kind of non-ff shader were but a wild dream.
ImHereToVote · 8h ago
Nvidia has a quite blocky MIP selection. Did an Nvidia engineer decide that consumers don't notice, and fixed functioned the hell out of it?
ImHereToVote · 8h ago
This is very relevant to what I'm doing. I'm trying to reproduce the MIP pipeline to get anti-aliased procedural details in fragment. specifically converting high frequency details into roughness.
sebastianmestre · 2h ago
A while back I read a paper about downsampling normal maps and converting lost detail into roughness
I can try to find it if you want
kajkojednojajko · 4h ago
Insane deep-dive! Framing texture sampling as "Ideally, we’d like to integrate over the projection of the screen pixel onto the texture" was enlightening for me. I particularly enjoyed the explanation of anisotropic filtering because it always seemed like magic to me, and in the context of aligning ellipses on textures it just makes sense :D
gitroom · 3h ago
Pretty cool seeing someone dig this deep - I always wish I understood these graphics tricks better
aeonik · 2d ago
"You couldn’t implement these functions yourself - they are magic intrinsics which are implemented in hardware"
But why?
pema99 · 1d ago
There simply isn't another way to access registers from one 'thread' on another thread without using an intrinsic. You need that to calculate finite differences. For a long time, the only option was ddx()/ddy(). Now we also wave intrinsics, which you couldn't implement yourself either.
Sharlin · 2h ago
You need to access the neighboring pixels (fragments) in a quad to compute d_dx and d_dy, but quads are an implementation detail not exposed to the programmer.
llm_nerd · 2h ago
This isn't my specialty, and ultimately it really doesn't matter to the core point of this good submission about how the GPU chooses mipmap level to use, however the article gives the impression that we pre-calculate mipmap levels to improve distant aliasing, though the problem they demonstrate is solved with trivial texture filtering.
Mipmaps are a performance optimization[1]. You could just use a 4096x4096 brick texture across your entire game, and then use texture filtering to make it look good both close and far, but that means that rendering a distant wall polygon that might fill just a few pixels of the viewport needs to filter and apply a 16.7 million texel texture, redoing the filtering again and again and evicting everything else from caches just for that one texture. If instead it can apply a 32x32 pre-filtered texture to loads of distant objects, there are obviously massive performance ramifications. Which is why mipmaps are used, letting massive textures be used for those cases where the detail is valuable, without destroying performance when it's just some distant object.
And of course modern engines do the same thing with geometry now, where ideally there is hierarchy of differing level of detail geometry and it will choose the massive-vertices object when it fills the scene, and the tiny, super optimized one when it's just a few pixels.
[1] As one additional note, all major graphics platforms can automatically generate mipmaps for textures...but only if the root is uncompressed. Modern texture compression is hugely compute bound and yields major VRAM savings so almost all games pre-compute the mipmapping and then do the onerous compression in advance.
I can't tell from your GLSL if these would have forced FMAs for a lot of the intermediate product sums. That would probably be a non-trivial effect, particularly for your large anisotropy cases.
The Heckbert paper also describes the basic theory, but you would want to supplement with some of the offline rendering work that followed it. OpenImageIO (OIIO) is pretty widely used, and has gone through several iterations of bug fixing like https://github.com/AcademySoftwareFoundation/OpenImageIO/pul...
But for your purposes, you probably just need to find all the magic epsilons and sign checks to make it match.
Still sucky but at leas you can read the code
> The minimum and maximum scale factors (ρmin, ρmax) should be the minor and major axes of this ellipse.
Where "should" probably means some transformation can be applied (would be "must" otherwise).
Now I'm tempted to implement your visualizations so I can compare my work to your hardware references, and spend more hours getting it closer to actual hardware.
There's a bit of nostalgia ;) Brought me back to the days where GL display lists were the fancy thing to do and any kind of non-ff shader were but a wild dream.
I can try to find it if you want
Mipmaps are a performance optimization[1]. You could just use a 4096x4096 brick texture across your entire game, and then use texture filtering to make it look good both close and far, but that means that rendering a distant wall polygon that might fill just a few pixels of the viewport needs to filter and apply a 16.7 million texel texture, redoing the filtering again and again and evicting everything else from caches just for that one texture. If instead it can apply a 32x32 pre-filtered texture to loads of distant objects, there are obviously massive performance ramifications. Which is why mipmaps are used, letting massive textures be used for those cases where the detail is valuable, without destroying performance when it's just some distant object.
And of course modern engines do the same thing with geometry now, where ideally there is hierarchy of differing level of detail geometry and it will choose the massive-vertices object when it fills the scene, and the tiny, super optimized one when it's just a few pixels.
[1] As one additional note, all major graphics platforms can automatically generate mipmaps for textures...but only if the root is uncompressed. Modern texture compression is hugely compute bound and yields major VRAM savings so almost all games pre-compute the mipmapping and then do the onerous compression in advance.