This looks a lot like some line anti aliasing I had to hack together many years ago when a customer started complaining loudly about the lack of hardware support for it. I think I had something like a week to put together three different alternatives for them to pick from, and this was the winner. It looked the best by far.
Years later my boss was telling me how satisfied he was that he could throw any problem in my general direction and it would be gone in no time. There is nothing like the risk of losing his work permit to motivate a young guy to work himself down to a crisp, all for peanuts.
Const-me · 2h ago
Good article, but I believe it lacks information what specifically these magical dFdx, dFdy, and fwidth = abs(dFdx) + abs(dFdy) functions are computing.
The following stackexchange answer addresses that question rather well: https://gamedev.stackexchange.com/a/130933/3355 As you see, dFdx and dFdx are not exactly derivatives, these are discrete screen-space approximations of these derivativities. Very cheap to compute due to the weird execution model of pixel shaders running in hardware GPUs.
mananaysiempre · 2h ago
If you’ve ever sampled a texture in a shader, then you know what those are, so it’s probably fair to include them in the prerequisites for the article. But yes, those are intended to be approximate screen-space derivatives of whatever quantity you plug into them, and (I believe) on basically any hardware the approximation in question is a single-sided first difference, because the particular fragment (single-pixel contribution) you’re computing always exists in a 2×2 group of screen-space neighbours executing in lockstep.
NohatCoder · 4h ago
Reminds me that I found an alternative way of sampling an SDF:
First take a sample in each corner of the pixel to be rendered (s1 s2 s3 s4), then compute:
It is a good approximation, and it keeps on working no matter how you scale and stretch the field.
Relative to the standard method it is expensive to calculate. But for a modern GPU it is still a very light workload to do this once per screen pixel.
shiandow · 4h ago
Technically that only requires calculating one extra row and column of pixels.
It is indeed scale invariant but I think you can do better, you should have enough to make it invariant to any linear transformation. The calculation will be more complex but that is nothing compared to evaluating the SDF
NohatCoder · 3h ago
I do believe that it is already invariant to linear transformations the way you want, i.e. we can evaluate the corners of an arbitrary parallelogram instead of a square and get a similar coverage estimate.
shiandow · 1h ago
Similar maybe but it can't be the same surely? Just pick some function like f(x,y) = x-1 and start rotating it around your centre pixel, the average (s1+s2+s3+s4) will be the same (since it's a linear function) but there's no way those absolute values will remain constant.
You should be pretty close though. For a linear function you can just calculate the distance to the 0 line, which is invariant to any linear transformation that leaves that line where it is (which is what you want). This is just the function value divided by the norm of the gradient. Both of which you can estimate from those 4 points. This gives something like
dx = (s2 - s1 + s4 - s3)
dy = (s3 - s1 + s4 - s2)
f = (s1+s2+s3+s4)/4
dist = f / sqrt(dx*dx + dy*dy)
brookman64k · 4h ago
Would that be done in two passes?
1. Render the image shifted by 0.5 pixels in both directions (plus one additional row & column).
2. Apply above formula to each pixel (4 reads, 1 write).
ralferoo · 3h ago
That'd be one way of doing it.
You don't technically need 4 reads per pixel either, for instance you can process a 7x7 group with a 64-count thread group. Each thread does 1 read, and then fetches the other 3 values from its neighbours and calculates the average. Then the 7x7 subset of the 8x8 write their values.
You could integrate this into the first pass too, but then there would be duplication on the overlapped areas of each block. Depending on the complexity of the first pass, it still might be more efficient to do that than an extra pass.
Knowing that it's only the edges that are shared between threads, you could expand the work of each thread to do multiple pixels so that each thread group covers more pixels the reduce the number of pixels sampled multiple times. How much you do this by depends on register pressure, it's probably not worth doing more than 4 pixels per thread but YMMV.
NohatCoder · 3h ago
You certainly could imagine doing that, but as long as the initial evaluation is fairly cheap (say a texture lookup), I don't see the extra pass being worth it.
mxfh · 5h ago
The minute black area on the inner part of the sector getting perceptually boosted with the same ramp width like the outer area is effectively how an outline on a shape would behave, not two shapes with no stroke width. I would expect the output brightness should scale with the volume/depth under a pixel in the 3d visualization.
Is this intentional? To me this is an opiniated (aka artistic preference) feature preserving method not the perfect one.
Btw the common visualization has a source and an author:
> The minute black area on the inner part of the sector
I'm not grasping what you're referring to here.
mxfh · 2h ago
That Pac-Man “mouth” is collapsing to a constant width line like halfway in for the last for examples for me.
Having some weird mid-length discontinuity in the edge direction for me. Not just perceptually.
Maybe I’m misunderstanding something here or have a different idea what the goal of that exercise is, but I would expect some pixels to turn near white near the center towards that gap sector.
WithinReason · 3h ago
Instead of OKLAB isn't it simpler to just use a linear color space and only do gamma correction at the very end?
badlibrarian · 2h ago
Simpler and worse in this application.
No comments yet
jeremyscanvic · 2h ago
Really interesting write-up! I'm not very familiar with signed distance functions but aliasing is a major part of my PhD and this is really insightful to me!
talkingtab · 4h ago
A very good example of SDF thinking, using signed distance fields in shaders. Both shaders and SDF are new to me and very interesting. Another example of what is being done is MSDF here: https://github.com/Chlumsky/msdfgen.
mxfh · 42m ago
That what I wondering, for sharp narrow corners like with that pac-man mouth center and in font rendering composite/multichannel is probably the better approach for any situation where the there is potential for self-intersection of the distance field in concave situations.
https://lambdacube3d.wordpress.com/2014/11/12/playing-around...
Years later my boss was telling me how satisfied he was that he could throw any problem in my general direction and it would be gone in no time. There is nothing like the risk of losing his work permit to motivate a young guy to work himself down to a crisp, all for peanuts.
The following stackexchange answer addresses that question rather well: https://gamedev.stackexchange.com/a/130933/3355 As you see, dFdx and dFdx are not exactly derivatives, these are discrete screen-space approximations of these derivativities. Very cheap to compute due to the weird execution model of pixel shaders running in hardware GPUs.
First take a sample in each corner of the pixel to be rendered (s1 s2 s3 s4), then compute:
It is a good approximation, and it keeps on working no matter how you scale and stretch the field.Relative to the standard method it is expensive to calculate. But for a modern GPU it is still a very light workload to do this once per screen pixel.
It is indeed scale invariant but I think you can do better, you should have enough to make it invariant to any linear transformation. The calculation will be more complex but that is nothing compared to evaluating the SDF
You should be pretty close though. For a linear function you can just calculate the distance to the 0 line, which is invariant to any linear transformation that leaves that line where it is (which is what you want). This is just the function value divided by the norm of the gradient. Both of which you can estimate from those 4 points. This gives something like
You don't technically need 4 reads per pixel either, for instance you can process a 7x7 group with a 64-count thread group. Each thread does 1 read, and then fetches the other 3 values from its neighbours and calculates the average. Then the 7x7 subset of the 8x8 write their values.
You could integrate this into the first pass too, but then there would be duplication on the overlapped areas of each block. Depending on the complexity of the first pass, it still might be more efficient to do that than an extra pass.
Knowing that it's only the edges that are shared between threads, you could expand the work of each thread to do multiple pixels so that each thread group covers more pixels the reduce the number of pixels sampled multiple times. How much you do this by depends on register pressure, it's probably not worth doing more than 4 pixels per thread but YMMV.
Is this intentional? To me this is an opiniated (aka artistic preference) feature preserving method not the perfect one.
Btw the common visualization has a source and an author:
https://iquilezles.org/articles/distfunctions2d/ https://www.shadertoy.com/playlist/MXdSRf
I'm not grasping what you're referring to here.
Having some weird mid-length discontinuity in the edge direction for me. Not just perceptually.
Maybe I’m misunderstanding something here or have a different idea what the goal of that exercise is, but I would expect some pixels to turn near white near the center towards that gap sector.
No comments yet