Visualising how close random GUIDs come to being the same

70 nugzbunny 20 8/16/2025, 10:11:09 PM guidsmash.com ↗

Comments (20)

twiss · 2h ago
> The chances of generating two GUIDs that are the same is astronomically small.

> The odds are 1 in 2^122 — that’s approximately 1 in 5,000,000,000,000,000,000,000,000,000,000,000,00.

This is true if you only generate two GUIDs, but if you generate very many GUIDs, the chance of generating two identical ones between any of them increases. E.g. if you generate 2^61 GUIDs, you have about a 1 in 2 chance of a collision, due to the birthday paradox.

2^61 is still a very large number of course, but much more feasible to reach than 2^122 when doing a collision attack. This is the reason that cryptographic hashes are typically 256 bits or more (to make the cost of collision attacks >= 2^128).

Retr0id · 2h ago
2^61 isn't even that large, well within the compute budget of mere mortals.
vlovich123 · 51m ago
Depends on what “isn’t even that large means”. A modern 6ghz machine would probably need 12 years of 24/7 operation to count that high. To me that seems like a lot.
dgrin91 · 45m ago
Yeah, but a nation state server farm can probably cut that down to minutes because their budget can buy a lot of processors. You only need a few hundred to really shrink it down to manageable numbers. And it turns out that nation starts aren't the only ones that have this budget
8organicbits · 27m ago
What's the threat here?

It's trivial to force a collision. Here's the same UUID twice:

6e197264-d14b-44df-af98-39aac5681791

6e197264-d14b-44df-af98-39aac5681791

Typically, you don't care about UUIDs that aren't in your system and you generate those yourself to avoid maliciously generated collisions. Your system can't handle 2^61 IDs. It doesn't have the processing power, storage, or bandwidth for that to happen. Not to mention traditional rate limiting.

PaulHoule · 53m ago
I think you might have trouble if you tried to assign one to every iron atom in an iron filing.
NoahZuniga · 2h ago
* not the birthday paradox, but the birthday bound.
8organicbits · 3h ago
Note that this only considers UUIDv4, the random UUID. Other forms can generate UUIDs that are much closer together. For UUIDv7, UUIDs generated within the same millisecond will have identical 48 bit prefixes (or up to 60 when the monotonic counter from section 6.2 is used).

https://www.rfc-editor.org/rfc/rfc9562.html#monotonicity_cou...

e1g · 2h ago
You need to be generating >100M of them within the same millisecond before even remembering that collisions can theoretically happen.
charcircuit · 2h ago
>You

The entire universe. Else it's not universally unique.

8organicbits · 56m ago
I like UUIDv7s as database IDs since they sort chronologically, are unique, and are efficient to generate. My system chooses the UUIDs; I don't allow externally generated IDs in. If I did, then an attacker could easily force a collision. As such, I only care about how fast I create IDs. This is a common pattern.

If your system does need to worry about UUIDv7s generated by the rest of the universe, you likely also need to worry about maliciously created IDs, software bugs, clocks that reset to unix epoch, etc. I worry about those more than a bonefide collision.

webstrand · 2h ago
This is the chance that given a specific guid, that you'll find a collision for it. Utterly minuscule chance. However birthday paradox controls, if you generate 2^62.60 guids the chance that you've generated a collision is around 99%. Still enormously unlikely, but way smaller than 2^122.

At a rate of comparing 400,000 guids per second, you have a 99% chance of seeing a collision within the next 553,750 years.

jonathrg · 2h ago
You would need a little more memory to see/detect that collision.
RS-232 · 1h ago
UUID > GUID.

Microsoft’s GUID standard is garbage.

lionkor · 1h ago
Oh, why?
w-ll · 58m ago
not OP but i already have fields for time ts and what model it is. i want my uuids random.
kaoD · 28m ago
I think the current Microsoft GUID is just UUIDv7.

https://learn.microsoft.com/en-us/dotnet/api/system.guid?vie...

I don't think there's a "Microsoft standard" and they just use different versions of UUID in different products over time. No idea why they call it GUID instead of UUID though, but it's easier to speak out loud so I'm not against it.

v7 has a timestamp indeed, but isn't the time making it more collision resistant? You'd have to generate tons of UUIDv7s in the same millisecond, while v4 is more likely to collide due to not being time-constrained and the birthday paradox.

I think both have their uses though. You might need pure random if you want your UUID not to convey any time information and you're not generating tons of them (e.g. a random user id).

What do you mean "model"? Are you referring to UUIDv1 which has time and MAC address?

Zambyte · 11m ago
> isn't the time making it more collision resistant?

That seems to depend a whole lot on the pattern your application generates UUIDs in. If you're generating a consistent distribution over time, sure. If you generate a whole lot in bursts, collision seems to be way more likely.

amingilani · 3h ago
Instead of picking a target UUID and evaluating new UUIDs against it, a better experiment would be finding duplicates in all the UUIDs you have generated.

This plays nicely with the birthday paradox.

nesk_ · 3h ago
Nice experiment. Is the code available somewhere?