The part that was most surprising to me was how much the performance of serializing floating-point numbers has improved, even just in the past decade [1].
Roundtripping IEEE floating point values via conversion to decimal UTF-8 strings and back is a ridiculously fragile process, too, not just slow.
The difference between which values are precisely representable in binary and which are precisely representable in decimal means small errors can creep in.
gugagore · 1h ago
You don't have to precisely represent the float in decimal. You just have to have each float have a unique decimal representation, which you can guarantee if you include enough digits: 9 for 32-bit floats, and 17 for 64-bit floats.
And you need to trust that whoever is generating the JSON you’re consuming, or will consume the JSON you generate, is using a library which agrees about what those representations round to.
jk-jeon · 1h ago
Note that the consumer side doesn't really have a lot of ambiguity. You just read the number, compute its precise value as written, round it to the closest binary representation with banker's rounding. You do anything other than this only under very special circumstances. Virtually all ambiguity lies on the producer side, which can be cleared out by using any of the formatting algorithms with the roundtripping-guarantee.
EDIT:
If you're talking about decimal->binary->decimal round-tripping, it's a completely different story though.
kccqzy · 39m ago
JSON itself doesn't mandate that IEEE754 numbers be used.
jameshart · 35m ago
Indeed - you could be serializing to or from JSON where the in-memory representation you're aiming for is actually a floating point decimal. JSON doesn't care.
jk-jeon · 1h ago
A way to achieve perfect round-tripping was proposed back in 1990, by Steele and White (and likely they are not the first ones who came up with a similar idea). I guess their proposal probably wasn't extremely popular at least until 2000's, compared to more classical `printf`-like rounding methods, but it seems many languages and platforms these days do provide such round-tripping formatting algorithms as the default option. So, I guess nowadays roundtripping isn't that hard, unless people do something sophisticated without really understanding what they're doing.
kccqzy · 46m ago
Interesting! I didn't know about Steele and White's 1990 method. I did however remember the Burger and Dybvig's method from 1996.
lifthrasiir · 1h ago
I do think the OP was worrying about such people. Now a performant and correctly rounded JSON library is reasonably common, but it was not the case a decade ago (I think).
kccqzy · 45m ago
Most languages in use (such as Python) have solved this problem ages ago. Take any floating point value other than NaN, convert it to string and convert the string back. It will compare exactly equal. Not only that, they are able to produce the shortest string representation.
jameshart · 37m ago
Maybe 'ridiculously fragile' is the wrong word. Perhaps 'needlessly' fragile would be better.
The point is that it takes application of algorithms that need to be provably correctly implemented on both ends of any JSON serialization/deserialization. And if one implantation can roundtrip its own floating point values, that's great - but JSON is an interop format, so does it roundtrip if you send it to another system and back?
It's just an unnecessary layer of complexity that binary floating point serializers do not have to worry about.
hinkley · 7h ago
JSON encoding is a huge impediment to interprocess communication in NodeJS.
Sooner or later is seems like everyone gets the idea of reducing event loop stalls in their NodeJS code by trying to offload it to another thread, only to discover they’ve tripled the CPU load in the main thread.
I’ve seen people stringify arrays one entry at a time. Sounds like maybe they are doing that internally now.
If anything I would encourage the V8 team to go farther with this. Can you avoid bailing out for subsets of data? What about the CString issue? Does this bring faststr back from the dead?
jcdavis · 4h ago
Based off of my first ever forays into node performance analysis last year, JSON.stringify was one of the biggest impediments to just about everything around performant node services. The fact that everyone uses stringify to for dict keys, the fact that apollo/express just serializes the entire response into a string instead of incrementally streaming it back (I think there are some possible workarounds for this, but they seemed very hacky)
As someone who has come from a JVM/go background, I was kinda shocked how amateur hour it felt tbh.
hinkley · 3h ago
> Based off of my first ever forays into node performance analysis last year, JSON.stringify was one of the biggest impediments to just about everything around performant node services
Just so. It is, or at least can be, the plurality of the sequential part of any Amdahl's Law calculation for Nodejs.
I'm curious if any of the 'side effect free' commentary in this post is about moving parts of the JSON calculation off of the event loop. That would certainly be very interesting if true.
However for concurrency reasons I suspect it could never be fully off. The best you could likely do is have multiple threads converting the object while the event loop remains blocked. Not entirely unlike concurrent marking in the JVM.
MehdiHK · 3h ago
> JSON.stringify was one of the biggest impediments to just about everything around performant node services
That's what I experienced too. But I think the deeper problem is Node's cooperative multitasking model. A preemptive multitasking (like Go) wouldn't block the whole event-loop (other concurrent tasks) during serializing a large response (often the case with GraphQL, but possible with any other API too). Yeah, it does kinda feel like amateur hour.
dmit · 2h ago
Node is the biggest impediment to performant Node services. The entire value proposition is "What if you could hire people who write code in the most popular programming language in the world?" Well, guess what
teaearlgraycold · 47m ago
I'd say the value prop is you can share code (and with TS, types as well) between your web front end and back end.
hinkley · 1h ago
Nodejs will never be as bad as VB was.
nijave · 1h ago
Same problem in Python. It'd be nice to have good/efficient IPC primitives with higher level APIs on top for common patterns
brundolf · 3h ago
Yeah. I think I've only ever found one situation where offloading work to a worker saved more time than was lost through serializing/deserializing. Doing heavy work often means working with a huge set of data- which means the cost of passing that data via messages scales with the benefits of parallelizing the work.
hinkley · 3h ago
I think the clues are all there in the MDN docs for web workers. Having a worker act as a forward proxy for services; you send it a URL, it decides if it needs to make a network request, it cooks down the response for you and sends you the condensed result.
Most tasks take more memory in the middle that at the beginning and end. And if you're sharing memory between processes that can only communicate by setting bytes, then the memory at the beginning and end represents the communication overhead. The latency.
But this is also why things like p-limit work - they pause an array of arbitrary tasks during the induction phase, before the data expands into a complex state that has to be retained in memory concurrent with all of its peers. By partially linearizing you put a clamp on peak memory usage that Promise.all(arr.map(...)) does not, not just the thundering herd fix.
dwattttt · 3h ago
Now to just write the processing code in something that compiles to WebAssembly, and you can start copying and sending ArrayBuffers to your workers!
Or I guess you can do it without the WebAssembly step.
ot · 2h ago
The SWAR escaping algorithm [1] is very similar to the one I implemented in Folly JSON a few years ago [2]. The latter works on 8 byte words instead of 4 bytes, and it also returns the position of the first byte that needs escaping, so that the fast path does not add noticeable overhead on escape-heavy strings.
I don't think v8 gets enough praise. It is fucking insane how fast javascript can be these days
andyferris · 2h ago
Yeah, it is quite impressive!
It's a real example of "you can solve just about anything with a billion dollars" though :)
I'd prefer JavaScript kept evolving (think "strict", but "stricter", "stricter still", ...) to a simpler and easier to compile/JIT language.
fngjdflmdflg · 1h ago
I want JS with sound types. It's interesting how sound types can't be added to JS because runtime checks would be too expensive, but then so much of what makes JS slow is having to check types all the time anyway, and the only way to speed it up is to retroactively infer the types. I want types plus a "use typechecked" that tells the VM I already did some agreed upon level of compile-time checks and now it only needs to do true runtime checks that can't be done at compile time.
teaearlgraycold · 46m ago
The most likely path forward on that would be a Typescript AOT compiler, maybe with some limitations on the code you write.
fngjdflmdflg · 38m ago
Compiled to what, wasm?
ayaros · 1h ago
Yes, this is what I want too. Give me "stricter" mode.
shivawu · 1h ago
On the other hand, I consider v8 the most extreme optimized runtime in a weird way, in that there’re like 100 people on the planet understand how it works, while the rest of us be like “why my JS not fast”
MutedEstate45 · 4h ago
I really like seeing the segmented buffer approach. It's basically the rope data structure trick I used to hand-roll in userland with libraries like fast-json-stringify, now native and way cleaner. Have you run into the bailout conditions much? Any replacer, space, or custom .toJSON() kicks you back to the slow path?
lifthrasiir · 1h ago
As usual, the advancement in double-to-string algorithms is usually driven by JSON (this time Dragonbox).
taeric · 3h ago
I confess that I'm at a bit of a loss to know what sort of side effects would be common when serializing something? Is there an obvious class of reasons for this that I'm just accidentally ignoring right off?
vinkelhake · 3h ago
A simple example is `toJSON`. If an object defines that method, it'll get invoked automatically by JSON.stringify and it could have arbitrary side effects.
I think it's less about side effects being common when serializing, just that their fast path avoids anything that could have side effects (like toJSON).
The article touches briefly on this.
kevingadd · 2h ago
Calling a property getter can have side effects, so if you serialize an object with a getter you have to be very cautious to make sure nothing weird happens underneath you during serialization.
People have exploited this sort of side effect to get bug bounties before via type confusion attacks, iirc.
pyrolistical · 2h ago
> Optimizing the underlying temporary buffer
So array list instead of array?
greatgib · 1h ago
An important question that was not addressed is whether the general path will be slower to account for what is needed to check first if the fast path can be used.
stephenlf · 50m ago
Yo nice
t1234s · 1h ago
speed is always good!
iouser · 3h ago
Did you run any tests/regressions against the security problems that are common with parsers? Seems like the solution might be at risk of creating CVEs later
rpearl · 42m ago
...Do you think that v8 doesn't have tests for what might be one of the most executed userspace codepaths in the world?
[1] https://github.com/jk-jeon/dragonbox?tab=readme-ov-file#perf...
The difference between which values are precisely representable in binary and which are precisely representable in decimal means small errors can creep in.
https://randomascii.wordpress.com/2012/02/11/they-sure-look-...
EDIT: If you're talking about decimal->binary->decimal round-tripping, it's a completely different story though.
The point is that it takes application of algorithms that need to be provably correctly implemented on both ends of any JSON serialization/deserialization. And if one implantation can roundtrip its own floating point values, that's great - but JSON is an interop format, so does it roundtrip if you send it to another system and back?
It's just an unnecessary layer of complexity that binary floating point serializers do not have to worry about.
Sooner or later is seems like everyone gets the idea of reducing event loop stalls in their NodeJS code by trying to offload it to another thread, only to discover they’ve tripled the CPU load in the main thread.
I’ve seen people stringify arrays one entry at a time. Sounds like maybe they are doing that internally now.
If anything I would encourage the V8 team to go farther with this. Can you avoid bailing out for subsets of data? What about the CString issue? Does this bring faststr back from the dead?
As someone who has come from a JVM/go background, I was kinda shocked how amateur hour it felt tbh.
Just so. It is, or at least can be, the plurality of the sequential part of any Amdahl's Law calculation for Nodejs.
I'm curious if any of the 'side effect free' commentary in this post is about moving parts of the JSON calculation off of the event loop. That would certainly be very interesting if true.
However for concurrency reasons I suspect it could never be fully off. The best you could likely do is have multiple threads converting the object while the event loop remains blocked. Not entirely unlike concurrent marking in the JVM.
That's what I experienced too. But I think the deeper problem is Node's cooperative multitasking model. A preemptive multitasking (like Go) wouldn't block the whole event-loop (other concurrent tasks) during serializing a large response (often the case with GraphQL, but possible with any other API too). Yeah, it does kinda feel like amateur hour.
Most tasks take more memory in the middle that at the beginning and end. And if you're sharing memory between processes that can only communicate by setting bytes, then the memory at the beginning and end represents the communication overhead. The latency.
But this is also why things like p-limit work - they pause an array of arbitrary tasks during the induction phase, before the data expands into a complex state that has to be retained in memory concurrent with all of its peers. By partially linearizing you put a clamp on peak memory usage that Promise.all(arr.map(...)) does not, not just the thundering herd fix.
Or I guess you can do it without the WebAssembly step.
[1] https://source.chromium.org/chromium/_/chromium/v8/v8/+/5cbc...
[2] https://github.com/facebook/folly/commit/2f0cabfb48b8a8df84f...
It's a real example of "you can solve just about anything with a billion dollars" though :)
I'd prefer JavaScript kept evolving (think "strict", but "stricter", "stricter still", ...) to a simpler and easier to compile/JIT language.
I think it's less about side effects being common when serializing, just that their fast path avoids anything that could have side effects (like toJSON).
The article touches briefly on this.
People have exploited this sort of side effect to get bug bounties before via type confusion attacks, iirc.
So array list instead of array?