Parsing Protobuf Like Never Before

27 ibobev 5 7/17/2025, 10:09:48 AM mcyoung.xyz ↗

Comments (5)

skybrian · 4h ago
This is excellent: an in-depth description showing how the Go internals make writing fast interpreters difficult, by someone who is far more determined than I ever was to make it fast anyway.

I’ve assumed that writing fast interpreters wasn’t a use case the Go team cared much about, but if it makes protobuf parsing faster, maybe it will get some attention, and some of these low-level tricks will no longer be necessary?

irq-1 · 5h ago
mdhb · 5h ago
I’d really love to see more work bringing the best parts of protobuf to a standardised serialization format like CBOR.

I’d make the same argument for gRPC-web to something like WHATWG streams and or WebTransport.

There is a lot of really cool and important learnings in both but it’s also so tied up in weird tooling and assumptions. Let’s rebase on IETF and W3C standards

youngtaff · 32m ago
Would be good to see support for encoding / decoding CBOR exposed as a broswer API - they currently use CBOR internally for WebAuthn so I’d hope it’s bnot too hard
UncleEntity · 5h ago
> In other words, a UPB parser is actually configuration for an interpreter VM, which executes Protobuf messages as its bytecode.

This is kind of confusing, the VM is runtime crafted to parse a single protobuf message type and only this message type? The Second Futamura Projection, I suppose...

Or the VM is designed specifically around generic protobuf messages and it can parse any random message but only if it's a protobuf message?

I've been working on the design of a similar system but for general binary parsing (think bison/yacc for binary data) and hadn't even considered doing data over specialized VM vs. bytecode+data over general VM. Honestly, since it's designed around 'maximum laziness' (it just parses/verifies and creates metadata over the input so you only pay for decoding bytes you actually use) and I/O overhead is way greater than the VM dispatching trying this out is probably one of those "premature optimization is the root of all evil" cases but intriguing none the less.