This is excellent: an in-depth description showing how the Go internals make writing fast interpreters difficult, by someone who is far more determined than I ever was to make it fast anyway.
I’ve assumed that writing fast interpreters wasn’t a use case the Go team cared much about, but if it makes protobuf parsing faster, maybe it will get some attention, and some of these low-level tricks will no longer be necessary?
I’d really love to see more work bringing the best parts of protobuf to a standardised serialization format like CBOR.
I’d make the same argument for gRPC-web to something like WHATWG streams and or WebTransport.
There is a lot of really cool and important learnings in both but it’s also so tied up in weird tooling and assumptions. Let’s rebase on IETF and W3C standards
youngtaff · 32m ago
Would be good to see support for encoding / decoding CBOR exposed as a broswer API - they currently use CBOR internally for WebAuthn so I’d hope it’s bnot too hard
UncleEntity · 5h ago
> In other words, a UPB parser is actually configuration for an interpreter VM, which executes Protobuf messages as its bytecode.
This is kind of confusing, the VM is runtime crafted to parse a single protobuf message type and only this message type? The Second Futamura Projection, I suppose...
Or the VM is designed specifically around generic protobuf messages and it can parse any random message but only if it's a protobuf message?
I've been working on the design of a similar system but for general binary parsing (think bison/yacc for binary data) and hadn't even considered doing data over specialized VM vs. bytecode+data over general VM. Honestly, since it's designed around 'maximum laziness' (it just parses/verifies and creates metadata over the input so you only pay for decoding bytes you actually use) and I/O overhead is way greater than the VM dispatching trying this out is probably one of those "premature optimization is the root of all evil" cases but intriguing none the less.
I’ve assumed that writing fast interpreters wasn’t a use case the Go team cared much about, but if it makes protobuf parsing faster, maybe it will get some attention, and some of these low-level tricks will no longer be necessary?
I’d make the same argument for gRPC-web to something like WHATWG streams and or WebTransport.
There is a lot of really cool and important learnings in both but it’s also so tied up in weird tooling and assumptions. Let’s rebase on IETF and W3C standards
This is kind of confusing, the VM is runtime crafted to parse a single protobuf message type and only this message type? The Second Futamura Projection, I suppose...
Or the VM is designed specifically around generic protobuf messages and it can parse any random message but only if it's a protobuf message?
I've been working on the design of a similar system but for general binary parsing (think bison/yacc for binary data) and hadn't even considered doing data over specialized VM vs. bytecode+data over general VM. Honestly, since it's designed around 'maximum laziness' (it just parses/verifies and creates metadata over the input so you only pay for decoding bytes you actually use) and I/O overhead is way greater than the VM dispatching trying this out is probably one of those "premature optimization is the root of all evil" cases but intriguing none the less.