Stop using REST for state synchronization (2024)

58 Kerrick 29 5/15/2025, 5:30:03 PM mbid.me ↗

Comments (29)

kokanee · 32d ago

It seems like the author has never worked on an interface that uses realtime UI syncing. The points about cumbersome and mundane issues when working with REST UIs are not wrong, but the challenges of working with realtime synced UIs are substantially more difficult. And I'm not talking about conflict resolution, which can be abstracted by many existing solutions. I'm talking about the intricacies of UI behaviors that are specific to each application, when suddenly every interactive element can potentially be interacted with by multiple parties simultaneously.

For most applications, the status quo whereby a little bit of recent state is maintained locally, and periodically pushed to a centralized store, seems to provide the best overall balance of complexity and functionality. One exception is that most applications do a poor job of actually maintaining the local state (e.g. most forms clear when you refresh the page, which is not hard to fix).

qudat · 32d ago

State synchronization is a bit of a hot topic for front end development since most of the “middle end” deals squarely with solving that problem.

I also agree that the way many APIs are built are pushing complexity to the FE: https://bower.sh/dogma-of-restful-api

However, virtually every company I’ve joined has needed to serve multiple clients. In those cases REST is the status quo. Some people have opted for graphql but even that makes some uneasy because tuning gql is non trivial. CRDTs are completely foreign to most organizations unless they deal specifically with multiplayer modes.

So while it sounds nice to ditch REST it’s just not realistic for most orgs. This is why I’ve been developing a side effect and state sync system that can easily interface with REST, websockets, graphql, etc because it’s built off of structured concurrency.

https://starfx.bower.sh/

https://bower.sh/why-structured-concurrency

jemmyw · 33d ago

While I superficially agree with the point, the article author states that they haven't tried the alternatives they suggest. Implementing CRDT or OT can be a complex endeavour, especially if you're retrofitting on an existing system that already has a REST or similar API. Use case is ever important, don't spend your all important time implementing a complex state sync if it turns out your customers don't do the kind of collaboration that requires it.

gregw2 · 31d ago

There is a very simple but often missed pair of solutions to this REST+sync-state problem.

1) When fetching data from the source (to appear on your web form that someone will edit), ensure that you always also fetch a "version number" (or version number dictionary for multi-tables) attached with the source data (a la MVCC). (This does imply you control the backend database and can have a version# field in each table or your CRDT event table.)

Only apply updates/deletes on your backend when the version number still matches, otherwise return an error saying the state has changed, refresh the UI and have the user retry their data entry.

2) When sending information back to the server in your REST PUT, include both old+new values. On the backend, only perform the UPDATE or DELETE operations if/WHERE the old values are still present (preferrably wrapped within a transaction). Otherwise return an error saying the underlying data has changed and refresh the UI and have the user retry their data entry.

cgio · 32d ago

REST as the author finds out is not for state synchronisation. CRDTs are also not about state synchronisation, only so in an eventual sense, so similar to REST, you can make it work, but you are twisting the approach with assumptions and sticky tape. State synchronisation is a “solved” problem, in the form of distributed data stores, or in some more relaxed sense, database replication, CDC etc. Not an expert, so this is just an opinion even if it sounds assertive.

No comments yet

coolhand2120 · 32d ago

What part of this implementation is ReST? I see a CRUD API that uses HTTP verbs. State transfer implies multiple steps at least, where I call the initial API which then returns several other API endpoints "transfering the state" typically in a HATEOAS style - perhaps that part isn't required.

https://en.wikipedia.org/wiki/REST

> ... although this term is more commonly associated with the design of HTTP-based APIs and what are widely considered best practices regarding the "verbs" (HTTP methods) a resource responds to, while having little to do with REST as originally formulated ...

Ah, I see, the industry has taken to just calling HTTP ReST for no apparent reason.

As far as not being sure about CRDTs, these protocols were made to overcome the obvious and terrible shortcomings of CRUD APIs (lack of communicability and idempotency). Who ever wants to see "the data on the server has changed, do you want to overwrite the data on the server or overwrite your local copy". If you're not doing some sort of event sourcing for state management (ES, OT, CRDT) you're probably doing it wrong or doing it for a simple project.

WhitneyLand · 32d ago

Not sure why you think that’s not rest. Because Roy Fielding wrote a paper that most people don’t care about?

Language isn’t controlled by a single authority or the first person to use a term. Words mean what people use them to mean. You can complain that it’s not the ‘proper’ definition, but if the majority of people use it a certain way, that becomes the definition in practice.

edit: Fielding not Crawford (thx ackfoobar)

ackfoobar · 32d ago

Similarly Java falls outside of Alan Kay's definition of OOP. But unlike Roy Fielding's ReST, I don't see many people bringing that up.

827a · 32d ago

Its more fun at parties to just choose to believe that many of these APIs companies deploy are REST APIs, where the word "REST" is just a word, doesn't stand for anything, and is entirely unrelated to whatever that Roy guy was talking about all those years ago.

tptacek · 32d ago

I don't understand this objection. Whether or not the APIs the post describes are according-to-Royle REST, is it your claim that REST is in fact a good approach to state sync? Because that's what the post is really about.

juliusdavies · 32d ago

Damn that’s a fine coinage. According-to-Royle.

sansseriff · 32d ago

Curious to her other people's opinion of this. I arrived at a similar conclusion after making web apps for research control software in my PhD. I went through a Yjs tutorial and looked into integrating it with fastapi websockets. But this seems like a pretty unusual thing; there just isn't enough people doing this.

Nice user-friendly libraries and tutorials don't exist for smoothing the transition from REST to CRDTs, should your app need that.

coolhand2120 · 32d ago

If you're hesitating to use YJS take the word of someone who took the plunge, it's totally worth it on the other side.

I've written several apps now with it now. Very easy to use and quite robust to failures. It's a bit of a mental load to take on at first but it's totally worth it for the problems it solves out of the box. I've tried other things too from rolling my own ES stack to OT and more.

Lately I've got it running on AWS API Gateway V2 over websockets to lambdas + DynamoDB with a small army of daily users. The only expensive part is the event audit logs I keep due to my inherent mistrust of all computers.

techno-beetle · 32d ago

A few questions regarding CRDT usage from someone that's lightly tested out automerge/autosurgeon in rust late last year, but hasn't used it or any other CRDT for an actual project.

1. Do you use the CRDT document as the source of truth or as just synchronization with a database as the source of truth. If the document is the source of truth, do you keep data in it or copy the data into some other format that's easier to query?

2. How do you handle changes to the schema of the CRDT documents? In my testing I had a `version` field at the top level of the documents and then a function to migrate forward between versions when a document is loaded, but I'm not sure how to handle when different clients are running different versions concurrently as opposed to all clients updating at the same time. I had read some articles that alluded to allowing the previous versions to still change the state and then, seemingly, translate it as needed on newer versions, but they seemed to hand-wave away any details of what that would actually look like to implement.

3. How granular do you go with documents in the spectrum of "one per user" to "one per object"?

coolhand2120 · 32d ago

> 1. Do you use the CRDT document as the source of truth or as just synchronization with a database as the source of truth. If the document is the source of truth, do you keep data in it or copy the data into some other format that's easier to query?

The DB is the source of truth (DynamoDB, spanned binary chunking) and holds the YJS binary document which is also the source of truth, so I guess the doc? I keep a number of copies of this document in different states because of my aforementioned distrust of computers. 1) Per edit event is recorded as S3 JSON, 2) per edit binary document is kept in dynamoDB, 3) per edit serialized JSON of the document content in S3. This trail of breadcrumbs keeps my anxiety down and has in the past helped recreate documents when "something bad" happened. Something bad was always _my_ poor implementation of YJS causing the document to either grow too big or start throwing warnings - both of which are catastrophic IMO if you're maintaining documents for 3rd parties. The document is kept in sync with a vector state exchange lambda that loads the "latest" document from dynamo and compares the client vector state to the server's "latest" in the DB and responds with a delta. All of this is binary which is a bit unnerving. YJS provides ways to dip into that data stream but it's equally unnerving to unbox the complex CRDT schema in the guts of the lib. When writing data I use dynamo DB's exclusive write mode where conflict writes (based on a version number) will go into a retry mode which at worse costs a few extra ms. This ensures that "latest" never loses an event with concurrent overwrites. I rely on this and the CRDTs communicativity to make sure no writes are ever lost. Since the lib is "local first" all this interaction is transparent to the user.

> 2. How do you handle changes to the schema of the CRDT documents?

I defined a new hypermedia specification, basically HTML as JSON. {id, parentId, type, props, events, acl, childIds}. I make a flat map and build a hierarchy for my editor, then flatten it back out to save it saving the child's Ids to maintain order when building the linked list from the ymap object (hashmap).

This makes it so the core schema never changes, only the type (aka tag) changes and the props/events def from that type. This all is defined in a swagger doc which allows for this type of schema definition. This is reused at runtime for schema validation.

To introduce changes I introduce new types (aka tags) so if I had "type: ThingV1" now I have "type: ThingV2" with a new contract. This also helps with the downstream artifacts from the programs as the devices that get the schema can ignore types they don't implement and use the ones they do, and we can put them both on the same response who's core endpoint should always be v1 because the core schema never changes (thanks W3C for the idea).

> 3. How granular do you go with documents in the spectrum of "one per user" to "one per object"?

It depends on the requirements for the projects, but in all cases the documentId is the partition key for the dynamoDB allowing proper scaling. The content of the document is many per user, sometimes hundreds of documents per user. There is absolutely no query surface for these documents, you can only look them up by direct ID. There is another system that keeps track of the document directory and I would add a query surface of some sort there via event projection if I needed to query my documents - which I fortunately do not.

So far so good! The only thing that is pricy is the breadcrumbs which I'll start tuning to store less as I'm probably coping the same data 5x or more.

AtlasBarfed · 32d ago

This is right smack in the territory of the CAP theorem, which a lot of distributed databases using highly reliable networks can now somewhat functionally ignore.

But in the land of highly variable reliability of clients, you got to think about it you are CP or AP. You HAVE to be partition tolerant because clients are so unreliable.

gxs · 32d ago

Maybe if the headline said Stop using REST for state synchronization in these scenarios, I would have been on board

Dismissing something outright is usually the wrong answer, and for the same reason I did read the article

Using REST for synchronization isn’t some huge transgression and is fine in a lot of scenarios, silly in others.

Keeping record updates that a user makes in one system in sync with a single record in another system that uses those updates is fine

Trying to sync giant databases in bulk on the other hand might not be

3cats-in-a-coat · 32d ago

"State transfer is not state synchronization" is such an obscure point to make.

State sync over REST can be rather trivial by keeping resources immutable (hence also cacheable), and delivering a root resource during the initial endpoint contact. I.e. like an index page linking to static assets. Which is all very familiar to REST design.

yakshaving_jgt · 32d ago

The article told us we should stop using REST for state sync, yet it didn’t really tell us what to do instead.

4b11b4 · 32d ago

Electric SQL and their various packages looking pretty. Can't comment on actually using them though

kobieps · 32d ago

The pain points described in this post are exquisitely painful. Makes me bullish on sync engines.

toomim · 32d ago

I work on the referenced Braid project (https://braid.org), and the author is right on— HTTP needs to evolve from a state transfer protocol to a state synchronization protocol.

What he doesn't seem to realize is (a) how doable this is, and (b) how beneficial it is once you solve it.

First, the Braid extensions to HTTP [1] achieve this already! They transform HTTP from state transfer to state synchronization with four simple, backwards-compatible extensions to HTTP, which work in today's browsers and servers, with a simple polyfill library [2].

You can use this today, with or without any CRDT or OT algorithms. In fact, 99% of apps don't need a fancy CRDT — they just need the subscriptions feature. This is really cool: instead of GET responding just once, a Braid-enhanced GET will keep the response open, and send you new updates each time the resource changes, until you're done with it.

This gives you realtime synchronized HTTP without a Websocket. No CRDT or OT needed.

You only need a CRDT or OT algorithm if multiple writers mutate the same state, and your app needs to guarantee perfect consistency. And if this is needed—no problem! We've got libraries for that that use the other 3 independent Braid extensions — essentially adding a few more headers to carry over the CRDT/OT metadata. They work with any CRDT or OT.

Second, I don't think anyone yet realizes how big of a deal this is, so let me explain. The lack of realtime synchronization in today's HTTP is what drives developers to move their state synchronization traffic to a websocket, with some random pub/sub protocol that meets the needs of their app — but is totally incompatible with other developers, other websites, and other tools. This means that every website uses standard HTTP to serve its pages, but has a non-standard protocol for its internal state. As a result, websites openly link to each other's pages, on the open web — but the internal state of websites has become a walled garden. This basic limitation in the architecture of HTTP has led to the centralization of data silos in websites. It's simply too difficult for a peer-to-peer community of web developers to build apps that re-use each others' internal state, in the same way that they can link to each other's pages, because they lack the infrastructure.

We've been building the tools for this new style of peer-to-peer synchronous web of state. I just presented on it a few weeks back here: https://braid.org/meeting-107

Cheers to the author for correctly identifying this problem. And onward to the glorious future of the synchronous web of state!

[1] https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-b...

[2] https://github.com/braid-org/braid-http?tab=readme-ov-file

brianjlogan · 32d ago

You've sold me. Braid is now on my list to experiment with.

toomim · 32d ago

Excellent. Don't hesitate to reach out with comments or questions. I love talking about this stuff with folks.

Mailing list: https://groups.google.com/forum/#!forum/braid-http

Discord: https://discord.gg/nvPQN7FgDX

My email: toomim@gmail.com

And you can show up to our open biweekly meetings on zoom: https://braid.org

healthydyd · 32d ago

This looks pretty amazing. I am definitely going to try it out.

0xbadcafebee · 32d ago

The problem is the frontend web application model is not great design. People are designing distributed systems when all they really want is to provide data to the user, or provide a user interface (which provides data from the user back to the system). You can do both those things without being stateful on the frontend.

There are multiple network protocols for synchronizing state, and they are annoying as hell. Necessary in some very specific/limited situations, but not for a broadband mobile or web application. Even when we were on 33.6k modems we didn't need it.

But then again if we were talking about native apps, we'd still be using HTTP APIs to transfer state data, because developers can't be arsed to do anything they're not used to. (Unless you show them something more technically complex, yet easier to use/think about; shit's catnip for coders)

whstl · 32d ago

Not quite, even a static HTTP form is "stateful" in the same way as a React frontend is. The value that the user typed in a field before hitting submit is also a "state", it's just abstracted away by the browser/DOM/HTML.

React is just reimplementing browser functionality.

The argument here is that those kinds of forms aren't state transfer similar to "subscribe to our newsletter" or "confirm payment", they're all just state-syncronization-on-top-of-state-transfer. Whether this is right or wrong, good or bad: I don't have a strong opinion.

0xbadcafebee · 32d ago

> The value that the user typed in a field before hitting submit is also a "state"

I'm sorry but that's not what these terms mean. Stateful vs stateless has to do with a state machine holding a state across interactions. In order to be stateful it has to remember what the state was after it takes action. The fact that a text box can temporarily hold some text in it doesn't make it stateful. You would have to submit the form, the page would have to completely reload, and the browser would have to keep the text in the box. That would be a stateful form field. So the browser isn't doing anything stateful. As HTTP is a stateless protocol, the browser is also (traditionally) stateless. This changed with the introduction of JavaScript, and to a certain extent with HTTP caching conventions, and cookies.

From the article:

  Using REST we can model this as a path, say /api/foo, that supports GET and POST methods to fetch or replace the text by a given value[..].  A React component that allows users to edit this piece of text will probably display a text input element, GET the initial value when the component is created and POST a new value when the text input loses focus. We show an error message with a retry button if a request failed, and we display a spinner while requests are in-flight.
  
  Given that we just want to enable the user to edit one string in the database, there’s quite a lot of boilerplate here. Ideally, we’d just need to specify how to display the user interface and where to find the string in the database, but here we have to also deal with sending state back and forth, showing errors and displaying a spinner.

He's describing two different state machines, and a network protocol to transmit data between, in order to synchronize an equivalent state. This isn't transferring state, they are two independent systems. The frontend code has a state and the backend code has a state, both of which may be different at any given time.

If they were both the same system then they'd have one state machine with distributed operation. This is technically possible now that there's a stateful TCP connection over HTTP (WebSockets) and cookies to retain the state. But in practice the backend applications function without the frontend, while the frontend doesn't function without the backend, so this is less of a distributed system and more of a traditional client-server model with independent client and server applications.

whstl · 32d ago

> The fact that a text box can temporarily hold some text in it doesn't make it stateful

And this is exactly what 90% of SPAs are doing, including the example in the article of doing a GET and then a POST request.

> The frontend code has a state and the backend code has a state, both of which may be different at any given time.

Same for a plain-HTML form.

GitHub API Is Down

Is GitHub Down?

BMW ConnectedDrive lets me control my returned rental car (Sixt)

Ask HN: In a guide to inner work for founders and engs, what topics to cover?

What newspaper are you paying for these days?

PSA: iwantmyname is utterly broken

Ask HN: What are some ways the internet is being used for good?

Ask HN: How to Deal with a Bad Manager?

Ask HN: Tech people who are self employed. How do you do it?

What does it mean to use C++ in the front end?

Ask HN: What cool skill or project interests you, but feels out of reach?

Ask HN: How do I give back to people helped me when I was young and had nothing?

Engineers at our startup don't build features anymore

Tell HN: Help restore the tax deduction for software dev in the US (Section 174)

Ask HN: What is your fallback job if AI takes away your career?

Tell HN: YouTube's New AI Search Is Incredibly Good

Ask HN: Is there an AI bot that works like a literate programming build step

Ask HN: How to learn CUDA to professional level

Ask HN: Prevent Secrets from Committing to Repos

Ask HN: Is ageism in tech still a problem?

Ask HN: Genuine alternatives to Google and Apple for releasing paid apps

Ask HN: Casual Math Book Suggestions

Ask HN: Seeking ways to improve my planning skills and follow-through

Ask HN: Minecraft's UI element style (vs. modern flat glass interface)

Ask HN: AWS cdk, serverless setup advice

Ask HN: AGI and Product Development

How does feedback usually happen during projects?

Ask HN: What is the latest on treatment of Metastatic Breast Cancer?

Just how many $10 /MOS subscriptions do startups expect us to sign up for?

Ask HN: In 15 years, what will a gas station visit look like?

Ask HN: Dear Product Managers – How do you use LLM's in your day to day work?

Why Vertical AI Agents May Replace RPA in Complex Enterprise Workflows

Stop using REST for state synchronization (2024)

Comments (29)