Context Engineering – can't call it engineering if we can't predict it breaking [video] (youtube.com)

Hi Hacker News! We're Jan & Wilco from ReJot (https://rejot.dev). With ReJot we're building a framework that turns the write-ahead log of your database into an asynchronous communication channel for your services. ReJot enables application developers to define how the database tables they own should be replicated to other databases. Something we wish we had at in our previous job at a large fintech.

There is a gap between building internal (REST) APIs and Kafka (event streaming) to share data between services.

Internal APIs start to break down when you have more than a couple services communicating. Their synchronous nature makes them brittle in a distributed system: failures cascade and latency adds up. Companies operating internal APIs at scale often face challenges like managing implicit schemas and versioning. They also need to write significant amounts of code to implement features like circuit breakers and internal load balancing.

Event streaming addresses these issues by using asynchronous communication, but it also introduces significant drawbacks. Kafka is known for its operational complexity and high cost. Engineers must manage outbox tables, outbox processors, and consumers, which makes the system more difficult to understand and maintain.

ReJot is the middle ground solution that re-uses a database system's write-ahead log as an asynchronous communication channel. The WAL is well-suited to double as an outbox, this has been proven by CDC systems like Debezium. ReJot is a lightweight addition to existing infrastructure, and even re-uses existing (relational) database systems to store messages (temporarily) before sending them to the destination/sink databases.

We're developer focused, as opposed to being infrastructure focused. Much like how developers define the database table schemas they use, we enable developers to say how their data should be published to others in the distributed system. This is done through something we call "Public Schemas", they consist of a schema and a (SQL) query. When an item in the underlying table changes, the query is executed to produce an object conforming to the schema. This data is then forwarded through ReJot, ready to be consumed by a different service using a "Consumer Schema". This is again a simple (SQL) query that contains an INSERT statement. All of this is defined from within the codebase of the application, much like how ORMs or query builders work.

In short, ReJot re-uses your database in two ways: by consuming the WAL, and also by using queries to encapsulate and integrate data. This makes ReJot a good middle-ground between the brittleness of synchronous communication and the complexity of event streaming.

Excited to hear what you think!

Comments (4)

raoulritter · 53d ago

I'm thinking that now with all these agent to agent frameworks this could potentially work for that. If you send off one agent you want them to keep up to date and sync / talk to each-other. Could your solution work for something like A2A by google or similar to enhance the synchronization across the different agents doing their tasks and prevent them from landing in a loop or similar.

WilcoKruijer · 53d ago

I'm not too familiar with how people store the state of AI agents, but I do think there's some opportunity to use ReJot for this use case. Hooking up an agent to ReJot and giving them access to all available Public Schemas could be an interesting way of letting an agent explore and use the data in a distributed system.

jasonthorsness · 53d ago

If the consumers stall, doesn't the WAL have to grow in unbounded fashion? Does it place any backpressure on the writers?

WilcoKruijer · 53d ago

You're right. Since we don't want to put too much pressure on the source database, we do save the (transformed) WAL items in an intermediary database (we call this the event store), so the source can clear its WAL.

This does mean the intermediary database can grow in an unbounded fashion. The use case really determines if this is fine or not. Since our focus right now is on (micro)service communication, we think this is fine in most cases, as the throughput usually is not gigantic.

Since the event store is just a Postgres database, it's easy to set up partitions to only retain data for a certain amount of time. On the near-term roadmap we also have back-fill support which will make it easier to work with shorter retention windows.