My experience with Apache Pulsar to solve PostgreSQL multi-tenant pain
Now that it's been quite some time using Pulsar, I feel that I can share some notes about my experience in replacing postgres-based streaming solutions with Pulsar and hopefully learn from your opinions/insights.
----
What I liked about Pulsar:
1. Tenant isolation is solid, auto load balancing works well: We haven't experienced so far a chatty tenant affecting others. We use the same cluster to ingest the data of all our customers (per region, one in US, one in EU). MultiTenancy along with cluster auto-scaling allowed us to contain costs.
2. No more single points of failure (data replicated across bookies): Data is replicated in at least two bookies now. This made us a lot more reliable when it comes to data loss.
3. Maintenance is easier: No single master constraint anymore, this simplified a lot of the infra maintenance (imagine having to move a Postgres pod into a different EC2 node, it could lead to downtime).
----
What's painful about Pulsar:
1. StreamNative licensing costs were significant
2. Network costs considerably increased with multi-AZ + replication
3. Learning curve was steeper than expected, also it was more complex to debug
----
Would love to hear your experience with Postgres/Pulsar, any opinions or insights on the approach/challenges. I hope this dialogue helps others in the community, feel free to ask me anything.
No comments yet