Show HN: OnPair – String compression with fast random access (Rust, C++)

2 gargiulof 0 8/19/2025, 3:20:26 PM github.com ↗
I’ve been working on a compression algorithm for fast random access to individual strings in large collections.

The problem came up when working with large in-memory database columns (emails, URLs, product titles, etc.), where low-latency point queries are essential. With short strings, LZ77-based compressors don’t perform well. Block compression helps, but block size forces a trade-off between ratio and access speed.

Some existing options:

- BPE: good ratios, but slow and memory-heavy

- FSST (discussed here: https://news.ycombinator.com/item?id=41489047): very fast, but weaker compression

This solution provides an interesting balance (more details in the paper):

- Compression ratio: similar to BPE

- Compression speed: 100–200 MiB/s

- Decompression speed: 6–7 GiB/s

I’d love to hear your thoughts — whether it’s workloads you think this could help with, ideas for API improvements, or just general discussion. Always happy to chat here on HN or by email.

---

Resources:

- Paper: https://arxiv.org/pdf/2508.02280

- Rust: https://github.com/gargiulofrancesco/onpair_rs

- C++: https://github.com/gargiulofrancesco/onpair_cpp

Comments (0)

No comments yet