Show HN: Read-Through Cache for S3

1 shikhar 0 9/14/2025, 2:37:48 PM github.com ↗

Cachey (https://github.com/s2-streamstore/cachey) is an open source read-through cache for S3-compatible object storage.

It is written in Rust with a hybrid memory+disk cache powered by foyer [1], accessed over a simple HTTP API. It runs as a self-contained single-node binary – the idea is to distribute yourself and lean on client-side logic for key affinity and load balancing.

If you are building something heavily reliant on object storage, the need for something like this is likely to come up! A bunch of companies have talked about their approaches to distributed caching atop S3 (such as Clickhouse [2], Turbopuffer [3], WarpStream [4], RisingWave [5]).

Why we built it:

Recent records in s2.dev are owned by a designated process for each stream, and we could return them for reads with minimal latency overhead once they were durable. However this limited our scalability in terms of concurrent readers and throughput, as well as implied cross-zone network costs when the zones of the gateway and stream-owning process did not align.

The source of durability was S3, so there was a path to slurping recently-written data straight from there (older data would already be read directly), and take advantage of free bandwidth. But even S3 has RPS limits [6], and most object storage is HDD-backed so avoiding the latency overhead as much as possible is desirable.

Caching helps reduce S3 operation costs, improves the latency profile, and lifts the scalability ceiling. Now, regardless of whether records are recent or old, our reads always flow through Cachey.

Cachey internals:

- It borrows an idea from OS page caches by mapping every request into a page-aligned range read. This did call for requiring the typically-optional Range header, with an exact byte range. Standard tradeoffs around picking page sizes apply, and we went with fixing it at the high end of S3's recommendation (16 MB). If multiple pages are accessed, some limited intra-request concurrency is used. The sliced data is sent as a streaming response.

- It will coalesce concurrent requests to the same page (another thing an OS page cache will do). This was easy since foyer provides a native 'fetch' API that takes a key and thunk.

- It mitigates the high tail latency of object storage by maintaining latency statistics and making a duplicate request when a configurable quantile is exceeded, picking whichever response becomes available first. Jeff Dean discussed this technique in "The Tail at Scale" [7], and S3 docs also suggest such an approach.

A more niche thing Cachey lets you do is specify more than 1 bucket an object may live on, and attempt up to 2, prioritizing the client's preference blended with its own knowledge of recent operational stats. This is actually something we rely on since we offer regional durability with low latency by ensuring a quorum of zonal S3 express buckets for recently-written data, so the desired range may not exist on an arbitrary one. This capability may end up making sense to reuse for multi-region durability in future, too.

I'd love to hear your feedback and suggestions! Hopefully other projects will also find Cachey to be a useful part of their stack.

[1] https://foyer.rs [2] https://clickhouse.com/blog/building-a-distributed-cache-for... [3] https://turbopuffer.com/docs/architecture [4] https://www.warpstream.com/blog/minimizing-s3-api-costs-with... [5] https://risingwave.com/blog/risingwave-elastic-disk-cache [6] https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimi... [7] https://cacm.acm.org/research/the-tail-at-scale/#body-7

We should have the ability to run any code we want on hardware we own (hugotunius.se)

Cognitive load is what matters (github.com)

NPM debug and chalk packages compromised (aikido.dev)

I didn't bring my son to a museum to look at screens (sethpurcell.com)

I ditched Docker for Podman (codesmash.dev)

Germany is not supporting ChatControl – blocking minority secured (digitalcourage.social)

30 minutes with a stranger (pudding.cool)

Charlie Kirk killed at event in Utah (nbcnews.com)

Show HN: Term.everything – Run any GUI app in the terminal (github.com)

996 (lucumr.pocoo.org)

Next.js is infuriating (blog.meca.sh)

Show HN: I recreated Windows XP as my portfolio (mitchivin.com)

The MacBook has a sensor that knows the exact angle of the screen hinge (twitter.com)

EU court rules nuclear energy is clean energy (weplanet.org)

Anthropic agrees to pay $1.5B to settle lawsuit with book authors (nytimes.com)

Signal Secure Backups (signal.org)

Show HN: A store that generates products from anything you type in search (anycrap.shop)

Using Claude Code to modernize a 25-year-old kernel driver (dmitrybrant.com)

iPhone Air (apple.com)

Pontevedra, Spain declares its entire urban area a "reduced traffic zone" (greeneuropeanjournal.eu)

Google can keep its Chrome browser but will be barred from exclusive contracts (cnbc.com)

I replaced Animal Crossing's dialogue with a live LLM by hacking GameCube memory (joshfonseca.com)

We all dodged a bullet (xeiaso.net)

Stripe Launches L1 Blockchain: Tempo (tempo.xyz)

UTF-8 is a brilliant design (iamvishnu.com)

Mistral raises 1.7B€, partners with ASML (mistral.ai)

Chat Control Must Be Stopped (privacyguides.org)

New Mexico is first state in US to offer universal child care (governor.state.nm.us)

“This telegram must be closely paraphrased before being communicated to anyone” (history.stackexchange.com)

Almost anything you give sustained attention to will begin to loop on itself (henrikkarlsson.xyz)

The treasury is expanding the Patriot Act to attack Bitcoin self custody (tftc.io)

Where's the shovelware? Why AI coding claims don't add up (mikelovesrobots.substack.com)

Google AI Overview made up an elaborate story about me (bsky.app)

iPhone dumbphone (stopa.io)

Claude Code: Now in Beta in Zed (zed.dev)

Eternal Struggle (yoavg.github.io)

Why our website looks like an operating system (posthog.com)

KDE launches its own distribution (lwn.net)

ICE is using fake cell towers to spy on people's phones (forbes.com)

Court rejects Verizon claim that selling location data without consent is legal (arstechnica.com)

Claude now has access to a server-side container environment (anthropic.com)

I'm absolutely right (absolutelyright.lol)

Corporations are trying to hide job openings from US citizens (thehill.com)

Many hard LeetCode problems are easy constraint problems (buttondown.com)

LLM Visualization (bbycroft.net)

Notes on Managing ADHD (borretti.me)

Serverless Horrors (serverlesshorrors.com)

MIT Study Finds AI Use Reprograms the Brain, Leading to Cognitive Decline (publichealthpolicyjournal.com)

E-paper display reaches the realm of LCD screens (spectrum.ieee.org)

No adblocker detected (maurycyz.com)

Show HN: Read-Through Cache for S3

Comments (0)