Vector database that can index 1B vectors in 48M

38 mathewpregasen 14 9/12/2025, 4:56:18 PM vectroid.com ↗

Comments (14)

esafak · 8m ago

By the creator of the real-time data platform https://en.wikipedia.org/wiki/Hazelcast.

softwaredoug · 1h ago

Not trying to be snarky, just curious -- How is this different from TurboPuffer and other serverless, object storage backed vector DBs?

ge96 · 1h ago

M is minutes

HarHarVeryFunny · 1h ago

I was starting to think this was impressive, if not impossible. 1B vectors in 48 MB of storage => < 1 bit per vector.

Maybe not impossible using shared/lossy storage if they were sparsely scattered over a large space ?

But anyways - minutes. Thanks.

Edit: Gemini suggested that this sort of (lossy) storage size could be achieved using "Product Quantization" (sub vectors, clustering, cluster indices), giving an example of 256 dimensional vectors being stored at an average of 6 bits per vector, with ANN being one application that might use this.

stevemk14ebr · 1h ago

Thank you, title needs edited.

ikanade · 1h ago

Legend

l5870uoo9y · 1h ago

Thankfully not months.

softwaredoug · 1h ago

Oh the horrors of search indexing Ive seen... including weeks / months to rebuild an index.

ashvardanian · 1h ago

Very curious about the hardware setup used for this benchmark!

OutOfHere · 1h ago

Proprietary closed-source lock-in. Nothing to see here.

HEmanZ · 36m ago

What do you think an alternative is for someone who:

1. Has a technical system they think could be worth a fortune to large enterprises, containing at least a few novel insights to the industry.

2. Knows that competitors and open source alternatives could copy/implement these in a year or so if the product starts off open source.

3. Has to put food on the table and doesn’t want to give massive corporations extremely valuable software for free.

Open source has its place, but it is IMO one of the ways to give monopolies massive value for free. There are plenty of open source alternatives around for vector DBs. Do we (developers) need to give everything away to the rich

CuriouslyC · 1h ago

Seriously. The amount of lift a SaaS product needs to give me is insane for me to even bother evaluating it, and there's a near zero percent chance I'll use it in my core.

stronglikedan · 58m ago

Nothing for you to see here. Surely you just aren't their target customer.

OutOfHere · 39m ago

So who is? Who really needs to index 1 billion new vectors every 48 minutes, or perhaps equivalently 1 million new vectors every 3 seconds?