Counting at Scale

Comments (1)

prvt · 14h ago

Last weekend, I dove into CMU’s [15-445/645](https://15445.courses.cs.cmu.edu/) database course and got hit with a deceptively simple problem: count the number of unique users visiting a website per day. Easy, right? Just throw user IDs into an unordered_set and return its size—classic LeetCode.

But what happens when you’re at Facebook scale? Tracking a billion unique users means burning through GBs of memory just to count. And in the real world, users are streaming in constantly, not sitting in a neat, static list. Storing every ID? Not happening.

I explored practical workarounds (like “last seen” timestamps and full table scans), but they’re either inefficient or put massive strain on your DB. Then the assignment introduces HyperLogLog: a probabilistic algorithm that estimates cardinality with just 1.5KB of memory—accurate to within 2% for billions of users.

The magic? Pure mathematics. It’s distributable, and powers real-world systems like Redis and Google Analytics. I break down how it works (with illustrations!), check out my deep dive.

Curious to hear from HN: Who’s using HyperLogLog in production? And have you run into accuracy issues, and how did you handle them?

Enhanced Radar (YC W25) is hiring a founding engineer

Converge (YC S23) well-capitalized New York startup seeks product developers (runconverge.com)

Kyber (YC W23) Is Hiring Enterprise BDRs (ycombinator.com)

MindsDB (YC W20) is hiring an AI solutions engineer (job-boards.greenhouse.io)

Recurse Center (YC S10) Is Hiring a Career Facilitator (recurse.notion.site)

Cua (YC X25) is hiring an engineer (ycombinator.com)

Noloco (YC S21) is hiring a founder's associate in Barcelona (ycombinator.com)

14.ai (YC W24) hiring founding engineers in SF to build a Zendesk alternative (14.ai)

Lago (Open-Source Usage Based Billing) is hiring for ten roles (ycombinator.com)

Spark AI (YC W24) is hiring a full-stack engineer in SF (founding team) (ycombinator.com)

Bitmovin (YC S15) Is Hiring a Junior Solutions Engineer in Denver (bitmovin.com)

SigNoz (YC W21, Open Source Datadog) Is Hiring DevRel Engineers (Remote)(US) (ycombinator.com)

AccessOwl (YC S22) is hiring an Elixir Engineer to connect 100s of SaaS (ycombinator.com)

FurtherAI (YC W24) Is Hiring for Software and AI Roles (ycombinator.com)

Yarn (YC W24) is hiring engineers in NYC (ycombinator.com)

Expand.ai (YC S24) is hiring a founding engineer

Optifye.ai (YC W25) is hiring a back end engineer

Kastle (S24) is hiring an engineer (ycombinator.com)

Weave (YC W25) is hiring a founding AI engineer (ycombinator.com)

Qfex (YC X25) – Back End Engineer for a 24/7 Stock Exchange (ycombinator.com)

Attimet (YC F24) – Quant Trading Research Lab – Is Hiring Founding Engineer (ycombinator.com)

Jiga (YC W21) Is Hiring Software Engs to Make Life of Mech Engs Easier (workatastartup.com)

Foundry (YC F24) Hiring Early Engineer to Build Web Agent Infrastructure (ycombinator.com)

Blaze (YC S24) Is Hiring (ycombinator.com)

Infracost (YC W21) is hiring software engineers (GMT+2 to GMT-6) (infracost.io)

Solidroad (YC W25) Is Hiring (solidroad.com)

Kyber (YC W23) Is Hiring a Technical Account Manager (ycombinator.com)

Roundtable (YC S23) Is Hiring a President / CRO (ycombinator.com)

Roame (YC S23) Is Hiring (ycombinator.com)

GauntletAI (YC S17): All expenses paid AI training and guaranteed $200k+ job (gauntletai.com)

Counting at Scale

Comments (1)