Activeloop (YC S18) Is Hiring Senior Back End and AI Search Engineers(Onsite, MV) (careers.activeloop.ai)

We've gotten rate-limited out of the blue on clustered development servers in the past 3 years now, this last one was on servers we setup 830 days ago, before we knew that getting rate-limited/banned on DNS servers where even possible. The worst thing about the last incident was that we entered a death spiral, DNS resolution failing started a logging job, that failed (due to DNS resolution failing to call log server) that then started a job about the failing DNS resolution.. You get the gist..

Of course, this is an issue of engineering and code, not only a rate-limiting issue.

However, many developers rely and depend on upstream DNS resolution to "Just Work" when you add it to a server, which has been the case with Googles DNS servers for the past 15+ years that I've been a sysop. I'm just hoping that this time, this will get SOME attention, because either you want dev-ops to use Cloudflare DNS on servers or you don't - and if you don't - there should be an official warning that this WILL happen, you WILL get rate-limited eventually.

Comments (8)

gertop · 33d ago

> However, many developers rely and depend on root DNS resolution to "Just Work" when you add it to a server

As a sysops you're probably aware that neither Google nor CloudFlare are DNS root servers.

Using actual root servers through your own resolver would have avoided this issue. Bind doesn't even need any config for that use case.

Bender · 32d ago

Adding to this data-center servers should at very least use a proper set of caching DNS servers at the edge of their network and those should talk to the root servers as to not add to the global abuse of the anycast clusters. I've seen some companies go as far as to run Unbound on each and every server to improve the retry and caching mechanisms to great success. Unbound can also raise the min TTL as some applications get quite abusive if they use really low TTL's and the applications are making requests for every action. i.e. raising every TTL up to at least 30 seconds if they were lower. The excessive retries can compound really fast especially when applications and systems are not properly configured which is often the case. If people are not sure what I mean, run a UDP capture at the edge of your network and one may find that for each request an application makes there could be as many as 12 DNS requests. It adds up very fast.

    App1 Unbound -> Data-center edge Unbound instances [1-4] -> Root DNS  Anycast clusters

Unbound can be configured to either pick the fastest resolvers or to cycle through all of them and retry in the background when one fails to re-select it when it starts resolving again. This avoids a lot of outages otherwise known as "It's always DNS".

plagiat0r · 32d ago

But when setting up a full recursive resolver, you should avoid using root servers directly for queries, but rather mirror the root zone locally:

https://datatracker.ietf.org/doc/html/rfc8806

tmikaeld · 33d ago

Of course, it depends on the use-case, what I meant was "upstream DNS". I've edited.

phillipseamore · 33d ago

What kind of volume was this? I have a server that does some rather specific DNS monitoring resulting in millions of unique lookups with 1.1.1.1 a day.

tmikaeld · 33d ago

That's the frustrating part of this and the inconsistency, we're doing benchmarks one day, making thousands of lookups, adding/removing domains, then during normal day operations we're getting blocked.

phillipseamore · 33d ago

Is this only DNS or have issues with accessing CF networks? Do you own the subnet the server is on is it shared with others? Wondering if this is because of other traffic from the subnet and also affects you.

tmikaeld · 33d ago

These are on spread out external IPs (VPSs) so not within CF networks or specific IP subnets. The common denominator is that at certain bursts of traffic, we get blocked.

If this had some kind of pattern we could avoid or improve, I wouldn't even bring it up.

Onlook (YC W25) Is Hiring an engineer in SF

OneText (YC W23) Is Hiring a DevOps/DBA Lead Engineer (jobs.ashbyhq.com)

Gander (YC F24) Is Hiring Founding Engineers and Interns (ycombinator.com)

Ziina (YC W21) the Series A fintech is hiring product engineers (ziina.notion.site)

Onyx (YC W24) – AI Assistants for Work Hiring Founding AE (ycombinator.com)

Great Question (YC W21) Is Hiring a Director of Customer Success (ycombinator.com)

Deepnote (YC S19) is hiring engineers to build an AI-powered data notebook (deepnote.com)

Converge (YC S23) Well-capitalized New York startup seeks product developers (runconverge.com)

CircuitHub (YC W12) is hiring full-stack robotics engineers (workatastartup.com)

AtoB (YC S20) – Stripe for Transportation – is hiring engineers (jobs.ashbyhq.com)

PromptArmor (YC W24) Is Hiring in San Francisco (ycombinator.com)

Depot (YC W23) is hiring an enterprise support engineer (UK/EU) (ycombinator.com)

Patched (YC S24) Is Hiring SWEs in Singapore (ycombinator.com)

Activeloop (YC S18) Is Hiring Senior Back End and AI Search Engineers(Onsite, MV) (careers.activeloop.ai)

Morph (YC S23) Is Hiring a ML Engineer

Spark AI (YC W24) Is Hiring a Full Stack Engineer in San Francisco (ycombinator.com)

Demodesk (YC W19) Is Hiring Rails Engineers (demodesk.com)

Piramidal (YC W24) Is Hiring a Senior Full Stack Engineer (ycombinator.com)

AccessOwl (YC S22) is hiring an AI TypeScript Engineer to connect 100s of SaaS (ycombinator.com)

StackAI (YC W23) Is Looking for SWR and Tailwind Wizards (ycombinator.com)

Weave (YC W25) is hiring a founding engineer (ycombinator.com)

Infisical (YC W23) Is Hiring Full Stack Engineers (TypeScript) in US and Canada (ycombinator.com)

GoGoGrandparent (YC S16) is hiring Back end Engineers

Roundtable (YC S23) Is Hiring a Member of Technical Staff (ycombinator.com)

Diligent (YC S23) Is Hiring a Founding AI Engineer (ycombinator.com)

Venta AI (YC S23) is hiring a full stack engineer in Amsterdam (ycombinator.com)

Martin (YC S23) Is Hiring Founding AI/Product Engineers to Build a Better Siri (ycombinator.com)

Trellis (YC W24) Is Hiring founding SDR to help automate healthcare paperwork (ycombinator.com)

Sorcerer (YC S24) Is Hiring a Lead Hardware Design Engineer (jobs.ashbyhq.com)

Harper (YC W25) Is Hiring Applied AI / AI Context Engineers and Data Scientist (ycombinator.com)

Overlap (YC S24) Is Hiring (ycombinator.com)

Don't use Cloudflares 1.1.1.1 on servers

Comments (8)