Fight Chat Control (fightchatcontrol.eu)

Over the past year, we've collaborated with hundreds of developers building voice AI agents using our Agents SDK. It quickly became apparent that hosting these persistent, session-based workloads is a world apart from traditional HTTP servers. We fielded the same questions repeatedly:

- How do I size CPU/memory for unpredictable session lengths?

- What's the best way to autoscale without overprovisioning?

- How can I monitor and optimize performance across concurrent sessions?

A few months ago, we set out to build a serverless platform tailored for this—think Vercel for stateful AI agents, with low-latency cold starts and seamless scaling. Along the way, we tackled some fun engineering challenges:

- Container isolation to sandbox workloads and prevent noisy-neighbor issues

- Minimizing container startup times, to ensure proper autoscaling

- Custom load balancing to distribute sessions based on real-time resource utilization, not just round-robin—since session durations vary wildly

- Graceful draining during updates or scaling events

We've been dogfooding this internally and with early users, and it's handling thousands of concurrent voice sessions with minimal latency spikes. If you're building AI agents (voice or otherwise) and wrestling with infra, we'd love your feedback—does this solve pain points you've hit? What's missing?

Comments (0)

No comments yet

GPT-5 (openai.com)

Fight Chat Control (fightchatcontrol.eu)

GitHub is no longer independent at Microsoft after CEO resignation (theverge.com)

I tried every todo app and ended up with a .txt file (al3rez.com)

Claude Sonnet 4 now supports 1M tokens of context (anthropic.com)

I want everything local – Building my offline AI workspace (instavm.io)

Ultrathin business card runs a fluid simulation (github.com)

Streaming services are driving viewers back to piracy (theguardian.com)

Wikipedia loses challenge against Online Safety Act (bbc.com)

Anna's Archive: An Update from the Team (annas-archive.org)

FFmpeg 8.0 adds Whisper support (code.ffmpeg.org)

Emailing a one-time code is worse than passwords (blog.danielh.cc)

Debian 13 “Trixie” (debian.org)

Steve Wozniak: Life to me was never about accomplishment, but about happiness (yro.slashdot.org)

Good system design (seangoedecke.com)

Vibechart (vibechart.net)

Why LLMs can't really build software (zed.dev)

VC-backed company just killed my EU trademark for a small OSS project

Claude Code is all you need (dwyer.co.za)

Gemma 3 270M: Compact model for hyper-efficient AI (developers.googleblog.com)

Nginx introduces native support for ACME protocol (blog.nginx.org)

Show HN: The current sky at your approximate location, as a CSS gradient (sky.dlazaro.ca)

AGENTS.md – Open format for guiding coding agents (agents.md)

Claude says “You're absolutely right!” about everything (github.com)

Copilot broke audit logs, but Microsoft won't tell customers (pistachioapp.com)

PYX: The next step in Python packaging (astral.sh)

Open hardware desktop 3D printing is dead? (josefprusa.com)

How I code with AI on a budget/free (wuu73.org)

Show HN: Building a web search engine from scratch with 3B neural embeddings (blog.wilsonl.in)

Obsidian Bases (help.obsidian.md)

Show HN: I built an app to block Shorts and Reels (scrollguard.app)

How we exploited CodeRabbit: From simple PR to RCE and write access on 1M repos (research.kudelskisecurity.com)

Try and (ygdp.yale.edu)

This website is for humans (localghost.dev)

I accidentally became PureGym’s unofficial Apple Wallet developer (drobinin.com)

GPT-5: Key characteristics, pricing and system card (simonwillison.net)

Wikimedia Foundation Challenges UK Online Safety Act Regulations (wikimediafoundation.org)

Web apps in a single, portable, self-updating, vanilla HTML file (hyperclay.com)

OpenFreeMap survived 100k requests per second (blog.hyperknot.com)

Show HN: I was curious about spherical helix, ended up making this visualization (visualrambling.space)

Jim Lovell, Apollo 13 commander, has died (nasa.gov)

OpenMower – An open source lawn mower (github.com)

Search all text in New York City (alltext.nyc)

Lazy-brush – smooth drawing with mouse or finger (lazybrush.dulnan.net)

Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?

What's the strongest AI model you can train on a laptop in five minutes? (seangoedecke.com)

Show HN: Whispering – Open-source, local-first dictation you can trust (github.com)

The future of large files in Git is Git (tylercipriani.com)

Historical Tech Tree (historicaltechtree.com)

Why are there so many rationalist cults? (asteriskmag.com)

Show HN: Serverless platform for running voice AI agents

Comments (0)