Open-source framework for real-time AI voice

Comments (1)

sagarkava · 10m ago

Hey

I’m Sagar, co-founder of VideoSDK.

I'm beyond excited to share what we've been building: VideoSDK Real-Time AI Agents. Today, voice is becoming the new UI.

We expect agents to feel human, to understand us, respond instantly, and work seamlessly across web, mobile, and even telephony. But, to achieve this, developers have to stitch together: STT, LLM, TTS, glued with HTTP endpoints and, a prayer.

This most often results in agents that sound robotic, hallucinations and fail in product environments without observability. So we built something to solve that.

Now, we are open sourcing it!

Here’s what it offers:

- Global WebRTC infra with <80ms latency - Native turn detection, VAD, and noise suppression - Modular pipelines for STT, LLM, TTS, avatars, and - real-time model switching - Built-in RAG + memory for grounding and hallucination resistance - SDKs for web, mobile, Unity, IoT, and telephony — no glue code needed - Agent Cloud to scale infinitely with one-click deployments — or self-host with full control Think of it like moving from a walkie-talkie to a modern network tower that handles thousands of calls.

VideoSDK gives you the infrastructure to build voice agents that actually work in the real world, at scale.

I'd love your thoughts and questions! Happy to dive deep into architecture, use cases, or crazy edge cases you've been struggling with.

No comments yet

Ask HN: Is it time to fork HN into AI/LLM and "Everything else/other?"

The IDE isn't going away

AI is for Midwits

Ask HN: What is your window management solution?

Ask HN: How did Soham Parekh get so many jobs?

The German Works Council has blocked Amazon's performance reviews

Cyberpunk and Politics: Neon Dystopias, Power, and Resistance

Ask HN: How is my MacBook temp getting misread?

Ask HN: How much of OpenAI code is written by AI?

Ask HN: Is it true that early humans were more 'gatherers' than 'hunters'?

Is Firebase Console Down

Cloudflare DNS Down in UK/EU

Cloudflare's 1.1.1.1 DNS server seems to be down

Telnyx launches automatic noise suppression for AI Voice Agents

Tell HN: 1.1.1.1 Appears to Be Down

Ask HN: How are you productively using Claude code?

Ask HN: How to find mentors while working remote?

Dyan – A Visual REST API Builder You Can Self-Host

Tell HN: Lobste.rs blocking the Brave browser

Ask HN: What's your favorite book you've read?

Ask HN: How do you get first 10 customers?

Ask HN: Battery life for graphical Linux VMs (or Asahi) on Apple Silicon laptops

Tell HN: I Lost Joy of Programming

Is making the rust compiler slow a billion dollar mistake?

Ask HN: What is a physiically disabled person to do in this job market?

Open-source STM32 autopilot for long-range fixed-wing UAVs (SmartNavX)

Ask HN: Are there any tools for tracking GPU prices over time?

Attended Windsurf's Build Night 18 hours before founders joined Google DeepMind

Ask HN: Could the C64 startup screen have encouraged more users to learn BASIC?

Ramanujan-Computing: Distributed Computing with Idle Smart Devices: Open-Source

Ask HN: Looking for a directory of PS1 command prompts. Like awesome lists

Open-source framework for real-time AI voice

Comments (1)