Show HN: Boost Your Voice AI Agents with Open-Source Ten VAD

8 Jingyi0321 5 7/10/2025, 7:36:21 AM github.com ↗
Voice Activity Detection (VAD) is a crucial component for Voice AI, enabling more natural and efficient interactions. TEN VAD is an open-source solution designed to supercharge your Voice AI Agents with lightning-fast, human-like conversations! TEN VAD offers some key advantages:

ONNX Support: Deploy on virtually any platform or hardware architecture! This means greater flexibility and easier integration into your existing systems. Superior Detection Accuracy: Experience noticeable improvements in voice detection, leading to fewer errors and more reliable performance. Smaller & Faster: Enjoy a 32% reduction in Real-Time Factor (RTF) and an 86% size reduction compared to Silero VAD! This translates to lower resource consumption and faster processing.

Get the code: https://github.com/ten-framework/ten-vad

Comments (5)

fm100 · 1d ago
What are the differences between TEN VAD and WebRTC VAD?
Jingyi0321 · 1d ago
In general, WebRTC VAD uses pitch information for VAD. Note that pitch only appears in voiced speech, but not in unvoiced speech. With this characteristic, WebRTC VAD may fails in detecting the start of a word, losing the unvoiced start, which will then result in e.g. increased WER in ASR system. On the other hand, noise whose spectrum is similar to voiced speech, e.g. music, may be extracted a non-zero pitch by WebRTC VAD pitch detection system.

Our model incorporates fbank and the pitch information together, and can analyse the input pattern deeply, therefore has better performance than WebRTC VAD.

strassenbahn · 1d ago
It seems the performance is much better than the existing VAD SOTA Silero VAD and the size is much smaller. Good to see this new SOTA VAD model!
JuneWW · 1d ago
This looks really promising! VAD is such a critical piece of the puzzle for voice AI. Definitely going to check this out. Thanks for sharing!
rambo11 · 1d ago
Thanks for sharing this awesome VAD model. A high-performance and low latency VAD is very helpfull in Conversation AI agents.