Show HN: FLM- Audio, our 7B spoken dialog chatbot with native full-duplexity

2 BAAIBeijing 0 9/19/2025, 8:14:44 AM
We’re excited to share RoboBrain-Audio(FLM-Audio), a new 7B spoken dialog chatbot with native full-duplexity. FLM-Audio achieves superior response qualities and chatting experiences while requiring significantly less training data.

Key innovations: Natural Monologue Abandons word-level timestamps and innovatively proposes the “Natural Monologue ” mechanism Preserves the inherent advantages of LLMs in generating coherence and instruction-following. effectively addresses the context-dependent pronunciation issues of certain words (especially numbers).

Dual Training Paradigm Training spans two major stages, four sub-stages, simulating ASR, TTS, and interactive dialog tasks. Post-Training stage equips the model with the basic abilities of “listening” and “speaking”. Supervised Fine-Tuning (SFT) stage then shapes its dialogue and full-duplex interaction capabilities.

Resource Links: https://arxiv.org/abs/2509.02521 https://huggingface.co/CofeAI/FLM-Audio GitHub - cofe-ai/flm-audio: FLM-Audio is a audio-language subversion of RoboEgo/FLM-Ego -- an omnimo

The model is now open-sourced, and we look forward to your use and feedback.

Comments (0)

No comments yet