DeepSeek won the best paper award at ACL 2025

6 CalmStorm 1 8/1/2025, 7:48:06 PM arxiv.org ↗

Comments (1)

CalmStorm · 1h ago
For the first time, it introduced native sparse attention into the full training process, achieving up to 11× inference speedup while maintaining model performance.