DeepSeek won the best paper award at ACL 2025

46 CalmStorm 5 8/1/2025, 7:48:06 PM arxiv.org ↗

Comments (5)

gnabgib · 1h ago
Title: Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

The awards page for ACL seems to disagree with this editorialized title: https://2025.aclweb.org/program/awards/

fourdnet · 45m ago
The ACL webpage has not been updated yet. Here are the announcement slides: https://cspaper.org/topic/116/record-breaking-acl-2025-crown...
pyuser583 · 18m ago
I'd say award for best title is a tie between: "Dehumanizing Machines: Mitigating Anthropomorphic Behaviors in Text Generation Systems"; "Finding Needles in Images: Can Multi-modal LLMs Locate Fine Details?"; and "Steering off Course: Reliability Challenges in Steering Language Models."
sabakhoj · 1h ago
> Despite being sparse, NSA surpasses Full Attention baseline on average across general benchmarks, long-context tasks, and reasoning evaluation.

Isn't it very notable that the latency improvement didn't have a performance loss? I'm not super familiar with all the technical aspects, but that seems like it should be one of the main focuses of the paper.

CalmStorm · 5h ago
For the first time, it introduced native sparse attention into the full training process, achieving up to 11× inference speedup while maintaining model performance.