Why do LLMs attend to the first token?

2 adhi01 1 5/15/2025, 11:15:20 PM arxiv.org ↗

Comments (1)

maytc · 3h ago
Curious if the authors had a chance to look at the Softpick paper? https://arxiv.org/abs/2504.20966