Theoretical Analysis of Positional Encodings in Transformer Models
26 PaulHoule 2 6/27/2025, 10:07:11 PM arxiv.org ↗
Comments (2)
semiinfinitely · 4h ago
Kinda disappointing that rope- the most common pe- is given about one sentence in this work and omitted from the analysis.
gsf_emergency_2 · 2h ago
Maybe it's because ropes by themselves do nothing for the model capacity?