Theoretical Analysis of Positional Encodings in Transformer Models

24 PaulHoule 2 6/27/2025, 10:07:11 PM arxiv.org ↗

Comments (2)

semiinfinitely · 3h ago
Kinda disappointing that rope- the most common pe- is given about one sentence in this work and omitted from the analysis.
gsf_emergency_2 · 55m ago
Maybe it's because ropes by themselves do nothing for the model capacity?