Theoretical Analysis of Positional Encodings in Transformer Models arxiv.org 19 points by PaulHoule 6 hours ago
semiinfinitely 2 hours ago Kinda disappointing that rope- the most common pe- is given about one sentence in this work and omitted from the analysis.
Kinda disappointing that rope- the most common pe- is given about one sentence in this work and omitted from the analysis.