Refrag: Rethinking RAG Based Decoding

2 datadrivenangel 1 9/11/2025, 4:22:46 PM arxiv.org ↗

Comments (1)

datadrivenangel · 3h ago
Am I misunderstanding this or is basically just taking RAG results and doing a vector search on the results and only passing some to the context window?

Also, why do these AI papers never get speedup times in human time units?