Strengths and limitations of diffusion language models

27 rbanffy 1 5/22/2025, 10:10:09 AM seangoedecke.com ↗

Comments (1)

cubefox · 1h ago
That's a nice explanation. I wonder whether autoregressive and diffusion language models could be combined such that the model only denoises the (most recent) end of a sequence of text, like a paragraph, while the rest is unchangeable and allows for key-value caching.