Sweden's AI Reform reignites a proven flywheel (medium.com)
1 points by imartin2k 41m ago 0 comments
Hardware testing automation: a status update (postmarketos.org)
1 points by yorwba 2h ago 0 comments
TransMLA: Multi-head latent attention is all you need
40 ocean_moist 2 5/13/2025, 3:29:47 AM arxiv.org ↗
Comments (2)
kavalg · 16m ago
My (possibly wrong) TLDR: TransMLA is a method to "compress" an already trained GQA model, with the additional option to further fine tune it. Shall make inference faster.
olq_plo · 50m ago
Very cool idea. Can't wait for converted models on HF.