LNS-Madam: Low-Precision Training in Log Using Multiplicative Weight Update
2 nabla9 1 8/23/2025, 7:50:57 AM arxiv.org ↗
Comments (1)
nabla9 · 6h ago
The new Deepseek v3.1 is trained using the UE8M0 FP8 scale data format. Compared to FP32 and FP8, LNS-Madam reduces the energy consumption by over 90% and 55%, respectively.