LNS-Madam: Low-Precision Training in Log Using Multiplicative Weight Update

2 nabla9 1 8/23/2025, 7:50:57 AM arxiv.org ↗

Comments (1)

nabla9 · 6h ago
The new Deepseek v3.1 is trained using the UE8M0 FP8 scale data format. Compared to FP32 and FP8, LNS-Madam reduces the energy consumption by over 90% and 55%, respectively.