Nvidia announces 4-bit training with NVFP4

1 opcode84 1 8/25/2025, 4:54:35 PM developer.nvidia.com ↗

Comments (1)

opcode84 · 2h ago
A version of the 12B Hybrid Mamba-Transformer model was initially trained with 8-bit precision—FP8, which has been shown in previous studies to closely match 16-bit precision, and hence served as our baseline for comparison. We then successfully trained this same 12B model from scratch using NVFP4, demonstrating that this new low-precision format can support full pretraining at trillion-token scale. The NVFP4 run exhibited stable convergence without the training instabilities or divergence issues that typically plague ultra-low precision training.