Nvidia announces 4-bit training with NVFP4

Comments (1)

opcode84 · 1d ago

A version of the 12B Hybrid Mamba-Transformer model was initially trained with 8-bit precision—FP8, which has been shown in previous studies to closely match 16-bit precision, and hence served as our baseline for comparison. We then successfully trained this same 12B model from scratch using NVFP4, demonstrating that this new low-precision format can support full pretraining at trillion-token scale. The NVFP4 run exhibited stable convergence without the training instabilities or divergence issues that typically plague ultra-low precision training.

Ask HN: Why hasn't x86 caught up with Apple M series?

Ask HN: How can I recover and run my old mobile game from the 2010s?

Ask HN: Does sentience put stress on the brain?

Units of Economics of LLMs. Reply to Ed Zitron's "AI Is a Money Trap"

Out of curiosity: what kind of people use this "forum" (I mean Hacker News)?

Ask HN: Do You Believe in Aliens?

Ask HN: Why are so many services rejecting Google Voice numbers for signups?

Ask HN: What to Do with Old iPads?

Ask HN: What should I use to run React Native tests on a device?

Ask HN: Any experienced devs who use AI extensively in their work?

Ask HN: Is there a temp phone number like temp email?

Stop squashing your commits. You're squashing your AI too

Ask HN: How are you attributing your AI usage when developing software?

Patient Lisp Hacker Seeks Same for Long Walks Through IPL-V Code

Ask HN: Has anyone else used online communities that are archetypically "savvy"?

Ask HN: How is Nvidia streaming 4k games while remote control sw is low q lag

High rate of LLM (GPT5) hallucinations in dense stats domains (cricket)

Ask HN: Best codebases to study to learn software design?

429 Too Many Requests from registry.npmjs.org

How can a mutex in Wine be faster than a native one on Linux

Ask HN: What is wrong with modern software development

Ask HN: Recommandation for an Ergonomic Keyboard?

Ask HN: Are AI filters becoming stricter than society itself?

Ask HN: Someone has committed 20K+ LoC to a PR, exhausting my CI & AI workflows

Ask HN: Why do people hate on Sabine Hossenfelder so much?

Ask HN: I just abandoned my PyCharm subscription, what should I use now?

Ask HN: How can I trace what user queries make AI bots crawl my site?

Ask HN: What is the biggest problem LLMs solved in your life/work?

Ask HN: Best Marketplaces for Used Servers?

ASK HN: AI in high school. Will teachers and schools have to compensate?

Ask HN: How do you find early stage startups to join

Ask HN: What is your source for answers?

Ask HN: Devices to allow children to listen to podcasts on my local network?

DSPy GEPA Example: Listwise Reranker

Problem with Payment Gateways

Ask HN: Does using public transportation make you more creative than driving?

Ask HN: No easy way for tvOS to display long documents (e.g., terms of service)?

Ask HN: Why aren't Android manufacturers interested in GrapheneOS?

HeartWatch: A Proactive Child Safety System

Ask HN: What's Hacker News's vision for the future?

Ask HN: Why is Apple so far behind with Siri?

Nvidia announces 4-bit training with NVFP4

Comments (1)