Nvidia announces 4-bit training with NVFP4

Comments (1)

opcode84 · 2h ago

A version of the 12B Hybrid Mamba-Transformer model was initially trained with 8-bit precision—FP8, which has been shown in previous studies to closely match 16-bit precision, and hence served as our baseline for comparison. We then successfully trained this same 12B model from scratch using NVFP4, demonstrating that this new low-precision format can support full pretraining at trillion-token scale. The NVFP4 run exhibited stable convergence without the training instabilities or divergence issues that typically plague ultra-low precision training.

Perplexity launches revenue share with content publishers (perplexity.ai)

Show HN: Expansions of Programming Related Acronyms (github.com)

Elon Musk Has His Vision. Waymo Chief T. Mawakana Says She's Got a Better One (vanityfair.com)

Quality Precision (lesswrong.com)

The Nvidia AI GPU Black Market [video] (youtube.com)

The Family Fallout of DNA Surprises (newyorker.com)

OpenAI: Building the "Everything Platform" in AI (leoniscap.com)

ASK HN: AI in high school. Will teachers and schools have to compensate?

Notes on Autograd (aschrein.github.io)

Enslaved Grandparent Syndrome (theguardian.com)

Ask HN: What problem you are solving as a founder?

Inouye Solar Telescope delivers record images of solar flare and coronal loops (phys.org)

Show HN: I built an image-based logical Sudoku Solver (dokusolver.com)

Show HN: I made a browser extension that exposes the hidden cost of shopping (chromewebstore.google.com)

Richard Dawkins explains why men are different from women (unherd.com)

Show HN: Whisker, a real-time Pipecat debugger for your voice AI agents (github.com)

Photonics and co-packaged optics may become mandatory for AI data centers (tomshardware.com)

De minimis' end: How shippers are adapting for peak season and beyond (supplychaindive.com)

Contracts: A Deep Dive (modernescpp.com)

Good Products Work. Great Products Create Meaning (opuslabs.substack.com)

AnalogSeeker: An Open-Source Foundation Language Model for Analog Circuit Design (arxiv.org)

noble-curves: audited and minimal elliptic curve cryptography in JavaScript (github.com)

Nullable vs. Nullable in C# (einarwh.no)

Estimating Household Green Space Using Drone Oblique Photography (mdpi.com)

Bmssp: A New Shortest Path Algorithm (rohanparanjpe.substack.com)

French Toilets of Spikersuppa in Oslo, Norway (atlasobscura.com)

The Cult of Productivity Apps (karanjaxyz.substack.com)

Better Politicians: Weeding Is Fundamental (eatingpolicy.com)

Walmart VP took $30K/day kickbacks favoring Indian H1Bs over US applicants (twitter.com)

20 years of the default mode network: A review and synthesis (sciencedirect.com)

There Are a Lot of ETFs (bloomberg.com)

Google to require developer verification to install and sideload Android apps (9to5google.com)

With AI chatbots, Big Tech is moving fast and breaking people (arstechnica.com)

Private equity may affect your 401(k) (thehustle.co)

Solve coding challenges by generating the code solutions with prompts (colf.dev)

ShellSage is a context-aware AI assistant that generates/explains shell commands (github.com)

Reflecting on Years of Runescape (2024) (ludic.mataroa.blog)

Why is Delphi so popular according to the Tiobe index?

Calories In, Calories Out (2014) (possiblywrong.wordpress.com)

The Power of CICO (twitter.com)

First absolute superconducting switch developed in a magnetic device (phys.org)

How RubyGems.org Protects Our Community’s Critical OSS Infrastructure (blog.rubygems.org)

Kruci: Post-Mortem of a UI Library (pwy.io)

Madison Avenue Is Starting to Love A.I (nytimes.com)

Why Go Out (2013) (thoughtcatalog.com)

I built a tool that converts Git commits into a full Changelog (shiplog.sh)

Google's Liquid Cooling at Hot Chips 2025 (chipsandcheese.com)

Prisoner laments reliance on floppy disks for appeals documents (tomshardware.com)

U.S. confirms nation's first travel-associated human screwworm case (reuters.com)

How to Build a ChatGPT Clone in Go: Cost, Context, and Lessons (nleiva.medium.com)

Nvidia announces 4-bit training with NVFP4

Comments (1)