96% TCO and energy savings for 65 racks eight-way HGX H100 air-cooled versus 1 rack GB200 NLV72 liquid-cooled with equivalent performance on GPT-MoE-1.8T real-time inference throughput.
Big if true. Energy and cooling costs can represent up to 30-40% of the total cost of setting up and running an AI data center.
Sweepi · 24m ago
[+114% Attention acceleration]
Any idea how they got +50% FP4 from the same silicon? "Firmware" improvements?
Or did they found a way to disable the INT8 and FP64 units and re-use them e.g. as overspill registers?
Any other ideas why INT8/FP64 is down -97% on the same chip? QA/certification issues?
Big if true. Energy and cooling costs can represent up to 30-40% of the total cost of setting up and running an AI data center.
In case you you want to compare the complete specs, I would post them here, but since hn supports less formatting than early 2000s bb-forums, check it here: https://www.forum-3dcenter.org/vbulletin/showpost.php?p=1380...