Hierarchical Reasoning Model outperforms LLMs at reasoning tasks

2 geox 1 8/27/2025, 1:00:37 PM livescience.com ↗

Comments (1)

nabla9 · 1h ago
Reality check.

https://arcprize.org/blog/hrm-analysis

...we made some surprising findings that call into question the prevailing narrative around HRM:

1. The "hierarchical" architecture had minimal performance impact when compared to a similarly sized transformer.

2. However, the relatively under-documented "outer loop" refinement process drove substantial performance, especially at training time.

3. Cross-task transfer learning has limited benefits; most of the performance comes from memorizing solutions to the specific tasks used at evaluation time.

4. Pre-training task augmentation is critical, though only 300 augmentations are needed (not 1K augmentations as reported in the paper). Inference-time task augmentation had limited impact.

Findings 2 & 3 suggest that the paper's approach is fundamentally similar to Liao and Gu's "ARC-AGI without pretraining".