AI agents fail tasks 70% of the time
12 JTbane 4 8/12/2025, 3:01:07 PM arxiv.org ↗
Comments (4)
rogerkirkness · 17h ago
Agents went from 10% to 30% reliable this year, which is still a big deal.
thebigspacefuck · 9h ago
This is from a Dec 2024 which feels like a while ago
gavinray · 17h ago
So you ask it to try every task 3.33 times for guaranteed success?
JTbane · 17h ago
"We test baseline agents powered by both closed API-based and open-weights language models (LMs), and find that the most competitive agent can complete 30% of tasks autonomously."