Ask HN: Freelancers, what features matter most in invoicing software?

1 points by ahmadhamza19 2m ago 0 comments

Supercharger for Business – Tesla (tesla.com)

1 points by bilsbie 3m ago 0 comments

Ben-Hur on a Computer Screen (daviramos.com)

1 points by bananamerica 4m ago 1 comments

The Universe Within 12.5 Light Years (atlasoftheuniverse.com)

3 points by algorithmista 10m ago 0 comments

Federal Trade Commission Files to Accede to Vacatur of Non-Compete Clause Rule (ftc.gov)

1 points by moonka 11m ago 1 comments

Coordinated Thermal and Electrical Balancing for Lithium-Ion Cells (mdpi.com)

1 points by PaulHoule 11m ago 0 comments

Logic Theorist (en.wikipedia.org)

2 points by geox 12m ago 0 comments

What to do about "mirror life"? (foreignaffairs.com)

2 points by hhs 12m ago 0 comments

Blockbench: A low-poly 3D model editor (blockbench.net)

1 points by marcodiego 13m ago 0 comments

Show HN: Desk clock with rotating Earth showing real-time day/night cycles (atelierludo.com)

1 points by lludo 13m ago 0 comments

Hosting SQLite databases on GitHub Pages (2021) (phiresky.github.io)

1 points by whatisabcdefgh 14m ago 0 comments

Quantum Mechanics, Concise Book (github.com)

2 points by pykello 22m ago 0 comments

Money for Nothing, Chips for Free (phrack.org)

1 points by rmason 22m ago 1 comments

Tesla offers mammoth $1T pay package to Musk, sets lofty targets (reuters.com)

5 points by andsoitis 30m ago 2 comments

Tool that tracks all new website launches (websitelaunches.com)

3 points by antiochIst 31m ago 1 comments

NeuroWaste Crane Depot (neurowaste.net)

2 points by 2OEH8eoCRo0 33m ago 0 comments

Deluxe Paint on the Commodore Amiga (stonetools.ghost.io)

3 points by doener 34m ago 0 comments

Kenvue stock falls 10% on report RFK Jr to tie autism to pregnancy Tylenol use (cnbc.com)

7 points by randycupertino 34m ago 1 comments

Why Doesn't Google Maps Work in South Korea (cnn.com)

1 points by cpeterso 36m ago 0 comments

Covid wave washes over California. Some officials urge residents to mask up (latimes.com)

2 points by bookofjoe 40m ago 1 comments

Show HN: I built a free AI tool that makes a pdf fillable (instafill.ai)

1 points by alexander-g 40m ago 0 comments

Marriage, Motherhood, and Women's Well-Being (ifstudies.org)

2 points by mgh2 41m ago 1 comments

Campfire Is Now Free and Open Source (twitter.com)

2 points by chilipepperhott 41m ago 1 comments

Reflections on Random Kitchen Sinks (archives.argmin.net)

1 points by ntonozzi 41m ago 0 comments

Stripe's Tempo and the Ghost of Facebook's Libra's Past (forbes.com)

1 points by krrishd 42m ago 0 comments

Retrieval Embedding Benchmark (RTEB) (huggingface.co)

1 points by fzliu 42m ago 0 comments

Where's the Shovelware? Why AI Coding Claims Don't Add Up (substack.com)

2 points by BerislavLopac 45m ago 0 comments

The Babysitter Problem (chrisbeckman.dev)

1 points by kiyanwang 47m ago 0 comments

Should we revisit Extreme Programming in the age of AI? (hyperact.co.uk)

2 points by imjacobclark 52m ago 0 comments

EU slaps Google with €2.95B fine despite Trump trade threat (politico.eu)

5 points by saubeidl 52m ago 1 comments

Learning the soroban rapid mental calculation as an adult (github.com)

2 points by vitalnodo 53m ago 0 comments

EEA/UK/CH users can now use photorealistic images in Google Flow

1 points by oliverulerich 54m ago 2 comments

New Management Newsletter: We are all, barely managing (barelymanaging.substack.com)

1 points by ronsoak 55m ago 0 comments

Google killing 2 million nest thermostats next month (community.hubitat.com)

37 points by RyanShook 57m ago 21 comments

One mother for two species via obligate cross-species cloning in ants – Nature (nature.com)

1 points by janandonly 58m ago 0 comments

Trump Social Security Administration Removed Key Metrics, Information from Site (warren.senate.gov)

17 points by Improvement 1h ago 0 comments

US special forces killed North Korean civilians in botched 2019 mission (reuters.com)

27 points by hnlurker22 1h ago 3 comments

How to make your APIs both fast and secure (community.qbix.com)

1 points by EGreg 1h ago 0 comments

Malicious NPM Packages Impersonate Flashbots SDKs, Targeting Ethereum Wallet (socket.dev)

1 points by feross 1h ago 0 comments

High-level visual representations in the human brain are aligned with LLMs (nature.com)

3 points by PaulHoule 1h ago 0 comments

How to run without all the pesky agonizing pain (2021) (dynomight.net)

1 points by UltimateEdge 1h ago 0 comments

Sting operation kills "copycat" sports piracy site with 1.6B visits last year (arstechnica.com)

2 points by nomilk 1h ago 0 comments

Making 8-bit Music From Scratch at the Commodore 64 BASIC Prompt [video] (youtube.com)

1 points by johlo 1h ago 0 comments

Bringing restartable sequences out of the niche (lwn.net)

1 points by Bogdanp 1h ago 0 comments

South Koreans detained in ICE raid at Hyundai plant (bbc.com)

6 points by belter 1h ago 0 comments

Cormac McCarthy's Library (smithsonianmag.com)

1 points by hackandthink 1h ago 0 comments

A CSS-only time progress bar to use in Markdown / GitHub Pages (christianheilmann.com)

2 points by bobbiechen 1h ago 0 comments

Steven Soderbergh Goes Rogue (Again) (hollywoodreporter.com)

1 points by homarp 1h ago 0 comments

Jonathan's Space Report (planet4589.org)

2 points by kqbx 1h ago 0 comments

Capitalization of Initialisms (teamten.com)

1 points by praptak 1h ago 0 comments

When LLMs Grow Hands and Feet, How to Design Our Agentic RL Systems?

3 amberjcjj 1 9/5/2025, 8:32:24 PM amberljc.github.io ↗

Comments (1)

amberjcjj · 1h ago

Lately I’ve been building AI agents for scientific research. In addition to build better agent scaffold, to make AI agents truly useful, LLMs need to do more than just think—they need to use tools, run code, and interact with complex environments. That’s why we need Agentic RL.

While working on this, I notice the underlying RL systems must evolve to support these new capabilities. So, I wrote a blog post to capture my thoughts and lessons learned.

“When LLMs Grow Hands and Feet, How to Design our Agentic RL Systems?”

TL;DR: The frontier of AI is moving from simple-response generation to solving complex, multi-step problems through agents. Previous RL frameworks for LLMs aren’t built for this—they struggle with the heavy, heterogeneous resource demands that agents need, like isolated environments or tool interactions.

In the blog, I cover:

How RL for LLM-based agents differs from traditional RL for LLM.

The critical system challenges when scaling agentic RL.

Emerging solutions top labs and companies are using

If you’re interested in agentic intelligence—LLMs that don’t just think but act—I go into the nuts and bolts of what it takes to make this work in practice.