When LLMs Grow Hands and Feet, How to Design Our Agentic RL Systems?

3 amberjcjj 1 9/5/2025, 8:32:24 PM amberljc.github.io ↗

Comments (1)

amberjcjj · 1h ago
Lately I’ve been building AI agents for scientific research. In addition to build better agent scaffold, to make AI agents truly useful, LLMs need to do more than just think—they need to use tools, run code, and interact with complex environments. That’s why we need Agentic RL.

While working on this, I notice the underlying RL systems must evolve to support these new capabilities. So, I wrote a blog post to capture my thoughts and lessons learned.

“When LLMs Grow Hands and Feet, How to Design our Agentic RL Systems?”

TL;DR: The frontier of AI is moving from simple-response generation to solving complex, multi-step problems through agents. Previous RL frameworks for LLMs aren’t built for this—they struggle with the heavy, heterogeneous resource demands that agents need, like isolated environments or tool interactions.

In the blog, I cover:

How RL for LLM-based agents differs from traditional RL for LLM.

The critical system challenges when scaling agentic RL.

Emerging solutions top labs and companies are using

If you’re interested in agentic intelligence—LLMs that don’t just think but act—I go into the nuts and bolts of what it takes to make this work in practice.