Show HN: I replicated GRPO and made it one-click runnable on HPC-AI.com

20 cheerGPU 6 6/23/2025, 10:19:10 AM hpc-ai.com ↗
Hi HN,

I’m excited to share RUNRL JOB, our new one-click service for running Reinforcement Learning Fine-Tuning (RFT) workloads—think GRPO, PPO, or any custom reward-based tuning—directly on HPC-AI.com.

What It Is Pre-wired RFT pipeline: dual-network configs, memory optimizations, logging, and reward modules are all set up for you.

Model support: demos with Qwen-3B and Qwen-1.5 out of the box; drop in your own model if you like.

Cost & performance transparency: real-hardware benchmarks on 8× H100/H200, with live metrics in TensorBoard and built-in cost tracking.

Why It Matters Memory-efficient GRPO: up to 40% memory savings vs PPO—no separate value network or double backward pass.

Zero setup: no Dockerfiles, no dependency hell—just click “Start” and your training job spins up.

Accessible RLHF: lowers the barrier for researchers, students, and indie hackers to experiment at scale.

How to Try Visit the blog post: https://hpc-ai.com/blog/RUNRL_JOB_is_live_on_hpc-ai

Click “Launch GPU Instances”, choose H100 or H200.

Select the RUNRL JOB template and hit “Start Job”.

Monitor progress live in JupyterLab or via TensorBoard—zero extra setup.

Comments (6)

thisisaacc · 8h ago
It's interesting, most clouds can only provide SFT, not the latest RFT.
cheerGPU · 8h ago
Would love any feedback if you give it a try!
icemount · 8h ago
Is it free?
cheerGPU · 8h ago
Sign up today to claim $6 credit!
adamfly · 8h ago
You are the GOAT GPU Cloud
cheerGPU · 8h ago
If you're training big models or running GRPO at scale, we’re here to make it fast, affordable, and hassle-free. Let me know if you ever need a trial code or want to spin something up — HPC-AI.COM's got you covered!