Show HN: I replicated GRPO and made it one-click runnable on HPC-AI.com

20 cheerGPU 6 6/23/2025, 10:19:10 AM hpc-ai.com ↗

Hi HN,

I’m excited to share RUNRL JOB, our new one-click service for running Reinforcement Learning Fine-Tuning (RFT) workloads—think GRPO, PPO, or any custom reward-based tuning—directly on HPC-AI.com.

What It Is Pre-wired RFT pipeline: dual-network configs, memory optimizations, logging, and reward modules are all set up for you.

Model support: demos with Qwen-3B and Qwen-1.5 out of the box; drop in your own model if you like.

Cost & performance transparency: real-hardware benchmarks on 8× H100/H200, with live metrics in TensorBoard and built-in cost tracking.

Why It Matters Memory-efficient GRPO: up to 40% memory savings vs PPO—no separate value network or double backward pass.

Zero setup: no Dockerfiles, no dependency hell—just click “Start” and your training job spins up.

Accessible RLHF: lowers the barrier for researchers, students, and indie hackers to experiment at scale.

How to Try Visit the blog post: https://hpc-ai.com/blog/RUNRL_JOB_is_live_on_hpc-ai

Click “Launch GPU Instances”, choose H100 or H200.

Select the RUNRL JOB template and hit “Start Job”.

Monitor progress live in JupyterLab or via TensorBoard—zero extra setup.

Comments (6)

thisisaacc · 7h ago

It's interesting, most clouds can only provide SFT, not the latest RFT.

cheerGPU · 7h ago

Would love any feedback if you give it a try!

icemount · 7h ago

Is it free?

cheerGPU · 7h ago

adamfly · 7h ago

You are the GOAT GPU Cloud

cheerGPU · 7h ago

If you're training big models or running GRPO at scale, we’re here to make it fast, affordable, and hassle-free. Let me know if you ever need a trial code or want to spin something up — HPC-AI.COM's got you covered!

Tell HN: Beware confidentiality agreements that act as lifetime non competes

Ask HN: How to regain the ability to read with focus and learn

Ask HN: Anyone using OpenAI's Agent SDK in production?

Ask HN: Could a "social mode" in AI chats replace social media?

Ask HN: How much would you pay to solve the largest problem you face?

Ask HN: Using AI daily but not seeing productivity gains – is it just me?

Are we overfitting our code to trends instead of problems?

I built an app to backup Live Photos from iPhone to external hard drives

Ask HN: Do you think switching between apps hurts your productivity?

Handling safety/compliance in edtech apps?

Ask HN: Are you hesitant to open source your project because LLMs may steal it?

Ask HN: Hydrogen plasma to deoxidize Aluminum for sustainable green hydrogen?

Free Virtual CS Classes and Tutoring

Ask HN: What newspaper are you paying for these days?

Ask HN: Tips for hiring? It has been difficult

Tell HN: Sam and Jony Announcement 404s

People with Diabetes Are Cured in Small Trial of New Drug

Ask HN: Tech people who are self employed. How do you do it?

Ask HN: Is AI 'context switching' exhausting?

Ask HN: What is your recommendation for a wireless keyboard and mouse?

BMW ConnectedDrive lets me control my returned rental car (Sixt)

Ask HN: How to Get Rid of Gemini?

Ask HN: What do you think about app native vs. portable look-and-feel?

Ask HN: Data engineers, What suck when working on exploratory data-related task?

Ask HN: How did you meet your co-founder?

Is there a way to run an LLM as a better local search engine?

Ask HN: At what point did your startup hire its first lawyer?

Ask HN: AI agents and the future of UI/UX design. Opinions?

Ask HN: What is the equivalent to Win32 on Linux

Ask HN: Advice about transitioning to remote role?

Ask HN: How to Deal with a Bad Manager?

Is GitHub Down?

Show HN: I replicated GRPO and made it one-click runnable on HPC-AI.com

Comments (6)