HN Reader
Top
New
Best
Ask
Show
Jobs
Top
New
Best
Ask
Show
Jobs
OpenAI: support for Reinforcement Fine-tuning available to verified orgs
1
justanotheratom
1
5/8/2025, 9:00:31 PM
twitter.com ↗
Comments (1)
justanotheratom
· 7h ago
my question for anyone who knows:
Between SFT, DPO, and RFT, - when to use which? - can we mix and match? e.g, first SFT, then DPO.
[-] Collapse
Between SFT, DPO, and RFT, - when to use which? - can we mix and match? e.g, first SFT, then DPO.