OpenAI: support for Reinforcement Fine-tuning available to verified orgs

1 justanotheratom 1 5/8/2025, 9:00:31 PM twitter.com โ†—

Comments (1)

justanotheratom ยท 3h ago
my question for anyone who knows:

Between SFT, DPO, and RFT, - when to use which? - can we mix and match? e.g, first SFT, then DPO.