Live on Prem Studio GPT-OSS Models

1 prem_studio 1 8/14/2025, 6:56:38 AM studio.premai.io ↗

Comments (1)

prem_studio · 2h ago
Live on Prem Studio - We’ve added OpenAI’s gpt‑oss models (20 B & 120 B) to our inference lineup! OpenAI’s gpt‑oss launch on 5 Aug 2025 marks their first open‑weight LLM release since GPT‑2. According to their own benchmarks, the 120 B model rivals OpenAI o4‑mini in reasoning while running on a single 80 GB GPU, and the 20 B model delivers o3‑mini‑class performance on devices with just 16 GB of memory Both models use a Mixture‑of‑Experts architecture, so only 5.1 B or 3.6 B parameters are active per tokens, giving them powerful reasoning ability without massive hardware. What this means on Prem Studio: 1.Two sizes for different use cases: gpt‑oss‑120B for frontier‑level reasoning; gpt‑oss‑20B for edge‑friendly local deployment. 2.Privacy‑preserving inference: run these models entirely within your VPC or on‑prem—no data leaves your environment. 3.Enterprise performance at scale: our optimised stack lets you go from notebooks to production pipelines with low latency. 4.Custom fine‑tuning: we’re working on in‑platform fine‑tuning for gpt‑oss—coming soon to studio.premai.io.

At Prem, we believe local AI is graduating from hobbyist projects to mainstream enterprise adoption. With gpt‑oss, organisations get open‑weight freedom plus frontier‑level reasoning.

Try it now: studio.premai.io Stay tuned for our upcoming deep‑dive report and fine‑tuning release!