GPT-OSS-20B at 10k tokens/second on a 4090? Sure

1 diggan 1 8/17/2025, 9:17:14 PM old.reddit.com ↗

Comments (1)

diggan · 2h ago
From submitter:

> here's GPT-OSS-20b running 100 agents simultaneously at 131,076 token context, each agent with its own 131k context window, each hitting sub-100ms ttft, blowing everything out of the sky at 10k tokens/second.