GPT5 is worse than 4.1-mini for text and worse than Sonnet 4 for coding

5 hitradostava 7 8/10/2025, 10:16:49 AM
It seems that OpenAI have got the PR machine working amazingly. The Cursor CEO said it's the best, as did Simon Willison (https://simonwillison.net/2025/Aug/7/gpt-5/).

But I've found it terrible. For coding (in Cursor), it's slow, fails with tool calls often (no MCP just stock Cursor tools) and stored some new application state in globalThis - something that no model has ever attempted to do in over a year of very heavy Cursor / Claude Code use).

For a summarization/insights API that I work on, it was way worse than gpt-4.1-mini. I tried both mini and full gpt5, with different reasoning settings. It didn't follow instructions, and output was worse across all my evals, even after heavy prompt adjustment. I did a lot of sampling and the results were objectively bad.

Am I the only one? Has anyone seen actual real-world benefits of GPT-5 vs other models?

Comments (7)

canerdogan · 12m ago
GPT-5 isn’t really a brand-new model in the way people think. From what I’ve seen, the goal was more about reducing costs and unifying the interface than releasing a totally different architecture. Under the hood it is still routing to models we already know, just picking what it thinks will give the “best” result for the request.

That can be fine for a lot of general use cases, but if you’re working in specific domains like coding agents or high-precision summarization, that routing can actually make results worse compared to sticking with a model you know performs well for your workload.

8thcross · 12m ago
I tried it with cursor-agent, their cli - and it generated better code than expected. YMMV. It was more thoughtful and strategic than the other frontier models.
tim_angus · 41m ago
And yet the media keeps using the term "exponential improvement"...
cranberryturkey · 1h ago
it solved a huge bug i've been struggling with.
hitradostava · 1h ago
Had Sonnet 4 not been able to?
cranberryturkey · 1h ago
No, it kept going in circles....spent like 3 weeks trying to fix it. Got access to gpt5 yesterday and all major bugs are resolved.
revskill · 1h ago
Sure.