Ask HN: For game devs – which AI tools do you use, and why?

1 points by canerdogan 2m ago 0 comments

Principles and Methodologies for Serial Performance Optimization (usenix.org)

1 points by limoce 3m ago 0 comments

Adult sites are stashing exploit code inside racy .svg files (arstechnica.com)

1 points by The-Old-Hacker 12m ago 0 comments

Show HN: Crudloop – On-Demand Voice AI, Meeting Bots and Full-Stack Development (crudloop.com)

1 points by navicstein 13m ago 0 comments

Instagram responds after users panic over location-sharing update (san.com)

1 points by taubek 15m ago 0 comments

A queer history of computing: Peter Landin (rhizome.org)

1 points by fanf2 16m ago 0 comments

Booting 5000 Erlangs on Ampere One 192-core (underjord.io)

1 points by ingve 16m ago 0 comments

Writing tests for Nim libraries with Nimble and unittest (serv.peterme.net)

1 points by TheWiggles 16m ago 0 comments

LLMs Aren't World Models (yosefk.com)

1 points by ingve 17m ago 0 comments

What Happens When People Turn to Chatbots for Therapy? (datasociety.net)

1 points by Anon84 18m ago 0 comments

Stop the Count Reporting Partial Election Results Fuels Belief in Fraud (journals.sagepub.com)

1 points by PaulHoule 19m ago 0 comments

Hyprland: Latest Wayland features, dynamic tiling, eyecandy, powerful plugins (hypr.land)

1 points by AbuAssar 19m ago 0 comments

Meteorite that hit home is older than Earth, scientists say (bbc.com)

1 points by Bluestein 19m ago 1 comments

Redesigning Workers KV for increased availability and faster performance (blog.cloudflare.com)

1 points by shadowfiend 20m ago 0 comments

Startup turns CO₂ into edible butter with backing from Bill Gates (theguardian.com)

1 points by mromanuk 20m ago 0 comments

Index 1.6B Keys with Automata and Rust (2015) (burntsushi.net)

2 points by djoldman 22m ago 0 comments

Creating high quality electronics schematics (blog.poly.nomial.co.uk)

2 points by todsacerdoti 25m ago 0 comments

The Koala Benchmarks for the Shell (kben.sh)

1 points by limoce 28m ago 0 comments

Seeing Like an LLM (blog.continua.ai)

2 points by mike_hearn 35m ago 0 comments

FREON – Threshold digital signature library in Go (github.com)

1 points by ahlCVA 38m ago 0 comments

Behind attacks on Ukrainian cities, Russia is building a drone empire (defensenews.com)

1 points by baxtr 40m ago 0 comments

How the Rich Don't Feel Rich (2011) (rmc28.dreamwidth.org)

2 points by bravesoul2 44m ago 0 comments

Project Hyperion Design Competition – Generation Spaceship Chrysalis (canva.com)

1 points by pseudolus 45m ago 0 comments

Show HN: FlowTime – Flexible focus timer with 20% breaks (flow.yattask.app)

2 points by dondonbe 52m ago 0 comments

Butter made from carbon tastes like the real thing, gets backing from Bill Gates (cbsnews.com)

4 points by Anon84 56m ago 3 comments

Framepack AI (framepackai.org)

1 points by 122506 57m ago 1 comments

A "Top 5" VPN Was Stealing from Us [video] (youtube.com)

2 points by austinallegro 58m ago 0 comments

Digital Sovereignty Index (dsi.nextcloud.com)

1 points by vogu66 1h ago 0 comments

The History of Acer (abortretry.fail)

2 points by achairapart 1h ago 1 comments

Show HN: Building 30ms voice AI – faster response than human speech processing (synthicai.com)

5 points by duggalji 1h ago 0 comments

Show HN: My voice AI survived 50k simultaneous calls with 30ms response time (synthicai.com)

5 points by duggalji 1h ago 1 comments

Goodbye, Six-Figure Tech Jobs. Young Coders Seek Work at Fast-Food Joints (nytimes.com)

3 points by Physkal 1h ago 0 comments

Tell HN: Beware of OpenAI API credits expiring 1 year after purchase

1 points by druskacik 1h ago 0 comments

Windows XP – By Bradford Morgan White (abortretry.fail)

2 points by achairapart 1h ago 0 comments

New Chrome browser extension for Bookmer.com (chromewebstore.google.com)

1 points by g_briel 1h ago 0 comments

The Corporate Colonization of Gaming Communication (old.reddit.com)

1 points by rly_ItsMe 1h ago 0 comments

GPT5 is worse than 4.1-mini for text and worse than Sonnet 4 for coding

5 points by hitradostava 1h ago 7 comments

LinuxHW: SSD/HDD Reliability Data (github.com)

2 points by btdmaster 1h ago 0 comments

Open Lovable (github.com)

15 points by iamflimflam1 1h ago 3 comments

GLM 4.5-Air-106B and Qwen3-235B on AMD Ryzen AI MAX+ 395 (HP Z2 G1a Mini) (youtube.com)

1 points by grigio 1h ago 0 comments

All you need to know about Tokenization in LLMs (medium.com)

1 points by tokfan 1h ago 0 comments

Ask HN: Favorite LLM CLI tools for your terminal workflow?

1 points by menisadi 1h ago 1 comments

Eighteen Years of Greytrapping – Is the Weirdness Paying Off? (bsdly.blogspot.com)

2 points by peter_hansteen 1h ago 0 comments

Charon's Obol (en.wikipedia.org)

2 points by sans_souse 2h ago 0 comments

Reverse-Engineering cuBLAS (2024) (accu.org)

2 points by throwawaybutwhy 2h ago 0 comments

Show HN: Play Brainrot Games Online (brainrot-game.xyz)

2 points by loocao 2h ago 0 comments

Writing Your Own Simple Tab-Completions for Bash and Zsh (mill-build.org)

53 points by lihaoyi 2h ago 23 comments

Rr Chaos Mode (2016) (robert.ocallahan.org)

2 points by Bogdanp 2h ago 0 comments

Show HN: Bookmarq.space – One Place for Everything You Save

1 points by shreyasjk 2h ago 0 comments

China sets its first renewable standards for steel, cement and polysilicon (reuters.com)

4 points by xbmcuser 2h ago 2 comments

GPT5 is worse than 4.1-mini for text and worse than Sonnet 4 for coding

5 hitradostava 7 8/10/2025, 10:16:49 AM

It seems that OpenAI have got the PR machine working amazingly. The Cursor CEO said it's the best, as did Simon Willison (https://simonwillison.net/2025/Aug/7/gpt-5/).

But I've found it terrible. For coding (in Cursor), it's slow, fails with tool calls often (no MCP just stock Cursor tools) and stored some new application state in globalThis - something that no model has ever attempted to do in over a year of very heavy Cursor / Claude Code use).

For a summarization/insights API that I work on, it was way worse than gpt-4.1-mini. I tried both mini and full gpt5, with different reasoning settings. It didn't follow instructions, and output was worse across all my evals, even after heavy prompt adjustment. I did a lot of sampling and the results were objectively bad.

Am I the only one? Has anyone seen actual real-world benefits of GPT-5 vs other models?

Comments (7)

canerdogan · 12m ago

GPT-5 isn’t really a brand-new model in the way people think. From what I’ve seen, the goal was more about reducing costs and unifying the interface than releasing a totally different architecture. Under the hood it is still routing to models we already know, just picking what it thinks will give the “best” result for the request.

That can be fine for a lot of general use cases, but if you’re working in specific domains like coding agents or high-precision summarization, that routing can actually make results worse compared to sticking with a model you know performs well for your workload.

8thcross · 12m ago

I tried it with cursor-agent, their cli - and it generated better code than expected. YMMV. It was more thoughtful and strategic than the other frontier models.

tim_angus · 41m ago

And yet the media keeps using the term "exponential improvement"...

cranberryturkey · 1h ago

it solved a huge bug i've been struggling with.

hitradostava · 1h ago

Had Sonnet 4 not been able to?

cranberryturkey · 1h ago

No, it kept going in circles....spent like 3 weeks trying to fix it. Got access to gpt5 yesterday and all major bugs are resolved.

revskill · 1h ago

Sure.