First Ethernet-Based AI Memory Fabric System to Increase LLM Efficiency – News (allaboutcircuits.com)

We evaluated the new GPT models with a minimal agent on SWE-bench verified. GPT-5 scores 65%, mini 60%, nano 35%. Still behind Opus 5 (68%), on par with Sonnet 4 (65%). But a lot cheaper, especially mini!

Cost is tricky to compare with agents, because agents succeed fast, but fail slowly. If an agent doesn't succeed, it should just continue trying until it succeeds, or hits a run time limit. And that's (almost) what happens.

But even so, it's very clear that

1. GPT-5 is cheaper than Sonnet 4 2. GPT-5-mini is _incredibly_ cheap for what it provides (you only sacrifice some 5%pts, but end up paying maybe 1/5th of the total cost)

All of the code to reproduce our numbers is open-source. There's a box on the bottom with the exact command to run in order to reproduce our numbers.

Also very happy to answer questions here!

techpineapple · 3h ago

I'm curious if this might help Cursor's lighting money on fire problem?

https://pivot-to-ai.com/2025/07/09/cursor-tries-setting-less...

is this enough of a price difference to make cursor profitable?

lieret · 3h ago

I think gpt-5-mini should really help them. At least from these benchmark scores, there probably shouldn't be a huge performance degradation for letting gpt-5-mini drive most of the workflow. Of course users might still want to just run with latest and greatest (but still gpt-5 will be cheaper I think)

Thinking Is Becoming a Luxury Good (nytimes.com)

Putin Tells U.S. He'll Halt War in Exchange for Eastern Ukraine (wsj.com)

Chinese biz using AI to hit US politicians, influencers with propaganda (theregister.com)

OpenAI will bring back ChatGPT-4o for plus users (old.reddit.com)

A Deeper Dive into Apache Iceberg V3 (opensource.googleblog.com)

Glass bottles found to contain more microplastics than plastic bottles (phys.org)

Leann – Claude Code–compatible semantic search with 97% smaller vector index (github.com)

Python backoff repository was archived (github.com)

Are you in a mid-career to senior job? Don't fear AI (theconversation.com)

Jim Lovell Has Died (bbc.com)

New signs found of giant gas planet in 'Earth's neighbourhood' (bbc.com)

A Vulkan on Metal Mesa 3D Graphics Driver (lunarg.com)

Apollo 13 astronaut Jim Lovell dies (news.sky.com)

Show HN: Text Cleanse – Free Online Text Cleaner and Case Converter (textcleanse.com)

First Ethernet-Based AI Memory Fabric System to Increase LLM Efficiency – News (allaboutcircuits.com)

Show HN: GPT OSS: How to run and fine-tune (docs.unsloth.ai)

PayPal to let U.S. businesses accept payment in more than 100 cryptocurrencies (fortune.com)

GPT-5 Under Fire: Red Teaming OpenAI's Model Reveals Surprising Weaknesses (splx.ai)

How The Black Cauldron became a notorious Disney flop (bbc.com)

Red Teams Jailbreak GPT-5 with Ease, Warn It's 'Nearly Unusable' for Enterprise (securityweek.com)

Ask HN: Any tips for agentic coding or vibe coding based on a Figma Design?

Wireless Application Protocol (WAP) (en.wikipedia.org)

Tamagot.sh: A terminal-based Tamagotchi that lives off your Git commits (github.com)

Reverse Engineering training data of OpenAI's new GPT-OSS models (twitter.com)

My ancestors fought in WWII. Hiroshima is plagued by shallow reading (washingtonpost.com)

Hire people who give a s__t (alexw.substack.com)

NearDrop: A Partial Implementation of Google's Nearby/Quick Share for macOS (github.com)

Labour must create green jobs or lose voters to parties who oppose net zero (theguardian.com)

ChatGPT is bringing back 4o as an option because people missed it (theverge.com)

AI on the Edge Device: Digitizing Your Non-Digital Meters with an ESP32-Cam (github.com)

Progressive Web Apps (adactio.com)

Self‑Hosted Kubernetes DevOps Portal – Ephemeral Envs, CI/CD, RBAC, Cost Control (ktl.ai)

A framework for AI-to-AI dialogue to solve the context problem in LLMs (gladicalexandru.medium.com)

FlusterApp: The Note-Taking App for Students and Academics (flusterapp.com)

Why did Gemini 2.5 Pro change my model from GPT-4o-mini to Gemini Flash?

Overview of Jepsen at BugBash 2025 (antithesis.com)

Wplace – Paint the World (wplace.live)

July 2025 Tech litigation roundup (techpolicy.press)

Molecules in the Spotlight: Snapshots Reveal the Eternal Dance of Particles (aktuelles.uni-frankfurt.de)

GPU-Rich Labs Have Won: What's Left for the Rest of Us Is Distillation (inference.net)

Optimized Autonomous Inference (outerbounds.com)

Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?

Loyalty programmes are keeping America's airlines aloft (economist.com)

Build durable workflows with Postgres (dbos.dev)

ParserComp 2025 – Initial Results (intfiction.org)

Gold futures jump to record high after US tariffs on cast bars – Gold (theguardian.com)

Efrit: A native elisp coding agent running in Emacs (github.com)

OpenAI bringing back GPT-4o to ChatGPT Plus users (old.reddit.com)

Show HN: New Angular OpenAPI Client gen (looking for testers) (ng-openapi.dev)

Ask HN: Does No Response Mean a Bad Idea?

GPT-5 on SWE-bench: Cost and performance deep-dive

Comments (3)