Open models by OpenAI (openai.com)

I'm constantly tempted by the idealism of this experience, but when you factor in the performance of the models you have access to, and the cost of running them on-demand in a cloud, it's really just a fun hobby instead of a viable strategy to benefit your life.

As the hardware continues to iterate at a rapid pace, anything you pick up second-hand will still deprecate at that pace, making any real investment in hardware unjustifiable.

Coupled with the dramatically inferior performance of the weights you would be running in a local environment, it's just not worth it.

I expect this will change in the future, and am excited to invest in a local inference stack when the weights become available. Until then, you're idling a relatively expensive, rapidly depreciating asset.

braooo · 2m ago

Running LLMs at home is a repeat of the mess we make with "run a K8s cluster at home" thinking

You're not OpenAI or Google. Just use pytorch, opencv, etc to build the small models you need.

You don't need Docker even! You can share over a simple code based HTTP router app and pre-shared certs with friends.

You're recreating the patterns required to manage a massive data center in 2-3 computers in your closet. That's insane.

shaky · 33m ago

This is something that I think about quite a bit and am grateful for this write-up. The amount of friction to get privacy today is astounding.

noelwelsh · 28m ago

It's the hardware more than the software that is the limiting factor at the moment, no? Hardware to run a good LLM locally starts around $2000 (e.g. Strix Halo / AI Max 395) I think a few Strix Halo iterations will make it considerably easier.

colecut · 25m ago

This is rapidly improving

https://simonwillison.net/2025/Jul/29/space-invaders/

ramesh31 · 19m ago

>Hardware to run a good LLM locally starts around $2000 (e.g. Strix Halo / AI Max 395) I think a few Strix Halo iterations will make it considerably easier.

And "good" is still questionable. The thing that makes this stuff useful is when it works instantly like magic. Once you find yourself fiddling around with subpar results at slower speeds, essentially all of the value is gone. Local models have come a long way but there is still nothing even close to Claude levels when it comes to coding. I just tried taking the latest Qwen and GLM models for a spin through OpenRouter with Cline recently and they feel roughly on par with Claude 3.0. Benchmarks are one thing, but reality is a completely different story.

ahmedbaracat · 27m ago

Thanks for sharing. Note that the GitHub at the end of the article is not working…

mkagenius · 19m ago

Thanks for the heads up. Its fixed now -

Coderunner-UI: https://github.com/instavm/coderunner-ui

Coderunner: https://github.com/instavm/coderunner

navbaker · 22m ago

Open Web UI is a great alternative for a chat interface. You can point to an OpenAI API like vLLM or use the native Ollama integration and it has cool features like being able to say something like “generate code for an HTML and JavaScript pong game” and have it display the running code inline with the chat for testing

pyman · 11m ago

Mr Stallman? Richard, is that you?

dmezzetti · 20m ago

I built TxtAI with this philosophy in mind: https://github.com/neuml/txtai

Open models by OpenAI (openai.com)

GPT-5 (openai.com)

Genie 3: A new frontier for world models (deepmind.google)

Perplexity is using stealth, undeclared crawlers to evade no-crawl directives (blog.cloudflare.com)

uBlock Origin Lite now available for Safari (apps.apple.com)

Show HN: I spent 6 years building a ridiculous wooden pixel display (benholmen.com)

Slow (michaelnotebook.com)

If you're remote, ramble (stephango.com)

Show HN: Kitten TTS – 25MB CPU-Only, Open-Source TTS Model (github.com)

Things that helped me get out of the AI 10x engineer imposter syndrome (colton.dev)

Emailing a one-time code is worse than passwords (blog.danielh.cc)

Modern Node.js Patterns (kashw1n.com)

Vibechart (vibechart.net)

Claude Opus 4.1 (anthropic.com)

I gave the AI arms and legs then it rejected me (grell.dev)

Claude Code IDE integration for Emacs (github.com)

Ultrathin business card runs a fluid simulation (github.com)

Telo MT1 (telotrucks.com)

Corporation for Public Broadcasting ceasing operations (cpb.org)

GPT-5: Key characteristics, pricing and system card (simonwillison.net)

Job-seekers are dodging AI interviewers (fortune.com)

Monitor your security cameras with locally processed AI (frigate.video)

Mastercard deflects blame for NSFW games being taken down (pcgamer.com)

6 weeks of Claude Code (blog.puzzmo.com)

Writing a good design document (grantslatton.com)

Qwen-Image: Crafting with native text rendering (qwenlm.github.io)

Live coding interviews measure stress, not coding skills (hadid.dev)

The anti-abundance critique on housing is wrong (derekthompson.org)

MacBook Pro Insomnia (manuel.bernhardt.io)

Tesla withheld data, lied, misdirected police to avoid blame in Autopilot crash (electrek.co)

We may not like what we become if A.I. solves loneliness (newyorker.com)

Historical Tech Tree (historicaltechtree.com)

Objects should shut up (dustri.org)

At 17, Hannah Cairo solved a major math mystery (quantamagazine.org)

Gemini 2.5 Deep Think (blog.google)

How we made JSON.stringify more than twice as fast (v8.dev)

US reportedly forcing TSMC to buy 49% stake in Intel to secure tariff relief (notebookcheck.net)

GPT-5 for Developers (openai.com)

Cerebras Code (cerebras.ai)

Flipper Zero dark web firmware bypasses rolling code security (rtl-sdr.com)

Japan: Apple Must Lift Browser Engine Ban by December (open-web-advocacy.org)

PHP 8.5 adds pipe operator (thephp.foundation)

This Month in Ladybird (ladybird.org)

Cursed Knowledge (immich.app)

Ollama Turbo (ollama.com)

Lina Khan points to Figma IPO as vindication of M&A scrutiny (techcrunch.com)

Ozempic shows anti-aging effects in trial (trial.medpath.com)

Scientific fraud has become an 'industry,' analysis finds (science.org)

Persona vectors: Monitoring and controlling character traits in language models (anthropic.com)

So you want to parse a PDF? (eliot-jones.com)

I want everything local – Building my offline AI workspace

Comments (11)