It's the hardware more than the software that is the limiting factor at the moment, no? Hardware to run a good LLM locally starts around $2000 (e.g. Strix Halo / AI Max 395) I think a few Strix Halo iterations will make it considerably easier.
And "good" is still questionable. The thing that makes this stuff useful is when it works instantly like magic. Once you find yourself fiddling around with subpar results at slower speeds, essentially all of the value is gone. Local models have come a long way but there is still nothing even close to Claude levels when it comes to coding. Benchmarks are one thing, but reality is a completely different story.
navbaker · 7m ago
Open Web UI is a great alternative for a chat interface. You can point to an OpenAI API like vLLM or use the native Ollama integration and it has cool features like being able to say something like “generate code for an HTML and JavaScript pong game” and have it display the running code inline with the chat for testing
shaky · 18m ago
This is something that I think about quite a bit and am grateful for this write-up. The amount of friction to get privacy today is astounding.
ahmedbaracat · 12m ago
Thanks for sharing. Note that the GitHub at the end of the article is not working…
https://simonwillison.net/2025/Jul/29/space-invaders/