I want everything local – Building my offline AI workspace

72 mkagenius 10 8/8/2025, 6:19:05 PM instavm.io ↗

Comments (10)

tcdent · 4m ago

I'm constantly tempted by the idealism of this experience, but when you factor in the realistic performance of the models you have access to, and the cost of running them on-demand in a cloud, it's really just a fun hobby instead of a viable strategy to benefit your life.

As the hardware continues to iterate at a rapid pace, anything you pick up second-hand will still deprecate at that pace, making any real investment in hardware unjustifiable.

Coupled with the dramatically inferior performance of the weights you would be running in a local environment, it's just not worth it.

I expect this will change in the future, and am excited to invest in a local inference stack when the weights become available. Until then, you're idling a relatively expensive, rapidly depreciating asset.

noelwelsh · 23m ago

It's the hardware more than the software that is the limiting factor at the moment, no? Hardware to run a good LLM locally starts around $2000 (e.g. Strix Halo / AI Max 395) I think a few Strix Halo iterations will make it considerably easier.

colecut · 20m ago

This is rapidly improving

https://simonwillison.net/2025/Jul/29/space-invaders/

ramesh31 · 14m ago

And "good" is still questionable. The thing that makes this stuff useful is when it works instantly like magic. Once you find yourself fiddling around with subpar results at slower speeds, essentially all of the value is gone. Local models have come a long way but there is still nothing even close to Claude levels when it comes to coding. I just tried taking the latest Qwen and GLM models for a spin through OpenRouter with Cline recently and they feel roughly on par with Claude 3. Benchmarks are one thing, but reality is a completely different story.

shaky · 28m ago

This is something that I think about quite a bit and am grateful for this write-up. The amount of friction to get privacy today is astounding.

ahmedbaracat · 22m ago

Thanks for sharing. Note that the GitHub at the end of the article is not working…

mkagenius · 14m ago

Thanks for the heads up. Its fixed now -

Coderunner-UI: https://github.com/instavm/coderunner-ui

Coderunner: https://github.com/instavm/coderunner

navbaker · 18m ago

Open Web UI is a great alternative for a chat interface. You can point to an OpenAI API like vLLM or use the native Ollama integration and it has cool features like being able to say something like “generate code for an HTML and JavaScript pong game” and have it display the running code inline with the chat for testing

pyman · 6m ago

Mr Stallman? Richard, is that you?

dmezzetti · 15m ago

I built TxtAI with this philosophy in mind: https://github.com/neuml/txtai