Ask HN: What Does Your Self-Hosted LLM Stack Look Like in 2025?
17 anditherobot 6 6/5/2025, 1:00:20 AM
Back when web development was taking off, there was always a go-to stack — something like Postgres + Django + jQuery, or .NET + Bootstrap, SQLITE. Over the years we had proven tech and proven patterns like : MVC, SPA etc...
Now that local LLMs are gaining traction, I’m wondering what the equivalent stack looks like today.
Models, Runtime, hardware and other tools.
That could rival the Claudes, ChatGPTs or Geminis, etc
Thanks
Tbh for coding I just use the smaller ones like CodeQwen 7B. way faster and good enough for autocomplete. Only fire up the big model when I actually need it to think.
The annoying part is keeping everything updated, new model drops every week and half don't work with whatever you're already running.
The models vary depending on the task. DeepSeek distilled has been a favorite for the past several months.
I use various smaller (~3B) models for simpler tasks.