Ask HN: What's Your Useful Local LLM Stack?

1 Olshansky 0 7/15/2025, 3:17:16 PM

What I’m asking HN:

What does your actually useful local LLM stack look like?

I’m looking for something that provides you with real value — not just a sexy demo.

---

After a recent internet outage, I realized I need a local LLM setup as a backup — not just for experimentation and fun.

My daily (remote) LLM stack:

  - Claude Max ($100/mo): My go-to for pair programming. Heavy user of both the Claude web and desktop clients.

  - Windsurf Pro ($15/mo): Love the multi-line autocomplete and how it uses clipboard/context awareness.

  - ChatGPT Plus ($20/mo): My rubber duck, editor, and ideation partner. I use it for everything except code.

Here’s what I’ve cobbled together for my local stack so far:

Tools

  - Ollama: for running models locally

  - Aider: Claude-code-style CLI interface

  - VSCode w/ continue.dev extension: local chat & autocomplete

Models

  - Chat: llama3.1:latest

  - Autocomplete: Qwen2.5 Coder 1.5B

  - Coding/Editing: deepseek-coder-v2:16b

Things I’m not worried about:

  - CPU/Memory (running on an M1 MacBook)

  - Cost (within reason)

  - Data privacy / being trained on (not trying to start a philosophical debate here)

I am worried about:

  - Actual usefulness (i.e. “vibes”)

  - Ease of use (tools that fit with my muscle memory)

  - Correctness (not benchmarks)

  - Latency & speed

Right now: I’ve got it working. I could make a slick demo. But it’s not actually useful yet.

---

Who I am

  - CTO of a small startup (5 amazing engineers)

  - 20 years of coding (since I was 13)

  - Ex-big tech

Ask HN: Is it time to fork HN into AI/LLM and "Everything else/other?"

Ask HN: How did Soham Parekh get so many jobs?

Cyberpunk and Politics: Neon Dystopias, Power, and Resistance

Ask HN: How is my MacBook temp getting misread?

The German Works Council has blocked Amazon's performance reviews

Ask HN: How much of OpenAI code is written by AI?

Is Firebase Console Down

Cloudflare DNS Down in UK/EU

Cloudflare's 1.1.1.1 DNS server seems to be down

Tell HN: 1.1.1.1 Appears to Be Down

Telnyx launches automatic noise suppression for AI Voice Agents

Ask HN: How are you productively using Claude code?

Ask HN: How to find mentors while working remote?

Dyan – A Visual REST API Builder You Can Self-Host

Tell HN: Lobste.rs blocking the Brave browser

Ask HN: What's your favorite book you've read?

Ask HN: What is a physiically disabled person to do in this job market?

Ask HN: Battery life for graphical Linux VMs (or Asahi) on Apple Silicon laptops

Ask HN: How do you get first 10 customers?

Tell HN: I Lost Joy of Programming

Is making the rust compiler slow a billion dollar mistake?

Open-source STM32 autopilot for long-range fixed-wing UAVs (SmartNavX)

Ask HN: Are there any tools for tracking GPU prices over time?

Ask HN: Could the C64 startup screen have encouraged more users to learn BASIC?

Attended Windsurf's Build Night 18 hours before founders joined Google DeepMind

Ramanujan-Computing: Distributed Computing with Idle Smart Devices: Open-Source

Ask HN: Looking for a directory of PS1 command prompts. Like awesome lists

Ask HN: Worth leaving position over push to adopt vibe coding?

Co-founder exiting after pivot – what's a fair exit package?

Ask HN: What's Your Useful Local LLM Stack?

Comments (0)