Show HN: Local LLM Notepad – run a GPT-style model from a USB stick

18 davidye324 4 6/30/2025, 11:43:37 PM github.com ↗

What it is A single 45 MB Windows .exe that embeds llama.cpp and a minimal Tk UI. Copy it (plus any .gguf model) to a flash drive, double-click on any Windows PC, and you’re chatting with an LLM—no admin rights, Cloud, or network.

Why I built it Existing “local LLM” GUIs assume you can pip install, pass long CLI flags, or download GBs of extras.

I wanted something my less-technical colleagues could run during a client visit by literally plugging in a USB drive.

How it works PyInstaller one-file build → bundles Python runtime, llama_cpp_python, and the UI into a single PE.

On first launch, it memory-maps the .gguf; subsequent prompts stream at ~20 tok/s on an i7-10750H with gemma-3-1b-it-Q4_K_M.gguf (0.8 GB).

Tick-driven render loop keeps the UI responsive while llama.cpp crunches.

A parser bold-underlines every token that originated in the prompt; Ctrl+click pops a “source viewer” to trace facts. (Helps spot hallucinations fast.)

Comments (4)

gxonatano · 5h ago

> walk up to any computer

Windows users seem to think their OS is ubiquitous. But in fact for most hackers reading this site, using Windows is a huge step backwards in productivity and capability.

Zetaphor · 3h ago

Surely you're hinting at Linux, in which case this runs fine with WINE

ensocode · 2h ago

Interesting, will definitely try it. What can be expected? What other models do perform ok with this?

ge96 · 6h ago

Wonder if you can use/interface with those coral accelerator boards

Ask HN: What Are You Working On? (June 2025)

Ask HN: What's the 2025 stack for a self-hosted photo library with local AI?

Ask HN: How to find developers interested in open-source concepts?

Ask HN: Which skill do you believe will take the longest to be replaced by AI?

Ask HN: How have you shared computers with your young child (~3 to 5)

Ask HN: 80s electronics book club; anyone remember this illustrator?

Ask HN: Is your company forcing use of AI?

AI that answers questions without making you hate the internet

Ask HN: Stock Android tablet free of bloatware?

Ask HN: How did low contrast text become so pervasive?

A literary magazine accessible only via telnet

Ask HN: Which Free Software or Open Source Project Needs Help?

Ask HN: Anyone using augmented reality, VR, glasses, helmets etc. in industry?

Ask HN: Where do you host your Go apps

Canon selphy cp1500 privacy concerns

Tell HN: (dictionary|thesaurus).reference.com is now a spam site

Ask HN: Is the header CSS broken for you?

Ask HN: Startup shutting down, should we open source?

Ask HN: Is noprocrast still working for you?

Something 'deeper' than Emacs, or am I looking for a unicorn?

Ask HN: How Are You Reading HN in June 2025?

A reverse-delta backup strategy – obvious idea or bad idea?

Ask HN: What do use for private service monitoring?

Ask HN: Languages Designed for WASM?

Ask HN: What Happened to James Halliday ( Substack)?

Ask HN: Better-auth or Nextauth or something else

What's the best gem you've found on Hacker News?

Ask HN: Why aren't AIs being used as app beta testers yet?

Ask HN: What's Your Car?

Ask HN: Why does my Node.js multiplayer game lag at 500 players with low CPU?

Ask HN: Alternatives to Cloudflare for DNS?

How do you handle production webhook delivery reliability in your apps?

Ask HN: Would this idea help address declining populations in many countries?

The 90% Gravity Problem: Why We Tend to Quit Right Before the Finish Line

Ask HN: LLM Assisted Vim Workflows?

Ask HN: Anyone interested in taking over my indie app?

The scam that is Visa Account Updater

Ask HN: Will MCP replace GUI interacting with back end via RESTful APIs?

Ask HN: Is it possible to generate usable energy from environmental heat?

Show HN: Local LLM Notepad – run a GPT-style model from a USB stick

Comments (4)