Gym Class VR (YC W22) Is Hiring – UX Design Engineer (ycombinator.com)

1 points by hackerews 19m ago 0 comments

Relace (YC W23) Is Hiring for Code LLMs (SF)

1 points by pfunctional 9h ago 0 comments

Artie (YC S23) Is Hiring Engineers, AES, and Senior PMM (ycombinator.com)

1 points by j-cheong 1d ago 0 comments

Depot (YC W23) Is Hiring a Solutions Engineer (Remote US and Canada) (ycombinator.com)

1 points by kylegalbraith 2d ago 0 comments

Svix (webhooks as a service) is hiring for a founding marketing lead (svix.com)

1 points by tasn 2d ago 0 comments

Dynamo AI (YC W22) Is Hiring for AI Product Managers (ycombinator.com)

1 points by DynamoFL 2d ago 0 comments

Kapa.ai (YC S23) is hiring research and software engineers (ycombinator.com)

1 points by emil_sorensen 3d ago 0 comments

Optery (YC W22) Is Hiring in Engineering, Legal, Sales, Marketing (U.S., Latam) (optery.com)

1 points by beyondd 4d ago 0 comments

Telli (YC F24) is hiring engineers, designers, and interns (on-site in Berlin) (hi.telli.com)

1 points by sebselassie 4d ago 0 comments

Infisical (YC W23) Is Hiring Solutions Engineers to Scale the OSS Security Stack (ycombinator.com)

1 points by vmatsiiako 5d ago 0 comments

Channel3 (YC S25) Is Hiring a Founding Engineer, NYC (channel3.notion.site)

1 points by aschiff1 5d ago 0 comments

Thunder Compute (YC S24) Is Hiring (ycombinator.com)

1 points by cpeterson42 7d ago 0 comments

Deepnote (YC S19) is hiring engineers to build a better Jupyter notebook (deepnote.com)

1 points by Equiet 7d ago 0 comments

Prosper AI (YC S23) Is Hiring Founding Account Executives (NYC) (jobs.ashbyhq.com)

1 points by XDGC 8d ago 0 comments

The Forecasting Company (YC S24) Is Hiring a Software Engineer (ycombinator.com)

1 points by jfainberg 8d ago 0 comments

Lago – Open-Source Usage Based Billing – Is Hiring in Sales, Eng, Ops (EU, US) (ycombinator.com)

1 points by AnhTho_FR 9d ago 0 comments

Ember (YC F24) Is Hiring Full Stack Engineer (ycombinator.com)

1 points by charlene-wang 9d ago 0 comments

LiteLLM (YC W23) is hiring a back end engineer (ycombinator.com)

1 points by detente18 10d ago 0 comments

SigNoz (YC W21, Open Source Datadog) Is Hiring Platform Engineers (Remote) (jobs.ashbyhq.com)

1 points by pranay01 10d ago 0 comments

Motion (YC W20) Is Hiring Principal Software Engineers (jobs.ashbyhq.com)

1 points by ethanyu94 13d ago 0 comments

Bild AI (YC W25) Is Hiring an Applied AI Engineer (workatastartup.com)

1 points by rooppal 13d ago 0 comments

Text.ai (YC X25) Is Hiring Founding Full-Stack Engineer (ycombinator.com)

1 points by RushiSushi 15d ago 0 comments

Cua (YC X25) is hiring design engineers in SF (ycombinator.com)

1 points by frabonacci 15d ago 0 comments

Activeloop (YC S18) Is Hiring Member of Technical Staff – Back End Engineering (careers.activeloop.ai)

1 points by davidbuniat 15d ago 0 comments

Coris (YC S22) Is Hiring (ycombinator.com)

1 points by smaddali 16d ago 0 comments

14.ai (YC W24) is hiring engineers in SF to build an AI-native Zendesk (14.ai)

1 points by michaelfester 16d ago 0 comments

Spice Data (YC S19) Is Hiring a Product Associate (New Grad) (ycombinator.com)

1 points by richard_pepper 17d ago 0 comments

Ashby (YC W19) Is Hiring Design Engineers in AMER and EMEA (ashbyhq.com)

1 points by abhikp 20d ago 0 comments

EasyPost (YC S13) Is Hiring (easypost.com)

1 points by jstreebin 21d ago 0 comments

Tesorio (YC S15) Is Hiring a Senior GenAI Engineer (100% Remote) (tesorio.com)

1 points by FabioFleitas 21d ago 0 comments

OneSignal (YC S11) Is Hiring Engineers (onesignal.com)

1 points by gdeglin 22d ago 0 comments

Show HN: Shimmy – 5MB privacy-first, local alternative to Ollama (680MB)

12 MKuykendall 9 9/4/2025, 6:10:12 PM github.com ↗

Comments (9)

MKuykendall · 1d ago

Hey HN! I built this because I was tired of waiting 10 seconds for Ollama's 680MB binary to start just to run a 4GB model locally.

Quick demo - working VSCode + local AI in 30 seconds: curl -L https://github.com/Michael-A-Kuykendall/shimmy/releases/late... ./shimmy serve # Point VSCode/Cursor to localhost:11435

The technical achievement: Got it down to 5.1MB by stripping everything except pure inference. Written in Rust, uses llama.cpp's engine.

One feature I'm excited about: You can use LoRA adapters directly without converting them. Just point to your .gguf base model and .gguf LoRA - it handles the merge at runtime. Makes iterating on fine-tuned models much faster since there's no conversion step.

Your data never leaves your machine. No telemetry. No accounts. Just a tiny binary that makes GGUF models work with your AI coding tools.

Would love feedback on the auto-discovery feature - it finds your models automatically so you don't need any configuration.

What's your local LLM setup? Are you using LoRA adapters for anything specific?

carlos_rpn · 1d ago

You may have noticed already, but the link to the binary is throwing a 404.

MKuykendall · 1d ago

This should be fixed now!

stupidgeek314 · 18h ago

Windows Defender tripped this for me, calling it out as Bearfoos trojan. Most likely a false positive, but jfyi.

MKuykendall · 8h ago

Try cargo install or intentionally exclude, unsigned Rust binaries will do this.

homarp · 1d ago

Nice, a rust tool wrapping llama.cpp

how does it differ from llama-server?

and from llama-swap?

MKuykendall · 1d ago

Shimmy is designed to be "invisible infrastructure" - the simplest possible way to get local inference working with your existing AI tools. llama-server gives you more control, llama-swap gives you multi-model management.

  Key differences:
  - Architecture: llama-swap = proxy + multiple servers, Shimmy = single server
  - Resource usage: llama-swap runs multiple processes, Shimmy = one 50MB process
  - Use case: llama-swap for managing many models, Shimmy for simplicity

MKuykendall · 1d ago

Shimmy is for when you want the absolute minimum footprint - CI/CD pipelines, quick local testing, or systems where you can't install 680MB of dependencies.

cat-turner · 10h ago

looks cool, ty! really great project will try this out.