Show HN: Any-LLM – Lightweight router to access any LLM Provider
97 AMeckes 58 7/22/2025, 5:40:40 PM github.com ↗
We built any-llm because we needed a lightweight router for LLM providers with minimal overhead. Switching between models is just a string change : update "openai/gpt-4" to "anthropic/claude-3" and you're done.
It uses official provider SDKs when available, which helps since providers handle their own compatibility updates. No proxy or gateway service needed either, so getting started is pretty straightforward - just pip install and import.
Currently supports 20+ providers including OpenAI, Anthropic, Google, Mistral, and AWS Bedrock. Would love to hear what you think!
with no vested interest in litellm, i'll challenge you on this one. what compatibility issues have come up? (i expect text to have the least, and probably voice etc have more but for text i've had no issues)
you -want- to reimplement interfaces because you have to normalize api's. in fact without looking at any-llm code deeply i quesiton how you do ANY router without reimplementing interfaces. that's basically the whole job of the router.
You're absolutely right that any router reimplements interfaces for normalization. The difference is what layer we reimplement at. We use SDKs where available for HTTP/auth/retries and reimplement normalization.
Bottom line is we both reimplement interfaces, just at different layers. Our bet on SDKs is mostly about maintenance preferences, not some fundamental flaw in LiteLLM's approach.
but I can give you one example: litellm recently had issue with handling deepseek reasoning. they broke implementation and while reasoning was missing from sync and streaming responses.
> it reimplements provider interfaces rather than leveraging official SDKs, which can lead to compatibility issues and unexpected behavior modifications
Leveraging official SDKs also does not solve compatibility issues. any_llm would still need to maintain compatibility with those offical SDKs. I don't think one way clearly better than the other here.
I'd rather a library that just used OpenAPI/REST, than one that takes a ton of dependencies.
I shipped a similar abstraction for llms a bit over a week ago:
https://github.com/omarkamali/borgllm
pip install borgllm
I focused on making it Langchain compatible so you could drop it in as a replacement. And it offers virtual providers for automatic fallback when you reach rate limits and so on.
Why Python? Probably because most of the SDKs are python, but something that could be ported across languages without requiring an interpreter would have been really amazing.
A truly universal solution would likely need to exist at a lower level of abstraction, completely decoupling the application's language from the model's runtime. It's a much harder problem to solve there, but it would be a huge step forward.
[0] https://github.com/vercel/ai
[1] https://github.com/ClickHouse/ai-sdk-cpp
[2] https://github.com/cactus-compute/cactus
Github: https://github.com/proxai/proxai
Website: https://proxai.co/
Seems like reputation parasitism.
We're very small compared to the Mozilla mothership, but moving quickly to support open source AI in any way we can.
On a second visit I notice a link to mozilla.org on their footer.
Still doesent ring official by me from being a veteran mozilla user (netscape, mdn, firefox) but ok, thanks for the explanation.
I only use it in development. Could you elaborate on why you don't recommend using it in production?
But provider switching is built in some of these - and the folks behind envoy built: https://github.com/katanemo/archgw - developers can use an OpenAI client to call any model, offers preference-aligned intelligent routing to LLMs based on usage scenarios that developers can define, and acts as an edge proxy too.
You're right that archgw handles routing at the infrastructure level, which is perfect for centralized control. any-llm simply gives you the option to handle routing in your application code when that makes sense (For example, premium users get Opus-4). We leave the architectural choice to you, whether that's adding a proxy, keeping routing in your app, or using both, or just using any-llm directly.
I’ve been looking for something a bit different though related to Ollama. I’d like a load balancing reverse proxy that supports queuing requests to multiple Ollama servers and sending requests only when a Ollama server is up and idle (not processing). Anything exist?
[1] https://github.com/Airbolt-AI/airbolt