Show HN: Local LLM AIME benchmarking tool
1 belluxx 0 5/31/2025, 12:11:21 PM github.com ↗
I made this simple tool to compare local LLMs. Any provider that supports OpenAI-like APIs can be used (LMStudio, Llama.cpp, Ollama) but you can also use Openrouter/OpenAI if you change the base URL accordingly.
In my opinion it is not particularly useful for comparing different models from different companies since some models are optimized heavily on math or even trained on AIME problems.
However it is really useful for testing different quantizations of the same model or the same quantization from different providers.
Let me know what you think about it!
Also check the README to see some examples of the results you will get from it.
No comments yet