Show HN: A tool to benchmark LLM APIs (OpenAI, Claude, local/self-hosted)
It runs a configurable number of test requests and reports two key metrics: • First-token latency (ms): How long it takes for the first token to appear • Output speed (tokens/sec): Overall output fluency
Demo: https://llmapitest.com/ Code: https://github.com/qjr87/llm-api-test
The goal is to provide a simple, visual, and reproducible way to evaluate performance across different LLM providers, including the growing number of third-party “proxy” or “cheap LLM API” services.
It supports: • OpenAI-compatible APIs (official + proxies) • Claude (via Anthropic) • Local endpoints (custom/self-hosted)
You can also self-host it with docker-compose. Config is clean, adding a new provider only requires a simple plugin-style addition.
Would love feedback, PRs, or even test reports from APIs you’re using. Especially interested in how some lesser-known services compare.
Why would you want the author to write about something else to validate the post? That would be an appeal to authority, which is the complete opposite of what the Hacker Manifesto has always been about in terms of ethos, goals, etc.