Show HN: I made a tiny, playable benchmark where LLMs compete head-to-head

2 yz-yu 0 8/5/2025, 9:57:49 AM llm-fighter.com ↗
TL;DR: LLM Fighter is a small, open-source, playable benchmark for agentic behavior. You bring an OpenAI-compatible API; the demo runs in the browser. It creates head-to-head “battles” that stress tools, planning, and efficiency, and shows step-by-step logs you can download.

What it does well: quick, honest feel for how agents act under the same rules. What it’s not: a formal academic benchmark or a single “score”. Why I built it: I wanted something you can play in minutes and still learn from.

Comments (0)

No comments yet