Show HN: Testronaut – AI-powered mission-based browser testing
I’ve been working on a project called *Testronaut*, an autonomous testing framework that combines AI reasoning with real browser automation. The idea is to let you define end-to-end tests as “missions” in plain English, then have an agent run them through a real browser using Playwright.
Why I built this: I’ve often found end-to-end tests to be fragile, time-consuming to maintain, and difficult to scale. Testronaut tries to reduce the maintenance burden by using AI to adapt tests to small UI changes, while still producing a deterministic report of what passed/failed.
How it works: - Missions can be written as strings or functions. - The agent uses GPT-4o with a set of tools (click, type, navigate, get_dom, etc.) to interact with the page. Support for other LLMs/Models in the works. - Browser control is handled by Playwright. - Reports are generated in both JSON and HTML, with step-by-step breakdowns (including screenshots). - It runs locally via a CLI (`npx testronaut`) and doesn’t require any hosted service. You will need to provide your own OpenAI API key, however.
Current state: - Early days: it works for simple flows and demo apps, but I’m still tuning the reliability and efficiency. - It installs with one command and comes with a sample mission. - Open source on npm/GitHub.
Links: - Docs & quickstart: https://docs.testronaut.app - GitHub: https://github.com/mission-testronaut/testronaut-cli - npm: https://www.npmjs.com/package/testronaut
I’d love feedback from the HN community on: - Where this could be most useful (CI/CD? flaky test replacement? exploratory testing?). - What concerns you’d have about using an AI-driven test runner. - Any “gotchas” I should watch out for in early adoption.
Thanks for taking a look!
No comments yet