Show HN: Index – New Open Source browser agent
We built Index - new SOTA Open Source browser agent.
It reached 92% on WebVoyager with Claude 3.7 (extended thinking). o1 was used as a judge, also we manually double checked the judge.
At the core is same old idea - run simple JS script in the browser to identify interactable elements -> draw bounding boxes around them on a screenshot of a browser window -> feed it to the LLM.
What made Index so good:
1. We essentially created browser agent observability. We patched Playwright to record the entire browser session while the agent operates, simultaneously tracing all agent steps and LLM calls. Then we synchronized everything in the UI, creating an unparalleled debugging experience. This allowed us to pinpoint exactly where the agent fails by seeing what it "sees" in session replay alongside execution traces.
2. Our detection script is simple but extremely good. It's carefully crafted via trial and error. We also employed CV and OCR.
3. Agent is very simple, literally just a while loop. All power comes from carefully crafted prompt and ton of eval runs.
Index is a simple python package. It also comes with a beautiful CLI.
pip install lmnr-index
playwright install chromium
index run
We've recently added o4-mini, Gemini 2.5 Pro and Flash. Pro is extremely good and fast. Give it a try via CLI.
You can also use index via serverless API. (https://docs.lmnr.ai/index-agent/api/getting-started)
Or via chat UI - https://lmnr.ai/chat.
To learn more about browser agent observability and evals check out open-source repo (https://github.com/lmnr-ai/lmnr) and our docs (https://docs.lmnr.ai/tracing/browser-agent-observability).
pip install lmnr-index playwright install chromium index run
Also try experimenting with different models. So far, Gemini 2.5 Pro is the best in terms of quality/speed. Claude 3.7 is also pretty good.
https://simplify.jobs/install
Do the biggest companies not create the most value for the world?
Consider this. If the most successful companies are simply cheating customers, then most consumers are stupid; handing offer their hard-earned money for bad deals and to be exploited.
But most people are not stupid, and most people highly value their money. So, they only buy something because they want what the seller is offering even more than their money. This means that companies create great value because they offer something that people really want.
A person always has a choice not to spend their money. Even if they need expensive healthcare, they can choose not to buy it. By buying the product, they want the service more than their money.
They might think that the price is too high, but prices are a function of market forces.
It doesn’t make sense to me that a person can say they feel exploited because they have voluntarily chosen to buy at a particular price. They probably want to pay less, and might feel that the consumer surplus is low, but they still value the service more than their money. That isn’t exploitation to me.
But let’s say your point is true. How do those players become entrenched? I’d say it’s from providing great value.
Can run with `uvx --from lmnr-index --python 3.12 index run`
I've written a handful of pretty hacky Python scripts that just pull down all of the HTML content from a page and toss it over to OpenAI. As you can imagine, these were all extremely simple tasks, e.g., "find out if there's a login button"
What's a good example of a complex task that Index is well-suited for? What's the threshold of minimal complexity where you guys are a really good fit?
- any task that requires UI interaction, button clicking, filter selection, form filling and so on. Just prompt it, it's surprisingly very robust and self-healing.
- complex long-running task that require extensive context - e.g. researching one topic and then creating spreadsheet, creating a presentation for a topic and so on.
Essentially, any task that can be done within a browser environment that previously required flacky hardcoded predefined scripts. Also, website testing is a great example.
For the CLI and custom models, you can clone the repo, then go to the cli.py and manually add your model there. I will work on proper support of custom models.
- SOTA on webvoyager
- browser agent observability
- fast and reliable
- CLI for easier interaction
- available as a serverless API
If it's abusive behavior you are worried about you should be able to detect and block it with rate limits or other tools that target the malicious behavior. If you can't distinguish between my usage and a regular browser then I'm not sure what moral ground you have to claim my usage is hurting you.
[0] "A robot is a program that automatically traverses the Web's hypertext structure by retrieving a document, and recursively retrieving all documents that are referenced." https://www.robotstxt.org/faq/what.html
If I wanted to use this to do my personal browsing for me, like checking for website updates on those where RSS does not exist, you shouldn't be able to stop me.
One thing I couldn't help but notice was the crazy amount of HTTP requests going on in the demo on the github readme page, and the video looks to be sped up.
I'm all for AI assisting but I wouldn't want to create even 1/10th of these HTTP requests, as a good netizen; unless I'm missing the point.
here's a demo of CLI https://x.com/skull8888888888/status/1914728292193628330