Show HN: Blast – Fast, multi-threaded serving engine for web browsing AI agents
BLAST is a high-performance serving engine for browser-augmented LLMs, designed to make deploying web-browsing AI easy, fast, and cost-manageable.
The goal with BLAST is to ultimately achieve google search level latencies for tasks that currently require a lot of typing and clicking around inside a browser. We're starting off with automatic parallelism, prefix caching, budgeting (memory and LLM cost), and an OpenAI-Compatible API but have a ton of ideas in the pipe!
Website & Docs: https://blastproject.org/ https://docs.blastproject.org/
MIT-Licensed Open-Source: https://github.com/stanford-mast/blast
Hope some folks here find this useful! Please let me know what you think in the comments or ping me on Discord.
— Caleb (PhD student @ Stanford CS)
Slightly broader question: Do you feel like there is any ethical considerations one should think about before using something like this?
Absolutely agree there are ethical considerations with web browsing AI in general. (And the whole general ongoing shift from using websites to using chatgpt/perplexity)
People are already deploying tools like Anubis[1] or go-away[2] to cope with the insane load that bots put on their server infrastructure. This is an arms race. In the end, the users lose.
[1]: https://anubis.techaro.lol
[2]: https://git.gammaspectra.live/git/go-away
E.g. if I'm the developer of a workforce management app (e.g. https://WhenIWork.com) I could deploy BLAST to quickly provide automation for users of my app.
I think it would take more substantiation to claim this. Maybe 10 out of 1000 websites will get closed, but the users will be able to use AI tools to use the remaining 990. Not sure about you, but sounds like a win for users to me.
this may be true for the time being (or not), but will sure change if/when more [websites] become aware of what is going on. The result will be 10 out of 1000 websites will remain open, and not the ones you actually want. The more pressure there is on the sites/servers, the more these will have to act to stay online to begin with.
I'm personally not sure there are, but I'm curious to hear what those are for you :)
No comments yet
Unfortunately, it seems like browser-use tries to hide that it's controlled by an AI, and uses a typical browser user-agent: https://github.com/browser-use/browser-use/blob/d8c4d03d9ea9...
I'm guessing because of the amount of flags, you could probably come up with a unique fingerprint for browser-use, based on available features, screen/canvas size and so on, that could be reused for blocking everyone using Blast/browser-use.
If calebhwin wanted to make Blast easier to identify, they could set a custom user-agent for browser-use that makes it clear it's Blast doing the browsing for the user.
BLAST can also be used to add automation to your own site/app FWIW.
I think it's cool that you're experimenting in this area, but I'm not a huge fan of this as answer to a question about responsible/respectful web crawling. This stuff seems like it should be table stakes (even if you wanted to make it optional for the end user), but "yeah probably; learn the codebase, fork it, make changes, then we'll review it" really puts the onus onto the original poster.
I'm really not excited at all about the "scrape other people's data" use case for BLAST and if we can prevent it then awesome. I'm excited about BLAST automating science, legacy web apps, internal tools, adding AI automation to your own app, etc.
Ad blocker is the least user can do.
I don't have API for _that_.
One use-case for this is conversations: So for example if I invoke /chat/completions with [{"role": "user", "content": "Go to google.com"}] and later with [{"role": "user", "content": "Go to google.com"}, {"role": "user", "content": "Search for gorilla vs 100 human"}] then we cache the browser state from the first invocation so it can be quickly restored (or reuse the browser if not evicted).
Caching will get much more sophisticated in a future version, it's the piece we're most actively working on.
Though ultimately I think the web needs something better than MCP and we're actively working on that as well.
(I make AI agents as my day job, among many other things.)
Now the API is what may be throwing folks off. Right now it's an OpenAI-compatible API. We will implement MCP. But really the core thing is abstracting away optimizations required to efficiently run browser+LLM.
I read through the docs and want to try this. I couldn’t figure out what you were using g under the covers for the actual webpage “use” I did see: “ What we’re not focusing on is building a better Browser-Use, Notte, Steel, or other vision LLM. Our focus is serving these systems in a way that is optimized under constraints”
Cool! That makes sense!but I was still curious what your default AI-driven browser use library was.
If I were to use your library right now on my MacBook, is it using “browser-use” under the covers by default? (I should poke around the source more. I just thought it might be helpful to ask here in case I misunderstand or in case others had the same question)
A queue? What else can you really do. Your server is at the mercy of OpenAI, so all you can do is queue up everyone's requests. I don't know how many parallel requests you can send out to OpenAI (infinite?), so that bottleneck is probably just dependent on your server stack (how many threads).
There's a lot of language being thrown out here, and I'm trying to see if we're using too much language to discuss basic concepts.
Now you are right that at some point you'll get throttled either by LLM rate limits or a set budget for browser memory usage or LLM cost. BLAST's scheduler is aware of these constraints and uses them to effectively map tasks to resources (resource=browser+LLM).
You wanna load test the local DOM rendering or what? Otherwise, whatever endpoint is serving the HTML, you configure your load tests to hit that, if anything. Although you'd just be doing the same testing your HTTP server probably already doing before doing releases, usually you wanna load test your underlying APIs or similar instead.
https://blast.ncbi.nlm.nih.gov/Blast.cgi