Show HN: Nxtscape – an open-source agentic browser
-- Why bother building a new browser? For the first time since Netscape was released in 1994, it feels like we can reimagine browsers from scratch for the age of AI agents. The web browser of tomorrow might not look like what we have today.
We saw how tools like Cursor gave developers a 10x productivity boost, yet the browser—where everyone else spends their entire workday—hasn't fundamentally changed.
And honestly, we feel like we're constantly fighting the browser we use every day. It's not one big thing, but a series of small, constant frustrations. I'll have 70+ tabs open from three different projects and completely lose my train of thought. And simple stuff like reordering tide pods from amazon or filling out forms shouldn't need our full attention anymore. AI can handle all of this, and that's exactly what we're building.
Here’s a demo of our early version https://dub.sh/nxtscape-demo
-- What makes us different We know others are exploring this space (Perplexity, Dia), but we want to build something open-source and community-driven. We're not a search or ads company, so we can focus on being privacy-first – Ollama integration, BYOK (Bring Your Own Keys), ad-blocker.
Btw we love what Brave started and stood for, but they've now spread themselves too thin across crypto, search, etc. We are laser-focused on one thing: making browsers work for YOU with AI. And unlike Arc (which we loved too but got abandoned), we're 100% open source. Fork us if you don't like our direction.
-- Our journey hacking a new browser To build this, we had to fork Chromium. Honestly, it feels like the only viable path today—we've seen others like Brave (started with electron) and Microsoft Edge learn this the hard way.
We also started with why not just build an extension. But realized we needed more control. Similar to the reason why Cursor forked VSCode. For example, Chrome has this thing called the Accessibility Tree - basically a cleaner, semantic version of the DOM that screen readers use. Perfect for AI agents to understand pages, but you can't use it through extension APIs.
That said, working with the 15M-line C++ chromium codebase has been an adventure. We've both worked on infra at Google and Meta, but Chromium is a different beast. Tools like Cursor's indexing completely break at this scale, so we've had to get really good with grep and vim. And the build times are brutal—even with our maxed-out M4 Max MacBook, a full build takes about 3 hours.
Full disclosure: we are still very early, but we have a working prototype on GitHub. It includes an early version of a "local Manus" style agent that can automate simple web tasks, plus an AI sidebar for questions, and other productivity features (grouping tabs, saving/resuming sessions, etc.).
Looking forward to any and all comments!
You can download the browser from our github page: https://github.com/nxtscape/nxtscape
some genuine feedback on a frustrating early experience:
- I ran the suggested "Group all my tabs by topic" in productivity agent mode. It worked great. - I then asked it to remove all tab groups and reset things, but was told this:
- Tried agent mode and was told: - Basically was being sent back and forth. Went back to productivity mode and argued with it for a bit. The closest I could come to it removing all tabs groups was creating a new tab group encompassing all tabs, but couldn't get it to remove groups entirely. I'm guessing it might lack that API?Overall, it'd be nice if every browser level action is took had an undo button. Or at least if it was smart enough/able to remove the tab groups it just created.
Will keep playing with it more.
edit1: one more weird issue: While running the chat interface on chrome internal pages, it would randomly browse me google.com for some reason.
edit2: confirmed that agent mode lacks a tool to ungroup tabs, just a tool to create tab groups.
Bookmarks don't cut it anymore when you've got 25 years of them saved.
Falling down deep rabbit holes because you landed on an attention-desperate website to check one single thing and immediately got distracted can be reduced by running a bodyguard bot to filter junk out. Those sites create deafening noise that you can squash by telling the bot to just let you know when somebody replies to your comment with something of substance that you might actually want to read.
If it truly works, I can imagine the digital equivalent of a personal assistant + tour manager + doorman + bodyguard + housekeeper + mechanic + etc, that could all be turned off and on with a switch.
Given that the browser is our main portal to the chaos that is internet in 2025, this is not a bad idea! Really depends on the execution, but yeah.. I'm very curious to see how this project (and projects like it) go.
We spend 90%+ of our time in browsers, yet they're still basically dumb windows. Having an AI assistant that remembers what you visited, clips important articles (remember Evernote web clipper?), saves highlights and makes everything semantically searchable - all running locally - would be game-changing.
Everything stays in a local PostgresDB - your history, highlights, sessions. You can ask "what was that pricing comparison from last month?" or "find my highlights about browser automation" and it just works. Plus built-in self-control features to block distracting sites when you need to focus.
Beyond search and memory, the browser can actually help you work. AI that intelligently groups your tabs ("these 15 are all Chromium research"), automation for grunt work ("compare 2TB hard drive prices across these sites"), or even "summarize all new posts in my Discord servers" - all handled locally. The browser should help us manage internet chaos, not add to it.
Would love to hear what specific workflows are painful for you!
Is this a common and well-defined term that people use? I've never heard it.
It would appear to me from the context that it means something like "web browser with AI stuff tackled on".
By "agentic browser" we basically mean a browser with AI agents that can do web navigation tasks for you. So instead of you manually clicking around to reorder something on Amazon or fill out forms, the AI agent can actually navigate the site and do those tasks.
Does having access to Chromium internals give you any super powers over connecting over the Chrome Devtools Protocol?
Few ideas we were thinking of: integrating a small LLM, building MCP store into browser, building a more AI friendly DOM, etc.
Even today, we use chrome's accessibility tree (a better representation of DOM for LLMs) which is not exposed via chrome extension APIs.
You might consider the Accessibility Tree and its semantics. Plain divs are basically filtered out so you're left with interactive objects and some structural/layout cues.
These fancy words carry an intellectual/productive effect. When they're put to use it probably makes people feel like they're getting things done. And they never feel lazy because of this.
A complicated workflow may involve other tools. For example, the input to the LLM may produce something that tells it to set the user-agent to such and such as string:
Other tools could be clicking on things in the page, or even injecting custom JavaScript when a page loads.The tl;dr is that it's AI that makes decisions on its own.
On the other hand: this has the potential to be an absolute security Chernobyl. A browser is likely to be logged into all your sensitive accounts. An agent in your browser is probably going to be exposed to untrusted inputs from the internet by its very nature.
You have the potential for prompt injection to turn your life upside down in a matter of seconds. I like the concept but I wouldn't touch this thing with a ten foot pole unless everyone in the supply chain was PCI/SOC2/ISO 27001 certified, the whole supply chain has been vetted, and I have blood oaths about its security from third party analysts.
This is exactly why we're going local-first and open source. With cloud agents (like Manus.im), you're trusting a black box with your credentials. With local agents, you maintain control:
- Agents only run when you explicitly trigger them
- You see exactly what they're doing in real-time and can stop them
- You can run tasks in separate chrome user profile
- Most importantly: the code is open source, so you can audit exactly what's happening.
regardless, you did not answer OPs point, which is that any potentially malicious site can prompt inject you at any point, and trigger an MCP or any other action or whatever before you see them and stop them. The whole point of an AI browser is like self-driving car, being able to de-focus and let it do its thing. If i have to be nervous to watch if im getting hacked at any given second, then it's probably not a great product
have linux next on our radar. What build do you want?
https://github.com/nxtscape/nxtscape/issues/5
Great product though.
Good luck, but in your place i would at least start with something that a certain ICP needs more. Many, many manhours have been wasted by ambitious technical founders on taking down Chrome. (many also starting from a chrome fork itself). But none of them succeeded. We only have limited energy
Definitely agree there is good amount of competition here.
But we do think there is a gap in the market for open-source, community driven and privacy-first AI browser. (Something like Brave?)
Brave is a decent example but their business model is actually complicated, it includes a lot of little stuff. And they dont have the unit cost of LLMs (im assuming at some point you will take the burden of the llms, if not local)
Island browser, chrome enterprise have kinda of validated the need for enterprise version of browser with VPN and DLP engine (data-loss-prevention).
All the same, looks like y’all are having fun working on it, and maybe some unforeseen usecase will bubble up.
So your thesis is that an AI agent should decide what I pay attention to, rather than me?
What could possibly go wrong?
While reviewing the prompt's capabilities, I had an idea: implementing a Greasemonkey/Userscript-style system, where users could inject custom JavaScript or prompts based on URLs, could be a powerful way to enhance website interactions.
For instance, consider a banking website with a cumbersome data export process that requires extra steps to make the data usable. Imagine being able to add a custom button to their UI (or define a custom MCP function) specifically for that URL, which could automatically pull and format the data into a more convenient format for plain text accounting.
Was huge fan of Tampermonkey back in the days.
* Buying a sofa. You want to filter for sofas of a specific size, with certain features, marketing sites want to feed you a bunch of marketing slop for each sofa before giving you the details . This generalises to many domains.
* You have a few friends who are still stuck on Facebook, you want to be notified if they post anything and avoid other rubbish
* The local neighborhood is stuck organising in a Facebook group or even worse, nextdoor. You want to see any new posts except for those couple of guys who are always posting the same thing.
* A government consultation website has been put up, but as a hurdle the consultation document has been combinatorially expanded to 763 pages by bureaucratic authoring techniques. You want to undo the combinatorial expansion do you can identify things you actually care about.
This jumped out to me as well. Even sites like Amazon lack per-item-cost sorting, which can be really helpful when buying in bulk. Historically we've seen people use scraping and data science to build sites like https://diskprices.com/; without using LLMs. If LLMs are useful for those types of tasks, perhaps we'll see a surge in similar sites instead of end users doing prompt engineering in their browser.
> You want to see any new posts except for those couple of guys who are always posting the same thing.
It looks like nextdoor supports blocking users, although other sites may not.
https://help.nextdoor.com/s/article/block-a-neighbor
Sort of like a backwards perplexity search. (LLM context is from open tabs rather than the tool that brings you to those tabs)
I built a tab manager extension a long time ago that people used but ran into the same problem- the concept of tab management runs deeper than just the tabs themselves.
I added few features which I felt would be useful - easy way to organise and group tabs - simple way to save and resume sessions with selective context.
What are your problems that you would like to see solved?
This would of course apply to not just open tabs but tabs I used to have open, where the LLM knows about my browsing history.
But I think I would want a non-chat interface for this. (of course at any time I could chat/ask a question as well)
Resist the call to open in a tab every link in this article, overcome the fear of losing something if all these tabs lagging behind are closed right now without further consideration.
But wonder if it matter if it the agent is mostly using it for "human" use cases and not scrapping?
If any type of AI based assistance is supposed to adhere to the robot.txt, then would you also say that AI based accessibility tools should refuse to work on pages blocked by robot.txt?
What coherent definition of robot excludes Chrome but includes this?
If your browser behaves, it's not going to be excluded in robots.txt.
If your browser doesn't behave, you should at least respect robots.txt.
If your browser doesn't behave, and you continue to ignore robots.txt, that's just... shitty.
No, it's common practice to allow Googlebot and deny all other crawlers by default [0].
This is within their rights when it comes to true scrapers, but it's part of why I'm very uncomfortable with the idea of applying robots.txt to what are clearly user agents. It sets a precedent where it's not inconceivable that we have websites curating allowlists of user agents like they already do for scrapers, which would be very bad for the web.
[0] As just one example: https://www.404media.co/google-is-the-only-search-engine-tha...
I am not sure I agree with an AI-aided browser, that will scrape sites and aggregate that information, being classified as "clearly" a user agent.
If this browser were to gain traction and ends up being abusive to the web, that's bad too.
Where do you draw the line of crawler vs. automated "user agent"? Is it a certain number of web requests per minute? How are you defining "true scraper"?
> A robot is a program that automatically traverses the Web's hypertext structure by retrieving a document, and recursively retrieving all documents that are referenced.
To me "recursive" is key—it transforms the traffic pattern from one that strongly resembles that of a human to one that touches every page on the site, breaks caching by visiting pages humans wouldn't typically, and produces not just a little bit more but orders of magnitude more traffic.
I was persuaded in another subthread that Nxtscape should respect robots.txt if a user issues a recursive request. I don't think it should if the request is "open these 5 subreddits and summarize the most popular links uploaded since yesterday", because the resulting traffic pattern is nearly identical to what I'd have done by hand (especially if the browser implements proper rate limiting, which I believe it should).
[0] https://www.robotstxt.org/faq/what.html
Maybe some new standards and maybe a user configurable per site permissions may make it better?
I'm curious to see how this will turn out to be.
Why? My user agent is configured to make things easier for me and allow me to access content that I wouldn't otherwise choose to access. Dark mode allows me to read late at night. Reader mode allows me to read content that would otherwise be unbearably cluttered. I can zoom in on small text to better see it.
Should my reader mode or dark mode or zoom feature have to respect robots.txt because otherwise they'd allow me to access content that I would otherwise have chosen to leave alone?
I know its not completely true, I know reader mode can help you bypass the ads _after_ you already had a peek at the cluttered version, but if you need to go to the next page or something like that you need to disable reader-mode once and so on, so its a very granular ad-blocking while many AI use cases are about bypassing viewing it at all by a human; and the other thing is that reader mode is not very popular so its not a significant threat.
*or other links on their websites, or informative banners, etc
AFAIK this is false, and this browser can do things like "summarize all the cooking recipes linked in this page" and therefore act exactly like a scraper (even if at smaller scale than most scrapers)
If tomorrow magically all phones and all computers had an ad-blocking browser installed -and set as the default browser- a big chunk of the economy would collapse, so while I can see the philosophical value of "What a user does with a page after it has entered their browser is their own prerogative", the pragmatic in me knows that if all users cared about that and enforced it it would have grave repercussions in the livelihood of many.
> A robot is a program that automatically traverses the Web's hypertext structure by retrieving a document, and recursively retrieving all documents that are referenced.
There's nothing recursive about "summarize all the cooking recipes linked on this page". That's a single-level iterative loop.
I will grant that I should alter my original statement: if OP wanted to respect robots.txt when it receives a request that should be interpreted as an instruction to recursively fetch pages, then I'd think that's an appropriate use of robots.txt, because that's not materially different than implementing a web crawler by hand in code.
But that represents a tiny subset of the queries that will go through a tool like this and respecting robots.txt for non-recursive requests would lead to silly outcomes like the browser refusing to load reddit.com [0].
[0] https://www.reddit.com/robots.txt
What about reader mode that is auto-configured to turn on immediately on landing on specific domains? Is that a robot for the purposes of robots.txt?
https://addons.mozilla.org/en-US/firefox/addon/automatic-rea...
And also, just to confirm, I'm to understand that if I'm navigating the internet with an ad blocker then you believe that I should respect robots.txt because my user agent is now a robot by virtue of using an ad blocker?
Is that also true if I browse with a terminal-based browser that simply doesn't render JavaScript or images?
Autoconfig of reader mode and so on its so uncommon that is not even in the radar of most websites, if it was browser developers probably would try to create a solution that satisfies both parties, like putting the ads at the end and required to be text-only and other guidelines, but its not popular, same thing happens with terminal-based browsers, a lot of the most visited websites in the world don't even work without JS enabled.
On the other hand, this AI stuff seems to envision a larger userbase so it could become a concern and therefore the role of robots.txt or other anti-bot features could have some practical connotations.
I'm not asking if you believe ad blocking is ethical, I got that you don't. I'm asking if it turns my browser into a scraper that should be treated as such, which is an orthogonal question to the ethics of the tool in the first place.
I strongly disagree that user agents of the sort shown in the demo should count as robots. Robots.txt is designed for bots that produce tons of traffic to discourage them from hitting expensive endpoints (or to politely ask them to not scrape at all). I've responded to incidents caused by scraper traffic and this tool will never produce traffic in the same order of magnitude as a problematic scraper.
If we count this as a robot for the purposes of robots.txt we're heading down a path that will end the user agent freedom we've hitherto enjoyed. I cannot endorse that path.
For me the line is simple, and it's the one defined by robotstxt.org [0]: "A robot is a program that automatically traverses the Web's hypertext structure by retrieving a document, and recursively retrieving all documents that are referenced. ... Normal Web browsers are not robots, because they are operated by a human, and don't automatically retrieve referenced documents (other than inline images)."
If the user agent is acting on my instructions and accessing a specific and limited subset of the site that I asked it to, it's not a web scraper and should not be treated as such. The defining feature of a robot is amount of traffic produced, not what my user agent does with the information it pulls.
[0] https://www.robotstxt.org/faq/what.html
No comments yet
> A robot is a program that automatically traverses the Web's hypertext structure by retrieving a document, and recursively retrieving all documents that are referenced.
This is absolutely not what you are doing, which means what you have here is not a robot. What you have here is a user agent, so you don't need to pay attention to robots.txt.
If what you are doing here counted as robotic traffic, then so would:
* Speculative loading (algorithm guesses what you're going to load next and grabs it for you in advance for faster load times).
* Reader mode (algorithm transforms the website to strip out tons of content that you don't want and present you only with the minimum set of content you wanted to read).
* Terminal-based browsers (do not render images or JavaScript, thus bypassing advertising and according to some justifications leading them to be considered a robot because they bypass monetization).
The fact is that the web is designed to be navigated by a diverse array of different user agents that behave differently. I'd seriously consider imposing rate limits on how frequently your browser acts so you don't knock over a server—that's just good citizenship—but robots.txt is not designed for you and if we act like it is then a lot of dominoes will fall.
[0] https://www.robotstxt.org/faq/what.html
Website operators should not get a say in what kinds of user agents I used to access their sites. Terminal? Fine. Regular web browser? Okay. AI powered web browser? Who cares. The strength of the web lies in the fact that I can access it with many different kinds of tools depending on my use case, and we cannot sacrifice that strength on the altar of hatred of AI tools.
Down that road lies disaster, with the Play Integrity API being just the tip of the iceberg.
https://www.robotstxt.org/faq/what.html
There's a straw man here. If you want to reorder an item on Amazon: click on 'order history', scroll, and click buy. This is a well-optimized path already and it doesn't require your full attention. I suspect the agent approach takes more effort as you need to type and then monitor what the AI is doing.
A chat interface works for ChatGPT because most folks use it as a pseudo-search, but productivity tools are (broadly speaking) not generative, therefore shouldn't be using freeform inputs. I have many thoughts on fixing this, and it's a very hard problem, but simply slapping an LLM onto Chrome is just lazy. I don't mean to be overly negative, but it's kind of wild to see YC funding slop like this.
And that's exactly what this is: slop. There's no technical creativity here, this isn't a new product segment, it barely deserves the "hey bro, this might be a feature, not a product" startup 101 criticism. It's what ChatGPT would spit out if you asked it what a good startup idea would be in 2025. All we need to do, even if we were being as charitable as possible, is ask who's doing the heavy lifting here (hint: it's not in the Github repo).
Appreciate the agplv3 licence, kudos on that.
I get the general sentiment. But cursor for sure has improved productivity by a huge multiplicative factor, especially for simpler stuff (like building chrome extension).
What is the tech around the thing that segments out DOM elements automatically and shows the visual representation. I think something like this would be great for automated UI testing agents?
To get the page content we parse accessibility tree.
Also what's the business model?
The hype cycle business model never changes.
> what's the reason for no Linux/Windows?
Sorry, just lack of time. Also we use Sparkle for distributing updates, which is MacOS only.
> Also what's the business model?
We are considering an enterprise version of the browser for teams.
Instead of manually hunting across half a dozen different elements, then copy/paste and retype to put something into a format I want…
I can just get Dia do it. In fact, I can create a shortcut to get it to do it the same way every single time. It’s the first time I’ve used something that actually feels like an extension of the web, instead of a new way to simply act on it at the surface level.
I think the obvious extension of that is agentic browsers. I can’t wait for this to get built to a standard where I can use it every day… But how well is it going to run on my 16GB M1 Pro?
Download form https://www.nxtscape.ai/ or our github page.
Google being a big one of those companies would soon side with those companies and not with the users, it's been their modus operandi, just recently some people got threats that if they don't stop using ad blockers in YouTube they will ban them from the platform.
edit: Just read about the accessibility thing, but that's thin. Is there any usecase in the future that a browser can, but an extension can't?
The only reason to use a browser over a chrome extension is to bypass security features, for example, trusted events. If a user wants the browser window to go to full screen or play a video, a physical mouse click or key press is required. Moreover, some websites do not want to be automated like ChatGPT web console and Chase.com which checks if the event was a trusted event before accepting a button click or key press. This means that a Chrome extension can not automate voice commands inferred with audio to text. However, to get a trusted event only requires the user to press a button, any button, so message or dialog prompt that says, "Press to go full screen," is all that is required. This can be down with a remote bluetooth keyboard also.
The way I see it, these limitations are in place for very, very good reasons and should not be bypassed. Moreover, there are much larger security issues using a agentic browser which is sending entire contents of a bank website or health records in a hospital patient portal to a third party server. It is possible to run OpenAI's whisper on webgpu on a Macbook Pro M3 but most text generation models over 300M will cause it to heat up enough to cook a steak. There are even bigger issues with potential prompt injection attacks from third party websites that know agentic browsers are visiting their sites.
The first step in mitigating these security vulnerabilities is preventing the automation from doing anything a Chrome extension can't already do. The second is blacklisting or opt in only allowing the agents to read and especially to write (fill in form is a write) any webpage without explicit permission. I've started to use VSCode's copilot for command line action and it works with permissions the same way such as only session only access.
I've already solved a lot of the problems associated with using a Chrome extension for agentic browser automation. I really would like to be having this conversation with people.
EDIT: I forgot the most important part. There are 3,500,000,000 Chrome users on Earth. Getting them to install a Chrome extension is much, much easier than getting them to install a new browser.
https://developer.chrome.com/docs/extensions/ai
Don't any of these fit the bill? Are they Gemini-locked and you want something else? I am not familiar with the Chrome API, so pardon my ignorance.
- Ship a small LLM along with browser - MCP store built in
Oh cool, will look into basic.tech to understand more.
feel free to add new or upvote. Want to build what people want :)
Thank you! We have ollama integration already, you can run models locally and use that for AI chat.
- https://tsdr.uspto.gov/#caseNumber=76017078&caseSearchType=U...
> PROVIDING MULTIPLE-USER ACCESS TO A GLOBAL COMPUTER INFORMATION NETWORK FOR THE TRANSFER AND DISSEMINATION OF A WIDE RANGE OF INFORMATION; ELECTRONIC TRANSMISSION OF DATA, IMAGES, AND DOCUMENTS VIA COMPUTER NETWORKS; [ELECTRONIC MAIL SERVICES; PROVIDING ON-LINE CHAT ROOMS FOR TRANSMISSION OF MESSAGES AMONG COMPUTER USERS CONCERNING A WIDE VARIETY OF FIELDS]
- https://tsdr.uspto.gov/#caseNumber=76017079&caseSearchType=U...
> PROVIDING INFORMATION IN THE FIELD OF COMPUTERS VIA A GLOBAL COMPUTER NETWORK; PROVIDING A WIDE RANGE OF GENERAL INTEREST INFORMATION VIA COMPUTER NETWORKS
- https://tsdr.uspto.gov/#caseNumber=74574057&caseSearchType=U...
> computer software for use in the transfer of information and the conduct of commercial transactions across local, national and world-wide information networks
Also the fact that it's AGPL means this project is very copyleft and not compatible with business models.
I'm not saying that there is no place for copyleft open source anymore, but when it's in a clearly commercial project that makes me question the utility of it being open source.
https://www.gnu.org/licenses/why-affero-gpl.html
This means that if this company is successful and sells me 1 license, in theory I can request the source code and spin up Dr Evil's voice 1 billion clones and not pay licenses for those.
With other forms of GPL you only have to release the source code if you release the software to the user.
Saying that such a behavior encompasses all possible business models, it's like saying directorship is the only form of governance.
It was cute when the internet was cute but now it's just boring.
But not gonna lie, as a tiny startup, we don’t have marketing budget of Perplexity or Dia, so we picked a name and icon that at least hinted at “browser” right away. Definitely not trying to mislead anyone -- just needed something recognizable out of the gate.