Show HN: Nxtscape – an open-source agentic browser (github.com)

Hi HN - we're Nithin and Nikhil, twin brothers and founders of nxtscape.ai (YC S24). We're building Nxtscape ("next-scape") - an open-source, agentic browser for the AI era.

-- Why bother building a new browser? For the first time since Netscape was released in 1994, it feels like we can reimagine browsers from scratch for the age of AI agents. The web browser of tomorrow might not look like what we have today.

We saw how tools like Cursor gave developers a 10x productivity boost, yet the browser—where everyone else spends their entire workday—hasn't fundamentally changed.

And honestly, we feel like we're constantly fighting the browser we use every day. It's not one big thing, but a series of small, constant frustrations. I'll have 70+ tabs open from three different projects and completely lose my train of thought. And simple stuff like reordering tide pods from amazon or filling out forms shouldn't need our full attention anymore. AI can handle all of this, and that's exactly what we're building.

Here’s a demo of our early version https://dub.sh/nxtscape-demo

-- What makes us different We know others are exploring this space (Perplexity, Dia), but we want to build something open-source and community-driven. We're not a search or ads company, so we can focus on being privacy-first – Ollama integration, BYOK (Bring Your Own Keys), ad-blocker.

Btw we love what Brave started and stood for, but they've now spread themselves too thin across crypto, search, etc. We are laser-focused on one thing: making browsers work for YOU with AI. And unlike Arc (which we loved too but got abandoned), we're 100% open source. Fork us if you don't like our direction.

-- Our journey hacking a new browser To build this, we had to fork Chromium. Honestly, it feels like the only viable path today—we've seen others like Brave (started with electron) and Microsoft Edge learn this the hard way.

We also started with why not just build an extension. But realized we needed more control. Similar to the reason why Cursor forked VSCode. For example, Chrome has this thing called the Accessibility Tree - basically a cleaner, semantic version of the DOM that screen readers use. Perfect for AI agents to understand pages, but you can't use it through extension APIs.

That said, working with the 15M-line C++ chromium codebase has been an adventure. We've both worked on infra at Google and Meta, but Chromium is a different beast. Tools like Cursor's indexing completely break at this scale, so we've had to get really good with grep and vim. And the build times are brutal—even with our maxed-out M4 Max MacBook, a full build takes about 3 hours.

Full disclosure: we are still very early, but we have a working prototype on GitHub. It includes an early version of a "local Manus" style agent that can automate simple web tasks, plus an AI sidebar for questions, and other productivity features (grouping tabs, saving/resuming sessions, etc.).

Looking forward to any and all comments!

You can download the browser from our github page: https://github.com/nxtscape/nxtscape

Comments (95)

hannob · 3h ago

Okay, maybe this is a stupid question, but: what is an agentic browser? You seem to assume that everyone knows what that means.

Is this a common and well-defined term that people use? I've never heard it.

It would appear to me from the context that it means something like "web browser with AI stuff tackled on".

felarof · 2h ago

Thanks for asking - not a stupid question at all! I should have probably explained it at the top of my post.

By "agentic browser" we basically mean a browser with AI agents that can do web navigation tasks for you. So instead of you manually clicking around to reorder something on Amazon or fill out forms, the AI agent can actually navigate the site and do those tasks.

wild_egg · 2h ago

Not to pull a "why should I use Dropbox when I have rsync" but why should we use this over adding a Playwright MCP to Claude Desktop or similar?

Does having access to Chromium internals give you any super powers over connecting over the Chrome Devtools Protocol?

felarof · 2h ago

Yes, eventually we think there is more value of owning the entire stack than just be a MCP connector.

Few ideas we were thinking of: integrating a small LLM, building MCP store into browser, building a more AI friendly DOM, etc.

Even today, we use chrome's accessibility tree (a better representation of DOM for LLMs) which is not exposed via chrome extension APIs.

pickpuck · 1h ago

> building a more AI friendly DOM

You might consider the Accessibility Tree and its semantics. Plain divs are basically filtered out so you're left with interactive objects and some structural/layout cues.

shortrounddev2 · 1h ago

I would take the position of "why use this when I have eyes and hands and a brain?"

al_borland · 2h ago

I first heard the term agentic about a month ago. I went from never hearing it, to hearing it 3 or 4 times in 2 days... one of which was on an internal town hall where I work, where leadership was simply using it as if the whole world already knew what it meant, instead of literally being the first time it was ever mentioned.

The tl;dr is that it's AI that makes decisions on its own.

kordlessagain · 2h ago

Agents are LLM responses that are feed with tools, like calculate(expression). When it encounters a thing it needs to do to meet desired output, it will run the tool. That is defining a simple agentic workflow.

A complicated workflow may involve other tools. For example, the input to the LLM may produce something that tells it to set the user-agent to such and such as string:

  set_user_agent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36");

Other tools could be clicking on things in the page, or even injecting custom JavaScript when a page loads.

dvt · 10m ago

Suffers from the same problem as all other AI "workflows"--no one wants to fucking chat with a computer (Brave sort-of does this, and it's god-awful). A chat interface should only be used as a fallback if the agent is too dumb to figure out what I want.

A chat interface works for ChatGPT, because most folks use it as a pseudo-search, but productivity tools are (broadly speaking) not generative, therefore shouldn't be using freeform inputs. I have many thoughts on fixing this, and it's a very hard problem, but simply slapping an LLM on Chrome is just lazy. I don't mean to be overly negative, but it's kind of wild to see YC funding slop like this.

jklinger410 · 3h ago

What is with Mac users forking Chromium and then only making releases for Mac?

felarof · 3h ago

Haha, was easier to build and we were the first users :)

have linux next on our radar. What build do you want?

jtolly710 · 2h ago

.deb would be great to see next :)

shortrounddev2 · 1h ago

Windows

doublerabbit · 3h ago

FreeBSD, Haiku, Amiga

felarof · 2h ago

Sounds good! will look into getting linux build.

kerv · 40m ago

This is cool thanks for sharing

What is the tech around the thing that segments out DOM elements automatically and shows the visual representation. I think something like this would be great for automated UI testing agents?

felarof · 21m ago

Oh that's from browser use buildDOMTree.js. check it out on their github.

ellisd · 44m ago

Why obfuscate the LLM system prompt in your Github repo when it's going to be completely visible in the network inspector?

felarof · 25m ago

Not on purpose, just got compressed during production build through webpack. will get that fixed.

ellisd · 9m ago

Thank you for the clarification!

While reviewing the prompt's capabilities, I had an idea: implementing a Greasemonkey/Userscript-style system, where users could inject custom JavaScript or prompts based on URLs, could be a powerful way to enhance website interactions.

For instance, consider a banking website with a cumbersome data export process that requires extra steps to make the data usable. Imagine being able to add a custom button to their UI (or define a custom MCP function) specifically for that URL, which could automatically pull and format the data into a more convenient format for plain text accounting.

deepdarkforest · 44m ago

This is definitely a winners take all market. Kudos for giving it a shot, but imo browser projects are just too big for a team of 2/3. Plus, google has already demoed at IO the first hint at this. IMO you just cannot move fast enough to grab enough market share as a first/second mover before google just does it on chrome, and that's assuming you can outcompete with Dia in the first place. Even browser-use can do this also, and they have good distribution already.

Good luck, but in your place i would at least start with something that a certain ICP needs more. Many, many manhours have been wasted by ambitious technical founders on taking down Chrome. (many also starting from a chrome fork itself). But none of them succeeded. We only have limited energy

felarof · 6m ago

Thanks for the honest feedback!

Definitely agree there is good amount of competition here.

But we do think there is a gap in the market for open-source, community driven and privacy-first AI browser. (Something like Brave?)

awongh · 2h ago

I think LLMs could have a reasonable chance at solving tab-related workflows (keeping track of tabs or the idea/concept of tabs) - that is tracking and sorting lots of small related research ideas.

Sort of like a backwards perplexity search. (LLM context is from open tabs rather than the tool that brings you to those tabs)

I built a tab manager extension a long time ago that people used but ran into the same problem- the concept of tab management runs deeper than just the tabs themselves.

felarof · 2h ago

Yeah, I feel LLMs can finally solve the tab overload issue. I suffer from this constantly.

I added few features which I felt would be useful - easy way to organise and group tabs - simple way to save and resume sessions with selective context.

What are your problems that you would like to see solved?

awongh · 1h ago

I don't like the idea of letting the LLM run wild and categorize things directly, but in a tab-organizing view it would be useful to add more semantic sorting of the tabs- maybe it would enable something like multiple tab-view control panel: Show all the AI tabs. Show all the image diffusion tabs. Show all the LLM tabs. (so overlapping views of sets of tabs)

This would of course apply to not just open tabs but tabs I used to have open, where the LLM knows about my browsing history.

But I think I would want a non-chat interface for this. (of course at any time I could chat/ask a question as well)

psychoslave · 1h ago

I think a large part of it is us, as user, we lake the appropriate discipline.

Resist the call to open in a tab every link in this article, overcome the fear of losing something if all these tabs lagging behind are closed right now without further consideration.

ugh123 · 1h ago

That tab grouping capability is leaps and bounds better than the junk Google put into Chrome for "AI suggested" grouping.

felarof · 54m ago

+1, I use AI tab grouping feature in our browser quite a lot.

xena · 3h ago

Do you respect robots.txt?

felarof · 3h ago

No, not today.

But wonder if it matter if it the agent is mostly using it for "human" use cases and not scrapping?

qualeed · 48m ago

There's no reason not to respect it.

If your browser behaves, it's not going to be excluded in robots.txt.

If your browser doesn't behave, you should at least respect robots.txt.

If your browser doesn't behave, and you continue to ignore robots.txt, that's just... shitty.

xena · 3h ago

You should, because universities are starting to get legal involved due to mass scraping taking down their systems.

dotancohen · 3h ago

Yes it would matter. The AI might be I in your eyes, but it is still A.

mattigames · 3h ago

What do you mean? This AI cannot scrape multiple links automatically? Like "make a summary of all the recipes linked in this page" kind of stuff? If it can it definitely meets the definition of scraping.

grepexdev · 2h ago

I think what he means is it is not just generally crawling and scraping, and uses a more targeted approach. Equivalent to a user going to each of those sites, just more efficiently.

No comments yet

b0a04gl · 3h ago

so agents can control tabs, forms, clicks—like a real user would.so what about undo. if an agent clicks the wrong thing, how do you roll that back without reloading the world?

felarof · 3h ago

There is a big red button to always stop the agent.

deepdarkforest · 52m ago

you will run into the same problems the extension approaches have (Like nanobrowser etc). Which is if i have to supervise constantly for non reversible actions, then im no more efficient(actually less i would argue) than just doing the task myself. A human in the loop- pause just before a non reversible action asking for approval maybe? And the user can see all the current things that need approval. But a stop button works just too late imo

lxe · 3h ago

Before I dive into the source code... how do you pass the page content, and the locations of interactive components to the LLM? And how do you dispatch events to interact with the page? I just want to verify if it's ARIA tree like the others, or it's something else.

felarof · 3h ago

Today, we connect to chrome using CDP and use Puppeteer to send clicks and other operations. Also, using browser use DOM tree highlighting, which works great.

To get the page content we parse accessibility tree.

mahoro · 2h ago

This is great, I'd like to test! Is there any recommendations on which ollama models works best with this kind of tasks?

felarof · 2h ago

Qwen3 8B works pretty well. But for complex planning and navigation tasks, big models (GPT4.1, claude 3.7) are the still the best bet. We also let you use your own API keys for the big models.

mahoro · 2h ago

Thanks, eager to try :)

rodolphoarruda · 2h ago

This is the missing piece from Karpathy's keynote: the browser.

felarof · 2m ago

+1. Browser definitely seems like it's on the verge of getting reinvented, either by us or someone else.

OsrsNeedsf2P · 3h ago

Is this only for MacOS? If it's a Chromium fork, what's the reason for no Linux/Windows?

Also what's the business model?

felarof · 3h ago

Yes MacOS for now, but looking into getting Linux binary next.

> what's the reason for no Linux/Windows?

Sorry, just lack of time. Also we use Sparkle for distributing updates, which is MacOS only.

> Also what's the business model?

We are considering an enterprise version of the browser for teams.

gtsop · 4h ago

Are we still tossing around the 10x productivity boost? Please make this stop. I see first commit on April 28 so by 10x productivity its like you've been working on this for almost 2.5 years, and there is still a waiting list on the website.

Appreciate the agplv3 licence, kudos on that.

felarof · 3h ago

Thanks for the feedback.

I get the general sentiment. But cursor for sure has improved productivity by a huge multiplicative factor, especially for simpler stuff (like building chrome extension).

_fw · 3h ago

I’m using Dia a lot for work at the moment and frankly it’s a gamechanger. granted I’m not a developer but being able to interact with an LLM that has access to the page I’m on is extremely useful:

Instead of manually hunting across half a dozen different elements, then copy/paste and retype to put something into a format I want…

I can just get Dia do it. In fact, I can create a shortcut to get it to do it the same way every single time. It’s the first time I’ve used something that actually feels like an extension of the web, instead of a new way to simply act on it at the surface level.

I think the obvious extension of that is agentic browsers. I can’t wait for this to get built to a standard where I can use it every day… But how well is it going to run on my 16GB M1 Pro?

felarof · 3h ago

16GB M1 Pro is good enough to run our browser! You should give it a try!

Download form https://www.nxtscape.ai/ or our github page.

mattigames · 3h ago

If this workflow starts getting any traction this will quickly turn into a cat and mouse game, where companies do their best to make sure those AIs don't work on their websites to make sure humans and humans only watch their websites' ads, their links, their banners and so on.

Google being a big one of those companies would soon side with those companies and not with the users, it's been their modus operandi, just recently some people got threats that if they don't stop using ad blockers in YouTube they will ban them from the platform.

finolex · 2h ago

This is cool! Congrats on launch! How do you store user data? Do you write to device? Curious if there's a basic.tech x nxtscape collab possible here where you can store each user's info to their dedicated PDS

felarof · 2h ago

Thank you! Yeah all user data is just stored locally on device.

Oh cool, will look into basic.tech to understand more.

Babkock · 3h ago

Yeah that's just what we need, more AI shit, more slop slapped on top of Chromium.

anilgulecha · 4h ago

Very interesting approach. Why a browser, and not a fantastic chrome extension? Grouping tabs, summarizing, even taking open ended actions, seem very doable with permissions extensions have..

edit: Just read about the accessibility thing, but that's thin. Is there any usecase in the future that a browser can, but an extension can't?

esafak · 3h ago

It sounds like something that needs to be dealt with in Chromium rather than forked. I am sure lots of developers want such functionality, if it is missing. I found:

https://developer.chrome.com/docs/extensions/ai

Don't any of these fit the bill? Are they Gemini-locked and you want something else? I am not familiar with the Chrome API, so pardon my ignorance.

dataviz1000 · 3h ago

> Is there any usecase in the future that a browser can, but an extension can't?

The only reason to use a browser over a chrome extension is to bypass security features, for example, trusted events. If a user wants the browser window to go to full screen or play a video, a physical mouse click or key press is required. Moreover, some websites do not want to be automated like ChatGPT web console and Chase.com which checks if the event was a trusted event before accepting a button click or key press. This means that a Chrome extension can not automate voice commands inferred with audio to text. However, to get a trusted event only requires the user to press a button, any button, so message or dialog prompt that says, "Press to go full screen," is all that is required. This can be down with a remote bluetooth keyboard also.

The way I see it, these limitations are in place for very, very good reasons and should not be bypassed. Moreover, there are much larger security issues using a agentic browser which is sending entire contents of a bank website or health records in a hospital patient portal to a third party server. It is possible to run OpenAI's whisper on webgpu on a Macbook Pro M3 but most text generation models over 300M will cause it to heat up enough to cook a steak. There are even bigger issues with potential prompt injection attacks from third party websites that know agentic browsers are visiting their sites.

The first step in mitigating these security vulnerabilities is preventing the automation from doing anything a Chrome extension can't already do. The second is blacklisting or opt in only allowing the agents to read and especially to write (fill in form is a write) any webpage without explicit permission. I've started to use VSCode's copilot for command line action and it works with permissions the same way such as only session only access.

I've already solved a lot of the problems associated with using a Chrome extension for agentic browser automation. I really would like to be having this conversation with people.

EDIT: I forgot the most important part. There are 3,500,000,000 Chrome users on Earth. Getting them to install a Chrome extension is much, much easier than getting them to install a new browser.

felarof · 3h ago

Yeah accessibility is one such usecase, but in future we have few other ideaswhere having a fork makes it lot easier. Few ideas:

- Ship a small LLM along with browser - MCP store built in

afeigenbaum · 2h ago

What is on your roadmap?

felarof · 1h ago

Some ideas here - https://nxtscape.feedbear.com/roadmap

feel free to add new or upvote. Want to build what people want :)

revskill · 2h ago

When windows ?

felarof · 2h ago

Hopefully in a month or two. Sorry!

mdaniel · 5m ago

Relevant: https://news.ycombinator.com/item?id=44329787

thisislife2 · 3h ago

I've upvoted to encourage your initiative, but I personally will not support any "AI" software unless it 100% runs locally and supports old platforms and hardware. Otherwise it is nothing but another conduit to get access to, and suck all my personal data for surveillance capitalism.

felarof · 3h ago

> 100% runs locally

Thank you! We have ollama integration already, you can run models locally and use that for AI chat.

a2128 · 2h ago

What models are actually recommended, and how useful is the browser when using them? "We have Ollama integration" isn't very useful when there's no information about which models you should use, what works with them, what doesn't, and honestly it feels disingenuous when projects market themselves as 100% private and local and cloud-free and everything stays on your computer when the intended use case is clearly to put an OpenAI API key and send everything to OpenAI

zahirbmirza · 3h ago

Does it support MP4 playback?

felarof · 2h ago

Yes, it works.

johncole · 3h ago

This has been the most fun Show HN to read in a long time. <grabs popcorn>

Lammy · 3h ago

AOL still have active trademarks for “Netscape” which might trouble you here:

- https://tsdr.uspto.gov/#caseNumber=76017078&caseSearchType=U...

> PROVIDING MULTIPLE-USER ACCESS TO A GLOBAL COMPUTER INFORMATION NETWORK FOR THE TRANSFER AND DISSEMINATION OF A WIDE RANGE OF INFORMATION; ELECTRONIC TRANSMISSION OF DATA, IMAGES, AND DOCUMENTS VIA COMPUTER NETWORKS; [ELECTRONIC MAIL SERVICES; PROVIDING ON-LINE CHAT ROOMS FOR TRANSMISSION OF MESSAGES AMONG COMPUTER USERS CONCERNING A WIDE VARIETY OF FIELDS]

- https://tsdr.uspto.gov/#caseNumber=76017079&caseSearchType=U...

> PROVIDING INFORMATION IN THE FIELD OF COMPUTERS VIA A GLOBAL COMPUTER NETWORK; PROVIDING A WIDE RANGE OF GENERAL INTEREST INFORMATION VIA COMPUTER NETWORKS

- https://tsdr.uspto.gov/#caseNumber=74574057&caseSearchType=U...

> computer software for use in the transfer of information and the conduct of commercial transactions across local, national and world-wide information networks

xp84 · 1h ago

And in case someone is wondering if Yahoo can be said to have abandoned the mark by disuse, it seems like http://isp.netscape.com is still up, so they have their bases covered.

wongarsu · 3h ago

Name derived from Netscape (Firefox's great-grandfather), icon is a red fox, but based on Chrome? Was this originally designed as a Firefox fork or what happened there

mbreese · 2h ago

I can’t see how this project lasts with the current name/logo. As mentioned elsewhere, Netscape is still a trademark, and this is quite confusing between Netscape and Firefox.

ilaksh · 3h ago

Yeah. Regardless, it seems misleading to use that icon with a Chromium fork.

Also the fact that it's AGPL means this project is very copyleft and not compatible with business models.

I'm not saying that there is no place for copyleft open source anymore, but when it's in a clearly commercial project that makes me question the utility of it being open source.

bityard · 3h ago

Being copyleft doesn't mean it's not compatible with business models, it means it's not compatible with exploitative business models.

dotancohen · 3h ago

  > very copyleft and not compatible with business models.

Could you explain this for the rest of us? Thanks.

mattigames · 2h ago

The short answer is that it means that businesses need to publicly share whatever change they do to the code, and that alone is enough deterrent to use it.

abirch · 2h ago

"The GNU Affero General Public License is a modified version of the ordinary GNU GPL version 3. It has one added requirement: if you run a modified program on a server and let other users communicate with it there, your server must also allow them to download the source code corresponding to the modified version running there."

https://www.gnu.org/licenses/why-affero-gpl.html

This means that if this company is successful and sells me 1 license, in theory I can request the source code and spin up Dr Evil's voice 1 billion clones and not pay licenses for those.

With other forms of GPL you only have to release the source code if you release the software to the user.

psychoslave · 2h ago

A business that maintain its customer base captive through any kind of designed technical defect and asymmetrical information distribution is not striving for excellence in customer experience.

Saying that such a behavior encompasses all possible business models, it's like saying directorship is the only form of governance.

monkeywork · 1h ago

Name 3 succesful companies running under such restrictions?

josephcsible · 1h ago

Huh? It's a good thing that it's AGPL. That license explicitly allows commercial use, and only bans proprietary forks/modifications.

taylorius · 3h ago

I'll get voted down, but I hate that cute AI fox, and hope I never see it again.

cjaackie · 1h ago

Agree, but this is bike shedding, like there’s so much more to worry about for these fellas!

doublerabbit · 3h ago

I wish we could stop with the animal "furry" mascots for projects.

It was cute when the internet was cute but now it's just boring.

felarof · 3h ago

Haha, used gpt4o to generate it. What change do you want to see in that fox appearance? Any change should be a prompt away :)

al_borland · 3h ago

Why a fox? The browser is based on Chrome. Firefox basically owns having a fox as a mascot in the browser space. Why not pick something original? The fox confusing at best, but some may say misleading. Same goes for the name.

felarof · 3h ago

Thanks for the feedback. Honestly—we just reused the icon we had gotten professionally designed for the last idea we were working on (https://felafax.ai/).

But not gonna lie, as a tiny startup, we don’t have marketing budget of Perplexity or Dia, so we picked a name and icon that at least hinted at “browser” right away. Definitely not trying to mislead anyone -- just needed something recognizable out of the gate.

Sophira · 2h ago

You stated in the parent comment that you used GPT4o to generate it, but now you're saying you had a professionally made icon? I don't understand.

sevg · 2h ago

It doesn’t look like you reused that icon. It looks like you generated a new one with AI. So it could have been any animal (or not even an animal at all).

esskay · 3h ago

For starters, it shouldn't be using a fox. You know why.

rafram · 3h ago

The text is very off-center and the AI "vibe" is palpable. Hire a designer or at least take the time to add the text to a free SVG yourself.

ilaksh · 3h ago

It makes me question your honesty. If you want a fox logo then build it as a Firefox fork. If you do that I will trust you again.