Activeloop(YC S18)Is Hiring Senior Backend and AI Search Engineer(Mountain View) (careers.activeloop.ai)

1 points by davidbuniat 17d ago 0 comments

Morph (YC S23) Is Hiring a ML Engineer

1 points by bhaktatejas922 17d ago 0 comments

Spark AI (YC W24) Is Hiring a Full Stack Engineer in San Francisco (ycombinator.com)

Show HN: Nxtscape – an open-source agentic browser

190 felarof 147 6/20/2025, 4:35:55 PM github.com ↗

Hi HN - we're Nithin and Nikhil, twin brothers and founders of nxtscape.ai (YC S24). We're building Nxtscape ("next-scape") - an open-source, agentic browser for the AI era.

-- Why bother building a new browser? For the first time since Netscape was released in 1994, it feels like we can reimagine browsers from scratch for the age of AI agents. The web browser of tomorrow might not look like what we have today.

We saw how tools like Cursor gave developers a 10x productivity boost, yet the browser—where everyone else spends their entire workday—hasn't fundamentally changed.

And honestly, we feel like we're constantly fighting the browser we use every day. It's not one big thing, but a series of small, constant frustrations. I'll have 70+ tabs open from three different projects and completely lose my train of thought. And simple stuff like reordering tide pods from amazon or filling out forms shouldn't need our full attention anymore. AI can handle all of this, and that's exactly what we're building.

Here’s a demo of our early version https://dub.sh/nxtscape-demo

-- What makes us different We know others are exploring this space (Perplexity, Dia), but we want to build something open-source and community-driven. We're not a search or ads company, so we can focus on being privacy-first – Ollama integration, BYOK (Bring Your Own Keys), ad-blocker.

Btw we love what Brave started and stood for, but they've now spread themselves too thin across crypto, search, etc. We are laser-focused on one thing: making browsers work for YOU with AI. And unlike Arc (which we loved too but got abandoned), we're 100% open source. Fork us if you don't like our direction.

-- Our journey hacking a new browser To build this, we had to fork Chromium. Honestly, it feels like the only viable path today—we've seen others like Brave (started with electron) and Microsoft Edge learn this the hard way.

We also started with why not just build an extension. But realized we needed more control. Similar to the reason why Cursor forked VSCode. For example, Chrome has this thing called the Accessibility Tree - basically a cleaner, semantic version of the DOM that screen readers use. Perfect for AI agents to understand pages, but you can't use it through extension APIs.

That said, working with the 15M-line C++ chromium codebase has been an adventure. We've both worked on infra at Google and Meta, but Chromium is a different beast. Tools like Cursor's indexing completely break at this scale, so we've had to get really good with grep and vim. And the build times are brutal—even with our maxed-out M4 Max MacBook, a full build takes about 3 hours.

Full disclosure: we are still very early, but we have a working prototype on GitHub. It includes an early version of a "local Manus" style agent that can automate simple web tasks, plus an AI sidebar for questions, and other productivity features (grouping tabs, saving/resuming sessions, etc.).

Looking forward to any and all comments!

You can download the browser from our github page: https://github.com/nxtscape/nxtscape

Comments (147)

varenc · 6m ago

This is quite cool! Very excited for this concept.

some genuine feedback on a frustrating early experience:

- I ran the suggested "Group all my tabs by topic" in productivity agent mode. It worked great. - I then asked it to remove all tab groups and reset things, but was told this:

    This is a browser automation task. Please use **Agent Mode** for web interactions like clicking, filling forms, navigating sites, or extracting web content.

- Tried agent mode and was told:

    This is a productivity task. Please use **Chat Mode** for tab management, bookmarks, sessions, history, and content analysis.

- Basically was being sent back and forth. Went back to productivity mode and argued with it for a bit. The closest I could come to it removing all tabs groups was creating a new tab group encompassing all tabs, but couldn't get it to remove groups entirely. I'm guessing it might lack that API?

Overall, it'd be nice if every browser level action is took had an undo button. Or at least if it was smart enough/able to remove the tab groups it just created.

Will keep playing with it more.

edit1: one more weird issue: While running the chat interface on chrome internal pages, it would randomly browse me google.com for some reason.

edit2: confirmed that agent mode lacks a tool to ungroup tabs, just a tool to create tab groups.

kevinsync · 3h ago

IMO comments so far seem to be not seeing the forest for the trees -- I can imagine incredible value for myself in a browser that hooks into a local LLM, writes everything it sees to a local timestamped database (oversimplification), parses and summarizes everything you interact with (again, oversimplification -- this would be tunable and scriptable), exposes Puppeteer-like functionality that is both scriptable via code and prompt-to-generate-code, helps you map shit out, remember stuff, find forgotten things that are "on the tip of your [digital] tongue", learn what you're interested in (again, local), help proactively filter ads, spam, phishing, bullshit you don't want to see, etc, can be wound up and let go to tackle internet tasks autonomously for (and WITH) you (oversimplification), on and on and on.

Bookmarks don't cut it anymore when you've got 25 years of them saved.

Falling down deep rabbit holes because you landed on an attention-desperate website to check one single thing and immediately got distracted can be reduced by running a bodyguard bot to filter junk out. Those sites create deafening noise that you can squash by telling the bot to just let you know when somebody replies to your comment with something of substance that you might actually want to read.

If it truly works, I can imagine the digital equivalent of a personal assistant + tour manager + doorman + bodyguard + housekeeper + mechanic + etc, that could all be turned off and on with a switch.

Given that the browser is our main portal to the chaos that is internet in 2025, this is not a bad idea! Really depends on the execution, but yeah.. I'm very curious to see how this project (and projects like it) go.

felarof · 3h ago

Thank you so much for your honest feedback. I 100% agree - this is spot on! This is exactly the vision we had.

We spend 90%+ of our time in browsers, yet they're still basically dumb windows. Having an AI assistant that remembers what you visited, clips important articles (remember Evernote web clipper?), saves highlights and makes everything semantically searchable - all running locally - would be game-changing.

Everything stays in a local PostgresDB - your history, highlights, sessions. You can ask "what was that pricing comparison from last month?" or "find my highlights about browser automation" and it just works. Plus built-in self-control features to block distracting sites when you need to focus.

Beyond search and memory, the browser can actually help you work. AI that intelligently groups your tabs ("these 15 are all Chromium research"), automation for grunt work ("compare 2TB hard drive prices across these sites"), or even "summarize all new posts in my Discord servers" - all handled locally. The browser should help us manage internet chaos, not add to it.

Would love to hear what specific workflows are painful for you!

hannob · 7h ago

Okay, maybe this is a stupid question, but: what is an agentic browser? You seem to assume that everyone knows what that means.

Is this a common and well-defined term that people use? I've never heard it.

It would appear to me from the context that it means something like "web browser with AI stuff tackled on".

felarof · 7h ago

Thanks for asking - not a stupid question at all! I should have probably explained it at the top of my post.

By "agentic browser" we basically mean a browser with AI agents that can do web navigation tasks for you. So instead of you manually clicking around to reorder something on Amazon or fill out forms, the AI agent can actually navigate the site and do those tasks.

wild_egg · 7h ago

Not to pull a "why should I use Dropbox when I have rsync" but why should we use this over adding a Playwright MCP to Claude Desktop or similar?

Does having access to Chromium internals give you any super powers over connecting over the Chrome Devtools Protocol?

felarof · 6h ago

Yes, eventually we think there is more value of owning the entire stack than just be a MCP connector.

Few ideas we were thinking of: integrating a small LLM, building MCP store into browser, building a more AI friendly DOM, etc.

Even today, we use chrome's accessibility tree (a better representation of DOM for LLMs) which is not exposed via chrome extension APIs.

pickpuck · 5h ago

> building a more AI friendly DOM

You might consider the Accessibility Tree and its semantics. Plain divs are basically filtered out so you're left with interactive objects and some structural/layout cues.

shortrounddev2 · 5h ago

I would take the position of "why use this when I have eyes and hands and a brain?"

tolerance · 4h ago

My guess is that this is for impatient people; people who think that the prescribed use cases are somehow necessary for their "workflows"; people who subscribe to terms like "cognitive friction" within the context of these use cases; people who are...sort of lazy.

zahlman · 3h ago

...Why do these lazy people put so much effort into coming up with fancy words to justify that laziness?

tolerance · 2h ago

That's a really good question. Maybe it's because laziness is associated with a lack of intellect? And certain technologies, like AI and other software, are meant to augment our intellect.

These fancy words carry an intellectual/productive effect. When they're put to use it probably makes people feel like they're getting things done. And they never feel lazy because of this.

kordlessagain · 6h ago

Agents are LLM responses that are feed with tools, like calculate(expression). When it encounters a thing it needs to do to meet desired output, it will run the tool. That is defining a simple agentic workflow.

A complicated workflow may involve other tools. For example, the input to the LLM may produce something that tells it to set the user-agent to such and such as string:

  set_user_agent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36");

Other tools could be clicking on things in the page, or even injecting custom JavaScript when a page loads.

al_borland · 6h ago

I first heard the term agentic about a month ago. I went from never hearing it, to hearing it 3 or 4 times in 2 days... one of which was on an internal town hall where I work, where leadership was simply using it as if the whole world already knew what it meant, instead of literally being the first time it was ever mentioned.

The tl;dr is that it's AI that makes decisions on its own.

mullingitover · 3h ago

On the one hand: an agentic browser sounds like a cool idea. Being able to automate things with an agent on the client side is crazy powerful.

On the other hand: this has the potential to be an absolute security Chernobyl. A browser is likely to be logged into all your sensitive accounts. An agent in your browser is probably going to be exposed to untrusted inputs from the internet by its very nature.

You have the potential for prompt injection to turn your life upside down in a matter of seconds. I like the concept but I wouldn't touch this thing with a ten foot pole unless everyone in the supply chain was PCI/SOC2/ISO 27001 certified, the whole supply chain has been vetted, and I have blood oaths about its security from third party analysts.

felarof · 2h ago

Thanks for raising this - it's a critical concern and you're absolutely right to be cautious.

This is exactly why we're going local-first and open source. With cloud agents (like Manus.im), you're trusting a black box with your credentials. With local agents, you maintain control:

- Agents only run when you explicitly trigger them

- You see exactly what they're doing in real-time and can stop them

- You can run tasks in separate chrome user profile

- Most importantly: the code is open source, so you can audit exactly what's happening.

econ · 29m ago

Have an agent monitor what is going on and raise dialogs explaining why something is not okay, question the need for something, have email or sms confirmation, extra passwords or bluntly refuse to do the destructive task right now (ask me again in 36 hours) Then, when you have the blood oath and the certifications, it can continue to monitor as an extra layer.

adamoshadjivas · 2h ago

this sounds LLM generated

regardless, you did not answer OPs point, which is that any potentially malicious site can prompt inject you at any point, and trigger an MCP or any other action or whatever before you see them and stop them. The whole point of an AI browser is like self-driving car, being able to de-focus and let it do its thing. If i have to be nervous to watch if im getting hacked at any given second, then it's probably not a great product

felarof · 1h ago

I see, definitely agree that more work is needed in figuring out the right UX here. Probably open a shadow browser with OAuth sessions of only a small subset of sites required for the task?

jklinger410 · 8h ago

What is with Mac users forking Chromium and then only making releases for Mac?

felarof · 8h ago

Haha, was easier to build and we were the first users :)

have linux next on our radar. What build do you want?

ugh123 · 3h ago

I think it makes sense to move fast on a platform you understand and get the featureset solid. Worry about linux, deb packages later

jtolly710 · 6h ago

.deb would be great to see next :)

shortrounddev2 · 5h ago

Windows

doublerabbit · 7h ago

FreeBSD, Haiku, Amiga

felarof · 7h ago

Sounds good! will look into getting linux build.

nmstoker · 2h ago

Ah, didn't see this until now... I just opened an issue requesting this :)

https://github.com/nxtscape/nxtscape/issues/5

lecro · 2h ago

“nxtscape” gives the good old "SCSI" vibes. People just nailed “GPT”. Please consider something you can say in one breath.

Great product though.

deepdarkforest · 5h ago

This is definitely a winners take all market. Kudos for giving it a shot, but imo browser projects are just too big for a team of 2/3. Plus, google has already demoed at IO the first hint at this. IMO you just cannot move fast enough to grab enough market share as a first/second mover before google just does it on chrome, and that's assuming you can outcompete with Dia in the first place. Even browser-use can do this also, and they have good distribution already.

Good luck, but in your place i would at least start with something that a certain ICP needs more. Many, many manhours have been wasted by ambitious technical founders on taking down Chrome. (many also starting from a chrome fork itself). But none of them succeeded. We only have limited energy

felarof · 4h ago

Thanks for the honest feedback!

Definitely agree there is good amount of competition here.

But we do think there is a gap in the market for open-source, community driven and privacy-first AI browser. (Something like Brave?)

deepdarkforest · 2h ago

remember, gaps in the market sometimes exist for a reason. Forget AI. How many open source, community driven and privacy first browsers have made serious money?

Brave is a decent example but their business model is actually complicated, it includes a lot of little stuff. And they dont have the unit cost of LLMs (im assuming at some point you will take the burden of the llms, if not local)

felarof · 1h ago

Good point. Our thinking so far has been to build good open-source product and then offer enterprise version as paid.

Island browser, chrome enterprise have kinda of validated the need for enterprise version of browser with VPN and DLP engine (data-loss-prevention).

throwaway314155 · 2h ago

As long as you don't implement ANY cryptocurrency features or ad-replacement features. I realize you've gotta make money at some point but the current browser landscape is so damn depressing. Even Mozilla Firefox has lost the trust of some of its diminishing userbase.

pnw_throwaway · 31m ago

Feels like yet another solution in search of a problem.

All the same, looks like y’all are having fun working on it, and maybe some unforeseen usecase will bubble up.

zahlman · 3h ago

> And honestly, we feel like we're constantly fighting the browser we use every day. It's not one big thing, but a series of small, constant frustrations. I'll have 70+ tabs open from three different projects and completely lose my train of thought. And simple stuff like reordering tide pods from amazon or filling out forms shouldn't need our full attention anymore. AI can handle all of this, and that's exactly what we're building.

So your thesis is that an AI agent should decide what I pay attention to, rather than me?

What could possibly go wrong?

throwaway314155 · 2h ago

what a fresh new take

ellisd · 5h ago

Why obfuscate the LLM system prompt in your Github repo when it's going to be completely visible in the network inspector?

felarof · 4h ago

Not on purpose, just got compressed during production build through webpack. will get that fixed.

ellisd · 4h ago

Thank you for the clarification!

While reviewing the prompt's capabilities, I had an idea: implementing a Greasemonkey/Userscript-style system, where users could inject custom JavaScript or prompts based on URLs, could be a powerful way to enhance website interactions.

For instance, consider a banking website with a cumbersome data export process that requires extra steps to make the data usable. Imagine being able to add a custom button to their UI (or define a custom MCP function) specifically for that URL, which could automatically pull and format the data into a more convenient format for plain text accounting.

felarof · 4h ago

great idea & usecase! Will look into this.Thanks!

Was huge fan of Tampermonkey back in the days.

ajb · 3h ago

I think there is big value in something that is on the side of the user, in particular in processing user-hostile material. Here are some concrete use cases:

* Buying a sofa. You want to filter for sofas of a specific size, with certain features, marketing sites want to feed you a bunch of marketing slop for each sofa before giving you the details . This generalises to many domains.

* You have a few friends who are still stuck on Facebook, you want to be notified if they post anything and avoid other rubbish

* The local neighborhood is stuck organising in a Facebook group or even worse, nextdoor. You want to see any new posts except for those couple of guys who are always posting the same thing.

* A government consultation website has been put up, but as a hurdle the consultation document has been combinatorially expanded to 763 pages by bureaucratic authoring techniques. You want to undo the combinatorial expansion do you can identify things you actually care about.

8organicbits · 1h ago

> Buying a sofa. You want to filter for sofas of a specific size, with certain features

This jumped out to me as well. Even sites like Amazon lack per-item-cost sorting, which can be really helpful when buying in bulk. Historically we've seen people use scraping and data science to build sites like https://diskprices.com/; without using LLMs. If LLMs are useful for those types of tasks, perhaps we'll see a surge in similar sites instead of end users doing prompt engineering in their browser.

> You want to see any new posts except for those couple of guys who are always posting the same thing.

It looks like nextdoor supports blocking users, although other sites may not.

https://help.nextdoor.com/s/article/block-a-neighbor

felarof · 1h ago

spot on! These are great examples.

awongh · 6h ago

I think LLMs could have a reasonable chance at solving tab-related workflows (keeping track of tabs or the idea/concept of tabs) - that is tracking and sorting lots of small related research ideas.

Sort of like a backwards perplexity search. (LLM context is from open tabs rather than the tool that brings you to those tabs)

I built a tab manager extension a long time ago that people used but ran into the same problem- the concept of tab management runs deeper than just the tabs themselves.

felarof · 6h ago

Yeah, I feel LLMs can finally solve the tab overload issue. I suffer from this constantly.

I added few features which I felt would be useful - easy way to organise and group tabs - simple way to save and resume sessions with selective context.

What are your problems that you would like to see solved?

awongh · 6h ago

I don't like the idea of letting the LLM run wild and categorize things directly, but in a tab-organizing view it would be useful to add more semantic sorting of the tabs- maybe it would enable something like multiple tab-view control panel: Show all the AI tabs. Show all the image diffusion tabs. Show all the LLM tabs. (so overlapping views of sets of tabs)

This would of course apply to not just open tabs but tabs I used to have open, where the LLM knows about my browsing history.

But I think I would want a non-chat interface for this. (of course at any time I could chat/ask a question as well)

psychoslave · 6h ago

I think a large part of it is us, as user, we lake the appropriate discipline.

Resist the call to open in a tab every link in this article, overcome the fear of losing something if all these tabs lagging behind are closed right now without further consideration.

xena · 8h ago

Do you respect robots.txt?

felarof · 7h ago

No, not today.

But wonder if it matter if it the agent is mostly using it for "human" use cases and not scrapping?

dotancohen · 7h ago

Yes it would matter. The AI might be I in your eyes, but it is still A.

debazel · 2h ago

My understanding of this product is that this isn't an automated AI scraper, it's simply helping the user navigate pages they've already navigated to themselves.

If any type of AI based assistance is supposed to adhere to the robot.txt, then would you also say that AI based accessibility tools should refuse to work on pages blocked by robot.txt?

lolinder · 3h ago

So is Chrome. Very artificial. It's still not a robot for the purposes of robots.txt.

What coherent definition of robot excludes Chrome but includes this?

qualeed · 5h ago

There's no reason not to respect it.

If your browser behaves, it's not going to be excluded in robots.txt.

If your browser doesn't behave, you should at least respect robots.txt.

If your browser doesn't behave, and you continue to ignore robots.txt, that's just... shitty.

lolinder · 2h ago

> If your browser behaves, it's not going to be excluded in robots.txt.

No, it's common practice to allow Googlebot and deny all other crawlers by default [0].

This is within their rights when it comes to true scrapers, but it's part of why I'm very uncomfortable with the idea of applying robots.txt to what are clearly user agents. It sets a precedent where it's not inconceivable that we have websites curating allowlists of user agents like they already do for scrapers, which would be very bad for the web.

[0] As just one example: https://www.404media.co/google-is-the-only-search-engine-tha...

qualeed · 1h ago

>clearly user agents

I am not sure I agree with an AI-aided browser, that will scrape sites and aggregate that information, being classified as "clearly" a user agent.

If this browser were to gain traction and ends up being abusive to the web, that's bad too.

Where do you draw the line of crawler vs. automated "user agent"? Is it a certain number of web requests per minute? How are you defining "true scraper"?

lolinder · 1h ago

I draw the line where robotstxt.org (the semi-official home of robots.txt) draws the line [0]:

> A robot is a program that automatically traverses the Web's hypertext structure by retrieving a document, and recursively retrieving all documents that are referenced.

To me "recursive" is key—it transforms the traffic pattern from one that strongly resembles that of a human to one that touches every page on the site, breaks caching by visiting pages humans wouldn't typically, and produces not just a little bit more but orders of magnitude more traffic.

I was persuaded in another subthread that Nxtscape should respect robots.txt if a user issues a recursive request. I don't think it should if the request is "open these 5 subreddits and summarize the most popular links uploaded since yesterday", because the resulting traffic pattern is nearly identical to what I'd have done by hand (especially if the browser implements proper rate limiting, which I believe it should).

[0] https://www.robotstxt.org/faq/what.html

mattigames · 7h ago

What do you mean? This AI cannot scrape multiple links automatically? Like "make a summary of all the recipes linked in this page" kind of stuff? If it can it definitely meets the definition of scraping.

grepexdev · 6h ago

I think what he means is it is not just generally crawling and scraping, and uses a more targeted approach. Equivalent to a user going to each of those sites, just more efficiently.

vharish · 4h ago

I'm guessing that would ideally mean only reading the content the user would otherwise have gone through. I wonder if that's the case and if it's guaranteed.

Maybe some new standards and maybe a user configurable per site permissions may make it better?

I'm curious to see how this will turn out to be.

lolinder · 3h ago

> only reading the content the user would otherwise have gone through.

Why? My user agent is configured to make things easier for me and allow me to access content that I wouldn't otherwise choose to access. Dark mode allows me to read late at night. Reader mode allows me to read content that would otherwise be unbearably cluttered. I can zoom in on small text to better see it.

Should my reader mode or dark mode or zoom feature have to respect robots.txt because otherwise they'd allow me to access content that I would otherwise have chosen to leave alone?

mattigames · 2h ago

Yeah no, nothing of that helps you bypass the ads on their website*, but scraping and summarizing does, so its wildly different for monetization purposes, and in most cases that means the maintainability and survival of any given website.

I know its not completely true, I know reader mode can help you bypass the ads _after_ you already had a peek at the cluttered version, but if you need to go to the next page or something like that you need to disable reader-mode once and so on, so its a very granular ad-blocking while many AI use cases are about bypassing viewing it at all by a human; and the other thing is that reader mode is not very popular so its not a significant threat.

*or other links on their websites, or informative banners, etc

debazel · 2h ago

robots.txt is not there to protect your ad-based business model. It's meant for automated scrapers that recursively retrieve all pages on your website, which this browser is not doing at all. What a user does with a page after it has entered their browser is their own prerogative.

mattigames · 2h ago

>It's meant for automated scrapers that recursively retrieve all pages on your website, _which this browser is not doing at all_

AFAIK this is false, and this browser can do things like "summarize all the cooking recipes linked in this page" and therefore act exactly like a scraper (even if at smaller scale than most scrapers)

If tomorrow magically all phones and all computers had an ad-blocking browser installed -and set as the default browser- a big chunk of the economy would collapse, so while I can see the philosophical value of "What a user does with a page after it has entered their browser is their own prerogative", the pragmatic in me knows that if all users cared about that and enforced it it would have grave repercussions in the livelihood of many.

lolinder · 2h ago

https://www.robotstxt.org/faq/what.html

> A robot is a program that automatically traverses the Web's hypertext structure by retrieving a document, and recursively retrieving all documents that are referenced.

There's nothing recursive about "summarize all the cooking recipes linked on this page". That's a single-level iterative loop.

I will grant that I should alter my original statement: if OP wanted to respect robots.txt when it receives a request that should be interpreted as an instruction to recursively fetch pages, then I'd think that's an appropriate use of robots.txt, because that's not materially different than implementing a web crawler by hand in code.

But that represents a tiny subset of the queries that will go through a tool like this and respecting robots.txt for non-recursive requests would lead to silly outcomes like the browser refusing to load reddit.com [0].

[0] https://www.reddit.com/robots.txt

mattigames · 1h ago

The concept of robots.txt was created in a different time, when nobody envisioned that users would one day use commands written in plain English sentences to interact with websites (including interacting with multiple pages with such commands), so the discussion about if AI browsers should respect it or if they should not is senseless, and instead -if this kind of usage takes off- it would probably make more sense to have a new standard for such use cases, something like AI-browsers.txt to make clear the intent of blocking (or not) AI browsing capabilities.

lolinder · 1h ago

Alright, I think we can agree on that. I'll see you over in that new standardization discussion fighting fiercely for protections to make sure companies don't abuse it to compromise the open web.

lolinder · 2h ago

> I know its not completely true, I know read-mode can help you bypass the ads _after_ you already had a peek at the cluttered version

What about reader mode that is auto-configured to turn on immediately on landing on specific domains? Is that a robot for the purposes of robots.txt?

https://addons.mozilla.org/en-US/firefox/addon/automatic-rea...

And also, just to confirm, I'm to understand that if I'm navigating the internet with an ad blocker then you believe that I should respect robots.txt because my user agent is now a robot by virtue of using an ad blocker?

Is that also true if I browse with a terminal-based browser that simply doesn't render JavaScript or images?

mattigames · 2h ago

If you are using an ad-blocker by definition you are intentionally breaking the intended behavior by the creator of any given website (for personal gain), in that context any discussion about robots.txt or any other behavior that the creator expects is a moot point.

Autoconfig of reader mode and so on its so uncommon that is not even in the radar of most websites, if it was browser developers probably would try to create a solution that satisfies both parties, like putting the ads at the end and required to be text-only and other guidelines, but its not popular, same thing happens with terminal-based browsers, a lot of the most visited websites in the world don't even work without JS enabled.

On the other hand, this AI stuff seems to envision a larger userbase so it could become a concern and therefore the role of robots.txt or other anti-bot features could have some practical connotations.

lolinder · 2h ago

> If you are using an ad-blocker by definition you are intentionally breaking the intended behavior by the creator of any given website (for personal gain), in that context any discussion about robots.txt or any other behavior that the creator expects is a moot point.

I'm not asking if you believe ad blocking is ethical, I got that you don't. I'm asking if it turns my browser into a scraper that should be treated as such, which is an orthogonal question to the ethics of the tool in the first place.

I strongly disagree that user agents of the sort shown in the demo should count as robots. Robots.txt is designed for bots that produce tons of traffic to discourage them from hitting expensive endpoints (or to politely ask them to not scrape at all). I've responded to incidents caused by scraper traffic and this tool will never produce traffic in the same order of magnitude as a problematic scraper.

If we count this as a robot for the purposes of robots.txt we're heading down a path that will end the user agent freedom we've hitherto enjoyed. I cannot endorse that path.

For me the line is simple, and it's the one defined by robotstxt.org [0]: "A robot is a program that automatically traverses the Web's hypertext structure by retrieving a document, and recursively retrieving all documents that are referenced. ... Normal Web browsers are not robots, because they are operated by a human, and don't automatically retrieve referenced documents (other than inline images)."

If the user agent is acting on my instructions and accessing a specific and limited subset of the site that I asked it to, it's not a web scraper and should not be treated as such. The defining feature of a robot is amount of traffic produced, not what my user agent does with the information it pulls.

[0] https://www.robotstxt.org/faq/what.html

xena · 7h ago

You should, because universities are starting to get legal involved due to mass scraping taking down their systems.

No comments yet

lolinder · 2h ago

robotstxt.org [0] is pretty specific in what constitutes a robot for the purposes of robots.txt:

> A robot is a program that automatically traverses the Web's hypertext structure by retrieving a document, and recursively retrieving all documents that are referenced.

This is absolutely not what you are doing, which means what you have here is not a robot. What you have here is a user agent, so you don't need to pay attention to robots.txt.

If what you are doing here counted as robotic traffic, then so would:

* Speculative loading (algorithm guesses what you're going to load next and grabs it for you in advance for faster load times).

* Reader mode (algorithm transforms the website to strip out tons of content that you don't want and present you only with the minimum set of content you wanted to read).

* Terminal-based browsers (do not render images or JavaScript, thus bypassing advertising and according to some justifications leading them to be considered a robot because they bypass monetization).

The fact is that the web is designed to be navigated by a diverse array of different user agents that behave differently. I'd seriously consider imposing rate limits on how frequently your browser acts so you don't knock over a server—that's just good citizenship—but robots.txt is not designed for you and if we act like it is then a lot of dominoes will fall.

[0] https://www.robotstxt.org/faq/what.html

lolinder · 3h ago

This is a user agent and I would be incredibly frustrated if they respected robots.txt. Robots.txt was designed to encourage recursive web crawlers to be respectful. It's specifically not meant to exclude agents that are acting on users' direct requests.

Website operators should not get a say in what kinds of user agents I used to access their sites. Terminal? Fine. Regular web browser? Okay. AI powered web browser? Who cares. The strength of the web lies in the fact that I can access it with many different kinds of tools depending on my use case, and we cannot sacrifice that strength on the altar of hatred of AI tools.

Down that road lies disaster, with the Play Integrity API being just the tip of the iceberg.

https://www.robotstxt.org/faq/what.html

8organicbits · 2h ago

> simple stuff like reordering tide pods from amazon [..] shouldn't need our full attention anymore

There's a straw man here. If you want to reorder an item on Amazon: click on 'order history', scroll, and click buy. This is a well-optimized path already and it doesn't require your full attention. I suspect the agent approach takes more effort as you need to type and then monitor what the AI is doing.

felarof · 1h ago

Hmm, there are similar use-cases which I would much prefer to offload to AI rather than me spending time on it -- "compare prices of 2tb sandisk ssd hard drive across amazon, walmart and b&h"

8organicbits · 6m ago

That seems like a completely different thing, but I agree that would make a much better demo. Is that something nxtscape can do?

b0a04gl · 8h ago

so agents can control tabs, forms, clicks—like a real user would.so what about undo. if an agent clicks the wrong thing, how do you roll that back without reloading the world?

felarof · 8h ago

There is a big red button to always stop the agent.

deepdarkforest · 5h ago

you will run into the same problems the extension approaches have (Like nanobrowser etc). Which is if i have to supervise constantly for non reversible actions, then im no more efficient(actually less i would argue) than just doing the task myself. A human in the loop- pause just before a non reversible action asking for approval maybe? And the user can see all the current things that need approval. But a stop button works just too late imo

dvt · 4h ago

Suffers from the same problem as all other AI "workflows"--no one wants to fucking chat with a computer (Brave sort-of does this, and it's god-awful). A chat interface should only be used as a fallback if the agent is too dumb to figure out what I want.

A chat interface works for ChatGPT because most folks use it as a pseudo-search, but productivity tools are (broadly speaking) not generative, therefore shouldn't be using freeform inputs. I have many thoughts on fixing this, and it's a very hard problem, but simply slapping an LLM onto Chrome is just lazy. I don't mean to be overly negative, but it's kind of wild to see YC funding slop like this.

And that's exactly what this is: slop. There's no technical creativity here, this isn't a new product segment, it barely deserves the "hey bro, this might be a feature, not a product" startup 101 criticism. It's what ChatGPT would spit out if you asked it what a good startup idea would be in 2025. All we need to do, even if we were being as charitable as possible, is ask who's doing the heavy lifting here (hint: it's not in the Github repo).

brulard · 3h ago

What do you see as an alternative? This is exactly what people need. To give quick instructions and have agent work on tasks across webpages/webapps. You say you "have many thoughts on fixing this". Can you share a better vision?

dvt · 3h ago

Yes, it involves "recipes" for different "action types" based on a few things (past behavior, sensible "default" behavior, the context, etc.). For example, when looking at a storefront, the agent might want to give a few options (off the top of my head):

    - It knows you're low on milk, so it suggests buying milk
    - It can get a list of all deals, maybe even cross-checking with other storefronts
    - It knows your sister was looking for a planter, and it found a particularly cheap one, so it suggests texting it to her
    - Etc.
    - (You can of course still chat with the thing, if you so desire)

These recipes are difficult to come up with and hard to generalize (they also need to be categorized, and likely accurately picked from a [RAG?] database), but imo this is the future if AI agents, not yet another chatbox.

doctorpangloss · 4h ago

Like hundreds of millions of people want to chat with a computer. At least.

dvt · 3h ago

This is simply not true in the productivity space (which is the context here). People just want to get shit done.

gtsop · 8h ago

Are we still tossing around the 10x productivity boost? Please make this stop. I see first commit on April 28 so by 10x productivity its like you've been working on this for almost 2.5 years, and there is still a waiting list on the website.

Appreciate the agplv3 licence, kudos on that.

felarof · 8h ago

Thanks for the feedback.

I get the general sentiment. But cursor for sure has improved productivity by a huge multiplicative factor, especially for simpler stuff (like building chrome extension).

kerv · 5h ago

This is cool thanks for sharing

What is the tech around the thing that segments out DOM elements automatically and shows the visual representation. I think something like this would be great for automated UI testing agents?

felarof · 4h ago

Oh that's from browser use buildDOMTree.js. check it out on their github.

lxe · 8h ago

Before I dive into the source code... how do you pass the page content, and the locations of interactive components to the LLM? And how do you dispatch events to interact with the page? I just want to verify if it's ARIA tree like the others, or it's something else.

felarof · 8h ago

Today, we connect to chrome using CDP and use Puppeteer to send clicks and other operations. Also, using browser use DOM tree highlighting, which works great.

To get the page content we parse accessibility tree.

mahoro · 7h ago

This is great, I'd like to test! Is there any recommendations on which ollama models works best with this kind of tasks?

felarof · 7h ago

Qwen3 8B works pretty well. But for complex planning and navigation tasks, big models (GPT4.1, claude 3.7) are the still the best bet. We also let you use your own API keys for the big models.

mahoro · 6h ago

Thanks, eager to try :)

OsrsNeedsf2P · 7h ago

Is this only for MacOS? If it's a Chromium fork, what's the reason for no Linux/Windows?

Also what's the business model?

jacobsenscott · 2h ago

> Also what's the business model?

The hype cycle business model never changes.

felarof · 7h ago

Yes MacOS for now, but looking into getting Linux binary next.

> what's the reason for no Linux/Windows?

Sorry, just lack of time. Also we use Sparkle for distributing updates, which is MacOS only.

> Also what's the business model?

We are considering an enterprise version of the browser for teams.

_fw · 7h ago

I’m using Dia a lot for work at the moment and frankly it’s a gamechanger. granted I’m not a developer but being able to interact with an LLM that has access to the page I’m on is extremely useful:

Instead of manually hunting across half a dozen different elements, then copy/paste and retype to put something into a format I want…

I can just get Dia do it. In fact, I can create a shortcut to get it to do it the same way every single time. It’s the first time I’ve used something that actually feels like an extension of the web, instead of a new way to simply act on it at the surface level.

I think the obvious extension of that is agentic browsers. I can’t wait for this to get built to a standard where I can use it every day… But how well is it going to run on my 16GB M1 Pro?

felarof · 7h ago

16GB M1 Pro is good enough to run our browser! You should give it a try!

Download form https://www.nxtscape.ai/ or our github page.

mattigames · 7h ago

If this workflow starts getting any traction this will quickly turn into a cat and mouse game, where companies do their best to make sure those AIs don't work on their websites to make sure humans and humans only watch their websites' ads, their links, their banners and so on.

Google being a big one of those companies would soon side with those companies and not with the users, it's been their modus operandi, just recently some people got threats that if they don't stop using ad blockers in YouTube they will ban them from the platform.

anilgulecha · 8h ago

Very interesting approach. Why a browser, and not a fantastic chrome extension? Grouping tabs, summarizing, even taking open ended actions, seem very doable with permissions extensions have..

edit: Just read about the accessibility thing, but that's thin. Is there any usecase in the future that a browser can, but an extension can't?

dataviz1000 · 7h ago

> Is there any usecase in the future that a browser can, but an extension can't?

The only reason to use a browser over a chrome extension is to bypass security features, for example, trusted events. If a user wants the browser window to go to full screen or play a video, a physical mouse click or key press is required. Moreover, some websites do not want to be automated like ChatGPT web console and Chase.com which checks if the event was a trusted event before accepting a button click or key press. This means that a Chrome extension can not automate voice commands inferred with audio to text. However, to get a trusted event only requires the user to press a button, any button, so message or dialog prompt that says, "Press to go full screen," is all that is required. This can be down with a remote bluetooth keyboard also.

The way I see it, these limitations are in place for very, very good reasons and should not be bypassed. Moreover, there are much larger security issues using a agentic browser which is sending entire contents of a bank website or health records in a hospital patient portal to a third party server. It is possible to run OpenAI's whisper on webgpu on a Macbook Pro M3 but most text generation models over 300M will cause it to heat up enough to cook a steak. There are even bigger issues with potential prompt injection attacks from third party websites that know agentic browsers are visiting their sites.

The first step in mitigating these security vulnerabilities is preventing the automation from doing anything a Chrome extension can't already do. The second is blacklisting or opt in only allowing the agents to read and especially to write (fill in form is a write) any webpage without explicit permission. I've started to use VSCode's copilot for command line action and it works with permissions the same way such as only session only access.

I've already solved a lot of the problems associated with using a Chrome extension for agentic browser automation. I really would like to be having this conversation with people.

EDIT: I forgot the most important part. There are 3,500,000,000 Chrome users on Earth. Getting them to install a Chrome extension is much, much easier than getting them to install a new browser.

esafak · 8h ago

It sounds like something that needs to be dealt with in Chromium rather than forked. I am sure lots of developers want such functionality, if it is missing. I found:

https://developer.chrome.com/docs/extensions/ai

Don't any of these fit the bill? Are they Gemini-locked and you want something else? I am not familiar with the Chrome API, so pardon my ignorance.

felarof · 8h ago

Yeah accessibility is one such usecase, but in future we have few other ideaswhere having a fork makes it lot easier. Few ideas:

- Ship a small LLM along with browser - MCP store built in

zahirbmirza · 8h ago

Does it support MP4 playback?

felarof · 6h ago

Yes, it works.

zahirbmirza · 4h ago

cool, some chromium forks forget to add the support. Thanks

Babkock · 7h ago

Yeah that's just what we need, more AI shit, more slop slapped on top of Chromium.

finolex · 7h ago

This is cool! Congrats on launch! How do you store user data? Do you write to device? Curious if there's a basic.tech x nxtscape collab possible here where you can store each user's info to their dedicated PDS

felarof · 7h ago

Thank you! Yeah all user data is just stored locally on device.

Oh cool, will look into basic.tech to understand more.

ugh123 · 5h ago

That tab grouping capability is leaps and bounds better than the junk Google put into Chrome for "AI suggested" grouping.

felarof · 5h ago

+1, I use AI tab grouping feature in our browser quite a lot.

afeigenbaum · 6h ago

What is on your roadmap?

felarof · 6h ago

Some ideas here - https://nxtscape.feedbear.com/roadmap

feel free to add new or upvote. Want to build what people want :)

revskill · 7h ago

When windows ?

felarof · 6h ago

Hopefully in a month or two. Sorry!

mdaniel · 4h ago

Relevant: https://news.ycombinator.com/item?id=44329787

thisislife2 · 7h ago

I've upvoted to encourage your initiative, but I personally will not support any "AI" software unless it 100% runs locally and supports old platforms and hardware. Otherwise it is nothing but another conduit to get access to, and suck all my personal data for surveillance capitalism.

felarof · 7h ago

> 100% runs locally

Thank you! We have ollama integration already, you can run models locally and use that for AI chat.

a2128 · 7h ago

What models are actually recommended, and how useful is the browser when using them? "We have Ollama integration" isn't very useful when there's no information about which models you should use, what works with them, what doesn't, and honestly it feels disingenuous when projects market themselves as 100% private and local and cloud-free and everything stays on your computer when the intended use case is clearly to put an OpenAI API key and send everything to OpenAI

rodolphoarruda · 6h ago

This is the missing piece from Karpathy's keynote: the browser.

felarof · 4h ago

+1. Browser definitely seems like it's on the verge of getting reinvented, either by us or someone else.

johncole · 7h ago

This has been the most fun Show HN to read in a long time. <grabs popcorn>

Aldipower · 3h ago

The README has 300MB of GIF animations embedded. WTF :-D Was this AI generated?

Lammy · 7h ago

AOL still have active trademarks for “Netscape” which might trouble you here:

- https://tsdr.uspto.gov/#caseNumber=76017078&caseSearchType=U...

> PROVIDING MULTIPLE-USER ACCESS TO A GLOBAL COMPUTER INFORMATION NETWORK FOR THE TRANSFER AND DISSEMINATION OF A WIDE RANGE OF INFORMATION; ELECTRONIC TRANSMISSION OF DATA, IMAGES, AND DOCUMENTS VIA COMPUTER NETWORKS; [ELECTRONIC MAIL SERVICES; PROVIDING ON-LINE CHAT ROOMS FOR TRANSMISSION OF MESSAGES AMONG COMPUTER USERS CONCERNING A WIDE VARIETY OF FIELDS]

- https://tsdr.uspto.gov/#caseNumber=76017079&caseSearchType=U...

> PROVIDING INFORMATION IN THE FIELD OF COMPUTERS VIA A GLOBAL COMPUTER NETWORK; PROVIDING A WIDE RANGE OF GENERAL INTEREST INFORMATION VIA COMPUTER NETWORKS

- https://tsdr.uspto.gov/#caseNumber=74574057&caseSearchType=U...

> computer software for use in the transfer of information and the conduct of commercial transactions across local, national and world-wide information networks

xp84 · 5h ago

And in case someone is wondering if Yahoo can be said to have abandoned the mark by disuse, it seems like http://isp.netscape.com is still up, so they have their bases covered.

wongarsu · 8h ago

Name derived from Netscape (Firefox's great-grandfather), icon is a red fox, but based on Chrome? Was this originally designed as a Firefox fork or what happened there

mbreese · 6h ago

I can’t see how this project lasts with the current name/logo. As mentioned elsewhere, Netscape is still a trademark, and this is quite confusing between Netscape and Firefox.

ilaksh · 7h ago

Yeah. Regardless, it seems misleading to use that icon with a Chromium fork.

Also the fact that it's AGPL means this project is very copyleft and not compatible with business models.

I'm not saying that there is no place for copyleft open source anymore, but when it's in a clearly commercial project that makes me question the utility of it being open source.

bityard · 7h ago

Being copyleft doesn't mean it's not compatible with business models, it means it's not compatible with exploitative business models.

dotancohen · 7h ago

  > very copyleft and not compatible with business models.

Could you explain this for the rest of us? Thanks.

mattigames · 7h ago

The short answer is that it means that businesses need to publicly share whatever change they do to the code, and that alone is enough deterrent to use it.

abirch · 7h ago

"The GNU Affero General Public License is a modified version of the ordinary GNU GPL version 3. It has one added requirement: if you run a modified program on a server and let other users communicate with it there, your server must also allow them to download the source code corresponding to the modified version running there."

https://www.gnu.org/licenses/why-affero-gpl.html

This means that if this company is successful and sells me 1 license, in theory I can request the source code and spin up Dr Evil's voice 1 billion clones and not pay licenses for those.

With other forms of GPL you only have to release the source code if you release the software to the user.

psychoslave · 6h ago

A business that maintain its customer base captive through any kind of designed technical defect and asymmetrical information distribution is not striving for excellence in customer experience.

Saying that such a behavior encompasses all possible business models, it's like saying directorship is the only form of governance.

monkeywork · 6h ago

Name 3 succesful companies running under such restrictions?

josephcsible · 6h ago

Huh? It's a good thing that it's AGPL. That license explicitly allows commercial use, and only bans proprietary forks/modifications.

taylorius · 8h ago

I'll get voted down, but I hate that cute AI fox, and hope I never see it again.

cjaackie · 5h ago

Agree, but this is bike shedding, like there’s so much more to worry about for these fellas!

doublerabbit · 7h ago

I wish we could stop with the animal "furry" mascots for projects.

It was cute when the internet was cute but now it's just boring.

felarof · 7h ago

Haha, used gpt4o to generate it. What change do you want to see in that fox appearance? Any change should be a prompt away :)

al_borland · 7h ago

Why a fox? The browser is based on Chrome. Firefox basically owns having a fox as a mascot in the browser space. Why not pick something original? The fox confusing at best, but some may say misleading. Same goes for the name.

felarof · 7h ago

Thanks for the feedback. Honestly—we just reused the icon we had gotten professionally designed for the last idea we were working on (https://felafax.ai/).

But not gonna lie, as a tiny startup, we don’t have marketing budget of Perplexity or Dia, so we picked a name and icon that at least hinted at “browser” right away. Definitely not trying to mislead anyone -- just needed something recognizable out of the gate.

Sophira · 6h ago

You stated in the parent comment that you used GPT4o to generate it, but now you're saying you had a professionally made icon? I don't understand.

sevg · 6h ago

It doesn’t look like you reused that icon. It looks like you generated a new one with AI. So it could have been any animal (or not even an animal at all).

esskay · 7h ago

For starters, it shouldn't be using a fox. You know why.

rafram · 7h ago

The text is very off-center and the AI "vibe" is palpable. Hire a designer or at least take the time to add the text to a free SVG yourself.

ilaksh · 7h ago

It makes me question your honesty. If you want a fox logo then build it as a Firefox fork. If you do that I will trust you again.

Qfex (YC X25) – Back End Engineer for a 24/7 Stock Exchange (ycombinator.com)

Flowspace (YC S17) Is Hiring Software Engineers (flowspace.applytojob.com)

Attimet (YC F24) – Quant Trading Research Lab – Is Hiring Founding Engineer (ycombinator.com)

Jiga (YC W21) Is Hiring Software Engs to Make Life of Mech Engs Easier (workatastartup.com)

Foundry (YC F24) Hiring Early Engineer to Build Web Agent Infrastructure (ycombinator.com)

Blaze (YC S24) Is Hiring (ycombinator.com)

Infracost (YC W21) is hiring software engineers (GMT+2 to GMT-6) (infracost.io)

Solidroad (YC W25) Is Hiring (solidroad.com)

Kyber (YC W23) Is Hiring a Technical Account Manager (ycombinator.com)

Roundtable (YC S23) Is Hiring a President / CRO (ycombinator.com)

Roame (YC S23) Is Hiring (ycombinator.com)

GauntletAI (YC S17): All expenses paid AI training and guaranteed $200k+ job (gauntletai.com)

SchemeFlow (YC S24) Is Hiring an Engineer (London) to Speed Up Construction (ycombinator.com)

Shaped (YC W22) Is Hiring (ycombinator.com)

Spice Data (YC S19) is hiring a software engineer – back end (ycombinator.com)

Onlook (YC W25) Is Hiring an engineer in SF

OneText (YC W23) Is Hiring a DevOps/DBA Lead Engineer (jobs.ashbyhq.com)

Gander (YC F24) Is Hiring Founding Engineers and Interns (ycombinator.com)

Ziina (YC W21) the Series A fintech is hiring product engineers (ziina.notion.site)

Onyx (YC W24) – AI Assistants for Work Hiring Founding AE (ycombinator.com)

Great Question (YC W21) Is Hiring a Director of Customer Success (ycombinator.com)

Deepnote (YC S19) is hiring engineers to build an AI-powered data notebook (deepnote.com)

Converge (YC S23) Well-capitalized New York startup seeks product developers (runconverge.com)

CircuitHub (YC W12) is hiring full-stack robotics engineers (workatastartup.com)

AtoB (YC S20) – Stripe for Transportation – is hiring engineers (jobs.ashbyhq.com)

PromptArmor (YC W24) Is Hiring in San Francisco (ycombinator.com)

Depot (YC W23) is hiring an enterprise support engineer (UK/EU) (ycombinator.com)

Patched (YC S24) Is Hiring SWEs in Singapore (ycombinator.com)

Activeloop(YC S18)Is Hiring Senior Backend and AI Search Engineer(Mountain View) (careers.activeloop.ai)

Morph (YC S23) Is Hiring a ML Engineer

Spark AI (YC W24) Is Hiring a Full Stack Engineer in San Francisco (ycombinator.com)

Show HN: Nxtscape – an open-source agentic browser

Comments (147)