Gurus of 90s Web Design: Zeldman, Siegel, Nielsen (cybercultural.com)

Hey HN! We're Brandon, Sam, and Akul from MindFort (https://mindfort.ai). We're building autonomous AI agents that continuously find, validate, and patch security vulnerabilities in web applications—essentially creating an AI red team that runs 24/7.

Here's a demo: https://www.loom.com/share/e56faa07d90b417db09bb4454dce8d5a

Security testing today is increasingly challenging. Traditional scanners generate 30-50% false positives, drowning engineering teams in noise. Manual penetration testing happens quarterly at best, costs tens of thousands per assessment, and takes weeks to complete. Meanwhile, teams are shipping code faster than ever with AI assistance, but security reviews have become an even bigger bottleneck.

All three of us encountered this problem from different angles. Brandon worked at ProjectDiscovery building the Nuclei scanner, then at NetSPI (one of the largest pen testing firms) building AI tools for testers. Sam was a senior engineer at Salesforce leading security for Tableau. He dealt firsthand with juggling security findings and managing remediations. Akul did his master's on AI and security, co-authored papers on using LLMs for ecurity attacks, and participated in red-teams at OpenAI and Anthropic.

We all realized that AI agents were going to fundamentally change security testing, and that the wave of AI-generated code would need an equally powerful solution to keep it secure.

We've built AI agents that perform reconnaissance, exploit vulnerabilities, and suggest patches—similar to how a human penetration tester works. The key difference from traditional scanners is that our agents validate exploits in runtime environments before reporting them, reducing false positives.

We use multiple foundational models orchestrated together. The agents perform recon to understand the attack surface, then use that context to inform testing strategies. When they find potential vulnerabilities, they spin up isolated environments to validate exploitation. If successful, they analyze the codebase to generate contextual patches.

What makes this different from existing tools? Validation through exploitation: We don't just pattern-match—we exploit vulnerabilities to prove they're real; - Codebase integration: The agents understand your code structure to find complex logic bugs and suggest appropriate fixes; - Continuous operation: Instead of point-in-time assessments, we're constantly testing as your code evolves; - Attack chain discovery: The agents can find multi-step vulnerabilities that require chaining different issues together.

We're currently in early access, working with initial partners to refine the platform. Our agents are already finding vulnerabilities that other tools miss and scoring well on penetration testing benchmarks.

Looking forward to your thoughts and comments!

Comments (18)

mparis · 15h ago

Congrats on the launch. Seems like a natural domain for an AI tool. One nice aspect about pen testing is it only needs to work once to be useful. In other words, it can fail most of the time and no one but your CFO cares. Nice!

A few questions:

On your site it says, "MindFort can asses 1 or 100,000 page web apps seamlessly. It can also scale dynamically as your applications grow."

Can you provide more color as to what that really means? If I were actually to ask you to asses 100,000 pages what would actually happen? Is it possible for my usage to block/brown-out another customer's usage?

I'm also curious what happens if the system does detect a vulnerability. Is there any chance the bot does something dangerous with e.g. it's newly discovered escalated privileges?

Thanks and good luck!

bveiseh · 15h ago

Thanks so much!

In regards to the scale, we absolutely can assess at that scale, but it would require quite a large enterprise contract upfront, as we would need to get the required capacity from our providers.

The system is designed to safely test exploitation, and not perform destructive testing. It will traverse as far as it can, but it won't break anything along the way.

robszumski · 17h ago

How does a customer use this?

Point it at a publicly available webapp? Run it locally against dev? Do I self-host it and continually run against staging as it's updated?

bveiseh · 15h ago

So you would point it to any web app available over the internet. There is an option to have a private deployment in your VPC to test applications that are not exposed to the internet. You can also schedule assessments so that the system runs at a regular interval (daily, weekly, bi-weekly, etc)

sumanyusharma · 19h ago

Congratulations on the launch. Few qs:

How do your agents decide a suspected issue is a validated vulnerability, and what measured false-positive/false-negative rates can you share?

How is customer code and data isolated and encrypted throughout reconnaissance, exploitation, and patch generation (e.g., single-tenant VPC, data-retention policy)?

Do the agents ever apply patches automatically, or is human review required—and how does the workflow integrate with CI/CD to prevent regressions?

Ty!

bveiseh · 15h ago

Appreciate it!

The agents will hone in on a potential vulnerability by looking at different signals during its testing, and then build a POC to validate it based on the context. We don't have any data to share publicly yet but we are working on releasing benchmarks soon.

Everything runs in a private VPC and data is encrypted in transit and at rest. We have zero data retention agreements with our vendors, and we do offer single tenant and private cloud deployments for customers. We don't retain any customer code once we finish processing it, only the vulnerability data. We are also in process of receiving our SOC 2.

Patches are not auto applied. We can either open up a PR for human review or can add the necessary changes to a Linear/Jira ticket. We have the ability schedule assessments in our platform, and are working on a way to integrate more deeply with CI/CD.

handfuloflight · 17h ago

Any outlines on pricing?

bveiseh · 15h ago

It depends on the size of your attack surface, complexity of the application, and frequency of assessments, so for now we are working out custom agreements with each customer based on these factors.

blibble · 19h ago

what controls do you have to ensure consent from the target site?

bveiseh · 15h ago

Yup as mentioned, we do the TXT verification of the domain. We also don't offer self service sign up, so we are able to screen customers ahead of time and regularly monitor for any bad behavior.

bko · 19h ago

In the video demo they showed requiring a TXT in the DNS to confirm you have consent

blibble · 17h ago

so they'll point it a domain they control, then reverse proxy it onto their target?

icedchai · 16h ago

What do you propose they do instead?

blibble · 14h ago

not offer automated targeted hacking as a service?

even the booters market themselves as as "legitimate stress testing tools for enterprise"

gyanchawdhary · 18h ago

Congratulations on the launch. How different is this from xbow.com, shinobi.security, gecko.security. zeropath.com etc ?

bveiseh · 14h ago

Thanks so much.

We want to solve the entire vulnerability lifecycle problem not just finding zero days. MindFort works from detection, validation, triage/scoring, all the way to patching the vulnerability. While we are starting with web app, we plan to expand to the rest of the attack surface soon.

lazyninja987 · 19h ago

Is it a pre-requisute for the agents to have access to the source code to generate attack strategies?

How about pen-testing a black box?

Does the potential vulnerabilities list is generated by matching list of vulnerabilities that are publicly disclosed for the framework version of target software stack constituents?

I am new to LLMs or any ML for that matter. Congrats on your launch.

bveiseh · 14h ago

Thanks so much.

Great question, it is not required but we recommend it. If you don't include the source code, it would be black box. The agents won't know what the app looks like from the other side.

The agents identify vulns using known attack patterns, novel techniques, and threat intelligence.