Show HN: Xbow raised $117M to build AI hackers, I open-sourced it for free

79 ahmedallam2 13 8/18/2025, 8:43:13 PM github.com ↗

Comments (13)

tptacek · 48m ago
This is a neat project, I don't know why you'd want to set it up with this comparison to Xbow. As someone who works (worked? I'm non-ironically still trying to figure out if I belong in this space post-AI!) in this space and knows some of the actors, I'm pretty sure there's more to Xbow than ~1800 lines of prompts. Like: this is your RCE template prompt:

https://github.com/usestrix/strix/blob/main/strix/prompts/vu...

... and this is great, I'm not dunking, but pretty basic?

We just had the DARPA AIxCC results come in, and those systems are (1) open source and (2) presumably simpler/less polished than Xbow (some of the authors will be quick to tell you that they're doing PoC work, not product development), and (3) they're more complicated than this.

Again, to be super clear: I think there's a huge amount of potential in building something like this up. Nessus was much simpler than ISS when it first shipped, but you'd rather be Nessus than an ISS scanner developer! I'm just: why set this bar for your project?

Best of luck with this!

thegeomaster · 43m ago
Seems heavily vibe coded, down to the Claude-generated README and a lot of the LLM prompts themselves (which I have found works very poorly compared to human-written prompts). While none of this is necessarily bad, it requires a higher burden of proof that it actually works beyond toy problems [0]. I think everyone would appreciate some examples of vulnerabilities it can find. The missing JWT check showcased in the screenshot would've probably been caught with ordinary AI code review, so to my eye that by itself is not persuasive.

Good luck!

[0]: Why I say this --- a 10kLOC piece of software that was mostly human-written would require a large amount of testing, even manual, to ensure that it works, reliably, at all. All this testing and experimentation would naturally force a certain depth of exploration for the approach, the LLM prompts, etc across a variety of usecases. A mostly AI-written codebase of this size would've required much less testing to get it to "doesn't crash and runs reliably", and so this depth is not a given anymore.

waihtis · 1h ago
The joke is that Xbow only works because they have close to 100 employees operating the software
_pdp_ · 1h ago
You are joking, but there was actually a very popular enterprise SAST tool that used to offer a "cloud" version of their software. It worked by having someone from their team manually download the zip file of your code, run it through their desktop software, and then upload the results back to make them visible in the web portal.
ericmcer · 1h ago
That's a totally valid and useful way to validate an idea. After a few months of manual labor they will have a good idea of how/what to build and if it is even worth building.
tptacek · 46m ago
It is if you can keep a baseline level of quality uniform across both your customers and each of your customers projects. It's less OK if the human-assisted output is a loss-leader you burn on the pilot project, the first couple projects, or high-profile customers.

There's nothing fundamentally bad about having Oompa Loompa's behind the scenes, as long as you're honest about the outcomes you can provide.

I agree, though: also a very sensible way to prioritize development work.

ai-christianson · 1h ago
Classic thing that doesn't scale.
0cf8612b2e1e · 1h ago
That seems like something that totally scales? Just requires some GUI automation (which can be quite finicky, so good to have a manual backup).
codys · 1h ago
Unless the lack of real time (or consistent time to) results drives down interest in the cloud version, or instead of driving down interest makes it appear as if people want something different than what they would want if the time to results was consistent/faster.

Still could be worth doing a bit of manual work like this, but it's worth being cautious about drawing conclusions from it.

Steeeve · 38m ago
There's a reason Amazon's Mechanical Turk exists.
tptacek · 55m ago
I know who you're talking about, but also: this is the joke about basically every hosted SAST and DAST tool. I call it the "Oompa Loompa" model of security products.
guhcampos · 1h ago
"XBOW is an AI-powered penetration testing platform that delivers human-level security testing at machine speed."

At least they're not lying right? It's just people using computers.

armanj · 1h ago
Took a while to notice it's xbow and not xbox