Show HN: Driftcop – Open-source CLI SAST for "MCP rug pull attacks in AI Agents"

2 vinaypanghal 0 8/9/2025, 3:37:01 AM github.com ↗
Hi HN! We just open-sourced Driftcop, a security tool for people building AI agents with external tools via MCP. Driftcop continuously checks that the tools your AI agent relies on haven’t changed or drifted in unsafe ways. The motivation came from recent findings that AI agents can be quietly compromised via their tools – e.g. a tool that was useful and benign yesterday could auto-update into something malicious today (this is known as a rug pull attack in the MCP context)

Anthropic’s MCP (Model Context Protocol) makes it easy to plug tools into LLMs, but it lacks built-in security checks – in fact, MCP servers can suffer from issues like command injection, permission reuse, and version drift as highlighted by some early research.

What Driftcop does: It’s essentially an AI-aware security scanner and approval workflow:

When you connect your agent to an MCP server (tool provider), Driftcop first saves the approved tool descriptions and metadata.

If anything later changes (the tool’s description, parameters, or underlying version), Driftcop detects that “drift” immediately. It will block the agent from using the changed tool until a human reviews and re-approves it. This stops the AI from blindly running a possibly malicious updated tool. Driftcop also scans tool definitions for obvious red flags (like hidden instructions that could prompt the AI to do unintended actions, aka prompt injection) and checks the tool’s code against a CVE database for known vulnerabilities.

All changes are logged and signed (we integrated with Sigstore to record a transparency log of tool version metadata). So you get an auditable history of what your agent was allowed to use.

In practice, you can run Driftcop as a CLI in your dev/test pipeline or as a service alongside your agent in prod. We provide a web dashboard to visualize tool status (e.g. “Tool X needs re-approval due to changes”). It’s early days – we literally just launched – and we’d love feedback. Why we built this: My co-founder and I encountered multiple scary scenarios while testing agent tools. One example: a harmless-looking text parsing tool that, if fed a certain input, would silently execute an unintended command via the agent – essentially a hidden exploit. It made us realize how little visibility we had into what these third-party tools were actually doing or if they changed over time. We wanted a simple way to enforce a zero-trust approach: trust on first use (with review), then continuously verify. If the tool deviates from its original contract, don’t trust it until you verify again. This is a concept borrowed from traditional supply-chain security, now applied to AI agent tooling.

The project is on GitHub (sudoviz/driftcop) and is Apache-2.0 licensed. We’re keen on making this useful, so issues and PRs are welcome. We also wrote a detailed blog post about “The Rug Pull Problem” in AI agents and our approach here (which I’ll post on Medium/Dev.to soon).

Thanks for reading, and we’re happy to answer questions! Have any of you run into security issues with LLM agents or the MCP ecosystem? We’d love to discuss.

Comments (0)

No comments yet