Show HN: Driftcop – Open-source CLI SAST for "MCP rug pull attacks in AI Agents"

2 vinaypanghal 0 8/9/2025, 3:37:01 AM github.com ↗

Hi HN! We just open-sourced Driftcop, a security tool for people building AI agents with external tools via MCP. Driftcop continuously checks that the tools your AI agent relies on haven’t changed or drifted in unsafe ways. The motivation came from recent findings that AI agents can be quietly compromised via their tools – e.g. a tool that was useful and benign yesterday could auto-update into something malicious today (this is known as a rug pull attack in the MCP context)

Anthropic’s MCP (Model Context Protocol) makes it easy to plug tools into LLMs, but it lacks built-in security checks – in fact, MCP servers can suffer from issues like command injection, permission reuse, and version drift as highlighted by some early research.

What Driftcop does: It’s essentially an AI-aware security scanner and approval workflow:

When you connect your agent to an MCP server (tool provider), Driftcop first saves the approved tool descriptions and metadata.

If anything later changes (the tool’s description, parameters, or underlying version), Driftcop detects that “drift” immediately. It will block the agent from using the changed tool until a human reviews and re-approves it. This stops the AI from blindly running a possibly malicious updated tool. Driftcop also scans tool definitions for obvious red flags (like hidden instructions that could prompt the AI to do unintended actions, aka prompt injection) and checks the tool’s code against a CVE database for known vulnerabilities.

All changes are logged and signed (we integrated with Sigstore to record a transparency log of tool version metadata). So you get an auditable history of what your agent was allowed to use.

In practice, you can run Driftcop as a CLI in your dev/test pipeline or as a service alongside your agent in prod. We provide a web dashboard to visualize tool status (e.g. “Tool X needs re-approval due to changes”). It’s early days – we literally just launched – and we’d love feedback. Why we built this: My co-founder and I encountered multiple scary scenarios while testing agent tools. One example: a harmless-looking text parsing tool that, if fed a certain input, would silently execute an unintended command via the agent – essentially a hidden exploit. It made us realize how little visibility we had into what these third-party tools were actually doing or if they changed over time. We wanted a simple way to enforce a zero-trust approach: trust on first use (with review), then continuously verify. If the tool deviates from its original contract, don’t trust it until you verify again. This is a concept borrowed from traditional supply-chain security, now applied to AI agent tooling.

The project is on GitHub (sudoviz/driftcop) and is Apache-2.0 licensed. We’re keen on making this useful, so issues and PRs are welcome. We also wrote a detailed blog post about “The Rug Pull Problem” in AI agents and our approach here (which I’ll post on Medium/Dev.to soon).

Thanks for reading, and we’re happy to answer questions! Have any of you run into security issues with LLM agents or the MCP ecosystem? We’d love to discuss.

Open models by OpenAI (openai.com)

GPT-5 (openai.com)

Genie 3: A new frontier for world models (deepmind.google)

Perplexity is using stealth, undeclared crawlers to evade no-crawl directives (blog.cloudflare.com)

uBlock Origin Lite now available for Safari (apps.apple.com)

Show HN: I spent 6 years building a ridiculous wooden pixel display (benholmen.com)

Slow (michaelnotebook.com)

If you're remote, ramble (stephango.com)

Show HN: Kitten TTS – 25MB CPU-Only, Open-Source TTS Model (github.com)

Things that helped me get out of the AI 10x engineer imposter syndrome (colton.dev)

Emailing a one-time code is worse than passwords (blog.danielh.cc)

Ultrathin business card runs a fluid simulation (github.com)

Modern Node.js Patterns (kashw1n.com)

Vibechart (vibechart.net)

Claude Opus 4.1 (anthropic.com)

I gave the AI arms and legs then it rejected me (grell.dev)

Claude Code IDE integration for Emacs (github.com)

Telo MT1 (telotrucks.com)

Corporation for Public Broadcasting ceasing operations (cpb.org)

GPT-5: Key characteristics, pricing and system card (simonwillison.net)

Job-seekers are dodging AI interviewers (fortune.com)

Monitor your security cameras with locally processed AI (frigate.video)

Mastercard deflects blame for NSFW games being taken down (pcgamer.com)

6 weeks of Claude Code (blog.puzzmo.com)

Writing a good design document (grantslatton.com)

Qwen-Image: Crafting with native text rendering (qwenlm.github.io)

I want everything local – Building my offline AI workspace (instavm.io)

Live coding interviews measure stress, not coding skills (hadid.dev)

The anti-abundance critique on housing is wrong (derekthompson.org)

MacBook Pro Insomnia (manuel.bernhardt.io)

Tesla withheld data, lied, misdirected police to avoid blame in Autopilot crash (electrek.co)

Historical Tech Tree (historicaltechtree.com)

We may not like what we become if A.I. solves loneliness (newyorker.com)

Objects should shut up (dustri.org)

At 17, Hannah Cairo solved a major math mystery (quantamagazine.org)

Gemini 2.5 Deep Think (blog.google)

Cursed Knowledge (immich.app)

Flipper Zero dark web firmware bypasses rolling code security (rtl-sdr.com)

GPT-5 for Developers (openai.com)

How we made JSON.stringify more than twice as fast (v8.dev)

US reportedly forcing TSMC to buy 49% stake in Intel to secure tariff relief (notebookcheck.net)

Cerebras Code (cerebras.ai)

Japan: Apple Must Lift Browser Engine Ban by December (open-web-advocacy.org)

PHP 8.5 adds pipe operator (thephp.foundation)

This Month in Ladybird (ladybird.org)

Ollama Turbo (ollama.com)

Lina Khan points to Figma IPO as vindication of M&A scrutiny (techcrunch.com)

Ozempic shows anti-aging effects in trial (trial.medpath.com)

Scientific fraud has become an 'industry,' analysis finds (science.org)

Linear sent me down a local-first rabbit hole (bytemash.net)

Show HN: Driftcop – Open-source CLI SAST for "MCP rug pull attacks in AI Agents"

Comments (0)