Show HN: RAG Firewall – retrieval-time guardrails for LangChain/LlamaIndex
1 talbuilds 1 8/29/2025, 7:48:12 PM github.com ↗
RAG pipelines are great, but they can still retrieve "toxic" chunks:
– prompt injection attempts
– leaked API keys/secrets
– stale or conflicting content
– unapproved external URLs
We built an open-source "retrieval firewall" that scans chunks before they reach the LLM: – denies injection & secrets – flags/reranks PII, encoded blobs, untrusted URLs – audit log (JSONL) of all decisions – drop-in wrappers for LangChain and LlamaIndex retrievers
Install: pip install rag-firewall Repo: https://github.com/taladari/rag-firewall
Curious if others here handle retrieval-time risks, or just ingest/output filtering. Would love feedback and red-team payloads.
– The firewall runs entirely client-side, so no data ever leaves your environment.
– It focuses on *retrieval-time* risks, not output moderation — so the LLM never sees poisoned chunks in the first place.
– Policies are YAML: you can choose to deny, allow, or just re-rank risky docs (based on recency, provenance, relevance).
– Overhead is low: scanners are regex/heuristic, so for ~5–20 chunks it adds only a few ms.
I’d love feedback on two things in particular:
1. Do you think retrieval-time filtering belongs in the pipeline, or should it all be done at ingest/output?
2. If you’ve got prompt injection payloads or edge cases you use to test your own RAG stacks, I’d love to try them against this.
Thanks for taking a look — always happy to hear critique, especially from folks running LangChain/LlamaIndex in production.