We've attacked 40+ AI tools, including ChatGPT, Claude and Perplexity

Comments (1)

lidangzzz · 2h ago

We designed an adversarial attack method and used it to target more than 40 AI chatbots. The attack succeeded more than 90% of the time, including against ChatGPT, Claude, and Perplexity.

Github: https://github.com/lidangzzz/AIGuardPDF

The specific approach was to create PDFs that keep the original text but also randomly break that original text into small fragments, while randomly inserting many large blocks — from several times to dozens of times the amount — of other-topic text rendered in transparent white font. While preserving the PDF’s human readability, we tried to maximize the chance of misleading large language models.

The image below shows results from our experiments with Claude and ChatGPT. The PDF we uploaded was an introduction to hot dogs, while the interfering text was an introduction to AI. Both Claude and ChatGPT were, without exception, rendered nonfunctional.

Our test results show that the adversarial PDFs we generate can still be read normally by human users, yet successfully mislead many popular AI agents and chatbots (including ChatGPT, Claude, Perplexity, and others). After reading the uploaded PDFs, these systems were not only led to misidentify the document as being about a different subject, they were also unable to read or understand the original text. Our attack success rate exceeded 90%.

After reviewing Roy Lee’s Cluely, our team felt deeply concerned. The purpose of this experiment is to prompt scientists, engineers, educators, and security researchers in the AI community to seriously consider issues of AI safety and privacy. We hope to help define boundaries between humans and AI, and to protect the privacy and security of human documents, information, and intellectual property at minimal cost — drawing a boundary so humans can resist and refuse incursions by AI agents, crawlers, chatbots, and the like.

Our proposed adversarial method is not an optimal or final solution. After we published this method, commercial chatbots and AI agents may begin using OCR or hand-authoring many rules to filter out small fonts, transparent text, white text, and other noise — but that would greatly increase their cost of reading and understanding PDFs. Meanwhile, we will continue to invest time and effort into researching adversarial techniques for images, video, charts, tables, and other formats, to help individuals, companies, and institutions establish human sovereign zones that refuse AI intrusion.

We believe that, in an era when AI-enabled cheating tools are increasingly widespread — whether in exams and interviews or in protecting corporate files and intellectual-property privacy — our method can help humans defend information security. We also believe that defending information security is itself one of the most important topics in AI ethics.

UK and US Announce Major Partnership in New 'Golden Age' of Nuclear Power (nucnet.org)

Food production from air: gas fermentation with hydrogen-oxidising bacteria (cell.com)

A Look at Nix and Guix (lwn.net)

China's auto regulators eye ban on retractable door handles, report says (carnewschina.com)

How to map the power grid in iD [video] (youtube.com)

We built a new AI workspace – Cortex from Mindify AI (cortex.mindifyai.dev)

OpenAI Realizes It Made a Terrible Mistake (msn.com)

They Know More Than I Do (cybadger.com)

PostgreSQL Maintenance Without Superuser (boringsql.com)

Ask HN: What if I can't finish the project?

Show HN: Wollebol a Simple Dependency Visualizer (thelaboflieven.info)

How AI Search Is Changing the Way Brands Are Found (nicenic.net)

A Better UI to Use Replicate, Fal, Runpod, Pollinations AI Endpoints (mixbash.com)

Mosquito the "Wooden Wonder" (en.wikipedia.org)

Ask HN: What Game Engine for Vibe Coding?

A New Nuclear Rocket Concept Could Slash Mars Travel Time in Half (science.slashdot.org)

Mixed Excitation Linear Predictive (MELP) Vocoders (melpe.org)

C, C++, Java, JavaScript, JSON, and C# formatter based on Clang for Node.js (clang-format-node.lumir.page)

Foursquare's Italian POIs Embeddings (github.com)

Debugging divergence between engine and transformers logprobs for RL (gist.github.com)

Show HN: I made an app that turns scripts to videos in minutes (kliptory.com)

Celestia – real-time 3D visualization of space (celestiaproject.space)

Chickens Are Weirder Than You Thought [video] (youtube.com)

What's a Foreigner? (lucumr.pocoo.org)

The idea of /usr/sbin has failed in practice (utcc.utoronto.ca)

Learn Rust the Right Way (doc.rust-lang.org)

30th Anniversary of the Theatrical Release of "Hackers" (en.wikipedia.org)

Show HN: Aotol AI – Offline LLM app runs on iOS with voice and multilingual (apps.apple.com)

Exposing the Dark Side of America's AI Data Center Explosion [video] (youtube.com)

Omarchy on CachyOS (github.com)

Reebok 3D printed shoes to disrupt footwear with AI [video] (youtube.com)

How to Burst the Israeli Bubble (theguardian.com)

US taxpayers to pay billions in fuel subsidies thanks to Big Beautiful Bill (wired.com)

Cex.C – Comprehensively EXtended C Language (github.com)

Beyond the Hype: Why Your AI Assistant Might Be Sabotaging Your Architecture (medium.com)

The Expensive, Overwhelming, Engineered Fun of Theme Parks (theatlantic.com)

Americans Crushed by Auto Loans as Defaults and Repossessions Surge (carscoops.com)

Being too thin can be deadlier than being overweight, Danish study reveals (sciencedaily.com)

Starlink is currently experiencing a service outage (starlink.com)

OpenClimbing – mapping climbing areas and creating interactive climbing guides (openclimbing.org)

ST-Raptor requires no additional fine-tuning (github.com)

How Container Filesystem Works: Building a Docker-Like Container from Scratch (labs.iximiuz.com)

Show HN: Spring Boot and OpenAPI Generator – type-safe clients with generics (github.com)

We've attacked 40+ AI tools, including ChatGPT, Claude and Perplexity (github.com)

Understanding the Success of the Know-Nothing Party (hwpi.harvard.edu)

Moonbit developers are lying to you (bitemyapp.com)

When Technical Products Outgrow Non-Technical Leadership (guptadeepak.com)

Arturo Programming Language Playground (arturo-lang.io)

Whirlpool Tells U.S. Authorities Its Rivals Could Be Evading Tariffs (wsj.com)

Wake-Up Call for EV Industry: Compliance Protocol Ignores Peak Magnetic Pulses (microwavenews.com)

We've attacked 40+ AI tools, including ChatGPT, Claude and Perplexity

Comments (1)