The web does not need gatekeepers: Cloudflare’s new “signed agents” pitch (positiveblue.substack.com)

> Large language models generate text one word (token) at a time. Each word is assigned a probability score, based on how likely it is to be generated next. So for a sentence like “My favourite tropical fruits are mango and…”, the word “bananas” would have a higher probability score than the word “airplanes”.

> SynthID adjusts these probability scores to generate a watermark. It's not noticeable to the human eye, and doesn’t affect the quality of the output.

I think they need to be clearer about the constraints involved here. If I ask What is the capital of France? Just the answer, no extra information.” then there’s no room to vary the probability without harming the quality of the output. So clearly there is a lower bound beyond which this becomes ineffective. And presumably the longer the text, the more resilient it is to alterations. So what are the constraints?

I also think that this is self-interest dressed up as altruism. There’s always going to be generative AI that doesn’t include watermarks, so a watermarking scheme cannot tell you if something is genuine. It is, however, useful for determining that something came from a specific provider, which could be valuable to Google in all sorts of ways.

merelysounds · 3h ago

This might be enforced in some trivial way, e.g. by requiring AI models to answer with at least a sentence. The constraints may not be fully published and the obscurity might make it more efficient, if only temporarily.

Printer tracking dots[1] is one prior solution like this; annoying, largely unknown, workarounds exist, still - surprisingly efficient.

[1]: https://en.m.wikipedia.org/wiki/Printer_tracking_dots

No comments yet

mingtianzhang · 1h ago

1. One-sample detection is impossible. These detection methods work at the distributional level—more like a two-sample test in statistics—which means you need to collect a large amount of generated text from the same model to make the test significant. Detecting based on a short piece of generated text is theoretically impossible. For example, imagine two different Gaussian distributions: you can never be 100% certain whether a single sample comes from one Gaussian or the other, since both share the same support.

2. Adding watermarks may reduce the ability of an LLM, which is why I don’t think they will be widely adopted.

3. Consider this simple task: ask an LLM to repeat exactly what you said. Is the resulting text authored by you, or by the AI?

mingtianzhang · 1h ago

For images/video/audio, removing such a watermark is very simple. By adding noise to the generated image and then using an open-source diffusion model to denoise it, the watermark can be broken. Or in an autoregressive model, use an open-sourced model to do generation with "teacher forcing" loll.

drdebug · 51m ago

I wonder where you got that impression. Several professional watermarking systems for movie studio type content I have worked with (and on) are highly resistant to noise removal while remaining imperceptible.

mingtianzhang · 28m ago

Based on my research experience and judgment, I have published several top-conference papers in both the detection and diffusion domain, but I haven’t explored the engineering/product side. I believe that if such a system hasn’t been invented yet, it wouldn’t be difficult to create one to remove that watermark using an open-source image/video model and maintain the high quality. Would you be interested in having a further discussion on this?

wenbin · 3h ago

I really hope SynthID becomes a widely adopted standard - at the very least, Google should implement it across its own products like NotebookLM.

The problem is becoming urgent: more and more so-called “podcasts” are entirely fake, generated by NotebookLM and pushed to every major platform purely to farm backlinks and run blackhat SEO campaigns.

Beyond SynthID or similar watermarking standards, we also need models trained specifically [0] to detect AI-generated audio. Otherwise, the damage compounds - people might waste 30 minutes listening to a meaningless AI-generated podcast, or worse, absorb and believe misleading or outright harmful information.

[0] 15,000+ ai generated fake podcasts https://www.kaggle.com/datasets/listennotes/ai-generated-fak...

6LLvveMx2koXfwn · 41m ago

Given there is "misleading or outright harmful" information generated by humans, why is it more pressing that we track such content generated by AI?

utilize1808 · 2h ago

I feel this is not the scalable/right way to approach this. The right way would be for human creators to apply their own digital signatures to the original pieces they created (specialised chips on camera/in software to inject hidden pixel patterns that are verifiable). If a piece of work lacks such signature, it should be considered AI-generated by default.

HPsquared · 1h ago

Then you just point the special camera at a screen showing the AI content.

shkkmo · 2h ago

That seems like a horrible blow to anonymity and psuedonymity that would also empower identity thieves.

utilize1808 · 1h ago

Not necessarily. It’s basically document signing with key pairs —- old tech that is known to work. It’s purpose is not to identify the individual creators, but to verify that a piece of work was created by a process/device that is not touched by AI.

fertrevino · 43m ago

I wonder what exactly would prevent a developer from removing the signature from a generated file. One could remove arbitrary segments that signal that it is AI generated.

42lux · 31m ago

For images it's not that easy it's in Fourier space and injected through the whole denoising process.

teiferer · 4h ago

Could anybody explain how this isn't easily circumvented by using a competitor's model?

Also, if everything in the future has some touch of AI inside, for example cameras using AI to slightly improve the perceived picture quality, then "made with AI" won't be a categorization that anybody lifts an eyebrow about.

michaelt · 2h ago

> Could anybody explain how this isn't easily circumvented by using a competitor's model?

If the problem is "kids are using AI to cheat on their schoolwork and it's bad PR / politicians want us to do something" then competitors' models aren't your problem.

On the other hand, if the problem is "social media is flooded with undetectable, super-realistic bots pushing zany, divisive political opinions, we need to save the free world from our own creation" then yes, your competitors' models very much are part of the problem too.

progval · 3h ago

By lobbying regulators to force your competitors to add watermarks too.

dragonwriter · 4h ago

> Could anybody explain how this isn't easily circumvented by using a competitor's model?

Almost all the big hosted AI providers are publicly working on watermarking for at least media (text is more of a mixed bag); ultimately, its probably a regulatory play—the big providers expect that the combination of legitimate concerns and their own active fearmongering, combined with them demonstrating watermarking, will result in mandates for commercial AI generation services to include watermarking. This may even be part of the regulatory play to restrict availability and non-research use of open models.

mhl47 · 3h ago

Yes but isn't the cat out of the box already? Don't we have sufficiently strong local models that can be finetuned in various ways to rewrite text/alternate images and thus destroy possible watermarks.

Sure in some cases a model might do some astounding things that always shine through, but I guess the jury still out on these questions.

verisimi · 3h ago

If you see the mark, you'd know at least that you aren't dealing with a purely mechanic rendering of whatever-it-is.

kedv · 2h ago

Would be nice if you guys open source the detection code, similar to the way C2PA is open

harshreality · 1h ago

That's like asking for Adobe to open source their C2PA signing keys.

AI watermarking is adversarial, and anyone who generates a watermarked output either doesn't care, or wants the watermarked removed.

C2PA is cooperative: publishers want the signatures intact, so that the audience has trust in the publisher.

By "adversarial" and "cooperative", I mean in relation to the primary content distributor. There's an adversarial aspect to C2PA, too: bad actors want leaked keys so they can produce fake video and images with metadata attesting that they're real.

A lot of people have a large incentive to disrupt the AI watermark. Leaked C2PA keys will be a problem, but probably a minor one. C2PA is merely an additional assurance, beyond the reputation and representation of the publishing entity, of the origin of a piece of media.

egeozcan · 5h ago

I guess this is the start of a new arms race on making generated content pass these checks undetected and detecting them anyway.

dragonwriter · 4h ago

Its not really an arms race; any gen AI system that doesn't explicitly incorporate a watermarking tool like this won't be detectable by tools that read the watermarks.

There is a kind of arms race that has existed for a while for non-watermarked content, except that the detection tools are pretty much Magic 8-ball level of reliability, so there's not a lot of effort on the counter-detection side.

Oras · 3h ago

OpenAI has been doing something similar for generated images using C2PA [0]

It is easy to alter by just saving to a different format or basic cropping.

I would love to see how SynthID is fixing this issue.

https://help.openai.com/en/articles/8912793-c2pa-in-chatgpt-...

9dev · 5h ago

You can never be sure something has been generated by a model embedding one of these anyway, so it’s pretty moot.

DrNosferatu · 3h ago

The first good use of blockchain comes to mind.

peterkelly · 5h ago

Create the problem, sell the solution.

montag · 4h ago

"The watermarks are embedded across Google’s generative AI consumer products, and are imperceptible to humans."

I'd love to see the data behind this claim, especially on the audio side.

donperignon · 4h ago

Nah that’s a solved problem if you work on the frequency domain. Same for image. Text is the hard rock here.

chii · 4h ago

i find the premise to be an invalid one personally - why is the property that a works from an AI model must be identified/identifiable?

hiatus · 2h ago

People want to know when they are interacting with AI-generated content.

HighGoldstein · 3h ago

Video evidence of you committing a crime, for example, should be identifiable as AI-generated.

chii · 3h ago

how do we currently deal with tampered video evidence today, before the advent of ai generated videos? Why cant same methods be used for an ai generated video?

drdebug · 8m ago

If you are interested, you can look into the work of Hany Farid on this topic as a good introduction to image forensics and related topics.

DrNosferatu · 3h ago

If I slightly edit plain text watermarked with it, will the watermark identification be robust?

HighGoldstein · 3h ago

I wonder if, conversely, authentic media can be falsely watermarked as AI-generated.

NitpickLawyer · 3h ago

When chatgpt launched there was a rush of "solutions" to catch llm generated text. The problem was not their terrible accuracy, but their even more terrible false positive rates. The classic example was pasting the declaration of independence, and getting 100% AI generated. What's even more sad is that some of those solutions are still used today, and for a while they were used against students, with chilling repercussions for them.

notpushkin · 3h ago

For photos, I think the answer is yes. For texts, the wording will be changed when you watermark them, so I guess that’s a no.

donperignon · 4h ago

I am not sure that text watermarking will be accurate, I foresee plenty of false positives.

drdebug · 59s ago

In practice, very short texts don't carry very high value so watermarking is (usually) less important. For longer text false positives are not an issue at all since you have a large amount of data to extract your signal from.

doawoo · 4h ago

the beginning of walled garden “AI” tools has been interesting to follow

pelasaco · 4h ago

looks like the same as anti-virus companies in the 80s? Write virus, Write anti-virus and profit!

R_Spaghetti · 3h ago

It only works across Google shit.

Do the simplest thing that could possibly work (seangoedecke.com)

From Multi-Head to Latent Attention: The Evolution of Attention Mechanisms (vinithavn.medium.com)

Show HN: I made an Animal Crossing style letter editor (acmail.idreesinc.com)

The Grammar According to West (dwest.web.illinois.edu)

John Carmack's arguments against building a custom XR OS at Meta (twitter.com)

SynthID – A tool to watermark and identify content generated through AI (deepmind.google)

Lisp from Nothing, Second Edition (t3x.org)

Grok Code Fast 1 (x.ai)

Essential Coding Theory [pdf] (cse.buffalo.edu)

The Theoretical Limitations of Embedding-Based Retrieval (arxiv.org)

Deploying DeepSeek on 96 H100 GPUs (lmsys.org)

Hermes 4 (hermes4.nousresearch.com)

15-Fold increase in solar thermoelectric generator performance (nature.com)

Taylor Otwell: What 14 Years of Laravel Taught Me About Maintainability (maintainable.fm)

Emulating aarch64 in software using JIT compilation and Rust (pitsidianak.is)

Trying to get error backtraces in Rust libraries right (iroh.computer)

Wikipedia as a Graph (wikigrapher.com)

Why Romania excels in international Olympiads (palladiummag.com)

I'm working on implementing a programming language all my own (eli.li)

How do I get into the game industry (garry.net)

A look at XSLT 3.0 (2017) (xml.com)

The web does not need gatekeepers: Cloudflare’s new “signed agents” pitch (positiveblue.substack.com)

Income Equality in Nordic Countries: Myths, Facts, and Lessons (aeaweb.org)

Show HN: Hacker News em dash user leaderboard pre-ChatGPT (gally.net)

Nginx-CGI brings support for CGI to Nginx and angie (github.com)

How did .agakhan, .ismaili and .imamat get their own TLDs? (data.iana.org)

Flunking my Anthropic interview again (taylor.town)

Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown (sosumi.ai)

God created the real numbers (ethanheilman.com)

AI’s coding evolution hinges on collaboration and trust (spectrum.ieee.org)

Accelerating life sciences research (openai.com)

How to stop Google from AI-summarising your website (teruza.com)

SQLite's documentation about its durability properties is unclear (agwa.name)

Thunder Compute (YC S24) Is Hiring (ycombinator.com)

Reloading Classes in Python (andrewpwheeler.com)

Make any site multiplayer in a few lines. Serverless WebRTC matchmaking (oxism.com)

Fun and Immersive Typing Game (keybara.io)

Data engineering and software engineering are converging (clickhouse.com)

Lucky 13: a look at Debian trixie (lwn.net)

Some users have noticed settings that let Meta analyze and retain phone photos (zdnet.com)

Offline-First Landscape – 2025 (marcoapp.io)

Seedbox Lite: A lightweight torrent streaming app with instant playback (github.com)

This is my brain on leeches (todaythings.substack.com)

Ask HN: The government of my country blocked VPN access. What should I use?

Updates to Consumer Terms and Privacy Policy (anthropic.com)

Acoustic Panels as Wall Coverings in Star Trek: The Next Generation (ex-astris-scientia.org)

The fight against labeling long-term streaming rentals as "purchases" you "buy" (arstechnica.com)

The Synology End Game (lowendbox.com)

If you have a Claude account, they're going to train on your data moving forward (old.reddit.com)

Strange CW Keys (sites.google.com)

SynthID – A tool to watermark and identify content generated through AI

Comments (44)