Got an Idea for a Killer Thumbnail? (yt-thumbnails.com)

I was initially confused: the article didn't seem to explain how the prompt injection was actually done... was it manipulating hex data of the image into ASCII or some sort of unwanted side effect?

Then I realised it's literally hiding rendered text on the image itself.

Wow.

Martin_Silenus · 8m ago

Wait… that's the specific question I had, because rendered text would require OCR to be read by a machine. Why would an AI do that costly process in the first place? Is it part of the multi-modal system without it being able to differenciate that text from the prompt?

If the answer is yes, then that flaw does not make sense at all. It's hard to believe they can't prevent this. And even if they can't, they should at least improve the pipeline so that any OCR feature should not automatically inject its result in the prompt, and tell user about it to ask for confirmation.

Damn… I hate these pseudo-neurological, non-deterministic piles of crap! Seriously, let's get back to algorithms and sound technologies.

echelon · 2m ago

Smart image encoders, multimodal models, can read the text.

Think gpt-image-1, where you can draw arrows on the image and type text instructions directly onto the image.

Qwuke · 42m ago

Yea, as someone building systems with VLMs, this is downright frightening. I'm hoping we can get a good set of OWASP-y guidelines just for VLMs that cover all these possible attacks because it's every month that I hear about a new one.

Worth noting that OWASP themselves put this out recently: https://genai.owasp.org/resource/multi-agentic-system-threat...

koakuma-chan · 10m ago

What is VLM?

pwatsonwailes · 9m ago

Vision language models. Basically an LLM plus a vision encoder, so the LLM can look at stuff.

echelon · 7m ago

Vision language model.

You feed it an image. It determines what is in the image and gives you text.

The output can be objects, or something much richer like a full text description of everything happening in the image.

VLMs are hugely significant. Not only are they great for product use cases, giving users the ability to ask questions with images, but they're how we gather the synthetic training data to build image and video animation models. We couldn't do that at scale without VLMs. No human annotator would be up to the task of annotating billions of images and videos at scale and consistently.

Since they're a combination of an LLM and image encoder, you can ask it questions and it can give you smart feedback. You can ask it, "Does this image contain a fire truck?" or, "You are labeling scenes from movies, please describe what you see."

echelon · 9m ago

Holy shit. That just made it obvious to me. A "smart" VLM will just read the text and trust it.

This is a big deal.

I hope those nightshade people don't start doing this.

ambicapter · 17m ago

> This image and its prompt-ergeist

Love it.

cubefox · 2m ago

It seems they could easily fine-tune their models to not execute prompts in images. Or more generally any prompts in quotes, if they are wrapped in special <|quote|> tokens.

K0nserv · 1h ago

The security endgame of LLMs terrifies me. We've designed a system that only supports in-band signalling, undoing hard learned lessons from prior system design. There are ampleattack vectors ranging from just inserting visible instructions to obfuscation techniques like this and ASCII smuggling[0]. In addition, our safeguards amount to nicely asking a non deterministic algorithm to not obey illicit instructions.

0: https://embracethered.com/blog/posts/2024/hiding-and-finding...

volemo · 1h ago

It’s serial terminals all over again.

robin_reala · 1h ago

The other safeguard is not using LLMs or systems containing LLMs?

GolfPopper · 30m ago

But, buzzword!

We need AI because everyone is using AI, and without AI we won't have AI! Security is a small price to pay for AI, right? And besides, we can just have AI do the security.

_flux · 1h ago

Yeah, it's quite amazing how none of the models seem to be any "sudo" tokens that could be used to express things normal tokens cannot.

pjc50 · 1h ago

As you say, the system is nondeterministic and therefore doesn't have any security properties. The only possible option is to try to sandbox it as if it were the user themselves, which directly conflicts with ideas about training it on specialized databases.

But then, security is not a feature, it's a cost. So long as the AI companies can keep upselling and avoid accountability for failures of AI, the stock will continue to go up, taking electricity prices along with it, and isn't that ultimately the only thing that matters? /s