Research AI is becoming dangerous. Are we ready? (Hossenfelder)

**C C** · (This post was last modified: Jun 10, 2025 10:11 PM by C C.)

https://youtu.be/KY7_ufxh_Rk

VIDEO EXCERPTS: For the past years, the problems with artificial intelligence have been more amusing than scary, such as being unable to count the legs on a zebra. But in the past month things have taken a decidedly dark turn. I am starting to worry that the long-awaited era of agentic AI might become a big disaster.

Agentic AI is the term given to the current large language models that can use tools on your behalf, such as browsing the web, sending email, or talk to other people’s AIs. But once you let them do that, the potential damage is no longer contained.

[...] One of the most realistic threats on the horizon is AI-worms, that’s self-replicating AI prompts. An example for this comes from a recent paper that used a visual AI model based on an open-source version of Llama. You see, agentic AI uses tools by taking screenshots and analyzing them. They need to understand images.

But the authors of the paper demonstrate that it’s possible to tweak images so that they contain instructions for the model that humans can’t see. It works by subtly changing the pixels so that they trigger the weights for certain words.

In an example that they provide, an image that the AI agent “sees” on social media could trigger it to share that image, potentially setting off a cascade.

A similar problem was reported already last year, in which another group showed that you can put instructions into an email and tell the AI agent to share these instructions per email with potentially other AI agents. They just put the instructions into the text, but you could hide them so that no one would see them, say, in a small white font at the footer. You know, like the unsubscribe option.

This strategy is known as “prompt injection” and it’s a fundamental problem with large language models: They don’t distinguish between data and instructions to work on the data. It’s both in the same input. As others have pointed out before, this is a basically unfixable problem. So naturally, we are deploying it at scale...

AI is becoming dangerous. Are we ready? ... https://youtu.be/KY7_ufxh_Rk

https://www.youtube-nocookie.com/embed/KY7_ufxh_Rk

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Research Popular AI models aren’t ready to safely power robots	C C	0	147	Nov 11, 2025 12:58 AM Last Post: C C
	Article AI has psychological impacts we might not be ready for (sex abuse encouragement)	C C	0	590	May 1, 2024 05:41 PM Last Post: C C
	Article Open-source AI is uniquely dangerous	C C	0	332	Jan 14, 2024 08:42 AM Last Post: C C
	Approach to demystify black box AI not ready for prime time	C C	0	439	Oct 11, 2022 05:54 PM Last Post: C C