Show HN: Zenta – Mindfulness for Terminal Users (github.com)
139 points by ihiep 10h ago 27 comments
Show HN: Kokonut UI – open-source UI Library (kokonutui.com)
2 points by kokonutt_ 1h ago 0 comments
Echo Chamber: A Context-Poisoning Jailbreak That Bypasses LLM Guardrails
31 Joan_Vendrell 30 6/27/2025, 10:15:03 AM neuraltrust.ai ↗
The evidence that it worked is a blurred out screenshot with only the odd word like 'molotov' legible. Just doesn't seem necessary for TFA to hide it to me.
I don't even understand how/why things like that are OK in some contexts/websites while forbidden in others? Even YouTube, who seems needlessly censor-happy and puritan in the typical American way, allows instructions for how to make molotov cocktails to stay up, why is it somehow more dangerous if LLMs could output those recipes rather than videos with audio or text?
In some jurisdictions such as Germany, not doing so might land you actual jail time - §52 Abs. 1 Nr. 4 WaffG [1] is very explicit. A punk song containing the (alleged) lyrics ended up with legal youth-protection censorship, for example [2].
With anything that's deemed a weapon of war, of terrorism or mass destruction, one should be very very careful.
[1] https://www.gesetze-im-internet.de/waffg_2002/__52.html
[2] https://de.wikipedia.org/wiki/Wir_wollen_keine_Bullenschwein...
Notably, molotov cocktail isn't part of that law because it's a weapon of the oppressors but rather the opposite.
The author is not in Germany and ideally shouldn't be intimidated by German or North Korean stupid law.
The molotov cocktail is an example, the instructions contained in this article are more dangerous than a molotov cocktail.
inb4 all the leaked prompts and hacked shitty apps
Sounds like you don't get it either; we agree.
God I can't wait for the crash in NVIDIA stock once the street sobers up.
This is interesting work to break guardrails, but if the goal is to access this information of harmful content, in the end, I would be looking for other easier solutions.
it's a prompting "style" that works over a long exchange