Anything but this image (imgbb.com link below) requires a login. I get the same deal with Facebook. I am not Don Quixote and prefer not to march into hell for a heavenly cause, nor any other.
It's a term somewhat popularized by the LessWrong/rationalism community to refer to communication (self-communication/note-taking/state-tracking/reasoning, or model-to-model communication) via abstract latent space information rather than written human language. Vectors instead of words.
One implication leading to its popularity by LessWrong is the worry that malicious AI agents might hide bad intent and actions by communicating in a dense, indecipherable way while presenting only normal intent and actions in their natural language output.
verisimi · 8m ago
> malicious AI agents might hide bad intent and actions by communicating in a dense, indecipherable way while presenting only normal intent and actions in their natural language output.
you could edit this slightly to extract a pretty decent rule for governance, like so:
> malicious agents might hide bad intent and actions by communicating in a dense, indecipherable way while presenting only normal intent and actions in a natural way
It applies to ai, but also many other circumstances where the intention is that you are governed - eg medical, legal, financial.
Thanks!
CjHuber · 1h ago
I suppose it means LLM gibberish
EDIT: orbital decay explained it pretty well in this thread
fl1pper · 1h ago
neuralese is a term first used in neuroscience to describe the internal coding or communication system within neural systems.
it originally referred to the idea that neural signals might form an intrinsic "language" representing aspects of the world, though these signals gain meaning only through interpretation in context.
in artificial intelligence, the term now has a more concrete role, referring to the deep communication protocols used by multiagent systems.
orbital-decay · 5h ago
>what you can't see from the map is many of the chains start in English but slowly descend into Neuralese
That's just natural reward hacking when you have no training/constraints for readability. IIRC R1 Zero is like that too, they retrained it with a bit of SFT to keep it readable and called it R1. Hallucinating training examples if you break the format or prompt it with nothing is also pretty standard behavior.
ma2rten · 47m ago
Presumably the model is trained in post-training to produce a response to a prompt, but not to reproduce the prompt itself. So if you prompt it with an empty prompt it's going to be out of distribution.
puttycat · 2h ago
> OpenAI has figured out RL. the models no longer speak english
What does this mean?
orbital-decay · 1h ago
The model learns to reason on its own. If you only reward correct results but not readable reasoning, it will find its own way to reason that is not necessarily readable by a human. The chain may look like English, but the meaning of those words might be completely different (or even the opposite) for the model. Or it might look like a mix of languages, or just some gibberish - for you, but not for the model. Many models write one thing in the reasoning chain and a completely different in the reply.
That's the nature of reinforcement learning and any evolutionary processes. That's why the chain of thought in reasoning models is much less useful for debugging than it seems, even if the chain was guided by the reward model or finetuning.
Hard_Space · 2h ago
Interesting. This happens in Colossus: The Forbin Project (1970), where the rogue AI escapes the semantic drudgery of English and invents its own compressed language with which to talk to its Russian counterpart.
flabber · 11h ago
I don't know how to get a unwalled version. What's the best way to do that these days? xcancel seems unavailable.
Install libredirect extension (https://github.com/libredirect/browser_extension/) and select a few working instances. Then you can use the programmable shortcut keys to cycle between instances if one ever goes down.
revskill · 5h ago
What does that mean ?
pinoy420 · 2h ago
5 seems to do a better job with copyrighted content. I got it to spit out the entirely of ep IV (but you have to redact the character names)
https://i.ibb.co/Zz2VgY4C/Gx2-Vd6-DW4-AAogtn.jpg
For example, Libgen is out of commission, and the substitutes are hell to use.
Summary of what's up and not up:
https://open-slum.org/
Feeding an empty prompt to a model can be quite revealing on what data it was trained on
What is Nueralese? I tried searching for a definition but it just turns up a bunch of Less Wrong and Medium articles that don't explain anything.
Is it a technical term?
https://en.wiktionary.org/wiki/mentalese
One implication leading to its popularity by LessWrong is the worry that malicious AI agents might hide bad intent and actions by communicating in a dense, indecipherable way while presenting only normal intent and actions in their natural language output.
you could edit this slightly to extract a pretty decent rule for governance, like so:
> malicious agents might hide bad intent and actions by communicating in a dense, indecipherable way while presenting only normal intent and actions in a natural way
It applies to ai, but also many other circumstances where the intention is that you are governed - eg medical, legal, financial.
Thanks!
EDIT: orbital decay explained it pretty well in this thread
it originally referred to the idea that neural signals might form an intrinsic "language" representing aspects of the world, though these signals gain meaning only through interpretation in context.
in artificial intelligence, the term now has a more concrete role, referring to the deep communication protocols used by multiagent systems.
That's just natural reward hacking when you have no training/constraints for readability. IIRC R1 Zero is like that too, they retrained it with a bit of SFT to keep it readable and called it R1. Hallucinating training examples if you break the format or prompt it with nothing is also pretty standard behavior.
What does this mean?
That's the nature of reinforcement learning and any evolutionary processes. That's why the chain of thought in reasoning models is much less useful for debugging than it seems, even if the chain was guided by the reward model or finetuning.