Ask HN: Can anybody clarify why OpenAI reasoning now shows non-English thoughts?
21 johnnyApplePRNG 37 6/12/2025, 11:33:25 PM
People have noticed for a while now that Google's Bard/Gemini has inserted random hindi/bengali words often. [0]
I just caught this in an o3-pro thought process: "and customizing for low difficulty. কাজ করছে!"
That last set of chars is apparently Bengali for "working!".
I just find it curious that similar "errors" are appearing from multiple different models... what is the training method or reasoning that these alternate languages can creep in, does anyone know?
[0] https://www.reddit.com/r/Bard/comments/18zk2tb/bard_speaking_random_languages/
[1]: By this I mean "whatever it is they do that can be thought of as sorta kind roughly analogous to what we generally call thinking." I'm not interested in getting into a debate (here) about the exact nature of thinking and whether or not it's "correct" to refer to LLM's as "thinking". It's a colloquialism that I find useful in this context, nothing more.
[2]: https://arxiv.org/pdf/2501.12948
In other circumstances they might take a different path (in terms of output probability decoding) through other character sets, if the probabilities justify this.
My phrases switch to the language I learned them on very easily.
Computer terms are almost always English.
A lot of idioms I learned in my adult life are going to stay English, even if a Turkish equivalent exists and I later learned about them.
I find out that it is way easier for me to translate to or from English (not a native speaker) to any of the languages I am bilingual in, than between these languages. It is very hard for me to listen to one, and speak the other.
To my French ear or sounded like they were sentencing me to terrible things (and were always surprised they sounded like this :)), up until the random "router" or "framework" which was the core of the fight.
I love to listen to languages I do not understand (a great source is Radio Green) and try to get from the words what they are talking about.
Another one is one of my closest friend, a German, who speaks a very soft English. This until he described me how to drive somewhere (pre-GPS era) and the names he was using were like lashes.
Speaking various languages is a blessing
I assumed it knew I speak Spanish from other conversations, my Google profile, geolocation, etc. Maybe my English has enough hints that it was learned by a native Spanish speaker?
Perhaps it's more common in the parts of the world where bengali and english are more commonly spoken in general?
Why so much bengali/hindi then and why not other languages?
For example, the DeepSeek team explicitly reported this behavior in their R1-zero paper, noting that purely unsupervised reasoning emerges naturally but brings some “language mixing” along. Interestingly, they found a small supervised fine-tuning (SFT) step with language-consistency rewards slightly improved readability, though it came with trade-offs (DeepSeek blog post).
My guess is OpenAI has typically used a smaller summarizer model to sanitize reasoning outputs before display (they mentioned summarization/filtering briefly at Dev Day), but perhaps lately they’ve started relaxing that step, causing more multilingual slips to leak through. It’d be great to get clarity from them directly on whether this is intentional experimentation or just a side-effect.
[1] DeepSeek-R1 paper that talks about poor readability and language mixing in R1-zero’s raw reasoning https://arxiv.org/abs/2501.12948
[2] OpenAI “Detecting misbehavior in frontier reasoning models” — explains use of a separate CoT “summarizer or sanitizer” before showing traces to end-users https://openai.com/index/chain-of-thought-monitoring/
The DeepSeek-R1 paper has a section on this, where they 'punish' the model if it thinks in a different language to make the thinking tokens more readable. Probably Anthropic does this too.
One, the model is no longer being trained to output likely tokens or tokens likely to satisfy pairwise preferences. So the model doesn’t care. You have to explicitly punish the model for language switching, which dilutes the reasoning reward.
Two, I believe there has been some research on models representing similar ideas in multiple languages in similar areas. Sparse autoencoders have shown this. So if the translated text makes sense, I think this is why. If not, I have no idea.
Most people can only encode/decode a single language but an LLM can move between them fluidly.
(Inspired by movies and TV shows, when characters switch from English to a different language, such as French or Mandarin, to better express something. Maybe there's a compound word in German for that.)
The main suspicion is that it's more compact?
One could even say assuming someone's level of worldly understanding based on how many languages they speak shows a fairly limited world view.
Is it linear (25% more understanding for the fifth) or asymptotically? Does it increase across all domains equally (geology, poetry, ethics) or asymmetrically?
Seriously, explain it to me?
We are intentionally undoing one of the things that makes computers useful.