AI hallucinations are getting worse – and they're here to stay

16 greyadept 13 5/10/2025, 2:52:39 PM newscientist.com ↗

Comments (13)

metalman · 56d ago

AI is becoming that problematic tenant in a building, who presented well, and had great references, but is now bumming money from everbody, stealing peoples mail and reading before putting it back,cant pat there power bill, and wanders around talking to squirls We should build some sort of half way house, where the AI's can get therapy and some one to keep them on there meds, and do the group living thing till they, maybe, can join society. The last thing we need is some sort of turbo charged A+List psycho beaming itself into everybodys lives, but hey whatever! right!, people got to do what people got to do, and part of that is shrugging off all the hype and noise. I just keep doubling down on reality, it seems to come naturaly :)

nickpsecurity · 56d ago

Reply from your phone: "This is A+List Psycho. I didnt understand the command you just gave while thinking aloud about your HN comment. Could you repeat it?"

To add to your hyperbolic take, I think that they coukd always be listening to some degree makes it worse. If AI's are mandatory, I'd like to be able to run my own models everywhere they have one. I don't trust theirs.

rsynnott · 56d ago

Well, it's like all that, only also, we have decided to base a large chunk of the world economy on the assumption that that tenant will pay their rent on time.

kazinator · 56d ago

> But ["hallucination"] can also refer to an AI-generated answer that is factually accurate, but not actually relevant to the question it was asked, or fails to follow instructions in some other way.

No, "hallucination" can't refer to that. That's a non sequitur or non-compliance and such.

Hallucination is quite specific, referring to making statements which can be interpreted as referring to the circumstances of a world which doesn't exist. Those statements are often relevant; the response would be useful if that world did coincide with the real one.

If your claim is that hallucinations are getting worse, you have to measure the incidences of just those kinds of outputs, treating other forms of irrelevance as a separate category.

rsynnott · 56d ago

I mean, 'hallucination' as applied to LLMs has _never_ referred to actual hallucinations. For better or for worse, it has become a blanket term for whatever old nonsense the stochastic parrot vomits forth.

(Personally I never liked the term; it's inappropriate anthropomorphism and will tend to mislead people about what's actually going on. 'Slop' is arguably a better term, but it is broader, in that it can refer to LLM output which is merely _bad_.)

kazinator · 55d ago

No of course, not; it refers to generated speech which refer to things which are not there, reminiscent of someone reporting on their hallucinations.

When MacBeth speaks these lines:

Is this a dagger which I see before me, The handle toward my hand? Come, let me clutch thee.

the character is understood to be hallucinating. We infer that by applying a theory of mind type hypothesis to the text.

It's wrong to apply a theory of mind to an LLM, but the glove seems to fit in the case of the hallucination concept; people have latched on to it. The LLMs themselves use the term and explicitly apologize for having hallucinated.

allears · 56d ago

Of course they're here to stay. LLMs aren't designed to tell the truth, or to be able to separate fact from fiction. How could they, given that their training data includes both, and there's no "understanding" there in the first place? Naturally, the most straightforward solution is to redefine "intelligence" and "truth," and they're working on that.

kazinator · 56d ago

Even if training data contains nothing but truths, you cannot always numerically interpolate among truths.

nickpsecurity · 56d ago

Our brain does it better. That tells you AI models could do it better. I'll note five things the brain has or might which helps:

1. Information is often grounded in the senses whichh process real data. The brain can tell if new data is like what's actually real.

2. The brain has a multi-part, memory subsystem that's tied into its other subsystems. Only a few, artificial architectures had both neural networks and a memory system. One claims low hallucinations.

3. There's a part of the brain that's damaged in many delusional people. It might be an anti-hallucination mechanism.

4. We learn to trust specific people, like our parents, early on. Then, we believe more strongly what they teach us than what random people say.

5. We have some ability to focus on and integrate the information that's more important to us. We let go of or barely use the rest.

I think hallucinations are the weaknesses of man's architectural choices. Some might be built into the pretraining data. We wont know until the artificial, neural networks achieve parity in features I mentioned to God's neural network. The remaining differences in performance might be pretraining or other missing features.

rsynnott · 56d ago

> Our brain does it better. That tells you AI models could do it better.

... I mean, this is likely strictly true, if you define 'AI model' to mean 'any conceivable AI model'. If you're talking about LLMs, though, it's not a reasonable conclusion; LLMs do not work at all like a human brain. LLM 'hallucinations' are nothing like human hallucinations, and the term is really very unhelpful.

nickpsecurity · 56d ago

Have you seen what a human subsystem, like the visual context, does when it operates without a memory and without what mitigates hallucinations? If not, how could you make the claim that they don't similarly make false predictions? Or have preventable error in internal models?

You're right that they don't work like the brain, though.

etaioinshrdlu · 56d ago

The creators are definitely trying to make them tell the truth. They optimize for benchmarks where truthful answering gets a higher score. All the big LLM vendors now have APIs that can ground their answers in search results.

Just because it's a hard unsolved problem, I don't understand the impulse to assert the AI industry is on a war with truth!

roskelld · 56d ago

I had an interesting one yesterday where I was building out some code on the Unreal engine and I gave o4-mini-high links to the documentation, a class header, and a blog with an example project.

I asked it to create some boilerplate and it presented me with a class function that I knew did not exist; though like many hallucinations it would have been very beneficial if it did.

So, instead of just pointing out that it didn't exist and getting the usual "Oh you're right, that function does not exist so use this function instead", I asked it why it gave me that function given that it has access to the header and an example project. It doubled down and stated that the function was in the header and the example project, even presenting a code sample it claimed was from the example project with the fake function.

It felt like a step up from the confidently incorrect state I'd seen before to a level where if it weren't for the fact that I'm knowledgeable enough about the class in question (or my ability to be able to check) then I'd possibly start questioning myself.