Deeper Than Deep: David Reich's genetics lab unveils our prehistoric past (2017) (laphamsquarterly.org)

Most LLM benchmarks measure accuracy and coherence, but not whether the intended meaning survives. I’ve been calling this gap fidelity, the preservation of purpose and nuance. Has anyone else seen drift like effects in recursive generations or eval setups?

Comments (8)

Mallowram · 4h ago

What is intended meaning when words are arbitrary and meaning is always relative to arbitrariness?

realitydrift · 4h ago

That’s a fair point. Words themselves are arbitrary symbols, but meaning isn’t only in the symbols. It’s in the intent behind them and the use they’re put to.

For example, if I say “the meeting is at 3pm” and a model rewrites it as “planning sessions are important,” the words are fine, the grammar is fine, but the purpose (to coordinate time) has been lost. That’s the gap I’m calling fidelity: whether the output still serves the same function, even if the surface form changes.

Mallowram · 4h ago

There is no intent in words in and of themselves. Intent always comes from something specific tied to neural syntax, which is lost in words. That's an illusion. There is intensionality, which is different, which is what you're actually talking about. Intensionality is vague, it's not meaningful without context. The problem automating words is AI can't solve the conduit metaphor, which is the idea the words alone code meaning. They can't. This is the achilles heel of AI.

realitydrift · 4h ago

I agree. Words don’t carry intent by themselves. Intention is always embedded in use. That’s why I frame fidelity as about whether a system’s continuation still serves the same human purpose. The “conduit metaphor” you mention is exactly the trap: treating words as if they inherently encode meaning. Models fall into this because they optimize surface probabilities rather than checking whether the function of the exchange was preserved.

Mallowram · 3h ago

In 1973 Basil Bernstein studied how UK scores in math defied class boundaries while scores in reading comprehension/essays stayed tied to them, and he developed a theory with Halliday that language embeds so much that we cannot decipher easily: dominance, status, control, land-centering, gender/mate-selection, and much more, that they assumed language was far more a social system of primate-simian signaling rather than what Shannon and classical linguists took as "communication". My guess is, LLMs are really unresolvable revelations of these hidden nuances. In a way, LLMs demand a specific language, which doesn't exist, in order to function.

realitydrift · 3h ago

That’s a really interesting reference. Bernstein and Halliday were basically pointing out that language is never just propositional, it’s always smuggling social structure with it. That’s exactly why drift matters: when an LLM compresses or rewrites, it isn’t just shifting words, it’s rebalancing those embedded cues of power, context, and purpose. Humans keep the “extra baggage” because it carries meaning beyond the literal. Models optimize it away. That gap between statistical surface and lived function is what I’ve been calling semantic drift.

docsorb · 4h ago

You're touching upon a very nuanced point of identifying the true "intent" behind these words. Do you think the way these models are trained should be different to correctly map the potential intent vs the true meaning?

like your example, "the meeting is at 3pm", _we got enough time_ intends something else with "the meeting is at 3pm" _where the hell are you?_ intends something else. It is not so obvious to get that intent without a lot of context (like, time, environment, emotion etc.)

realitydrift · 4h ago

Exactly. That’s the hard part. Meaning is often carried less by the literal words and more by context (time, environment, emotion, shared knowledge). My point with fidelity is that current benchmarks don’t check whether outputs preserve that function in context. An AI can echo surface words but miss the intended role: coordination, reassurance, accountability. And that’s where drift shows up.

Gemini 2.5 Flash Image (developers.googleblog.com)

Show HN: A zoomable, searchable archive of BYTE magazine (byte.tsundoku.io)

One universal antiviral to rule them all? (cuimc.columbia.edu)

Windows 7 x64 Extended Support Page (trackerninja.codeberg.page)

Neuralink 'Participant 1' says his life has changed (fortune.com)

US Intel (stratechery.com)

Proposal to Ban Ghost Jobs: The Truth in Job Advertising and Accountability Act (cnbc.com)

DSLRoot, Proxies, and the Threat of 'Legal Botnets' (krebsonsecurity.com)

Apple´s Tim Cook battle results (hugston.com)

The McPhee method for writing deeply reported nonfiction (jsomers.net)

Deeper Than Deep: David Reich's genetics lab unveils our prehistoric past (2017) (laphamsquarterly.org)

The Leverage Paradox in AI (indiehackers.com)

iOS 18.6.1 0-click RCE POC (github.com)

The Relativity of Wrong (1988) (hermiene.net)

We regret but have to temporary suspend the shipments to USA (olimex.wordpress.com)

OOMProf: Profiling on the Brink (polarsignals.com)

SigNoz (YC W21, Open Source Datadog) Is Hiring Platform Engineers (Remote) (jobs.ashbyhq.com)

Japan has opened its first osmotic power plant (theguardian.com)

A bug saved the company (weblog.rogueamoeba.com)

Google Release Nano Banana (blog.google)

No evidence ageing/declining populations compromise socio-economic performance (arxiv.org)

Google will allow only apps from verified developers to be installed on Android (9to5google.com)

Meta is spending $10B in rural Louisiana to build its largest data center (fortune.com)

The TTY Demystified (2008) (linusakesson.net)

Cornell's world-first 'microwave brain' computes differently (newatlas.com)

Show HN: Gonzo – A Go-based TUI for log analysis (OpenTelemetry/OTLP support) (github.com)

Show HN: Diggit.dev – Git history for architecture archaeologists (diggit.dev)

Show HN: Turn Markdown into React/Svelte/Vue UI at runtime, zero build step (markdown-ui.com)

Anonymous structavaganza in Zig (lirk.top)

'Ten Martini' Proof Uses Number Theory to Explain Quantum Fractals (quantamagazine.org)

Interactive map of Paul's first century travels in Roman world (intofarlands.com)

Spoon-Bending, a logical framework for analyzing GPT-5 alignment behavior (github.com)

Show HN: I integrated my from-scratch TCP/IP stack into the xv6-riscv OS (github.com)

China's Share in Global Display Capacity to Reach 75% in 2028 (display.counterpointresearch.com)

Show HN: My OSS P2P file transfer tool for learning Next.js (as a C++ dev) (privydrop.app)

Google's Liquid Cooling (chipsandcheese.com)

Explanation of the Linux-kernel memory consistency model (2017) (raw.githubusercontent.com)

Blast from the past: Facit A2400 terminal (jpmens.net)

Ask HN: Why hasn't x86 caught up with Apple M series?

Linear Scan with Lifetime Holes (bernsteinbear.com)

Framework Laptop 16 (frame.work)

Climbing catfish filmed scaling waterfalls (science.org)

Michigan Supreme Court: Unrestricted Phone Searches Violate Fourth Amendment (reclaimthenet.org)

Teletext in North America (computer.rip)

National Weather Service application includes pledge to support Exec Orders (bsky.app)

The physics of parabolic microphones: Frequency dependence of gain (2023) (legallyblindbirding.net)

The Limits of NTP Accuracy on Linux (scottstuff.net)

DeepWiki: Understand Any Codebase (aitidbits.ai)

Wan2.2-S2V-14B – audio-driven cinematic video generation model (huggingface.co)

The Annotated Transformer (2022) (nlp.seas.harvard.edu)

Semantic drift: when AI gets the facts right but loses the meaning

Comments (8)