Analyzing Modern Nvidia GPU Cores (arxiv.org)

Great. Implement it, benchmark, slower. In some cases much slower. I tell ChatGPT it's slower, and it confidently tells me of course it's slower, here's why.

The duality of LLMs, I guess.

thechao · 24m ago

Me: What is the tallest tree in Texas?

CGT: The tallest tree in Texas is a 44 foot tall tree in ...

Me: No it's not! The tallest tree is a pine in East Texas!

CGT: You're right! The tallest tree in Texas is probably a Loblolly Pine in East Texas; they grow to a height of 100–150', but some have been recorded to be 180' or more.

Me: That's not right! In 1890 a group of Californians moved to Houston and planted a Sequoia, it's been growing there since then, and is nearly 300 feet tall.

CGT: Yes, correct. In the late 19th century, many Sequoia Sempervirens were planted in and around Houston.

...

I mean, come on; I already spew enough bullshit, I don't need an automated friend to help out!

datadrivenangel · 3h ago

This may be an issue with default settings:

"Modern LLMs now use a default temperature of 1.0, and I theorize that higher value is accentuating LLM hallucination issues where the text outputs are internally consistent but factually wrong." [0]

0 - https://minimaxir.com/2025/05/llm-use/

dewarrn1 · 4h ago

So, in reference to the "reasoning" models that the article references, is it possible that the increased error rate of those models vs. non-reasoning models is simply a function of the reasoning process introducing more tokens into context, and that because each such token may itself introduce wrong information, the risk of error is compounded? Or rather, generating more tokens with a fixed error rate must, on average, necessarily produce more errors?

ActorNightly · 2h ago

Its a symptom of asking the models to provide answers that are not exactly in the training set, so the internal interpolation that the models do probably runs into edge cases where statistically it goes down the wrong path.

scudsworth · 2h ago

https://archive.ph/Jqoqa

bdangubic · 2h ago

"self-driving cars are getting more and more powerful but the number of deaths they are causing is rising exponentially" :)

dimal · 2h ago

I wish we called hallucinations what they really are: bullshit. LLMs don’t perceive, so they can’t hallucinate. When a person bullshits, they’re not hallucinating or lying, they’re simply unconcerned with truth. They’re more interested in telling a good, coherent narrative, even if it’s not true.

I think this need to bullshit is probably inherent in LLMs. It’s essentially what they are built to do: take a text input and transform it into a coherent text output. Truth is irrelevant. The surprising thing is that they can ever get the right answer at all, not that they bullshit so much.

kelseyfrog · 26m ago

In the same sense that astrology readings, tarot readings, runes, augury, reading tea leaves are bullshit - they have oracular epistemology. Meaning comes from the querant suspending disbelief, forgetting for a moment that the I Ching is merely sticks.

It's why AI output is meaningless for everyone except the querant. No one cares about your horoscope. AI shares every salient feature with divination, except the aesthetics. The lack of candles, robes, and incense - the pageantry of divination means a LOT of people are unable to see it for what it is.

We live in a culture so deprived of meaning we accidentally invented digital tea readings and people are asking it if they should break up with their girlfriend.

elpocko · 46m ago

Or maybe we could stop anthropomorphizing tech and call the "hallucinations" what they really are: artifacts introduced by lossy compression.

No one is calling the crap that shows up in JPEGs "hallucinations" or "bullshit"; it's commonly accepted side effects of the compression algorithm that makes up shit that isn't there in the original image. Now we're doing the same lossy compression with language and suddenly it's "hallucinations" and "bullshit" because it's so uncanny.

Analyzing Modern Nvidia GPU Cores (arxiv.org)

HTAP Is Dead (mooncake.dev)

US Defense Secretary Hegseth to slash senior-most ranks of military (reuters.com)

Signal Clone Used by Waltz Suspends Service After 'Security Incident' (nytimes.com)

Bay Area biotech company Unity lays off every single worker, including CEO (sfgate.com)

Free workshop on commercial OSS for investors (chinstrap.community)

How Indian Colleges Casually Violate Human Rights (isomorphism.xyz)

Umarell (en.wikipedia.org)

How Is Handedness Linked to Neurological Disorders? (news.rub.de)

Toronto's unsold condo rate has reached 'an incredible level': expert (bnnbloomberg.ca)

Show HN: Moss – AI-Powered Semantic Search Running In-Browser (No Cloud) (twitter.com)

I built a 7-day calendar app – no months, no years, just the next 7 days (weeklong.life)

Baby music app backed by neuroscience research papers. Feedback?

A militarized conspiracy theorist group believes radars are 'weather weapons' (cnn.com)

Air traffic controllers couldn't see or talk to planes in Newark failure (cnbc.com)

Odysee Drops Stripe for USDC Payments, Citing Censorship Concerns (reclaimthenet.org)

Trump Administration Disqualifies Harvard from Future Research Grants (nytimes.com)

Google can train search AI on web content even if publishers opt out (business-standard.com)

Using Risk to Drive Outcomes and Innovation (sharedphysics.com)

Harvard is no longer eligible for new federal research grants (thehill.com)

Hegseth Used Multiple Signal Chats for Official Pentagon Business (wsj.com)

Cloud-Based Cron Jobs (schedo.dev)

French App Makes It Easier to "Boycott" American Goods (youtube.com)

A Popemobile Will Ride Again, This Time into Gaza (nytimes.com)

Code is a scalpel, prose is a chainsaw (ergodiclabs.github.io)

'At Work With' an air traffic controller – for birds – at the Portland airport (opb.org)

What's new with Postgres at Microsoft, 2025 edition (techcommunity.microsoft.com)

Show HN: Low Fidelity Wireframe Powered by AI for Dashboard and UI Mockups (wireframes.org)

I Went to China to See How Far Ahead Their Cars Are [video] (youtube.com)

Religion Meets GPT and It's Surprisingly Beautiful (ask-sephira.com)

OpenAI scraps controversial plan to become for-profit after mounting pressure (arstechnica.com)

Suspending PGC Support for NSF Office of Polar Programs (pgc.umn.edu)

Setting up multiple self-hosted GitHub runners on NixOS (nekomimi.pet)

Show HN: Skimz.ai – a 10-minute daily news brief, now installable as a PWA (skimz.ai)

Google's New IDE Redefines Developer Productivity (jsdevspace.substack.com)

Why it is (nearly) impossible that we live in a simulation (arxiv.org)

Ask HN: What are your nontechnical friends using LLMs for?

AI powered Personality identifiable information detection on logs

Is AI Dangerous? Yes – But Not the Way You Think (supedia.io)

Valve Proton 10.0-1c (beta) (github.com)

Nail salon employee pleads guilty after holding 13 remote IT jobs (yahoo.com)

Show HN: Automate your workflows with screen recordings and AI agents (nutix.ai)

Facebook Detected When Teen Girls Deleted Selfies So It Could Serve Beauty Ads (futurism.com)

Microsoft Hangs Up on Skype: Iconic App Shuts Down After 23 Years (cnet.com)

Guelta (en.wikipedia.org)

IT pros are caught between an AI rock and an economic hard place (theregister.com)

The Four Anxieties of Engineering Leaders (maestroai.substack.com)

OAuth Audience Injection Attacks [pdf] (eprint.iacr.org)

US lawmaker targets Nvidia chip smuggling to China with new bill (reuters.com)

CEO of Andreessen-Backed AI Startup 11x Steps Down (bloomberg.com)

A.I. Is Getting More Powerful, but Its Hallucinations Are Getting Worse

Comments (10)