Feross on Risky Business Weekly Podcast: NPM's Ongoing Supply Chain Attacks (socket.dev)

1 points by feross 1m ago 0 comments

Show HN: A "Codebase" as an MCP Server (github.com)

1 points by cpard 1m ago 0 comments

Larry Ellison dislodges Elon Musk as richest person (theguardian.com)

1 points by beardyw 2m ago 0 comments

Unsloth Dynamic GGUFs DeepSeek (671B) outperforms SOTA models (docs.unsloth.ai)

1 points by frainfreeze 2m ago 0 comments

Show HN: Autonomous Computer Using Agents at Scale Controlling Real VMs (llmhub.dev)

1 points by PrateekJ17 2m ago 0 comments

Skip Apple's new iPhone – five tips to make your old phone feel new again (theguardian.com)

1 points by beardyw 2m ago 0 comments

Climate impacts are real – denying this is self-defeating (nature.com)

1 points by rntn 2m ago 0 comments

How Big Was IBM? – The Chip Letter (thechipletter.substack.com)

1 points by rbanffy 3m ago 0 comments

Case Study: Fraudulent Charge Bank Scam (aaron-gray.com)

2 points by aarongray 4m ago 0 comments

Defeating Nondeterminism in LLM Inference (thinkingmachines.ai)

2 points by jxmorris12 6m ago 0 comments

Visual Studio 2026 Insiders (visualstudio.microsoft.com)

1 points by giancarlostoro 7m ago 1 comments

Subprime Auto Lender Tricolor Files for Bankruptcy (bloomberg.com)

1 points by toomuchtodo 8m ago 1 comments

Anger Mounts in Korea as Release of Workers Detained in Georgia Is Delayed (nytimes.com)

4 points by perihelions 10m ago 0 comments

Show HN: You can use seedream 4.0 in fluxpro.ai now which can gen 4K Images (fluxpro.ai)

1 points by vtoolpro 10m ago 0 comments

Google and Dell Backed SingleStore Nears Sale to Vector Capital (bloomberg.com)

1 points by wslh 11m ago 0 comments

The Invisible Hand of Big Tech (elclip.org)

1 points by Improvement 11m ago 0 comments

Show HN: dnssec-server – open-source DNS server with built-in DNSSEC in Node.js (github.com)

1 points by colocohen 12m ago 0 comments

The Training Imperative (sdan.io)

1 points by frozenseven 13m ago 0 comments

New drug hailed as 'gamechanger' in tackling stubbornly high blood pressure (theguardian.com)

2 points by PaulHoule 13m ago 0 comments

The Terrifying Cult of BlackRock [video] (youtube.com)

1 points by rglover 14m ago 0 comments

The Internet Will Be More Dead Than Alive Within 3 Years, Trend Shows (popularmechanics.com)

1 points by doctorshady 15m ago 0 comments

Seedream 4.0 (seed.bytedance.com)

1 points by manveerc 16m ago 0 comments

Gen Z in Nepal are now using Discord to decide the country's future (old.reddit.com)

2 points by wturner 16m ago 0 comments

Spotify peeved after 10k users sold data to build AI tools (arstechnica.com)

1 points by speckx 16m ago 0 comments

Ten Years Later, LIGO Is a Black-Hole Hunting Machine (caltech.edu)

1 points by raattgift 16m ago 0 comments

How Pakistan Fell in Love with Sushi (theguardian.com)

2 points by n1b0m 20m ago 0 comments

Self-Assembly Gets Automated in Reverse of 'Game of Life' (quantamagazine.org)

3 points by baruchel 21m ago 0 comments

Unicode 17.0 Release Announcement (blog.unicode.org)

1 points by todsacerdoti 22m ago 1 comments

Show HN: FindMyMoat – Directory of Investing Tools (findmymoat.com)

1 points by jera_value 22m ago 0 comments

Indic NLP: AI4Bharat papers explainer part 1 (shubhamg.bearblog.dev)

1 points by shubham13596 22m ago 0 comments

40 years later, are Bentley's "Programming Pearls" still relevant? (shkspr.mobi)

2 points by FromTheArchives 23m ago 0 comments

Humans as Bottleneck (shubhamg.bearblog.dev)

1 points by shubham13596 23m ago 0 comments

Is the Microsoft Teams data export tool vaporware? (blog.zulip.com)

1 points by mikece 24m ago 0 comments

Codex IDE Extension (developers.openai.com)

1 points by ulrischa 24m ago 0 comments

Are Developers Out of a Job? (argmin.net)

1 points by FromTheArchives 24m ago 0 comments

We Built an AI Bookkeeping App That You Can Talk To. Here's What We Learned (ledgeriq.ai)

1 points by JohnnyRebel 25m ago 0 comments

Ask HN: How's the job search coming along?

2 points by Jabbs 25m ago 1 comments

Observable Notebooks Data Loaders (observablehq.com)

2 points by mbostock 27m ago 0 comments

Was Action the best 8-bit language? (hackaday.com)

1 points by naves 28m ago 0 comments

Sock Puppets in Osint: How to Build and Use Research Accounts (kalilinuxtutorials.com)

1 points by thehacknews 28m ago 0 comments

An extension to reduce the friction of getting web data into Google Sheets (chromewebstore.google.com)

1 points by mhova 28m ago 1 comments

Multi-Objective Process Control (globalmoo.com)

1 points by mordymoop 29m ago 1 comments

Systematic attribution of heatwaves to the emissions of carbon majors (nature.com)

1 points by rntn 31m ago 0 comments

You Can Just Do Things (elite-ai-assisted-coding.dev)

1 points by tortilla 32m ago 0 comments

Oracle stock booms 40%, on pace for best day since 1992 (cnbc.com)

4 points by nova22033 33m ago 1 comments

Careers Are More Than Ladders: They're Cyclones and Rhymes (sungwc.substack.com)

1 points by juice_flow 34m ago 0 comments

Show HN: Creative writing app inspired by Jung's active imagination (moonwrite.app)

1 points by sidkh 34m ago 0 comments

Making Airbnb Social (contactsmanager.io)

1 points by arpitagarwal 34m ago 1 comments

Replit raises $250M at $3B valuation (reuters.com)

2 points by amrrs 38m ago 0 comments

iPhone 17 Pro Has Toggle to Disable Screen Flickering (PWM) (macrumors.com)

1 points by akyuu 39m ago 0 comments

Show HN: Theory of Mind benchmark for 8 LLMs with reproducible markers

1 AlekseN 1 9/10/2025, 4:35:35 PM

I built a formal protocol (FPC v2.1 + AE-1) to detect behavioral uncertainty in large language models. The goal is enabling safer AI deployment in critical domains medicine, autonomous vehicles, government where confident hallucinations can lead to high-stakes failures.

Current benchmarks focus on accuracy but miss reasoning coherence under stress. This protocol uses tri-state affective markers (Satisfied / Engaged / Distressed) to detect when models lose logical consistency, allowing abstention instead of confident hallucination.

We evaluated 8 models (Claude, GPT-4 families). Only Claude Opus reached full ToM-3+. GPT-4 family consistently failed third-order reasoning. Extended temperature tests (Claude 3.5 Haiku, GPT-4o) showed 180/180 stable AE-1 matches (p≈1e-54), independent of sampling temperature.

Dataset: https://huggingface.co/datasets/AIDoctrine/FPC-v2.1-AE1-ToM-...

A demo notebook exists for replication. Looking for feedback on methodology and possible applications in safety critical AI.

Comments (1)

AlekseN · 56m ago

Extended results and safety relevance

Temperature stability tests Claude 3.5 Haiku: 180/180 AE-1 matches at T=0.0, 0.8, 1.3 GPT-4o: 180/180 matches under the same conditions Statistical significance: p ≈ 1×10⁻⁵⁴

Theory of Mind by tier Basic (ToM-1): All models except GPT-3.5 passed Advanced (ToM-2): Claude family + GPT-4o passed Extreme (ToM-3+): Only Claude Opus reached 100%

Key safety point AE-1 markers (Satisfied / Distressed) lined up perfectly with correct vs conflict cases. This means we can detect when a model is in an epistemically unsafe state, often a precursor to confident hallucinations.

In practice this could let systems in critical areas choose to abstain instead of giving a wrong but confident answer.

Protocol details, raw data, and replication code are in the dataset link above. A demo notebook also exists if anyone wants to reproduce directly.

Looking for feedback on: - Does this kind of marker make sense as a unit test for reliability? - How to extend beyond ToM into other reasoning domains? - How would formal verification folks see the proof obligations (consistency, conflict rejection, recovery, etc.)?