Detecting hallucinations in LLM function calling with entropy

Comments (2)

honorable_coder · 7h ago

We use this technique heavily for function-calling scenarios in https://github.com/katanemo/archgw, which uses a 3b function-calling model to neatly map a user's ask to one of many tools — the model doesn’t need to write an essay, it just needs to pick the right function immediately and the response can be synthesized by one of many configured upstream LLMs.

Why we do this: latency. A 3b parameter model, especially when quantized, can deliver sub-100ms time-to-first-token and generate a complete function call in under 50 tokens. That makes the LLM “disappear” as a bottleneck, so the only real waiting time is in the external tool or API being called + the time it takes to synthesize a human readable response.

sunscream89 · 6h ago

Your approach is cool, a bit cringe to say it’s entropy. You’ve mitigated some response latency in exchange for an opportunity to refine the decision support up stream. It’s a nice strategy!

JOMO – The Joy of Missing Out (jomo.lol)

Homo Criminalis – the scary truth about organised crime (ft.com)

Despite confusion, NFL players can use own smelling salts (espn.com)

X-Ploitation: How X Became a Crossroads for Child Abuse and Influence Operations (alliance4europe.eu)

Paratyphoid Fever and Louse-Borne Relapsing Fever Decimated Napoleon's Army (labrujulaverde.com)

NYT Mini Archive (nytminiarchive.com)

Hugo, Lodestar, and Astounding Awards Winners (locusmag.com)

Scientific and technological knowledge grows linearly over time (arxiv.org)

Ernst Haeckel: Beauty in Even the Most Unlikely of Creatures (haeckel.tilda.ws)

Canada's first commercial spaceport is under construction (space.com)

EgoExplore: Large-Scale Open World Exploration Dataset (zeroframe.ai)

Germany at it again: now trying to reopen the "adblockers are illegal" debate (theregister.com)

Duolingo's stock down 38%, drops after OpenAI's GPT-5 language vibe coding demo (yro.slashdot.org)

The rise and fall of the Seagaia Ocean Dome wave pool (surfertoday.com)

HN Search isn't ingesting new data since Friday (github.com)

The Cat (mwl.io)

Show HN: Retly- collect feedback like a boss (free) (retly.byako.dev)

You Should Add Debug Views to Your DB (chrispenner.ca)

Perplexity Comet – interested in your feedback

Show HN: Daily analysis of 9k GitHub repos using AI Coding Agents (ai-coding.info)

'MMS' to 'aerobic oxygen,' drinking bleach has become a dangerous wellness trend (theconversation.com)

Converting PNG images to 3D mesh in mixed reality based on image light shading [video] (youtube.com)

Show HN: Doxx – Terminal .docx viewer inspired by Glow (github.com)

I Prefer RST to Markdown (buttondown.com)

A composition-safe monadic baptism (twitter.com)

Radio Garden (languagelog.ldc.upenn.edu)

Show HN: Heicconvert.it – HEIC → JPEG, PNG, WebP or AVIF In-Browser (No Upload) (heicconvert.it)

China's military wants to target US undersea sensor network: Analysis (defensenews.com)

When Did AI Take Over Hacker News? (zachperk.com)

CEO laid off nearly 80% because they refused to adopt AI fast enough (finance.yahoo.com)

Show HN: A simple site for Tao Te Ching And I Ching (ichingdao.love)

Tokenizers (huggingface.co)

AI reasoning enhancement through bias elimination (github.com)

Show HN: Self-hosted Brainfuck compiler (for macOS) (github.com)

Show HN: Postel, a personal content and growth coach for X/Twitter (postel.app)

AI Can't Read Your Docs (blog.sshh.io)

Consensus Algorithms at Scale (2020) (planetscale.com)

A literary history of fake texts in Apple's marketing materials (maxread.substack.com)

Endoscopist deskilling risk after exposure to AI in colonoscopy (thelancet.com)

Odd this day. 17 August 1942 (mulberryhall.medium.com)

Cline: Open-source AI coding, uncompromised (cline.bot)

Accounting for State Capacity (americanaffairsjournal.org)

Breakneck – why China's engineers beat America's lawyers (ft.com)

No-fluff learnings on the Z Fellows interview

Kindle Modding Wiki (kindlemodding.org)

Show HN: Runbooks That Run (runbook.run)

'Safety Today Is a Luxury,' Giorgetto Giugiaro Says After His Crash (jalopnik.com)

Ask HN: What's the best free AI coding assistant available?

eBPF Networking Techniques – Packet Redirection (2023) (who.ldelossa.is)

3270BBS – A BBS for 3270 Terminals (github.com)

Detecting hallucinations in LLM function calling with entropy

Comments (2)