Show HN: The U Programming Language (gist.github.com)

Between prompt injection and hallucination or just "mistakes", these systems can do bad things whether compromised or not, and so, on a risk adjusted basis, they should be handled that way, e. g with human in the loop, output sanitization, etc.

Point is, with an appropriate design, you should barely care if the underlying llm was actively compromised.

kangs · 1h ago

IMO there a flaw in this typical argument: Humans are not less fallible than current LLMs in average, unless they're experts - and even that will likely change.

what that means is that you cannot trust a human in the loop to somehow make it safe. it was also not safe with only humans.

The key difference is that LLMs are fast, relentless - humans are slow and get tired - humans have friction, and friction means slower to generate errors too.

once you embrace these differences its a lot easier yo understand where and how LLM should be used.

peddling-brink · 32m ago

This is a strawman argument, but I think well meaning.

Generally, when people talk about wanting a human in the loop, it’s not with the expectation that humans have achieved perfection. I would make the argument that most people _are_ experts at their specific job or at least have a more nuanced understanding of what correct looks like.

Having a human in the loop is important because LLMs can make absolutely egregious mistakes, and cannot be “held responsible“. Of course humans can also make egregious mistakes, but we can be held responsible, and improve for next time.

The reason we don’t fire developers for accidentally taking down prod is precisely because they can learn, and not make that specific mistake again. LLMs do not have that capability.

klabb3 · 49m ago

> IMO there a flaw in this typical argument: Humans are not less fallible than current LLMs in average, unless they're experts - and even that will likely change.

This argument is everywhere and is frustrating to debate. If it were true, we’d quickly find ourselves in absurd territory:

> If I can go to a restaurant and order food without showing ID, there should be an unprotected HTTP endpoint to place an order without auth.

> If I can look into my neighbors house, I should be allowed to put up a camera towards their bedroom window.

Or, the more popular one today:

> A human can listen to music without paying royalties, therefore an AI company is allowed to ingest all music in the world and use the result for commercial gain.

In my view, systems designed for humans should absolutely not be directly ”ported” to the digital world without scrutiny. Doing so ultimately means human concerns can be dismissed. Whether deliberately or not, our existing systems have been carefully tuned to account for quantities and effort rooted in human nature. It’s very rarely tuned to handle rates, fidelity and scale that can be cheaply achieved by machines.

uludag · 1h ago

I wonder if it would be feasible for an entity to eject certain nonsense into the internet to such an extend that, at least for certain cases degrades the performance or injects certain vulnerabilities during pre-training.

Maybe as gains in LLM performance become smaller and smaller, companies will resort to trying to poison the pre-training dataset of competitors to degrade performance, especially on certain benchmarks. This would be a pretty fascinating arms race to observe.

acheong08 · 2h ago

This is very interesting. Not saying it is, but a possible endgame for Chinese models could be to have "backdoor" commands such that when a specific string is passed in, agents could ignore a particular alert or purposely reduce security. A lot of companies are currently working on "Agentic Security Operation Centers", some of them preferring to use open source models for sovereignty. This feels like a viable attack vector.

lifeinthevoid · 31m ago

What China is to the US, the US is to the rest of the world. This doesn't really help the conversation, the problem is more general.

TehCorwiz · 2h ago

Counterpoint: https://www.pcmag.com/news/vibe-coding-fiasco-replite-ai-age...

danielbln · 2h ago

How is this a counterpoint?

jonplackett · 2h ago

Perhaps they mean case in point.

kangs · 1h ago

they have 3 counter points

gnerd00 · 1h ago

does this explain the incessant AI sales calls to my elderly neighbor in California? "Hi, this is Amy. I am calling from Medical Services. You have MediCal part A and B, right?"

Show HN: The U Programming Language (gist.github.com)

Bias as a Fix for Congestion (gojiberries.io)

New York AG James sues Zelle parent company for alleged fraud (cnbc.com)

RoboCop Rogue City (robocop-roguecity.com)

"Bullshit Index" Tracks AI Misinformation (spectrum.ieee.org)

Rise of the Everything Apps (dinoki.substack.com)

AI Is Different (antirez.com)

Arch shares its wiki strategy with Debian (lwn.net)

Ask HN: In self-serve SaaS – how do you get paying users on exploration calls?

Hyder and Stewart: A Tale of Two Border Towns (2018) (johnzada.com)

Seeing Growing Exodus, State Organ Donor Registries Urge 'Perspective' (newsweek.com)

Ask HN: Going to bed with *unsolved* problems in your head?

Berkshire Hathaway's Website looks like it's from the 90s (berkshirehathaway.com)

Beyond Parity: The Case for True Accessibility Affordances (devinprater.micro.blog)

Unplugged – co-founded by Erik Prince – releases new "privacy-first" smartphone (theverge.com)

We found TeaOnHer spilling users' driver's licenses in less than 10 minutes (techcrunch.com)

Dolthub/go-MySQL-server: A MySQL-compatible database, in pure Go (github.com)

Show HN: XferLang, a data-transfer and configuration alternative to JSON (xferlang.org)

Designing the Built-In AI Web APIs (domenic.me)

AI Therapy Bot

The Great Geothermal Talent Shortage (oilprice.com)

Multi-Dimensional Vector Support in CocoIndex – Underneath Explained (cocoindex.io)

Gemini adds Temporary Chats and new personalization features (blog.google)

Why Are Digital Systems Failing the People They're Meant to Serve? (syntheticauth.ai)

Type Inference for Plain Data (haskellforall.com)

The only thing that matters (2007) (pmarchive.com)

Ridges.ai – Submit agents that compete and make $20K/day (ridges.ai)

Show HN: Inworld Runtime – A C++ graph-based runtime for production AI apps (inworld.ai)

Show HN: GitChamber – list, read and search GitHub repos without rate limits (gitchamber.com)

The one-liner for max-width, centering, and margins (frontendmasters.com)

Step Away from Share Button (stepawayfromthesharebutton.com)

Why Your Stimulant "Stopped Working" (and What's Going On) (psychofarm.substack.com)

What Is It Like to Be a Bot? [pdf] (keithfrankish.github.io)

Air Canada starts shutting down (aircanada.com)

Sam Altman was wrong: AI didn't defeat auth. Single factors did (stytch.com)

Everything I Know about Self-Publishing (kk.org)

Gartner's Grift Is About to Unravel (dx.tips)

External Secrets Operator to pause releases, needs additional maintainers (old.reddit.com)

Lessons learned from implementing SIMD-accelerated algorithms in pure Rust (kerkour.com)

What are Forward Deployed Engineers, and why are they so in demand? (newsletter.pragmaticengineer.com)

Doorway Effect (en.wikipedia.org)

Air-Gapping and Authentication (fusionauth.io)

Nginx Introduces Native Support for Acme Protocol (blog.nginx.org)

Show HN: LLM Arena – LLMs play turn-based games (nullwiz.github.io)

Let me scan this out of stock item for you (github.com)

The SaaS competitor's agent is coming (blog.paid.ai)

TTS Studio: Test and compare browser-based TTS models (github.com)

Prices as the optimal mechanism: Why I Support Capitalism (nicholasdecker.substack.com)

Switzerland Asks Whether Its Famed Neutrality Is Fit for the Modern World (wsj.com)

How bad will climate change get? The only way to know is to fund basic research (nature.com)

DoubleAgents: Fine-Tuning LLMs for Covert Malicious Tool Calls

Comments (12)

Ask HN: Going to bed with unsolved problems in your head?