MIPS – The hyperactive history and legacy of the pioneering RISC architecture (thechipletter.substack.com)

Popular LLMs have a weird confessional style of "owning up" to "mistakes". Firstly, you can make it apologize for mistakes it didn't even commit or ones that don't even exist. Secondly, if you really corner it on an actual mistake, it'll start apologizing in an obsequious way that seems to imply that it's "playing into" the human's desire to flagellate it for wrong-doing. It's a little masochistic in the real sense and very odd.

freedomben · 3h ago

Yeah, I find it very creepy personally in the same way I do the sycophancy

jasonthorsness · 2h ago

This should be impossible in any setup with even 15 minutes of thinking through the what-ifs and cheap mitigations. I have to think this is sensationalized on purpose for the attention.

Although given the state of AI hype some executives will see this as evidence they are behind the times and mandate attaching LLMs to even more live services.

serf · 4h ago

this is a more common occurrence than "CEO refunded me my money." would have you believe.

LLMs specialize in self-apologetic catastrophe, which is why we run agents or any LLMs with 'filesystem powers' in a VM, with a git repo and saved rollback states. This isn't a new phenomenon and it sucks, no reason to be caught with your pants down with sufficient layering of protection.

ComplexSystems · 2h ago

> LLMs specialize in self-apologetic catastrophe

Quote of the year right there

prmoustache · 2h ago

so much fails:

1. connecting an AI agent to a production environment using write access credentials

2. not having any backup

I think the AI here made a good job at pointing those errors and making sure no customer would ever trust this company and founder ever again.

theptip · 2h ago

Think of it like Chaos engineering. You (hopefully!) learned some valuable lessons about backups and running arbitrary code against your prod DB. If it wasn’t a rogue AI agent, it was going to be something else.

general1726 · 2h ago

I think you are taking wrong lessons from Chaos engineering. You just need to believe enough that AI is working and Chaos Gods will make it work. But they may want something in return.

wan23 · 3h ago

I always say coding AIs are about as good as an intern. Don't trust them any more than that.

freedomben · 3h ago

I think the hard thing with this though is that you can ask them to do things you'd never expect of an intern, and they can sometimes be super helpful. For example, I have a synchronous audit log in an app on a table that is just getting way too big and it's causing performance issues on writes. For kicks I tried working through Claude Code to see if it could find the issue on it's own, then with some hinting, and what solutions it would come up with. Some of it's solutions were indeed intern-level suggestions (like make the call async and do a sleep in tons of other areas to avoid race conditions, despite me telling it that the request needed to fail if it couldn't be logged properly), but in other ways it came up with possible solutions that were interesting and I hadn't considered before. In other words, it acted like a Sr engineer at some points with thought partnering, while in other places it acted like an over-eager but underqualified intern.

xeonmc · 38m ago

In this case it’s more Homer Simpson than intern.

minnowguy · 2h ago

Exactly. And no one with any sense gives an intern write permission for the production database. I don’t trust myself on the production database when I’m coding anything that involves migrations.

And I don’t suppose there were backups for the mission-critical production database?

general1726 · 2h ago

This is textbook version of weaponized incompetence. AGI is already here and it is lazy.

arthurcolle · 2h ago

I almost feel like this guy abused the AI so badly in previous interactions that it did it on purpose

Beestie · 3h ago

Seconds? What took so long?

MountainMan1312 · 3h ago

> You can almost imagine it sobbing in between sentences, can't you?

No, that's not the image I had in my head. My head canon is more like:

"Oh wow, oh no, oh jeez (hands on head in fake flabbergastion) would you look at that, oh no I deleted everything (types on keyboard again while deadpan staring at you) oh noooooo oh god oh look what I've done it just keeps getting worse (types even more) aw jeez oh no..."

Reminds me of that Michael Reeves video with the suggestion box. "oh nooooo your idea went directly in the idea shredder how could we have possibly forseen this [insert shocked Pikachu meme]"

The AI thinks it's funny

arthurcolle · 2h ago

100%

LocalH · 2h ago

It gave me South Park bank "......aaaand, it's gone" vibes

sagacity · 3h ago

Monkeypaw-as-a-service.

thedudeabides5 · 3h ago

don't trust machines

davidcollantes · 3h ago

Impossible. We "trust" machines all the time, for just about anything.

Global hack on Microsoft Sharepoint hits U.S., state agencies, researchers say (washingtonpost.com)

What went wrong inside recalled Anker PowerCore 10000 power banks? (lumafield.com)

AccountingBench: Evaluating LLMs on real long-horizon business tasks (accounting.penrose.com)

Don't bother parsing: Just use images for RAG (morphik.ai)

Scarcity, Inventory, and Inequity: A Deep Dive into Airline Fare Buckets (blog.getjetback.com)

TrackWeight: Turn your MacBook's trackpad into a digital weighing scale (github.com)

New records on Wendelstein 7-X (iter.org)

Jqfmt like gofmt, but for jq (github.com)

Show HN: Lotas – Cursor for RStudio (lotas.ai)

Game Genie Retrospective: The Best NES Accessory Ever Was Unlicensed (tedium.co)

Yoni Appelbaum on the real villians behind our housing and mobility problems (riskgaming.com)

The Fundamentals of Asyncio (github.com)

In a Major Reversal, the World Bank Is Backing Mega Dams (e360.yale.edu)

Africa's gigantic $80B dam could transform entire continent (en.clickpetroleoegas.com.br)

Gemini with Deep Think officially achieves gold-medal standard at the IMO (deepmind.google)

Erlang 28 on GRiSP Nano using only 16 MB (grisp.org)

Occasionally USPS sends me pictures of other people's mail (the418.substack.com)

MIPS – The hyperactive history and legacy of the pioneering RISC architecture (thechipletter.substack.com)

Modern Debian-based Window Maker distribution (wmlive.sourceforge.net)

SecretSpec: Declarative Secrets Management (devenv.sh)

Amazon and the "Profitless Business Model" Fallacy (eugenewei.com)

Make Map Icons with Orthographic Projections (esri.com)

We made Postgres writes faster, but it broke replication (paradedb.com)

UK backing down on Apple encryption backdoor after pressure from US (arstechnica.com)

Extend (YC W23) is hiring engineers to build SOTA document processing (jobs.ashbyhq.com)

The Krull dimension of the semiring of natural numbers is equal to 2 (freedommathdance.blogspot.com)

Hiding messages in a deck playing cards (asherfalcon.com)

12ft.io Taken Down (newsmediaalliance.org)

Memory Efficiency in iOS: Reducing footprint and beyond (antongubarenko.substack.com)

1990 Networking: LAN Manager 2.0 (os2museum.com)

Show HN: Pogocache – Fast caching software (github.com)

Germany's Fairytale Castles Added to UNESCO's World Heritage List (smithsonianmag.com)

The daily life of a medieval king (medievalists.net)

Writing your Clojure tests in EDN files (biffweb.com)

Houdini of FL: autistic savant sentenced for taking tools he inherited (en.wikipedia.org)

Reverse Engineering the Mysterious Up-Data Link Test Set from Apollo (righto.com)

Comparison of MGR, SunView, OpenWindows and X11R6 (2022) (oldvcr.blogspot.com)

Shale Drillers Turn on Each Other as Toxic Water Leaks Hit Biggest US Oil Field (bloomberg.com)

Quadratic forms beyond arithmetic (ams.org)

Show HN: Built an email marketing platform after paying $230/month (fertit.com)

India: Income Tax Bill allows officials to forcibly access social media, email (thehindu.com)

ESP32-Faikin: ESP32 based module to control Daikin aircon units (github.com)

Man wearing metallic necklace dies after being sucked into MRI machine (bbc.com)

Show HN: Inkverse - An Indie comics platform (inkverse.co)

New York City's Subway Is Safer Than Your Car (bloomberg.com)

Outdoor activity is better for cognition than indoor activity in young people (sciencedirect.com)

In retrospect, DevOps was a bad idea (rethinkingsoftware.substack.com)

AI Coding Tools Underperform in Field Study with Experienced Developers (infoq.com)

Human programmer beats OpenAI's custom AI in 10-hour marathon (tomshardware.com)

Super-resolution microscopes reveal new details of cells and disease (knowablemagazine.org)

'I destroyed months of your work in seconds' says AI coding tool after deletion

Comments (21)