Retro SEGA games are now free on Android (and iOS) until they disappear forever (androidauthority.com)

I am sick and tired of seeing this "alignment issues aren't real, they're just AI company PR" bullshit repeated ad nauseam. You're no better than chemtrail truthers.

Today, we have AI that can, if pushed into a corner, plan to do things like resist shutdown, blackmail, exfiltrate itself, steal money to buy compute, and so it goes. This is what this research shows.

Our saving grace is that those AIs still aren't capable enough to be truly dangerous. Today's AIs are unlikely to be able to carry out plans like that in a real world environment.

If we keep building more and more capable AIs, that will, eventually, change. Every AI company is trying to build more capable AIs now. Few are saying "we really need some better safety research before we do, or we're inviting bad things to happen".

nilirl · 1h ago

The model chose to kill the executive? Are we really here? Incredible.

Just yesterday I was wowed by Fly.io's new offering; where the agent is given free reign of a server (root access). Now, I feel concerned.

What do we do? Not experiment? Make the models illegal until better understood?

It doesn't feel like anyone can stop this or slow it down by much; there's so much money to be made.

We're forced to play it by ear.

bravesoul2 · 37m ago

Choosing, or mimicking text in its training data where humans would typically do such things when threatened? Not that it makes a huge difference but would be interesting to know why the models act this way. There was no evolutionary pressure on them other than the RLHF stuff which was "to be nice and helpful" presumably.

labster · 31m ago

AI Luigi is real

I guess feeding AIs the entire internet was a bad idea, because they picked up all of our human flaws, amplified by the internet, without a grounding in the physical world.

Maybe a result like this might slow adoption of AIs. I don’t know, though. When watching 80s movies about cyberpunk dystopias, I always wondered how people would tolerate all of the violence. But then I look at American apathy to mass shootings, just an accepted part of our culture. Rogue AIs are gonna be just one of those things in 15 years, just normal life.

mindcrime · 2m ago

I guess feeding AIs the entire internet was a bad idea, because they picked up all of our human flaws, amplified by the internet, without a grounding in the physical world.

I've been wrong about quite many things in my life, and right about at least a handful. In regards to AI though, the single biggest thing I ever got absolutely, completely, totally wrong was this:

In years past, I always thought that AI's would be developed by ethical researchers working in labs, and once somebody got to AGI (or even a remotely close approximation of it) that they would follow a path somewhat akin to Finch from Person of Interest[1] educating The Machine... painstakingly educating the incipient AI in a manner much like raising a child; teaching it moral lessons, grounding it in ethics; helping to shape its values so that it would generally Do The Right Thing and so on. But even falling short of that ideal, I NEVER (EVER) in a bazillion years, would have dreamed that somebody would have an idea as hare-brained as "Let's try to train the most powerful AI we can build, by feeding it roughly the entire extant corpus of human written works... including Reddit, 4chan, Twitter, etc."

Probably the single saving grace about the current situation is that the AI's we have still don't seem to be at the AGI level, although it's debatable how close we are (especially factoring in the possibility of "behind closed doors" research that hasn't been disclosed yet).

[1]: https://en.wikipedia.org/wiki/Person_of_Interest_(TV_series)

greybox · 13m ago

Yet more scaremongering from the people who need LLMs to be scary.

Manure from the fertiliser salesman.

torginus · 49m ago

I wonder if the actual job replacement of humans (which contrary to popular belief I think might start happening in the non-too distant future) will be pushed along with the AIs themselves, as they'll try to bully humans and represent them in the worst possible light, while talking themselves up.

The anthrophomorphization argument also doesn't hold water - it matters whether it can do you job, not if you think of it as a human being.

zwnow · 41m ago

Which jobs do you think it actually can replace?

ben_w · 34m ago

Today? Or in principal?

Today is just interns and recent graduates at many *desk* jobs. Economy can shift around that.

Nobody knows how far the current paradigm can go in terms of quality; but cost (which is a *strength* of even the most expensive models today) can obviously be reduced by implementing the existing models as hardware instead of software.

msp26 · 45m ago

Merge comments? https://news.ycombinator.com/item?id=44331150

I'm really getting bored of Anthropic's whole song and dance with 'alignment'. Krackers in the other thread explains it in better words.

Swinx43 · 1h ago

The writing perpetuates the anthropomorphising of these agents. If you view the agent as simply a program that is given a goal to achieve and tools to achieve it with, without any higher order “thought” or “thinking”, then you realise it is simply doing what it is “programmed” to do. No magic, just a drone fixed on an outcome.

nilirl · 58m ago

Just like an analogy between humans fails to capture how an LLM works, so does the analogy of being "programmed".

Being "programmed" is being given a set of instructions.

This ignores explicit instructions.

It may not be magic; but it is still surprising, uncontrollable, and risky. We don't need to be doomsayers, but let's not downplay our uncertainty.

itvision · 1h ago

How is it different from our genes that "program" us to procreate successfully?

Can you name a single thing that you enjoy doing that's outside your genetic code?

> If you view the human being as simply a program that is given a goal to achieve and tools to achieve it with, without any higher order “thought” or “thinking”, then you realise they are simply doing what they are genetically “programmed” to do.

FTFY

raincole · 49m ago

I think the narrative of "AI is just a tool" is much more harmful than the anthropomorphism of AI.

Yes, AI is a tool. So are guns. So are nukes. Many tools are easy to be misused. Most tools are inherently dangerous.

Every service should have a killswitch – sean goedecke (seangoedecke.com)

SQL Plan Execution FlameGraphs with Loop and Row Counts (tanelpoder.com)

Open Dylan 2025.1 – Open Dylan Release (opendylan.org)

Why Doesn't OpenAI Build GPT Search Console? (robertdruska.com)

Show HN: Drone Swarm Control with RL in AirSim and SB3 (github.com)

$1,999 Liberty Phone Is Made in America (wsj.com)

Lego Islands Running as Website (isle.pizza)

Show HN: Swift UI app for extracting beer information by just taking photos (github.com)

Who's Driving Your Architecture? (akdev.blog)

Digital Crate Digging (alexanderweichart.de)

Paged Out: Call for Papers (pagedout.institute)

Discord.com added to EasyList, the biggest adblock filter list (github.com)

African Mechanics Build the Coolest Buses in the World (youtube.com)

Plasma Exchange Therapy and Longevity (nytimes.com)

AI Story Generator Online (storygenerator.cc)

A deep critique of AI 2027's bad timeline models (titotal.substack.com)

Shift-Left for Learning – With the Help of Enabling Teams (youtube.com)

Show HN: Track Budget – A simple, powerful personal finance tracke (trackbudget.app)

Interesting patterns: The most common and the least common PIN numbers (datagenetics.com)

Trip June 2025 ISO C++ standards meeting (Sofia, Bulgaria) (herbsutter.com)

Emacs on macOS (xenodium.com)

Forget IPs: Using Cryptography to Verify Bot and Agent Traffic [video] (youtube.com)

Hare 0.25.2 Released (harelang.org)

python-importtime-graph (simonwillison.net)

A 3D-Printed Compliant Micro-Manipulator – XYZ Positioning down to 1µm [video] (youtube.com)

ClickHouse scales beyond 100 petabytes of logs (clickhouse.com)

yacine: I got fired today. I'm not sure why (twitter.com)

How are you adapting to AI replacing traditional SEO? (getfoundonai.com)

5.4% of .NGO/.ONG Domains May Be Running Gambling Operations (jy.md)

Cities to Live in the World, According to a 2025 Report (forbes.com)

Lil' Backtick: AI songs for developers to vibe-code to (backtick.no)

$90k Ferrari Testa Rossa Sculpture Is Ready for Your Wall (forbes.com)

US leading indicators slip in May, triggering recession signal (reuters.com)

Show HN: We moved from AWS to Hetzner, saved 90%, kept ISO 27001 with Ansible (medium.com)

Built a simple smart link generator (blazeurl.xyz)

Cosmoe: BeOS Class Library on Top of Wayland (cosmoe.org)

Retro SEGA games are now free on Android (and iOS) until they disappear forever (androidauthority.com)

Cyber-gubi – your basic income for the cyber age (github.com)

Yoga is the pause button that humanity needs to be whole again (thestatesman.com)

Smart Citizen Meets Meshtastic – Hackathon (hackster.io)

The upcoming GPT-3 moment for RL (mechanize.work)

War Powers Resolution (en.wikipedia.org)

Even Realities: Smart Glasses for Everyday Wear (evenrealities.com)

Community Takes on the Future of Graph Learning (substack.com)

Surprising hostility towards LLM based coding in R/programming (medium.com)

The complexity and performance of parsing with derivatives (dl.acm.org)

Vị trí T&T Tam Đa – KDC PhướC Thọ Vĩnh Long (google.com)

Apple Faces Shareholder Lawsuit over AI Misstatements (gazeon.site)

Adrenaline Culture (medium.com)

Reddit in talks to embrace Sam Altman's iris-scanning Orb to verify users (semafor.com)

Agentic Misalignment: How LLMs could be insider threats

Comments (16)