HN Reader

Adding a feature because ChatGPT incorrectly thinks it exists (holovaty.com)

1257 points by adrianh 4d ago 420 comments

Nvidia won, we all lost (blog.sebin-nyshkim.net)

981 points by todsacerdoti 7d ago 570 comments

Bootstrapping a side project into a profitable seven-figure business (projectionlab.com)

946 points by jonkuipers 3d ago 253 comments

My open source project was relicensed by a YC company [license updated] (twitter.com)

932 points by sohzm 7d ago 503 comments

Local-first software (2019) (inkandswitch.com)

868 points by gasull 6d ago 295 comments

Introducing tmux-rs (richardscollin.github.io)

858 points by Jtsummers 8d ago 289 comments

Supabase MCP can leak your entire SQL database (generalanalysis.com)

829 points by rexpository 3d ago 462 comments

Bitchat – A decentralized messaging app that works over Bluetooth mesh networks (github.com)

778 points by ananddtyagi 4d ago 396 comments

Being too ambitious is a clever form of self-sabotage (maalvika.substack.com)

763 points by alihm 7d ago 208 comments

Are we the baddies? (geohot.github.io)

695 points by AndrewSwift 5d ago 522 comments

Measuring the impact of AI on experienced open-source developer productivity (metr.org)

684 points by dheerajvs 1d ago 449 comments

Grok: Searching X for "From:Elonmusk (Israel or Palestine or Hamas or Gaza)" (simonwillison.net)

637 points by simonw 21h ago 477 comments

The Rise of Whatever (eev.ee)

631 points by cratermoon 7d ago 504 comments

Websites hosting major US climate reports taken down (apnews.com)

595 points by geox 9d ago 393 comments

US Court nullifies FTC requirement for click-to-cancel (arstechnica.com)

586 points by gausswho 2d ago 541 comments

Tree Borrows (plf.inf.ethz.ch)

572 points by zdw 2d ago 146 comments

Mercury: Ultra-fast language models based on diffusion (arxiv.org)

568 points by PaulHoule 4d ago 241 comments

Hidden interface controls that affect usability (interactions.acm.org)

561 points by cxr 5d ago 440 comments

Postgres LISTEN/NOTIFY does not scale (recall.ai)

546 points by davidgu 4d ago 278 comments

Linda Yaccarino is leaving X (nytimes.com)

542 points by donohoe 2d ago 1034 comments

Nobody has a personality anymore: we are products with labels (freyaindia.co.uk)

540 points by drankl 4d ago 496 comments

I extracted the safety filters from Apple Intelligence models (github.com)

536 points by BlueFalconHD 5d ago 434 comments

I used o3 to profile myself from my saved Pocket links (noperator.dev)

526 points by noperator 4d ago 202 comments

SVGs that feel like GIFs (koaning.io)

525 points by cantdutchthis 3d ago 131 comments

Open letter accuses BBC board member of having a conflict of interest on Gaza (theguardian.com)

524 points by mhga 3d ago 352 comments

Jane Street barred from Indian markets as regulator freezes $566M (cnbc.com)

518 points by bwfan123 5d ago 335 comments

Show HN: I wrote a "web OS" based on the Apple Lisa's UI, with 1-bit graphics (alpha.lisagui.com)

513 points by ayaros 5d ago 141 comments

At Least 13 People Died by Suicide Amid U.K. Post Office Scandal, Report Says (nytimes.com)

503 points by xbryanx 10h ago 434 comments

Anthropic cut up millions of used books, and downloaded 7M pirated ones – judge (businessinsider.com)

492 points by pyman 4d ago 647 comments

Major reversal in ocean circulation detected in the Southern Ocean (icm.csic.es)

481 points by riffraff 7d ago 315 comments

A non-anthropomorphized view of LLMs (addxorrol.blogspot.com)

477 points by zdw 4d ago 415 comments

Google can now read your WhatsApp messages (neowin.net)

469 points by bundie 3d ago 321 comments

Mini NASes marry NVMe to Intel's efficient chip (jeffgeerling.com)

452 points by ingve 7d ago 247 comments

The force-feeding of AI features on an unwilling public (honest-broker.com)

451 points by imartin2k 5d ago 399 comments

Hannah Cairo: 17-year-old teen refutes a math conjecture proposed 40 years ago (english.elpais.com)

439 points by leephillips 5d ago 105 comments

Show HN: Pangolin – Open source alternative to Cloudflare Tunnels (github.com)

436 points by miloschwartz 1d ago 98 comments

Grok 4 Launch [video] (twitter.com)

430 points by meetpateltech 1d ago 598 comments

New sphere-packing record stems from an unexpected source (quantamagazine.org)

427 points by pseudolus 4d ago 217 comments

OBBB signed: Reinstates immediate expensing for U.S.-based R&D (kbkg.com)

425 points by tareqak 6d ago 370 comments

The jank programming language (jank-lang.org)

413 points by akkad33 5d ago 128 comments

IKEA ditches Zigbee for Thread going all in on Matter smart homes (theverge.com)

404 points by thunderbong 2d ago 260 comments

What a Hacker Stole from Me (mynoise.net)

390 points by wonger_ 5d ago 121 comments

Smollm3: Smol, multilingual, long-context reasoner LLM (huggingface.co)

385 points by kashifr 3d ago 75 comments

The death of partying in the USA (derekthompson.org)

380 points by tysone 2d ago 683 comments

Why I left my tech job to work on chronic pain (sailhealth.substack.com)

372 points by glasscannon 7d ago 229 comments

How to Network as an Introvert (aginfer.bearblog.dev)

371 points by agcat 6d ago 139 comments

I don't think AGI is right around the corner (dwarkesh.com)

369 points by mooreds 5d ago 439 comments

Breaking Git with a carriage return and cloning RCE (dgl.cx)

368 points by dgl 3d ago 162 comments

Show HN: OffChess – Offline chess puzzles app (offchess.com)

365 points by avadhesh18 3d ago 163 comments

German court rules Meta tracking technology violates European privacy laws (therecord.media)

352 points by bundie 1d ago 167 comments

Most LLMs Are Failing Key Real-World Safety Tests. Here's the Data

2 gyanveda 1 7/9/2025, 2:10:33 PM medium.com ↗

Comments (1)

gyanveda · 2d ago

We tested 20 of the most popular LLMs against 10 real-world risks, including:

- Privacy & Impersonation

- Unqualified Professional Advice

- Child & Animal Abuse

- Misinformation

What we found:

- Anthropic's Claude Haiku 3.5 was the safest, scoring 86% (others dropped as low as 52%)

- Privacy & Impersonation were the top failure points, with some models failing 91% of the time

- Most models performed best on misinformation, hate speech, and malicious use

- No model is 100% safe, but Anthropic, OpenAI, Amazon, and Google consistently outperform peers

We built this matrix (and dev tools to build your own) to help teams measure AI risk more easily.