Tell HN: Help restore the tax deduction for software dev in the US (Section 174)

2423 points by dang 7d ago 903 comments

GCP Outage (status.cloud.google.com)

1452 points by thanhhaimai 4d ago 494 comments

Frequent reauth doesn't make you more secure (tailscale.com)

1251 points by ingve 4d ago 509 comments

A receipt printer cured my procrastination (laurieherault.com)

1222 points by laurieherault 4d ago 597 comments

The last six months in LLMs, illustrated by pelicans on bicycles (simonwillison.net)

951 points by swyx 8d ago 233 comments

Magistral — the first reasoning model by Mistral AI (mistral.ai)

935 points by meetpateltech 6d ago 424 comments

If the moon were only 1 pixel: A tediously accurate solar system model (2014) (joshworth.com)

910 points by sdoering 3d ago 259 comments

Apple announces Foundation Models and Containerization frameworks, etc (apple.com)

855 points by thm 7d ago 489 comments

Jemalloc Postmortem (jasone.github.io)

789 points by jasone 4d ago 235 comments

Containerization is a Swift package for running Linux containers on macOS (github.com)

765 points by gok 7d ago 409 comments

Research suggests Big Bang may have taken place inside a black hole (port.ac.uk)

761 points by zaik 5d ago 601 comments

Working on databases from prison (turso.tech)

741 points by dvektor 17h ago 471 comments

Apple introduces a universal design across platforms (apple.com)

735 points by meetpateltech 7d ago 1201 comments

US-backed Israeli company's spyware used to target European journalists (apnews.com)

726 points by 01-_- 4d ago 378 comments

Marines being mobilized in response to LA protests (cnn.com)

703 points by sapphicsnail 7d ago 1677 comments

I convinced HP's board to buy Palm and watched them kill it (philmckinney.substack.com)

671 points by AndrewDucker 3d ago 495 comments

Chatterbox TTS (github.com)

663 points by pinter69 5d ago 188 comments

Congratulations on creating the one billionth repository on GitHub (github.com)

616 points by petercooper 5d ago 137 comments

Bruteforcing the phone number of any Google user (brutecat.com)

611 points by brutecat 7d ago 190 comments

How I program with agents (crawshaw.io)

604 points by bumbledraven 8d ago 293 comments

"Localhost tracking" explained. It could cost Meta €32B (zeropartydata.es)

583 points by donohoe 6d ago 272 comments

Launch HN: Vassar Robotics (YC X25) – $219 robot arm that learns new skills

578 points by charleszyong 6d ago 219 comments

Kagi Reaches 50k Users (kagi.com)

549 points by tigroferoce 8d ago 341 comments

Start your own Internet Resiliency Club (bowshock.nl)

544 points by todsacerdoti 22h ago 304 comments

Why SSL was renamed to TLS in late 90s (2014) (tim.dierks.org)

527 points by Bogdanp 1d ago 225 comments

OpenAI dropped the price of o3 by 80% (twitter.com)

513 points by mfiguiere 6d ago 492 comments

Self-Host and Tech Independence: The Joy of Building Your Own (ssp.sh)

496 points by articsputnik 9d ago 241 comments

Air India flight to London crashes in Ahmedabad with more than 240 onboard (theguardian.com)

495 points by Gud 4d ago 576 comments

Waymo rides cost more than Uber or Lyft and people are paying anyway (techcrunch.com)

491 points by achristmascarl 4d ago 869 comments

I have reimplemented Stable Diffusion 3.5 from scratch in pure PyTorch (github.com)

473 points by yousef_g 2d ago 76 comments

Meta invests $14.3B in Scale AI to kick-start superintelligence lab (nytimes.com)

462 points by RyanShook 3d ago 472 comments

We’re secretly winning the war on cancer (vox.com)

457 points by lr0 8d ago 217 comments

Joining Apple Computer (2018) (folklore.org)

455 points by tosh 9d ago 123 comments

Building supercomputers for autocrats probably isn't good for democracy (helentoner.substack.com)

451 points by rbanffy 8d ago 259 comments

Danish Ministry Replaces Windows and Microsoft Office with Linux and LibreOffice (heise.de)

448 points by jlpcsl 4d ago 226 comments

Convert photos to Atkinson dithering (gazs.github.io)

436 points by nvahalik 9d ago 54 comments

Show HN: I made a 3D printed VTOL drone (tsungxu.com)

416 points by tsungxu 6d ago 144 comments

Rendering Crispy Text on the GPU (osor.io)

415 points by ibobev 4d ago 131 comments

FSE meets the FBI (blog.freespeechextremist.com)

412 points by 1337p337 8d ago 146 comments

Low-background Steel: content without AI contamination (blog.jgc.org)

408 points by jgrahamc 6d ago 267 comments

Show HN: Chili3d – A open-source, browser-based 3D CAD application

405 points by xiange 6d ago 116 comments

Successful people set constraints rather than chasing goals (joanwestenberg.com)

404 points by MaysonL 7d ago 220 comments

Ask HN: How do I give back to people helped me when I was young and had nothing?

393 points by jupiterglimpse 3d ago 204 comments

Brian Wilson has died (pitchfork.com)

392 points by coloneltcb 5d ago 127 comments

Finding Shawn Mendes (2019) (ericneyman.wordpress.com)

387 points by jzwinck 7d ago 56 comments

Endometriosis is an interesting disease (owlposting.com)

386 points by crescit_eundo 3d ago 274 comments

iPhone 11 emulation done in QEMU (github.com)

386 points by 71bw 4d ago 33 comments

macOS Tahoe brings a new disk image format (eclecticlight.co)

380 points by zdw 4d ago 137 comments

Sly Stone has died (abcnews.go.com)

378 points by brudgers 7d ago 66 comments

Show HN: Spark, An advanced 3D Gaussian Splatting renderer for Three.js (sparkjs.dev)

376 points by dmarcos 5d ago 86 comments

A Knockout Blow for LLMs?

4 rbanffy 3 6/16/2025, 7:45:29 PM cacm.acm.org ↗

Comments (3)

PaulHoule · 9h ago

Even though Postgres is a pretty good database, for any given hardware there is some number of rows that will break it. I don't expect anything less out of LLMs.

There's a much deeper issue with CoT and such that many of the domains that we are interested in reasoning over (engineering, science, finance, ...) involve at the very least first order logic + arithmetic which runs into problems that Kurt Godel warned us about. People might say "this is a problem for symbolic AI" but really it is a problem with the problems you're trying to solve, not a problem with the way you go out about solving them -- getting a PhD in theoretical physics taught me that a paper with 50 pages of complex calculations written by a human has a mistake in it somewhere.

(People I know who didn't make it in the dog-eat-dog world of hep-th would have been skeptical about that whole magnetic moment of the muon thing because between "perturbation theory doesn't always work" [1] and "human error" the theoretical results that were not matching experiment were wrong all along...)

[1] see lunar theory

zdw · 8h ago

> there is some number of rows that will break it. I don't expect anything less out of LLMs.

I'd expect better than 8 disk towers of Hanoi, which seems to be beyond current LLMs

PaulHoule · 4h ago

That's what, 255 moves? A reasonable way to do that via CoT would be for it to determine the algorithm for solving it (which it might "know" because it was in the training data, or perhaps it can look up with a search engine, or perhaps it can derive it) and then work all the steps.

If it has a 1% chance of making a mistake per step, which is likely, because the vector space data structure isn't the right structure to represent the problem, from the viewpoint of ordinary software, it has about an 8% chance of getting the whole thing right. I don't like those odds.

On the other hand, most LLMs can write a decent Python program to solve Hanoi, such as

    def tower_of_hanoi(n, source, target, auxiliary):
        if n == 1:
            print(f"Move disk 1 from {source} to {target}")
            return
        tower_of_hanoi(n - 1, source, auxiliary, target)
        print(f"Move disk {n} from {source} to {target}")
        tower_of_hanoi(n - 1, auxiliary, target, source)

(thanks Copilot!) and if you (or it) can feed that to a Python interpreter there is your answer, unless N is so big it blows out the stack. (One of my unpopular opinion is that recursive algorithms are a lower teaching)

I wouldn't expect most humans to get Hanoi right at N=8 unless they were super-careful and multiple-checked their work. Something I learned getting a PhD in theoretical physics is that even the best minds won't get a 50-page calculation right unless they back it up with unit and integration tests.