The Illusion of Thinking: Strengths and limitations of reasoning models [pdf] (ml-site.cdn-apple.com)

For example, I had a bit of a spat with a colleague who was 100% certain that AI models are not only unreliable because from a human perspective, insignificant changes to their inputs can cause significant changes to their outputs, but because, in their idea, they were actually random, in the nondeterministic sense. That I was speaking in hypotheticals when I took an issue with this, as he recalled my beliefs about superdeterminism, and inferred that "yeah if you know where every atom is in your processor and the state they're in, then sure, maybe they're deterministic, but that's not a useful definition of deterministic".

Me "knowing" that they're not only not any more special than any other program, but that it's just a bunch of matrix math, provided me with the confidence and resiliency necessary to reason my colleague out of his position, including busting out a local model to demonstrate the reproducibility of model interactions first hand, that he was then able to replicate on his end on a completely different hardware. Even learned a bit about the "magic" involved myself along the way (that different versions of ollama may give different results, although not necessarily).

captn3m0 · 24m ago

I also had to argue with a lawyer on the same point - he held a firm belief that “Modern GenAI systems” are different from older ML systems in that they are non-deterministic and random. And that this inherent randomness is what makes them both unexplainable (you can’t guarantee what it would type) and useful (they can be creative).

pxc · 36m ago

> [The] article is sadly also in that camp of applying this perspective to be dismissive.

TFA literally and unironically includes such phrases as "AI is awesome".

It characterizes AI as "useful", "impressive" and capable of "genuine technological marvels".

In what sense is the article dismissive? What, exactly, is it dismissive of?

lucaslazarus · 1h ago

On a tangentially-related note: does anyone have a good intuition for why ChatGPT-generated images (like the one in this piece) are getting increasingly yellow? I often see explanations attributing this to a feedback loop in training data but I don't see why that would persist for so long and not be corrected at generation time.

minimaxir · 56m ago

They aren't getting increasingly yellow (I don't think the base model has been updated since the release of GPT-4o Image Generation), but the fact that they are always so yellow is bizarre and I am still shocked OpenAI shipped it knowing that effect exists, especially since it has the practical effect of instantly being able to clock it as an AI image generation.

Generally when training image encoders/decoders, the input images are normalized so some base commonality is possible (when playing around with Flux Kontext image-to-image I've noticed subtle adjustments in image temperature), but the fact that it's piss yellow is baffling. The autoregressive nature of the generation would not explain it either.

4b11b4 · 53m ago

You're just mapping from distribution to distribution

- one of my professors

hackinthebochs · 52m ago

LLMs are modelling the world, not just "predicting the next token". They are not akin to "stochastic parrots". Some examples here[1][2][3]. Anyone claiming otherwise at this point is not arguing in good faith. There are so many interesting things to say about LLMs, yet somehow the conversation about them is stuck in 2021.

[1] https://arxiv.org/abs/2405.15943

[2] https://x.com/OwainEvans_UK/status/1894436637054214509

[3] https://www.anthropic.com/research/tracing-thoughts-language...

minimaxir · 46m ago

LLMs are still trained to predict the next token: gradient descent just inevitably converges on building a world model as the best way to do it.

Masked language modeling and its need to understand inputs both forwards and backwards is a more intuitive way for having a model learn a representation of the world, but causal language modeling goes brrrrrrrr.

blahburn · 40m ago

Yeah, but it’s kinda magic

israrkhan · 1h ago

A computer (or a phone) is not magic, its just billions of transistors.

or perhaps we can further simplify and call it just sand?

or maybe atoms?

Tell HN: Help restore the tax deduction for software dev in the US (Section 174)

Bill Atkinson has died (daringfireball.net)

GCP Outage (status.cloud.google.com)

A receipt printer cured my procrastination (laurieherault.com)

Frequent reauth doesn't make you more secure (tailscale.com)

The last six months in LLMs, illustrated by pelicans on bicycles (simonwillison.net)

Magistral — the first reasoning model by Mistral AI (mistral.ai)

Apple announces Foundation Models and Containerization frameworks, etc (apple.com)

Containerization is a Swift package for running Linux containers on macOS (github.com)

Research suggests Big Bang may have taken place inside a black hole (port.ac.uk)

Apple introduces a universal design across platforms (apple.com)

Jemalloc Postmortem (jasone.github.io)

If the moon were only 1 pixel: A tediously accurate solar system model (2014) (joshworth.com)

Marines being mobilized in response to LA protests (cnn.com)

US-backed Israeli company's spyware used to target European journalists (apnews.com)

Chatterbox TTS (github.com)

Bruteforcing the phone number of any Google user (brutecat.com)

Congratulations on creating the one billionth repository on GitHub (github.com)

How we decreased GitLab repo backup times from 48 hours to 41 minutes (about.gitlab.com)

Launch HN: Vassar Robotics (YC X25) – $219 robot arm that learns new skills

"Localhost tracking" explained. It could cost Meta €32B (zeropartydata.es)

Kagi Reaches 50k Users (kagi.com)

Meta: Shut down your invasive AI Discover feed (mozillafoundation.org)

How I program with agents (crawshaw.io)

OpenAI dropped the price of o3 by 80% (twitter.com)

Self-Host and Tech Independence: The Joy of Building Your Own (ssp.sh)

My experiment living in a tent in Hong Kong's jungle (corentin.trebaol.com)

Air India flight to London crashes in Ahmedabad with more than 240 onboard (theguardian.com)

I convinced HP's board to buy Palm and watched them kill it (philmckinney.substack.com)

The librarian immediately attempts to sell you a vuvuzela (kaveland.no)

The Illusion of Thinking: Strengths and limitations of reasoning models [pdf] (ml-site.cdn-apple.com)

Washington Post's Privacy Tip: Stop Using Chrome, Delete Meta Apps (and Yandex) (tech.slashdot.org)

Falsehoods programmers believe about aviation (flightaware.engineering)

Joining Apple Computer (2018) (folklore.org)

Building supercomputers for autocrats probably isn't good for democracy (helentoner.substack.com)

We’re secretly winning the war on cancer (vox.com)

Researchers develop ‘transparent paper’ as alternative to plastics (japannews.yomiuri.co.jp)

Getting Past Procrastination (spectrum.ieee.org)

Convert photos to Atkinson dithering (gazs.github.io)

Meta invests $14.3B in Scale AI to kick-start superintelligence lab (nytimes.com)

Danish Ministry Replaces Windows and Microsoft Office with Linux and LibreOffice (heise.de)

Show HN: I made a 3D printed VTOL drone (tsungxu.com)

FSE meets the FBI (blog.freespeechextremist.com)

Low-background Steel: content without AI contamination (blog.jgc.org)

Show HN: Chili3d – A open-source, browser-based 3D CAD application

Successful people set constraints rather than chasing goals (joanwestenberg.com)

Rendering Crispy Text on the GPU (osor.io)

Brian Wilson has died (pitchfork.com)

Finding Shawn Mendes (2019) (ericneyman.wordpress.com)

A year of funded FreeBSD development (daemonology.net)

AI Isn't Magic, It's Maths

Comments (11)