ChatGPT Gets 'Absolutely Wrecked' in Chess Match with 1978 Atari

4 Hary06 3 6/18/2025, 4:41:24 PM pcmag.com ↗

Comments (3)

PaulHoule · 5h ago
I wouldn't expect ChatGPT to reliably get through a game without making an invalid move, which is the "minimal viable product" for a chess program. The 2600 program does succeed at this because it has correct data structures and algorithms.
morkalork · 5h ago
Right, an ML model failing on a task it wasn't trained on is not news. That's not to say there aren't problems LLMs can do surprisingly well on even though they weren't explicitly trained to do. Those cases are certainly newsworthy. But that's not the case here.
rossdavidh · 5h ago
Chess programs, from this Atari one all the way up to Stockfish, have code custom-made for chess. Things like, storing the current state of the chessboard, and determining what legal moves are available. None of this is what neural networks (LLMs or otherwise) are good at.

Which doesn't mean that the result doesn't mean anything. Some of the AI hype implies that neural networks are now an "anything box" that will learn any subject, without the need of a topic expert to make application-specific development. But, you know, it isn't. That doesn't make LLMs useless, but it does reinforce some of what Apple (and others) have been saying lately about the ability of "AI" (actually "machine learning") to generalize.

Which will (eventually) have a big impact on the ability of LLMs to make money off of anything other than attention/hype. There is a business model, for summarizing reviews, correcting grammar, helping out programmers with less complex tasks, etc. I'm not sure it pays for the amount of hardware being thrown at it.