There Are No New Ideas in AI Only New Datasets

122 bilsbie 64 6/30/2025, 2:43:46 PM blog.jxmo.io ↗

Comments (64)

EternalFury · 1h ago
What John Carmack is exploring is pretty revealing. Train models to play 2D video games to a superhuman level, then ask them to play a level they have not seen before or another 2D video game they have not seen before. The transfer function is negative. So, in my definition, no intelligence has been developed, only expertise in a narrow set of tasks.

It’s apparently much easier to scare the masses with visions of ASI, than to build a general intelligence that can pick up a new 2D video game faster than a human being.

vladimirralev · 47m ago
He is not using appropriate models for this conclusion and neither is he using state of the art models in this research and moreover he doesn't have an expensive foundational model to build upon for 2d games. It's just a fun project.

A serious attempt at video/vision would involve some probabilistic latent space that can be noised in ways that make sense for games in general. I think veo3 proves that ai can generalize 2d and even 3d games, generating a video under prompt constraints is basically playing a game. I think you could prompt veo3 to play any game for a few seconds and it will generally make sense even though it is not fine tuned.

justanotherjoe · 5m ago
I don't get why people are so invested in framing it this way. I'm sure there are ways to do the stated objective. John Carmack isn't even an AI guy why is he suddenly the standard.
YokoZar · 59m ago
I wonder if this is a case of overfitting from allowing the model to grow too large, and if you might cajole it into learning more generic heuristics by putting some constraints on it.

It sounds like the "best" AI without constraint would just be something like a replay of a record speedrun rather than a smaller set of heuristics of getting through a game, though the latter is clearly much more important with unseen content.

smokel · 44m ago
The subject you are referring to is most likely Meta-Reinforcement Learning [1]. It is great that John Carmack is looking into this, but it is not a new field of research.

[1] https://instadeep.com/2021/10/a-simple-introduction-to-meta-...

ferguess_k · 1h ago
Can you please explain "the transfer function is negative"?

I'm wondering whether one has tested with the same model but on two situations:

1) Bring it to superhuman level in game A and then present game B, which is similar to A, to it.

2) Present B to it without presenting A.

If 1) is not significantly better than 2) then maybe it is not carrying much "knowledge", or maybe we simply did not program it correctly.

tough · 1h ago
I think the problem is we train models to pattern match, not to learn or reason about world models
singron · 27m ago
I think this is clearly a case of over fitting and failure to generalize, which are really well understood concepts. We don't have to philosophize about what pattern matching really means.
ferguess_k · 10m ago
I kinda think I'm more or less the same...OK maybe we have different definitions of "pattern matching".
NBJack · 59m ago
In other words, they learn the game, not how to play games.
fsmv · 52m ago
They memorize the answers not the process to arrive at answers
IshKebab · 43m ago
This has been disproven so many times... They clearly do both. You can trivially prove this yourself.
0xWTF · 10m ago
> You can trivially prove this yourself.

Given the long list of dead philosophers of mind, if you have a trivial proof, would you mind providing a link?

IshKebab · 44m ago
Well yeah... If you only ever played one game in your life you would probably be pretty shit at other games too. This does not seem very revealing to me.
antisthenes · 36m ago
Where do you draw the line between pattern matching and reasoning about world models?

A lot of intelligence is just pattern matching and being quick about it.

moralestapia · 52m ago
I wonder how much performance decreases if they just use slightly modified versions of the same game. Like a different color scheme, or a couple different sprites.
t55 · 20m ago
this is what deepmind did 10 years ago lol
voxleone · 1h ago
I'd say with confidence: we're living in the early days. AI has made jaw-dropping progress in two major domains: language and vision. With large language models (LLMs) like GPT-4 and Claude, and vision models like CLIP and DALL·E, we've seen machines that can generate poetry, write code, describe photos, and even hold eerily humanlike conversations.

But as impressive as this is, it’s easy to lose sight of the bigger picture: we’ve only scratched the surface of what artificial intelligence could be — because we’ve only scaled two modalities: text and images.

That’s like saying we’ve modeled human intelligence by mastering reading and eyesight, while ignoring touch, taste, smell, motion, memory, emotion, and everything else that makes our cognition rich, embodied, and contextual.

Human intelligence is multimodal. We make sense of the world through:

Touch (the texture of a surface, the feedback of pressure, the warmth of skin0; Smell and taste (deeply tied to memory, danger, pleasure, and even creativity); Proprioception (the sense of where your body is in space — how you move and balance); Emotional and internal states (hunger, pain, comfort, fear, motivation).

None of these are captured by current LLMs or vision transformers. Not even close. And yet, our cognitive lives depend on them.

Language and vision are just the beginning — the parts we were able to digitize first - not necessarily the most central to intelligence.

The real frontier of AI lies in the messy, rich, sensory world where people live. We’ll need new hardware (sensors), new data representations (beyond tokens), and new ways to train models that grow understanding from experience, not just patterns.

dinfinity · 1h ago
> Language and vision are just the beginning — the parts we were able to digitize first - not necessarily the most central to intelligence.

I respectfully disagree. Touch gives pretty cool skills, but language, video and audio are all that are needed for all online interactions. We use touch for typing and pointing, but that is only because we don't have a more efficient and effective interface.

Now I'm not saying that all other senses are uninteresting. Integrating touch, extensive proprioception, and olfaction is going to unlock a lot of 'real world' behavior, but your comment was specifically about intelligence.

Compare humans to apes and other animals and the thing that sets us apart is definitely not in the 'remaining' senses, but firmly in the realm of audio, video and language.

voxleone · 35m ago
> Language and vision are just the beginning — the parts we were able to digitize first - not necessarily the most central to intelligence.

I probably made a mistake when i asserted that -- should have thought it over. Vision is evolutionarily older and more “primitive”, while language is uniquely human [or maybe, more broadly, primate, cetacean, cephalopod, avian...] symbolic, and abstract — arguably a different order of cognition altogether. But i maintain that each and every sense is important as far as human cognition -- and its replication -- is concerned.

Swizec · 1h ago
> The real frontier of AI lies in the messy, rich, sensory world where people live. We’ll need new hardware (sensors), new data representations (beyond tokens), and new ways to train models that grow understanding from experience, not just patterns.

Like Dr. Who said: DALEKs aren't brains in a machine, they are the machine!

Same is true for humans. We really are the whole body, we're not just driving it around.

chasd00 · 43m ago
> Language and vision are just the beginning..

Based on the architectures we have they may also be the ending. There’s been a lot of news in the past couple years about LLMs but has there been any breakthroughs making headlines anywhere else in AI?

dragonwriter · 33m ago
> There’s been a lot of news in the past couple years about LLMs but has there been any breakthroughs making headlines anywhere else in AI?

Yeah, lots of stuff tied to robotics, for instance; this overlaps with vision, but the advances go beyond vision.

Audio has seen quite a bit. And I imagine there is stuff happening in niche areas that just aren't as publicly interesting as language, vision/imagery, audio, and robotics.

skydhash · 1h ago
Yeah, but are there new ideas or only wishes?
jdgoesmarching · 52m ago
It’s pure magical thinking that would be correctly dismissed if it didn’t have AI attached to it. Imagine talking this way about anything else.

“We’ve barely scratched the surface with Rust, so far we’re only focused on code and haven’t even explored building mansions or ending world hunger”

LarsDu88 · 5m ago
If datasets are what we are talking about, I'd like to bring attention to the biological datasets out there that have yet to be fully harnessed.

The ability to collect gene expression data at a tissue specific level has only been invented and automated in the last 4-5 years (see 10X Genomics Xenium, MERFISH). We've only recently figured out how to collect this data at the scale of millions of cells. A breakthrough on this front may be the next big area of advancement.

Night_Thastus · 3m ago
Man I can't wait for this '''''AI''''' stuff to blow over. The back and forth gets a bit exhausting.
tippytippytango · 2h ago
Sometimes we get confused by the difference between technological and scientific progress. When science makes progress it unlocks new S-curves that progress at an incredible pace until you get into the diminishing returns region. People complain of slowing progress but it was always slow, you just didn’t notice that nothing new was happening during the exponential take off of the S-curve, just furious optimization.
kogus · 2h ago
To be fair, if you imagine a system that successfully reproduced human intelligence, then 'changing datasets' would probably be a fair summary of what it would take to have different models. After all, our own memories, training, education, background, etc are a very large component of our own problem solving abilities.
seydor · 34m ago
There are new ideas, people are finding new ways to build vision models, which then are applied to language models and vice versa (like diffusion).

The original idea of connectionism is that neural networks can represent any function, which is the fundamental mathematical fact. So we should be optimistic, neural nets will be able to do anything. Which neural nets? So far people have stumbled on a few productive architectures, but it appears to be more alchemy than science. There is no reason why we should think there won't be both new ideas and new data. Biology did it, humans will do it too.

> we’re engaged in a decentralized globalized exercise of Science, where findings are shared openly

Maybe the findings are shared, if they make the Company look good. But the methods are not anymore

jschveibinz · 2h ago
I will respectfully disagree. All "new" ideas come from old ideas. AI is a tool to access old ideas with speed and with new perspectives that hasn't been available up until now.

Innovation is in the cracks: recognition of holes, intersections, tangents, etc. on old ideas. It has bent said that innovation is done on the shoulders of giants.

So AI can be an express elevator up to an army of giant's shoulders? It all depends on how you use the tools.

alfalfasprout · 2h ago
Access old ideas? Yes. With new perspectives? Not necessarily. An LLM may be able to assist in interpreting data with new perspectives but in practice they're still fairly bad at greenfield work.

As with most things, the truth lies somewhere in the middle. LLMs can be helpful as a way of accelerating certain kinds and certain aspects of research but not others.

stevep98 · 45m ago
> Access old ideas? Yes. With new perspectives?

I wonder if we can mine patent databases for old ideas that never worked out in the past, but now are more useful. Perhaps due to modern machining or newer materials or just new applications of the idea.

bcrosby95 · 1h ago
The article is discussing working in AI innovation vs focusing on getting more and better data. And while there have been key breakthroughs in new ideas, one of the best ways to increase the performance of these systems is getting more and better data. And how many people think data is the primary avenue to improvement.

It reminds me of an AI talk a few decades ago, about how the cycle goes: more data -> more layers -> repeat...

Anyways, I'm not sure how your comment relates to these two avenues of improvement.

jjtheblunt · 1h ago
> I will respectfully disagree. All "new" ideas come from old ideas.

The insight into the structure of the benzene ring famously came in a dream, hadn't been seen before, but was imagined as a snake bitings its own tail.

gametorch · 2h ago
Exactly!

Can you imagine if we applied the same gatekeeping logic to science?

Imagine you weren't allowed to use someone else's scientific work or any derivative of it.

We would make no progress.

The only legitimate defense I have ever seen here revolves around IP and copyright infringement, which I couldn't care less about.

piinbinary · 1h ago
AI training is currently a process of making the AI remember the dataset. It doesn't involve the AI thinking about the dataset and drawing (and remembering) conclusions.

It can probably remember more facts about a topic than a PhD in that topic, but the PhD will be better at thinking about that topic.

tantalor · 22m ago
Maybe that's why PhDs keep the textbooks they use at hand, so they don't have to remember everything.

Why should the model need to memorize facts we already have written down somewhere?

jayd16 · 1h ago
Its a bit more complex than that. Its more about baking out the dataset into heuristics that a machine can use to match a satisfying result to an input. Sometimes these heuristics are surprising to a human and can solve a problem in a novel way.

"Thinking" is too broad a term to apply usefully but I would say its pretty clear we are not close to AGI.

nkrisc · 1h ago
> It can probably remember more facts about a topic than a PhD in that topic

So can a notebook.

Kapura · 1h ago
Here's an idea: make the AIs consistent at doing things computers are good at. Here's an anecdote from a friend who's living in Japan:

> i used chatgpt for the first time today and have some lite rage if you wanna hear it. tldr it wasnt correct. i thought of one simple task that it should be good at and it couldnt do that.

> (The kangxi radicals are neatly in order in unicode so you can just ++ thru em. The cjks are not. I couldnt see any clear mapping so i asked gpt to do it. Big mess i had to untangle manually anyway it woulda been faster to look them up by hand (theres 214))

> The big kicker was like, it gave me 213. And i was like, "why is one missing?" Then i put it back in and said count how many numbers are here and it said 214, and there just werent. Like come on you SHOULD be able to count.

If you can make the language models actually interface with what we've been able to do with computers for decades, i imagine many paths open up.

cheevly · 55m ago
Many of us have solved this with internal tooling that has not yet been shared or released to the public.
layer8 · 12m ago
This needs to be generalized however. For example, if you present an AI with a drawing of some directed graph (a state diagram, for example), it should be able to answer questions based on the precise set of all possible paths in that graph, without someone having to write tooling for diagram or graph processing and traversal. Or, given a photo of a dropped box of matches, an AI should be able to precisely count the matches, as far as they are individually visible (which a human could do by keeping a tally while coloring the matches). There are probably better examples, these are off the cuff.

There’s an infinite repertoire of such tasks that combine AI capabilities with traditional computer algorithms, and I don’t think we have a generic way of having AI autonomously outsource whatever parts require precision in a reliable way.

ks2048 · 1h ago
The latest LLMs are simply multiplying and adding various numbers together... Babylonians were doing that 4000 years ago.
bobson381 · 1h ago
You are just a lot of interactions of waves. All meaning is assigned. I prefer to think of this like the Goedel generator that found new formal expressions for the Principia - because we have a way of indexing concept-space, there's no telling what we might find in the gaps.
nyrulez · 50m ago
Things haven't changed much in terms of truly new ideas since electricity was invented. Everything else is just applications on top of that. Make the electrons flow in a different way and you get a different outcome.
krunck · 1h ago
Until these "AI" systems become always-on, always-thinking, always-processing, progress is stuck. The current push button AI - meaning it only processes when we prompt it - is not how the kind of AI that everyone is dreaming of needs to function.
fwip · 1h ago
From a technical perspective, we can do that with a for loop.

The reason we don't do it isn't because it's hard, it's because it yields worse results for increased cost.

ctoth · 2h ago
Reinforcement learning from self-play/AlphaWhatever? Nah must just be datasets. :)
NitpickLawyer · 2h ago
And architecture stuff like actually useful long context. Whatever they did with gemini 2.5 is miles ahead in long context useful results compared to the previous models. I'd be very surprised if gemini 2.5 is "just" gemini 1 w/ better data.
grumpopotamus · 2h ago
Y_Y · 1h ago
You raise a really interesting point. I'm sure it's just missed my notice, but I'm not familiar with any projects from antediluvian AI that have been resurrected to run on modern hardware and see where they'd really asymptote if they'd had the compute they deserved.
FeepingCreature · 1h ago
To be fair, usually those projects would need considerable work to be ported to modern multicore machines, let alone GPUs.
genewitch · 22m ago
can you name a couple so i can see how much work is involved? markov chains compile fast and respond fast, sure, and neural nets train pretty quick too, so i'm wondering where the cutoff is; expert systems?
nyrikki · 1h ago
Big difference between a perfect information, completely specified zero sum game and the real world.

As a simple analogy, read out the following sentence multiple times, stressing a different word each time.

"I never said she stole my money"

Note how the meaning changes and is often unique?

That is a lens I to the frame problem and it's inverse, the specification problem.

The above problem quickly becomes tower-complete, and recent studies suggest that RL is reinforcing or increasing the weight of existing patterns.

As the open domain frame problem and similar challenges are equivalent to HALT, finding new ways to extract useful information will be important for generalization IMHO.

Synthetic data is useful, but not a complete solution, especially for tower problems.

genewitch · 14m ago
The one we use is "I always pay my taxes"

and as far as synthetic vs real data, there's a lot of gaps in LLM knowledge; and vision models suffer from "limited tags", which used to have workarounds with textual embeddings and the like, but those went by the wayside as LoRA, controlnet, etc. appeared.

There's people who are fairly well known that LLMs have no idea about. There's things in books i own that the AI confidently tells me are either wrong or don't exist.

That one page about compressing 1 gig wikipedia as small as possible implicitly and explicitly states that AI is "basically compression" - and if the data isn't there, it's not in the compressed set (weights) either.

And i'll reply to another comment here, about "24/7 rolling/ for looped" AI - i thought of doing this when i first found out about LLMs, but context windows are the enemy, here. I have a couple of ideas about how to have a continuous AI, but i don't have the capital to test it out.

lossolo · 10m ago
I wrote about it around a year ago here:

"There weren't really any advancements from around 2018. The majority of the 'advancements' were in the amount of parameters, training data, and its applications. What was the GPT-3 to ChatGPT transition? It involved fine-tuning, using specifically crafted training data. What changed from GPT-3 to GPT-4? It was the increase in the number of parameters, improved training data, and the addition of another modality. From GPT-4 to GPT-40? There was more optimization and the introduction of a new modality. The only thing left that could further improve models is to add one more modality, which could be video or other sensory inputs, along with some optimization and more parameters. We are approaching diminishing returns." [1]

10 months ago around o1 release:

"It's because there is nothing novel here from an architectural point of view. Again, the secret sauce is only in the training data. O1 seems like a variant of RLRF https://arxiv.org/abs/2403.14238

Soon you will see similar models from competitors." [2]

Winter is coming.

1. https://news.ycombinator.com/item?id=40624112

2. https://news.ycombinator.com/item?id=41526039

tolerance · 1m ago
And when winter does arrive, then what? The technology is slowing down while its popularity picks up. Can sparks fly out of snow?
tantalor · 1h ago
> If data is the only thing that matters, why are 95% of people working on new methods?

Because new methods unlock access to new datasets.

Edit: Oh I see this was a rhetorical question answered in the next paragraph. D'oh

b0a04gl · 1h ago
if datasets are the new codebases ,then the real IP can be dataset version control. how you fork ,diff ,merge and audit datasets like code. every team says 'we trained on 10B tokens' but what if we can answer 'which 5M tokens made reasoning better', 'which 100k made it worse'. then we can start being targeted leverage
rar00 · 1h ago
disagree, there are a few organisations exploring novel paths. It's just that throwing new data at an "old" algorithm is much easier and has been a winning strategy. And, also, there's no incentive for a private org to advertise a new idea that seems to be working (mine's a notable exception :D).
luppy47474 · 1h ago
Hmmm
anon291 · 1h ago
I mean there's no new ideas for saas but just new applications and that worked out pretty well