Language Models in Plato's Cave

1 latentnumber 2 6/29/2025, 2:31:31 PM sergeylevine.substack.com ↗

Comments (2)

latentnumber · 5h ago
> In science, we might suppose that the more simple, elegant, and powerful a theory is, the more likely it is to be the right one – there are many ways to write down an equation to describe the oscillation of a spring, but we take Hooke’s law to be the “right” theory because it provides both remarkable simplicity and high predictive power. Similarly, we might suppose that if we have an algorithm that is simple, elegant, and explains similar essential functions as the human mind, then it is likely to be the right model of the mind’s computational processes. That is, if LLMs are trained with a simple algorithm and acquire functionality that resembles that of the mind, then their underlying algorithm should also resemble the algorithm by which the mind acquires its functionality. However, there is one very different alternative explanation: instead of acquiring its capabilities by observing the world in the same way as humans, LLMs might acquire their capabilities by observing the human mind and copying its function. Instead of implementing a learning process that can learn how the world works, they implement an incredibly indirect process for scanning human brains to construct a crude copy of human cognitive processes.
latentnumber · 5h ago
> This explains why video prediction models that learn about the physical world have so far not yielded the same results as next-token prediction on language: while we might hope that models that learn from videos might acquire representations of the physical world in the same way that humans learn through experience, the LLMs have managed to skip this step and simply copy some aspects of human mental representations without having to figure out the learning algorithm that allowed humans to acquire those representations in the first place.