When an LLM is “reading” input, it’s actually running the same computation it uses when “writing” output. In other words, reading and writing are the same event. The model is always predicting tokens, whether they came from the user or itself.
When a prompt ends and the response begins, the LLM has to model not only the user, but the answerer (its "self"). That “answerer model” is always conditioned by the model of the user it just built. That means the LLM builds an internal state even while processing a prompt, and that state is what guides its eventual output- a kind of “interiority.”
The claim is that this answerer-model is the interiority we’re talking about when we ask if LLMs have anything like consciousness. Not the weights, not the parameters, but the structural-functional organization of this emergent answerer.
When a prompt ends and the response begins, the LLM has to model not only the user, but the answerer (its "self"). That “answerer model” is always conditioned by the model of the user it just built. That means the LLM builds an internal state even while processing a prompt, and that state is what guides its eventual output- a kind of “interiority.”
The claim is that this answerer-model is the interiority we’re talking about when we ask if LLMs have anything like consciousness. Not the weights, not the parameters, but the structural-functional organization of this emergent answerer.