Jones and his team performed this experiment with four LLMs. ChatGPT 4.5 was by far the most successful: 73% of participants identified it as the real human. Another model that goes by the unwieldy name LLaMa-3.1-405B was identified as human 56% of the time. (The other two models—ELIZA and GPT-4o—achieved 23% and 21% success rates, respectively, and will not be spoken of again.)
By ELIZA, are they referring to the classic ELIZA? I am not aware of anything new and current with the same name?
If the old ELIZA succeeded 23% of the time, in the context of the other numbers ... that seems ... odd.
allears · 12m ago
Pop science indeed. Nothing new here. The Turing Test was the product of a much earlier era. Our machines today can easily fake a conversation, but there's been little progress in defining what intelligence is, let alone consciousness. Whatever they are, it's clear that LLMs don't have them, and aren't on track to produce them.
NitpickLawyer · 16m ago
Now the turing test isn't a good test. So the goalposts keep on moving. "AI is anything that hasn't been done yet" (quote from 1980s).
By ELIZA, are they referring to the classic ELIZA? I am not aware of anything new and current with the same name?
If the old ELIZA succeeded 23% of the time, in the context of the other numbers ... that seems ... odd.