> Related to this, I will not be commenting on any self-reported AI competition performance results for which the methodology was not disclosed in advance of the competition.
what a badass
amelius · 1h ago
Yes, I think it is disingenuous of OpenAI to make ill-supported claims about things that can affect us in important ways, having an impact on our worldview, and our place in the world as an intelligent species. They should be corrected here, and TT is doing a good job.
roxolotl · 44m ago
This does a great job illustrating the challenges with arguing over these results. Those in the agi camp will argue that the alterations are mostly what makes the ai so powerful.
Multiple days worth of processing, cross communication, picking only the best result? That’s just the power of parallel processing and how they reason so well. Altering to a more standard prompt? Communicating with a more strict natural language helps reduce confusion. Calculator access and the vast knowledge of humanity built in? That’s the whole point.
I tend to side with Tao on this one but the point is less who’s right and more why there’s so much arguing past each other. The basic fundamentals of how to judge these tools aren’t agreed upon.
svat · 1h ago
Great set of observations, and indeed it's worth remembering that the specific details of assistance and setup make a difference of several orders of magnitude. And ha, he edited the last post in the thread to add this comment:
> Related to this, I will not be commenting on any self-reported AI competition performance results for which the methodology was not disclosed in advance of the competition. (3/3)
(This wasn't there when I first read the thread yesterday 18 hours ago; it was edited in 15 hours ago i.e. 3 hours later.)
It's one of the things to admire about Terence Tao: he's always insightful even when he comments about stuff outside mathematics, while always having the mathematician's discipline of not drawing confident conclusions when data is missing.
I was reminded of this because of a recent thread where some HN commenter expected him to make predictions about the future (https://news.ycombinator.com/item?id=44356367). Also reminded of Sherlock Holmes (from A Scandal in Bohemia):
> “This is indeed a mystery,” I remarked. “What do you imagine that it means?”
> “I have no data yet. It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”
Edit: BTW, seeing some other commentary (here and elsewhere) about these posts is very disappointing — even when Tao explicitly says he's not commenting about any specific claim (like that of OpenAI), many people seem to be eager to interpret his comments as being about that claim: people's tendency for tribalism / taking “sides” is so great that they want to read this as Tao caring about the same things they care about, rather than him using the just-concluded IMO as an illustration for the point he's actually making (that results are sensitive to details). In fact his previous post (https://mathstodon.xyz/@tao/114877789298562646) was about “There was not an official controlled competition set up for AI models for this year’s IMO […] Hopefully by next year we will have a controlled environment to get some scientific comparisons and evaluations”.
largbae · 28m ago
I feel like everyone who treats AGI as "the goal" is wasting energy that could be applied towards real problems right now.
AI in general has given humans great leverage in processing information, more than we have ever had before. Do we need AGI to start applying this wonderful leverage toward our problems as a species?
johnecheck · 1h ago
My thoughts were similar. OpenAI, very cool result! Very exciting claim! Yet meaningless in the form of a Twitter thread with no real details.
what a badass
Multiple days worth of processing, cross communication, picking only the best result? That’s just the power of parallel processing and how they reason so well. Altering to a more standard prompt? Communicating with a more strict natural language helps reduce confusion. Calculator access and the vast knowledge of humanity built in? That’s the whole point.
I tend to side with Tao on this one but the point is less who’s right and more why there’s so much arguing past each other. The basic fundamentals of how to judge these tools aren’t agreed upon.
> Related to this, I will not be commenting on any self-reported AI competition performance results for which the methodology was not disclosed in advance of the competition. (3/3)
(This wasn't there when I first read the thread yesterday 18 hours ago; it was edited in 15 hours ago i.e. 3 hours later.)
It's one of the things to admire about Terence Tao: he's always insightful even when he comments about stuff outside mathematics, while always having the mathematician's discipline of not drawing confident conclusions when data is missing.
I was reminded of this because of a recent thread where some HN commenter expected him to make predictions about the future (https://news.ycombinator.com/item?id=44356367). Also reminded of Sherlock Holmes (from A Scandal in Bohemia):
> “This is indeed a mystery,” I remarked. “What do you imagine that it means?”
> “I have no data yet. It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”
Edit: BTW, seeing some other commentary (here and elsewhere) about these posts is very disappointing — even when Tao explicitly says he's not commenting about any specific claim (like that of OpenAI), many people seem to be eager to interpret his comments as being about that claim: people's tendency for tribalism / taking “sides” is so great that they want to read this as Tao caring about the same things they care about, rather than him using the just-concluded IMO as an illustration for the point he's actually making (that results are sensitive to details). In fact his previous post (https://mathstodon.xyz/@tao/114877789298562646) was about “There was not an official controlled competition set up for AI models for this year’s IMO […] Hopefully by next year we will have a controlled environment to get some scientific comparisons and evaluations”.
AI in general has given humans great leverage in processing information, more than we have ever had before. Do we need AGI to start applying this wonderful leverage toward our problems as a species?