This could be a very big paper if its claims are reproducible. Like approaching attention is all you need big.
They discovered 106 new state-of-the-art linear attention architectures through a fully autonomous AI research loop. The authors are making comparisons to AlphaGo’s move 37.
yorwba · 3h ago
The part that is in principle amenable to replication is where they throw a lot of stuff at the wall and see what sticks. The part where they hype their own work, on the other hand... as a rule of thumb, if this really were a breakthrough on the level of AlphaGo, they wouldn't have to make that comparison themselves, someone else would be impressed enough to do it for them.
They discovered 106 new state-of-the-art linear attention architectures through a fully autonomous AI research loop. The authors are making comparisons to AlphaGo’s move 37.