While mapping vector embeddings of arXiv abstracts, I came across some surprising clusters that I hadn’t seen before. This post is an attempt to dig deeper into what they might represent.
yorwba · 12h ago
FYI, the link to the previous blog post is broken. You may also want to add a link to the blog's landing page somewhere near the top.
On the content itself, have you tried to find pairs of abstracts from the two clusters that differ as little as possible along the other dimensions, to see what's different about them?
rbanffy · 23h ago
I love the 3D map where there is a big lobe for physics, and another for mostly everything else.
Quizzical4230 · 22h ago
Yes! However CS is taking over and accounted for nearly half of arXiv papers in the years 2024 and 2025 :)
On the content itself, have you tried to find pairs of abstracts from the two clusters that differ as little as possible along the other dimensions, to see what's different about them?