Yambda-5B – Industrial-scale music recommendation dataset

3 tazjin 1 6/4/2025, 2:56:38 PM huggingface.co ↗

Comments (1)

tazjin · 1d ago
This is a new public dataset that can be used for training (music) recommender systems, based on anonymized user track choices (for things like "users who like $X also like $Y" and so on).

There's more technical info in the associated paper: https://arxiv.org/abs/2505.22238

Events are timestamped, too, so this could potentially be used to train for things like recommending stuff based on the current mood (I run into this all the time where I might get a recommendation that, in another mood, would hit the spot, but it's not the thing right now).

(Disclaimer: I work at Yandex, but not on anything related to Yandex Music or this particular project. I'm an ex-Spotifier though and music recommendation tech is always interesting!)