Gideon Dror, Noam Koenigstein, Yehuda Koren, Markus Weimer


The theme of the the KDD cup 2011 challenge was to identify user tastes in music by leveraging the actual Yahoo! Music dataset. Two datasets were sampled for the raw data: The larger dataset contained 262,810,175 ratings of 624,961 music items by 1,000,990 users was created for Track1 and and a smaller dataset with 62,551,438 ratings of 296,111 mu- sic items by 249,012 was created for Track2. A distinctive feature of the datasets is that there are four types of musi- cal items: tracks, albums, artists, and genres, forming a four level hierarchy.

The challenge started on March 15, 2011 and ended on June 30, 2011 and attracted 2389 participants, 2100 of which were active by the end of the competition. The popularity of the challenge is related to the fact that learning a large scale recommender systems is a generic problem, highly rel- evant to the industry. In addition, The competition drew interest by introducing a number of scientific and techni- cal challenges including dataset size, hierarchical structure of items, high resolution timestamps of ratings, and a non- conventional ranking-based task.

Download PDF


	Author = {Gideon Dror and Noam Koenigstein and Yehuda Koren and Markus Weimer},
	Booktitle = {Proceedings of KDDCup 2011},
	Title = {The Yahoo! Music Dataset and KDD-Cup'11},
	Year = {2011}