Skip to content

MadKea/COS781_Project

Repository files navigation

====================
hetrec2011-lastfm-2k
====================

-------
Version
-------

Version 1.0 (May 2011)

-----------
Description
-----------

    This dataset contains social networking, tagging, and music artist listening information 
    from a set of 2K users from Last.fm online music system.
    http://www.last.fm 

    The dataset is released in the framework of the 2nd International Workshop on 
    Information Heterogeneity and Fusion in Recommender Systems (HetRec 2011) 
    http://ir.ii.uam.es/hetrec2011 
    at the 5th ACM Conference on Recommender Systems (RecSys 2011)
    http://recsys.acm.org/2011 

---------------
Data statistics
---------------

    1892 users
   17632 artists
      
   12717 bi-directional user friend relations, i.e. 25434 (user_i, user_j) pairs
         avg. 13.443 friend relations per user
         
   92834 user-listened artist relations, i.e. tuples [user, artist, listeningCount]
         avg. 49.067 artists most listened by each user
         avg. 5.265 users who listened each artist
            
   11946 tags  
   
  186479 tag assignments (tas), i.e. tuples [user, tag, artist]
         avg. 98.562 tas per user
         avg. 14.891 tas per artist
         avg. 18.930 distinct tags used by each user
         avg. 8.764 distinct tags used for each artist

-----
Files
-----
   	    
   * artists.dat
   
        This file contains information about music artists listened and tagged by the users.
   
   * tags.dat
   
   	This file contains the set of tags available in the dataset.

   * user_artists.dat
   
        This file contains the artists listened by each user.
        
        It also provides a listening count for each [user, artist] pair.

   * user_taggedartists.dat - user_taggedartists-timestamps.dat
   
        These files contain the tag assignments of artists provided by each particular user.
        
        They also contain the timestamps when the tag assignments were done.
   
   * user_friends.dat
   
   	These files contain the friend relations between users in the database.
     
-----------
Data format
-----------

   The data is formatted one entry per line as follows (tab separated, "\t"):

   * artists.dat
   
        id \t name \t url \t pictureURL

        Example:
        707	Metallica	http://www.last.fm/music/Metallica	http://userserve-ak.last.fm/serve/252/7560709.jpg

   * tags.dat
 
        tagID \t tagValue
        1	metal
 
   * user_artists.dat
   
        userID \t artistID \t weight
        2	51	13883
   
   * user_taggedartists.dat
  
        userID \t artistID \t tagID \t day \t month \t year
        2	52	13	1	4	2009  
  
   * user_taggedartists-timestamps.dat

        userID \t artistID \t tagID \t timestamp
        2	52	13	1238536800000

   * user_friends.dat

        userID \t friendID
        2	275

------- 
License
-------

   The users' names and other personal information in Last.fm are not provided in the dataset.

   The data contained in hetrec2011-lastfm-2k.zip is made available for non-commercial use.
   
   Those interested in using the data in a commercial context should contact Last.fm staff:
   http://www.lastfm.com/about/contact

----------------
Acknowledgements
----------------

   This work was supported by the Spanish Ministry of Science and Innovation (TIN2008-06566-C04-02), 
   and the Regional Government of Madrid (S2009TIC-1542).

----------
References
----------

   When using this dataset you should cite:
      - Last.fm website, http://www.lastfm.com

   You may also cite HetRec'11 workshop as follows:

   @inproceedings{Cantador:RecSys2011,
      author = {Cantador, Iv\'{a}n and Brusilovsky, Peter and Kuflik, Tsvi},
      title = {2nd Workshop on Information Heterogeneity and Fusion in Recommender Systems (HetRec 2011)},
      booktitle = {Proceedings of the 5th ACM conference on Recommender systems},
      series = {RecSys 2011},
      year = {2011},
      location = {Chicago, IL, USA},
      publisher = {ACM},
      address = {New York, NY, USA},
      keywords = {information heterogeneity, information integration, recommender systems},
   } 

-------
Credits
-------

   This dataset was built by Ignacio Fernández-Tobías with the collaboration of Iván Cantador and Alejandro Bellogín, 
   members of the Information Retrieval group at Universidad Autonoma de Madrid (http://ir.ii.uam.es)

-------   
Contact
-------

   Iván Cantador, ivan [dot] cantador [at] uam [dot] es

About

Semester 2

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors