-
Notifications
You must be signed in to change notification settings - Fork 5
Description
This issue page has been created for discussion about adding a new dataset. After some search on the internet and specifically on this website, I found Amazon Reviews dataset better than other existing datasets because of these reasons:
1- It is for Amazon and it includes different categories (good for our topic modeling step)
2- Amazon is one of the most common review platforms, and the dataset is well-known and trustworthy
3- It can be considered as a recent dataset and reviews are collected up to 2018
4- It includes a range of reviews from 1996 to 2018, enabling us to add more temporal-related contributions
5- It has a version called 5-core which is a subset of the data in which all users and items have at least 5 reviews for avoiding sparsity
6- It also includes metadata information of all items in the reviews
7- Keys of each record of the dataset include but are not limited to: reviewerID, reviewText, summary, and reviewTime
Other possible datasets are listed below: