Skip to content

New Dataset - Review domain #74

@soroush-ziaeinejad

Description

@soroush-ziaeinejad

This issue page has been created for discussion about adding a new dataset. After some search on the internet and specifically on this website, I found Amazon Reviews dataset better than other existing datasets because of these reasons:
1- It is for Amazon and it includes different categories (good for our topic modeling step)
2- Amazon is one of the most common review platforms, and the dataset is well-known and trustworthy
3- It can be considered as a recent dataset and reviews are collected up to 2018
4- It includes a range of reviews from 1996 to 2018, enabling us to add more temporal-related contributions
5- It has a version called 5-core which is a subset of the data in which all users and items have at least 5 reviews for avoiding sparsity
6- It also includes metadata information of all items in the reviews
7- Keys of each record of the dataset include but are not limited to: reviewerID, reviewText, summary, and reviewTime

Other possible datasets are listed below:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions