GitHub - gabrielle-lau/sentiment-analysis-deep-learning: A Flask app for classifying an Amazon product review as positive or negative by a machine learning model trained from scratch

Sentiment Analysis

This project aims to use machine learning to predict if a product review text is positive or negative.

Navigate to the web app: https://glau-sentiment-analysis.herokuapp.com/

Amazon Reviews Dataset

The raw Amazon reviews dataset consists of reviews from amazon. The data span a period of 18 years, including ~35 million reviews up to March 2013. Reviews include product and user information, ratings, and a plaintext review. The dataset can be downloaded from Stanford SNAP's website.

The dataset used in this project is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the above mentioned Amazon review dataset, which can be accessed from Xiang Zhang's Google Drive.

The Amazon reviews full score dataset is constructed by randomly taking 600,000 training samples and 130,000 testing samples for each review score from 1 to 5. In total there are 3,000,000 training samples and 650,000 testing samples.

WordCloud plots below highlight the common words in the positive and negative classes of product reviews in the dataset.

Figure 1: WordCloud Generated on Top 100 words used in Amazon Product Reviews

(a) Positive Sentiment	(b) Negative Sentiment

Model Architecture

A CNN architecture was adopted, where pre-processed text was passed to an embedding, followed by a fully-connected layer and pooling and drop-out layers for regularisation. Utilised AWS EC2 GPUs.

The Python code for training and testing the deep learning model is written in a Jupyter Notebook.

Model Training and Evaluation

The accuracy on the test set is 83.1%. Convergence is observed at epoch 2.

Figure 2: Evaluation of Model on Training and Validation Sets

(a) Accuracy	(b) Loss

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
__pycache__		__pycache__
app		app
plots		plots
Demo_Screen_Capture.gif		Demo_Screen_Capture.gif
Procfile		Procfile
README.md		README.md
amazon-review-rating-v2.ipynb		amazon-review-rating-v2.ipynb
requirements.txt		requirements.txt
runtime.txt		runtime.txt
wsgi.py		wsgi.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sentiment Analysis

Amazon Reviews Dataset

Figure 1: WordCloud Generated on Top 100 words used in Amazon Product Reviews

Model Architecture

Model Training and Evaluation

Figure 2: Evaluation of Model on Training and Validation Sets

About

Uh oh!

Releases

Packages

Languages

gabrielle-lau/sentiment-analysis-deep-learning

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis

Amazon Reviews Dataset

Figure 1: WordCloud Generated on Top 100 words used in Amazon Product Reviews

Model Architecture

Model Training and Evaluation

Figure 2: Evaluation of Model on Training and Validation Sets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages