Skip to content

aryankapoorr/moviesentiment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Movie Sentiment Analysis Project

For a live demonstration of the sentiment analysis tool, visit the Movie Sentiment Analyzer

Explore the Project Notebook for an in-depth look at implementation

Project Description

As a self-proclaimed cinephile, I am always looking for ways to gather the public's consensus opinion on a movie before watching. All of the current major outlets (Rotten Tomatoes, IMDb, Metacritic, etc.) are decent options, but can be arbitrary with bias and varying scoring systems from person to person. Calculating the sentiment of text from a movie review can create a standardized system for understanding the consensus opinion of a movie.

After finding an already-cleaned database of movie reviews, I developed a Sentiment Analysis Model using a Convolutional Neural Network. After testing various parameters and epochs of model training, I fit the model and created a scoring system, weighting the sentiments based on the inverse of their gaussian distribution. Then, ~250 reviews were collected per movie and passed through the model, with each score being uploaded to the website.

Demo

Link to the project UI demo

Usage

To use the sentiment analysis tool locally, follow these steps:

  1. Clone the project notebook
  2. Go through all of the cells in the model building section, making sure there are no errors
  3. Go through all of the cells in the model testing and scoring section, making sure there are no errors
  4. Connect your google drive in the appropriate cells in the database building section
  5. Edit the names variable to the movie names of your choice
  6. Run the rest of the cells in the database building section.

Data Description

  • Model Training Data: Provided courtesy of Stanford NLP, containing 50,000 data points with binary outputs (positive/negative). In order to reduce model bias, only 25 reviews per movie were used.
  • Model Test Results: The outcome of the model test post training is tracked on this sheet. 10,000 points were used for testing, with 83.75% accuracy.
  • Movie Score Database: The list of movie scores and poster URLs on the website come from this sheet, which is constantly being updated. Scores are generated off of the top ~250 IMDb reviews.
  • Project Notebook: Any intermediary data and the entire process of building the model & gathering data can be found in the project notebook.

License

This project is licensed under the MIT License.

Acknowledgments

  • Special thanks to Stanford NLP for allowing the use of their dataset for model training.
  • Thanks to Streamlit for providing a fantastic platform for building interactive web apps.
  • Thanks to Google Colab for providing a free and powerful environment for running Jupyter notebooks.

About

NLP Sentiment Analysis Project generating user sentiment scores for movies

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors