Skip to content

Resham011/Movie-Recommendation-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

20 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎬 Movie Recommender System

Streamlit App Python License: MIT Live Demo

πŸ“– Overview

This project is a Content-Based Movie Recommender System that analyzes movie metadata to suggest similar content. Using the TMDB 5000 Movies dataset, the system processes over 5,000 films to find meaningful connections between genres, cast, crew, and plot descriptions.

🧠 Machine Learning & NLP Workflow

To build this engine, I implemented a robust Natural Language Processing (NLP) pipeline:

1. Data Engineering

  • Feature Extraction: Combined overview, genres, keywords, cast (top 3 actors), and crew (director) into a single "tags" column.
  • Text Preprocessing: Applied lowercasing and handled special characters to ensure consistency.

2. Vectorization (Bag of Words)

  • Technique: Used CountVectorizer from scikit-learn.
  • Strategy: Converted text tags into 5,000-dimensional numerical vectors, removing standard English stop words to focus on unique movie identifiers.

3. Similarity Measurement (Cosine Similarity)

  • Instead of Euclidean distance, I utilized Cosine Similarity to measure the distance between movie vectors.
  • The Logic: In high-dimensional space, the angle between vectors (cosine) is a more accurate representation of content similarity than the straight-line distance.

πŸ—οΈ Engineering & Deployment Challenges

  • Git LFS Integration: The similarity matrix (similarity.pkl) exceeded standard Git limits. I implemented Git LFS to track and version large model weights seamlessly.
  • Optimization: Migrated the app from dynamic cloud downloading to local pre-bundled assets, reducing the application boot time by 90%.
  • API Integration: Integrated the TMDB API to dynamically fetch movie posters based on ID, enhancing the visual experience.

πŸ“‚ Project Structure

β”œβ”€β”€ app.py                # Main Streamlit UI & Logic
β”œβ”€β”€ model.ipynb           # Data Analysis, Preprocessing & Model Training
β”œβ”€β”€ movie_list.pkl        # Processed Movie DataFrame
β”œβ”€β”€ similarity.pkl        # Pre-computed Similarity Matrix (via Git LFS)
β”œβ”€β”€ requirements.txt      # Python Dependencies
└── README.md             # Project Documentation

πŸš€ How to Run Locally

  1. Clone the repository:
git clone https://github.com/Resham011/Movie-Recommendation-System.git
cd Movie-Recommendation-System
  1. Install Dependencies:
pip install -r requirements.txt
  1. Run the Application:
streamlit run app.py

🌐 Live Demos

Platform Link
πŸ€— Hugging Face Spaces Movie-Recommendation-System
☁️ Streamlit Cloud Live App

πŸ› οΈ Tech Stack

  • Core: Python, Pandas, NumPy
  • Machine Learning: Scikit-Learn (CountVectorizer, Cosine Similarity)
  • Web Framework: Streamlit
  • Version Control: Git & Git LFS
  • Hosting: Hugging Face Spaces & Streamlit Cloud

πŸ‘€ Author

Resham

About

🎬 A Content-Based Movie Recommender System using Python, Scikit-Learn, and Streamlit. Features 5,000+ movies with real-time poster fetching via TMDB API. πŸš€

Topics

Resources

Stars

Watchers

Forks

Contributors