This is a Content-Based Movie Recommender System built in Python. The system recommends movies similar to a user-selected movie based on title and genre metadata using TF-IDF vectorization and cosine similarity.
This project was created to demonstrate skills in Python, data preprocessing, machine learning, and web app deployment.
- Content-based recommendation using TF-IDF and cosine similarity.
- Preprocessing includes cleaning titles and genres for accurate matching.
- Interactive web interface using Streamlit:
- Select a movie from a dropdown
- Choose the number of recommendations
- Optional: Display movie posters from TMDB API
- Save/load model artifacts using
joblibandnumpy. - Modular, well-structured project for easy extension.
movie-recommender/ │ ├── app/ │ └── streamlit_app.py # Streamlit frontend ├── data/ │ ├── movies.csv # MovieLens dataset │ └── ratings.csv # Movie ratings ├── models/ # Saved model artifacts (after build) ├── notebooks/ │ └── exploration.ipynb # Data exploration and testing ├── tests/ │ └── test_recommender.py # Unit tests ├── recommender.py # Core recommendation logic ├── requirements.txt # Python dependencies └── README.md
yaml Copy code
- Home Page / Movie Selection
- Recommendations
Follow these steps to run the project locally:
git clone https://github.com/YOUR_USERNAME/movie-recommender.git
cd movie-recommenderCopy code
python -m venv venv
.\venv\Scripts\Activate.ps1
⚠️ If activation is blocked, run once:
powershell
Copy code
Set-ExecutionPolicy RemoteSigned -Scope CurrentUserpowershell
Copy code
pip install -r requirements.txt
Go to MovieLens Small Dataset
Download and extract movies.csv and ratings.csv into the data/ folder
powershell
Copy code
python recommender.py
This creates TF-IDF vectors, similarity matrix, and title index in the models/ folder.
powershell
Copy code
streamlit run app\streamlit_app.py
Open the URL printed in PowerShell (usually http://localhost:8501)
Select a movie and view recommendations
Copy code
python -m unittest discover -v
Libraries Used
Library Purpose
pandas Data manipulation, reading CSVs
numpy Numerical operations, saving/loading arrays
scikit-learn TF-IDF vectorization and cosine similarity
streamlit Web interface for interactive recommendations
requests Optional: Fetch movie posters from TMDB
joblib Save and load model artifacts efficiently
unittest Unit testing framework
Select a movie from the dropdown
Adjust the number of recommendations
Optional: Input your TMDB API key to show posters
Click "Get Recommendations" to see the top similar movies
Include plot summaries, cast, or keywords for better similarity
Deploy online via Streamlit Cloud for portfolio showcase
Scikit-learn: TF-IDF Vectorizer
Streamlit Documentation
Chidwan AD

