Inspiration: As a developer myself, I have been binge-watching TV shows and movies. One day, I began to struggle finding something NEW to watch. I realized people around me may be on the same boat. That’s why I invented this recommender system to target the “hidden-gem shows” on mainstream platforms
Product: A content-based filtering engine that generates a list of show recommendations based on user inputs like a past show they watched, ages, genres, and production countries
- Datasets: Amazon Prime, Netflix, and AppleTV+ Movies and TV Shows in Kaggle
- All uses a CC0 License → Can use freely
- Cleaning/Transforming Dataframes
- Drop all the duplicate and redundant columns
- Apply One-Hot Encoding + Z-score normalization on remaining columns (works well on list-based & categorical columns too)
- Save data into the joblib library
- Optimizes performance bc we don’t have to parse them in every run
ML Model used: K-Nearest-Neighbors with Cosine Similarity
- Use Case: For datasets with structured features (e.g., genre, artist, year for music recommendation), which is a YES!
- How it Works: Finds the k most similar items based on a given title and/OR user preferences
- Load data into the engine
- Feed the Model with the cosine similarity of the Transformed Dataframe
- Recommendation Generation:
- For a given title: retrieves the most similar ones from the dataset.
- Applies filtering such as min_year, genre, min_imdb to narrow results
- Model Persistence: Save the backend model and data as .onnx and .json files
- Link to backend: Serve the .onnx file in Next.js. Render the recommended TV shows/movies back to our website as cards.
- Consider using a movie/TV show API so we can apply this recommender system to every movie, instead of a limited dataset
- Add functionality to the recommender: make predictions on unseen data (not limited to the existing dataset)
- Evaluate the model using MAP@K algorithm
- Ensures users will use the high-rated products first
- Add movie poster as the background image for each movie card (using a poster api)
- Deploy the website to web hosting platforms such as Vercel and Netlify (so far is limited to local hosts)