Skip to content

A sophisticated product recommendation engine combining Collaborative Filtering, Content-Based Filtering, and Matrix Factorization (SVD) to deliver personalized recommendations.

Notifications You must be signed in to change notification settings

Emart29/recommendation-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

1 Commit
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Hybrid Recommendation System ๐ŸŽฏ

Python Streamlit Scikit-learn

A sophisticated product recommendation engine combining Collaborative Filtering, Content-Based Filtering, and Matrix Factorization (SVD) to deliver personalized recommendations.

๐ŸŽฏ Project Overview

This hybrid recommendation system analyzes user behavior and product features to suggest relevant items, mimicking systems used by Amazon, Netflix, and Spotify. It combines three powerful approaches to overcome the limitations of each individual method.

โœจ Key Features

Three Recommendation Approaches

  1. Collaborative Filtering (User-Based) - 40% weight

    • Finds users with similar preferences
    • Recommends products liked by similar users
    • "Users who are like you also bought..."
  2. Content-Based Filtering - 30% weight

    • Analyzes product features (category, price, rating)
    • Recommends items similar to past purchases
    • "If you liked this, you'll like..."
  3. Matrix Factorization (SVD) - 30% weight

    • Discovers latent patterns through dimensionality reduction
    • Learns hidden user preferences and product characteristics
    • Machine learning approach with 20 latent factors

Interactive Dashboard

  • Real-time personalized recommendations
  • User purchase history visualization
  • System analytics and performance metrics
  • Product catalog exploration
  • Method comparison and insights

๐Ÿ› ๏ธ Technologies Used

  • Python 3.8+
  • NumPy & Pandas: Data manipulation
  • Scikit-learn: TruncatedSVD for matrix factorization
  • Scipy: Sparse matrix operations, cosine similarity
  • Streamlit: Interactive web application
  • Plotly: Dynamic visualizations
  • Jupyter Notebook: Analysis and experimentation

๐Ÿ“ Project Structure


recommendation-system/
โ”‚
โ”œโ”€โ”€ data/
โ”‚   โ”œโ”€โ”€ ratings.csv           # User-product ratings
โ”‚   โ”œโ”€โ”€ products.csv          # Product catalog
โ”‚   โ””โ”€โ”€ users.csv             # User information
โ”‚
โ”œโ”€โ”€ notebooks/
โ”‚   โ””โ”€โ”€ 01_recommendation_system.ipynb
โ”‚
โ”œโ”€โ”€ models/
โ”‚   โ””โ”€โ”€ recommendation_system.pkl
โ”‚
โ”œโ”€โ”€ app.py
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ .gitignore

๐Ÿš€ Getting Started

Prerequisites

  • Python 3.8 or higher
  • pip package manager

Installation

  1. Clone the repository
git clone https://github.com/Emart29/recommendation-system.git
cd recommendation-system
  1. Install dependencies
pip install -r requirements.txt
  1. Run the Jupyter notebook to generate models
jupyter notebook notebooks/01_recommendation_system.ipynb
  1. Launch the application
streamlit run app.py
  1. Open your browser and navigate to http://localhost:8501

๐Ÿ“Š System Metrics

Dataset Statistics

  • Users: 3,000
  • Products: 500 (across 7 categories)
  • Ratings: 129,782
  • Sparsity: 91.35% (realistic for real-world systems)

Recommendation Quality

  • Catalog Coverage: High (can recommend diverse products)
  • Recommendation Diversity: Balanced across categories
  • Average Rating of Recommendations: 4.0+/5.0
  • Personalization: Category-aware based on user preferences

๐Ÿ” How It Works

Data Pipeline

User Ratings โ†’ User-Item Matrix โ†’ Three Recommendation Engines โ†’ Hybrid Scoring โ†’ Top-N Recommendations

Hybrid Scoring Algorithm

final_score = 0.4 ร— CF_score + 0.3 ร— Content_score + 0.3 ร— SVD_score

Each method contributes its strengths:

  • CF: Captures community preferences
  • Content: Ensures feature similarity
  • SVD: Discovers hidden patterns

Cold Start Handling

  • New Users: Leverage content-based recommendations
  • New Products: Use average ratings and category information
  • Sparse Data: SVD helps fill gaps in rating matrix

๐Ÿ’ก Business Applications

  1. E-commerce: Product recommendations (Amazon-style)
  2. Streaming Services: Content suggestions (Netflix-style)
  3. Music Platforms: Song/artist recommendations (Spotify-style)
  4. News Aggregators: Article personalization
  5. Social Media: Friend/content suggestions

๐ŸŽ“ Learning Outcomes

  • Collaborative filtering algorithms
  • Content-based recommendation systems
  • Matrix factorization techniques (SVD)
  • Sparse matrix operations
  • Recommendation system evaluation
  • Hybrid system design
  • Interactive dashboard development
  • Real-world data sparsity handling

๐Ÿ“ˆ Key Insights

Why Hybrid Approach?

Individual Methods Have Limitations:

  • CF alone: Cold start problem, popularity bias
  • Content alone: Limited serendipity, over-specialization
  • SVD alone: Interpretability issues, requires tuning

Hybrid System Advantages:

  • โœ… Combines strengths of all methods
  • โœ… Mitigates individual weaknesses
  • โœ… Better coverage and diversity
  • โœ… More robust to sparse data
  • โœ… Improved personalization

Sparsity Challenge

With 91.35% sparsity (only 8.65% of user-product pairs have ratings), the hybrid approach is essential:

  • CF fills gaps using similar users
  • Content-based leverages product features
  • SVD discovers latent patterns

๐Ÿ”ฎ Future Enhancements

  • Deep learning models (Neural Collaborative Filtering)
  • Context-aware recommendations (time, location, device)
  • Real-time updates as users interact
  • A/B testing framework for method weights
  • Explainable recommendations ("Why this item?")
  • Multi-armed bandit for exploration/exploitation
  • Sequence-aware recommendations (session-based)
  • Cross-domain recommendations

๐Ÿงช Alternative Approaches Not Used (But Could Add)

  1. Deep Learning: Neural networks for embeddings
  2. Factorization Machines: Feature interactions
  3. Graph-Based: Network analysis
  4. Association Rules: Market basket analysis
  5. Reinforcement Learning: Bandit algorithms

๐Ÿ“Š Comparison to Industry Systems

Feature This System Netflix Amazon Spotify
Collaborative Filtering โœ… โœ… โœ… โœ…
Content-Based โœ… โœ… โœ… โœ…
Matrix Factorization โœ… (SVD) โœ… (Advanced) โœ… (Multiple) โœ… (ALS)
Deep Learning โŒ โœ… โœ… โœ…
Real-time โŒ โœ… โœ… โœ…

๐Ÿ‘ค Author

[Your Name]

๐Ÿ“ License

This project is licensed under the MIT License.

๐Ÿ™ Acknowledgments

  • Inspired by real-world recommendation systems at major tech companies
  • Dataset generated to simulate realistic e-commerce patterns

โญ If this helped you understand recommendation systems, please star the repo!

๐Ÿค Open to collaboration and feedback!

About

A sophisticated product recommendation engine combining Collaborative Filtering, Content-Based Filtering, and Matrix Factorization (SVD) to deliver personalized recommendations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published