Skip to content

Hybrid music recommendation system that recommends Spotify playlists based on the user's inputted playlist. Note: inputting a playlist url would require user to give their own spotify api key, but it is doable. For the sake of simplicity, user inputs are limited to a demo and/or whatever tracks available in the lyrics data.

Notifications You must be signed in to change notification settings

brandon-carlos/HybridMusicRecommender

Repository files navigation

🎵 Hybrid Music Recommendation System

A sophisticated music recommendation engine that combines collaborative filtering and content-based filtering to provide personalized song suggestions based on playlist analysis and lyrical content.

🌟 Features

  • Hybrid Recommendation Engine: Combines user behavior patterns with song content analysis
  • Memory Efficient: Smart caching system that computes similarity matrices once, then loads instantly
  • Spotify Dataset Integration: Built on a sample of the Spotify Million Playlist Dataset
  • Lyrics Analysis: Uses TF-IDF vectorization to find songs with similar themes
  • Customizable Parameters: Adjustable alpha values to balance different recommendation approaches

🔬 How It Works

1. Collaborative Filtering (CF)

  • Analyzes patterns in the Spotify Million Playlist Dataset
  • Finds songs that are frequently listened to together
  • Uses cosine similarity between tracks based on playlist co-occurrence

2. Content-Based Filtering

  • Analyzes song lyrics using TF-IDF vectorization
  • Finds songs with similar lyrical themes and emotional content
  • Uses cosine similarity between TF-IDF vectors

3. Hybrid Approach

  • Combines both approaches using a weighted average
  • Alpha parameter controls the balance:
    • α = 1.0: Pure collaborative filtering
    • α = 0.0: Pure content-based filtering
    • α = 0.6: 60% CF + 40% content-based

🚀 Quick Start

Prerequisites

# Required Python packages
pip install -r requirements.txt

Basic Usage

# First run (computes and saves similarity matrices)
python hybrid_recommender.py

# Force recomputation if needed
python hybrid_recommender.py --recompute

📁 Project Structure

spotify_million_playlist_dataset_challenge/
├── hybrid_recommender.py      # Main recommendation system (STANDALONE!)
├── geniusAPI.py              # Optional API integration (NOT NEEDED!)
├── similarity_matrices_*.pkl # Cached similarity matrices (auto-generated)
├── requirements.txt           # Python dependencies
├── playlists.json             # Playlist data
└── README.md                 # This file

💻 Usage Examples

Example 1: Get Recommendations for a Playlist

from hybrid_recommender import *

# Load the system
cf_sim, content_sim, unique_tracks = main()

# Your playlist URIs
my_playlist_uris = [
    "spotify:track:1234567890abcdef",
    "spotify:track:abcdef1234567890"
]

# Get recommendations
recommendations = get_recommendations_by_uris(
    my_playlist_uris, cf_sim, content_sim, unique_tracks,
    n_recommendations=20, alpha=0.6
)

# Display results
for idx, row in recommendations.iterrows():
    print(f"{idx+1}. {row['track_name']} by {row['artist_name']} (Score: {row['score']:.4f})")

Example 2: Interactive Demo

# Run the interactive demo
cf_sim, content_sim, unique_tracks = main()
interactive_demo(cf_sim, content_sim, unique_tracks)

🎮 Interactive Demo Features

The system includes a built-in interactive demo that lets you:

  • Try sample playlists (rock, pop, mixed)
  • Create playlists from track names (e.g., "Bohemian Rhapsody|Queen")
  • Use custom Spotify track URIs
  • Adjust recommendation parameters (alpha, number of recommendations)

🎯 Use Cases

Perfect For:

  • Music Discovery: Find new songs similar to your favorites
  • Playlist Creation: Generate themed playlists automatically
  • Mood Matching: Find songs with similar lyrical themes
  • Genre Exploration: Discover music in your preferred styles
  • Research: Analyze music recommendation algorithms

Example Scenarios:

  1. "I want sad breakup songs" → Find lyrically similar emotional tracks
  2. "I love this indie playlist" → Discover more indie artists
  3. "I want workout music" → Find high-energy tracks with similar themes

🚨 Troubleshooting

Common Issues

"ModuleNotFoundError: No module named 'pandas'"

pip install -r requirements.txt

Memory Issues

  • Reduce n_nearest_neighbors parameter (default: 100)
  • Use --recompute to clear cached matrices
  • Ensure you have at least 4GB RAM available

Slow Performance

  • First run is always slow (computing matrices)
  • Subsequent runs should be fast (loading from disk)
  • Check if similarity matrices exist in your directory

🔬 Technical Details

Algorithm Components

  • Nearest Neighbors: KNN with cosine similarity
  • TF-IDF Vectorization: Text processing for lyrics
  • Sparse Matrices: Memory-efficient similarity storage
  • Hybrid Scoring: Weighted combination of CF and content scores

Data Processing

  • Lyrics Cleaning: Removes metadata, contributors, translations
  • Text Normalization: Converts to lowercase, removes stop words
  • Similarity Computation: Efficient sparse matrix operations

📊 Dataset Information

  • Source: Spotify Million Playlist Dataset
  • Tracks: 66,000+ unique songs
  • Coverage: Multiple genres, languages, and time periods

Happy recommending! 🎵

Built with ❤️ for music discovery and machine learning enthusiasts.

About

Hybrid music recommendation system that recommends Spotify playlists based on the user's inputted playlist. Note: inputting a playlist url would require user to give their own spotify api key, but it is doable. For the sake of simplicity, user inputs are limited to a demo and/or whatever tracks available in the lyrics data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages