Skip to content

Hybrid music recommender combining NMF collaborative filtering, two-tower content embeddings, audio feature synthesis, and meta-learning fusion for adaptive personalization.

Notifications You must be signed in to change notification settings

connergroth/Timbrality

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Timbrality

Timbrality - AI Powered Music Discovery

Timbrality β€” a machine learning-powered music recommendation engine that uses AI agents to create personalized music experiences.

Timbrality is an intelligent music recommendation platform that combines data from Spotify, Last.fm, and Album of the Year (AOTY) to provide personalized music suggestions through conversational AI agents. The platform features a hybrid recommendation system powered by the Timbral ML engine and modern web interface built with React and Next.js.


Features

  • πŸ€– AI-Powered Music Agent
    Conversational AI agent that understands music preferences and provides intelligent recommendations through natural language interactions.

  • 🎧 Personalized Recommendations
    Hybrid recommendation system combining collaborative filtering and content-based approaches using listening behavior and audio features.

  • πŸ”— Multi-Platform Integration
    Seamlessly connects with Spotify, Last.fm, and Album of the Year to gather comprehensive music data and preferences.

  • πŸ“± Modern Web Interface
    Clean, responsive UI built with React/Next.js featuring chat interface, playlist management, and real-time music discovery.

  • 🎡 Smart Playlist Creation
    AI-generated playlists with Spotify integration for seamless music discovery and playlist management.

  • ⚑ High-Performance Backend
    FastAPI-powered backend with multi-tier caching (Redis + in-memory), rate limiting, and async processing.

  • πŸ“Š Rich Music Metadata
    Enhanced with AOTY ratings, reviews, tags, and similar album data through sophisticated CloudScraper-based pipeline with rating count extraction.


Architecture Overview

Backend (FastAPI)

  • Multi-tier caching: Redis primary + in-memory fallback
  • Rate limiting: 30 requests/minute via SlowAPI
  • Database: PostgreSQL with SQLAlchemy ORM and Alembic migrations
  • Web scraping: CloudScraper with async processing for comprehensive AOTY data extraction
  • AI Agent: NLP processor with tool registry for music recommendations

ML Service (Timbral Engine)

  • Hybrid recommendation engine: NMF collaborative + BERT content-based filtering
  • Dedicated FastAPI service: Port 8001 with ML-specific endpoints
  • Model serving: Redis-cached recommendations with explainability
  • HTTP integration: Proxied through main backend at /timbral/* routes

Frontend

  • Main site: Vite + React + shadcn/ui components
  • Auth app: Next.js application for OAuth flows
  • State management: React Context + Supabase auth

Key Components

  • /backend/agent/: AI agent core, tools, and NLP processing
  • /backend/routes/: API endpoints (agent, albums, playlists, users, timbral)
  • /backend/services/: Business logic (Spotify, Last.fm, ML, AOTY)
  • /backend/ingestion/: Data pipeline for music metadata
  • /ml/timbral/: Timbral ML engine (models, training, inference)
  • /frontend/app/: Next.js authentication and chat interface

Tech Stack

πŸ’» Backend Technologies

  • FastAPI – Async Python web framework with automatic OpenAPI docs
  • PostgreSQL + SQLAlchemy – Relational database with async ORM
  • Redis – High-performance caching layer
  • CloudScraper – Advanced web scraping with anti-bot protection bypass for AOTY data
  • Pydantic – Data validation and serialization

πŸ“Š Data Sources & APIs

  • Spotify Web API – User listening data, playlists, and audio features
  • Last.fm API – Scrobbling data and music discovery
  • AOTY Custom Scraper – Album ratings, reviews, rating counts, and comprehensive metadata
  • Supabase – Authentication and user management

πŸ€– AI & Machine Learning

  • AI Agent Architecture – Tool-based agent for music recommendations
  • NLP Processing – Natural language understanding for music queries
  • Timbral Engine – Dedicated ML microservice with hybrid recommendation engine
  • NMF Collaborative Filtering – User-item matrix factorization for personalized suggestions
  • BERT Content-Based Filtering – Semantic understanding of music metadata and genres
  • Model Explainability – Built-in recommendation reasoning and explanations

Model Design

πŸ”Έ Collaborative Filtering (CF)

  • Built from play counts and listening behavior
  • Uses Non-negative Matrix Factorization (NMF)
  • Predicts latent user-track affinities

πŸ”Ή Content-Based Filtering (CBF)

  • Embeds mood, genre, and tags using Sentence-BERT
  • Computes track similarity with cosine distance
  • Useful for cold-starts and fallback recs

πŸ”Ά Hybrid Fusion

  • Weighted blending of CF + CBF scores
  • Tunable or learnable fusion logic
  • Produces rich, explainable recs per user or seed

AI Agent

Timbrality features an advanced AI agent with dual-store memory architecture that combines fast Redis working memory with durable PostgreSQL long-term storage for intelligent, context-aware music recommendations.

Memory Architecture:

  • Redis Working Memory – Sub-millisecond access to recent conversations (last 50-200 turns per chat)
  • PostgreSQL + pgvector – Semantic search and long-term memory with embeddings
  • Context Assembly – Intelligent retrieval combining recent turns, relevant memories, and user preferences
  • Background Processing – Async summarization, fact extraction, and topic analysis

Agent Capabilities:

  • Conversational Memory – Remembers user preferences, music tastes, and conversation context
  • Semantic Understanding – Uses Sentence-BERT embeddings for natural language processing
  • Tool Integration – Access to music databases, recommendation engines, and analysis tools
  • Streaming Responses – Real-time interaction with memory context updates
  • Automatic Learning – Extracts user facts, preferences, and music patterns over time

Memory Features:

  • Working Memory – Fast access to recent chat context with configurable TTL (24-72 hours)
  • Long-term Memory – Durable storage of important facts, preferences, and conversation summaries
  • Semantic Search – Vector similarity search for relevant context using pgvector
  • Importance Scoring – Memory prioritization based on user interaction patterns
  • Topic Tracking – Automatic extraction and trending of music-related topics

API Endpoints:

POST /api/agent/chat          # Enhanced chat with memory integration
POST /api/agent/chat/stream   # Streaming responses with context
GET  /api/agent/memory/stats  # User memory statistics
POST /api/agent/memory/process # Trigger background memory processing

AOTY Data Scraper

Timbrality includes a sophisticated web scraper that extracts rich music metadata from Album of the Year (AOTY), one of the most comprehensive music databases available. This custom scraper enhances the platform's recommendation capabilities with detailed album ratings, reviews, and metadata.

🎯 What It Scrapes

Albums:

  • User scores and rating counts (e.g., "Based on 37,040 ratings")
  • Critic reviews from major publications
  • Popular user reviews with like counts
  • Genre tags and metadata
  • Similar album recommendations
  • "Must Hear" designations

Artists:

  • Overall user ratings and rating counts
  • Biography and formation details
  • Geographic location data
  • Complete discography listings
  • Genre classifications

Tracks:

  • Individual track ratings and rating counts
  • Track-level metadata and features
  • Featured artist information
  • Track length and positioning data

πŸ›  Technical Implementation

Web Scraping Engine:

  • CloudScraper for bypassing anti-bot protection
  • BeautifulSoup for robust HTML parsing
  • Async/await processing for high performance
  • Custom retry logic with exponential backoff
  • Rate limiting to respect AOTY's servers

Data Models:

class Album(BaseModel):
    title: str
    artist: str
    user_score: Optional[float]
    num_ratings: int
    tracks: List[Track]
    critic_reviews: List[CriticReview]
    popular_reviews: List[AlbumUserReview]

class Track(BaseModel):
    title: str
    rating: Optional[int]
    num_ratings: int
    featured_artists: List[str]

API Endpoints:

GET /scraper/album?artist=Radiohead&album=OK+Computer
GET /scraper/similar?artist=Radiohead&album=OK+Computer
GET /scraper/artist?name=Radiohead

πŸ”„ Data Pipeline Integration

Automated Population:

# Add rating count columns to existing tables
psql $DATABASE_URL -f backend/add_aoty_rating_counts.sql

# Populate rating counts for all entities
python backend/populate_aoty_rating_counts.py --type all --batch-size 10

Database Enhancement:

  • Adds aoty_num_ratings columns to albums, artists, and tracks tables
  • Batch processing with configurable limits
  • Resume capability for interrupted runs
  • Error handling and logging for production use

Caching Strategy:

  • Redis caching for scraped data with configurable TTL
  • In-memory fallback when Redis is unavailable
  • Smart cache keys based on artist/album combinations
  • Cache warming for popular albums and artists

🎡 Use Cases

Recommendation Enhancement:

  • Weight recommendations by AOTY rating popularity
  • Surface critically acclaimed but undiscovered albums
  • Filter by minimum rating thresholds
  • Include review-based reasoning in AI responses

Music Discovery:

  • "Similar Albums" recommendations from AOTY's algorithm
  • Genre-based exploration using AOTY's tagging system
  • Critical consensus analysis for new releases
  • User review sentiment for recommendation explanations

Data Quality:

  • Cross-reference Spotify/Last.fm data with AOTY metadata
  • Resolve artist/album name discrepancies
  • Enrich sparse metadata with comprehensive AOTY details
  • Validate music catalog completeness

Getting Started

Prerequisites

  • Python 3.8+
  • Node.js 18+
  • PostgreSQL
  • Redis (optional, falls back to in-memory cache)

Full Stack Setup (Docker)

docker-compose up

Manual Setup

Backend

cd backend
pip install -r requirements.txt
uvicorn main:app --reload  # Port 8000

ML Service

cd ml
pip install -r requirements.txt
python main.py  # Port 8001

Frontend

# Main site
cd frontend && npm install && npm run dev  # Port 3001

# Auth app
cd frontend/app && npm install && npm run dev  # Port 3000

Environment Variables

Configure .env files in backend/, ml/, and frontend/app/ directories with your API keys for Spotify, Last.fm, Supabase, and OpenAI.


Current Status

βœ… Completed:

  • AI agent architecture with conversational interface
  • Multi-platform data integration (Spotify, Last.fm, AOTY)
  • Modern React/Next.js frontend with chat interface
  • FastAPI backend with caching and rate limiting

🚧 In Progress:

  • Enhanced playlist management features
  • Performance optimizations and deployment preparation
  • Advanced ML model training and fine-tuning

About

Hybrid music recommender combining NMF collaborative filtering, two-tower content embeddings, audio feature synthesis, and meta-learning fusion for adaptive personalization.

Topics

Resources

Stars

Watchers

Forks