🎧 Audio Feature Comparison – Custom Algorithm vs Spotify
This project compares the output of a custom audio feature extraction algorithm with Spotify’s official features, using a dataset of music tracks. It helps evaluate how close the custom model is to Spotify’s ground truth and highlights areas of strength and improvement.
The algorithm to extract audio feature is specifically located in the extract_feature function in soundcloud_pipeline.py
🔩 Infra
✅ soundcloud_pipeline.py
It is an automated system that searches for songs on SoundCloud, downloads them using yt-dlp, analyzes their audio features using librosa, and compares them to Spotify's official metadata. It tracks progress with a checkpoint system and saves results for later evaluation.
📁 Dataset Files
✅ music_info_cleaned.csv
A cleaned version of the original Music_Info.csv.
Only the first 1,000 songs that were used for testing and evaluation were cleaned.
✅ music_info_cleaned_fixed.csv
A fully cleaned version of the entire Music_Info.csv.
Note: Only the first 1,000 songs were experimentally validated — the rest may still contain edge cases.
🧹 Cleaning Criteria
From the initial 1,000 songs downloaded:
165 songs had issues and were filtered out.
Songs were excluded if their titles contained:
"remastered", "live", "mono", "album version", "slowed + reverb", etc.
Of the 165:
35 songs were recoverable after correcting filename quirks (e.g., replacing slashes like AC/DC → AC DC).
The remaining 130 were removed due to inconsistencies.
✅ clean_dataset.py
this script was used to replace all '/' in artist or name into a space
📊 Summary Files
✅ og_summary.csv
A summary of comparisons between the Spotify audio features and the custom algorithm for the initial 1,000 songs.
✅ og_audio_feature_cache.csv
Contains audio features generated by the custom algorithm for each track.
Used to avoid recomputation and enable quick lookup by file path.
🚀 How to Use
- Choose songs to analyze
Open app.py and set start_index and end_index to values above 1001 (since the model was trained on the first 1000 songs).
Example:
from soundcloud_pipeline import SoundCloudPipeline, compare_result
pipeline = SoundCloudPipeline(start_index=1001, end_index=1030)
pipeline.download_songs()
- Download and clean audio
To avoid wasting time analyzing irrelevant songs:
-
Comment out the analyze_songs() call in app.py.
Run the script to download all candidate songs using:
pipeline.download_songs() -
Manually inspect the downloaded songs and delete any that don't meet the bar (e.g., remastered, live versions, slowed + reverb).
-
Then, comment out download_songs() and uncomment analyze_songs() to extract features and run comparisons:
pipeline.analyze_songs()
- Analyze the audio
This will:
Run the custom feature extractor
Compare results to Spotify’s ground truth
Save a .csv per song to the comparisons/ folder
✅ You should ideally test with 30+ clean songs.
- Analyze performance
Run comparison_analysis.py to compute evaluation metrics, the output will be at a_summary.csv:
Mean: the average error — tells how far off the predictions are on average.
Median: the middle error — helps show typical performance, less affected by outliers.
Standard Deviation (std): how much the errors vary — low = consistent, high = unstable.
Max / Min: the best and worst errors observed — shows the range of performance.
🔬 Evaluation Goal
These metrics help you:
Identify which features are well-estimated
Spot features that are consistently biased
Flag features that are unstable and need tuning
🧪 Encouragement to Collaborate
There are ~900 comparisons total, but only a subset is included in this repo.
Feel free to pick 20–30 random songs and run comparisons yourself.
Let me know if you notice anything significantly off — testing edge cases is encouraged.