Welcome to AlphaDEX (formerly SoundVault)! If you have a library of songs with mixed codecs, duplicates, messy metadata, and disorganized folders, AlphaDEX is built to fix all of it. It is designed for casual users and enthusiasts alike, and the tools inside are the result of years of frustration born from too many songs, too little space, and no desire to manually sort thousands of files. The current feature set represents 500+ hours of focused work over roughly six months, with the original vision first outlined more than five years ago when the music collection began. The advent of AI made it possible to extend those ideas and ship the program you see here.
This README mirrors the project documentation and provides:
- A friendly overview of the core programs.
- A quick path to get running.
- Supporting tools and configuration details for deeper work.
These are the main workflows described in the documentation:
- Music Indexer: preview-first workflows to organize, move, and rename your library with a full HTML report before committing changes.
- Duplicate Finder (Redesigned): review-first deduplication with fingerprinting, group-by-group decisions, and safety-focused execution reports.
- Similarity Inspector: a diagnostic tool to understand why two tracks match (or do not match) during duplicate detection.
- Library Sync Review: compare two libraries, build a plan, and preview or execute copy/move actions.
- Tag Fixer: repair metadata using AcoustID and other services.
- Genre Normalizer: batch-update genres for consistent tagging.
- Playlist Generator: build
.m3uplaylists and Auto‑DJ flows. - Clustered Playlists: run K-Means/HDBSCAN clustering and visualize the results.
- Visual Music Graph: an interactive scatter plot for clustered playlists that lets you explore your library as a map.
- Python 3.11+ (use conda or
venv) - Git command line for cloning the repo
- FFmpeg installed and on your
PATH(required for audio analysis) - VLC / libVLC for in-app playback (recommended; required for audio preview features)
- Optional LLM helper:
third_party/llama/llama-run.exe(Windows binaries included) plus a GGUF model (place it atmodels/your-model.ggufor updateplugins/assistant_plugin.py).
Create and activate a virtual environment then install requirements:
python -m venv .venv
source .venv/bin/activate # or "Scripts\activate" on Windows
pip install -r requirements.txtThe indexer will exit with an error if the real mutagen package is missing,
so ensure all dependencies are installed before running.
The Duplicate Finder tab opens the redesigned shell for spotting duplicates.
If you run from source, keep this repo on your Python path (run python main_gui.py from the repo root or install an editable package) so the backend
modules remain importable.
Essentia can be used instead of librosa for tempo and feature extraction. It
is optional—stick with librosa if you don't need it—but enables faster C++
implementations when available.
- Prerequisites (Linux builds compile C++ code and can take several
minutes):
- Debian/Ubuntu:
sudo apt-get install build-essential libfftw3-dev liblapack-dev libblas-dev libyaml-dev libtag1-dev libchromaprint-dev libsamplerate0-dev libavcodec-dev libavformat-dev libavutil-dev libavresample-dev - macOS:
brew install essentia(installs prebuilt formula with dependencies) - Windows: no official wheel; use WSL/Linux if you need Essentia.
- Debian/Ubuntu:
- Install (after prerequisites):
pip install essentia==2.1b6
Expect longer build times on Linux the first time you install Essentia. If you
prefer the pure-Python stack, you can continue using librosa without this
extra dependency.
# New PySide6 GUI (recommended)
python alpha_dex_gui.py
# Legacy Tkinter GUI (still functional)
python main_gui.py- Open your library folder.
- Run the Indexer tab in preview mode to review the HTML plan.
- Execute the Indexer to apply moves/renames after the preview looks right.
- Open Duplicate Finder to scan, preview, and execute dedupe groups.
- Use Tools → Similarity Inspector to understand tricky duplicate matches.
- Explore Playlists (folder playlists, Auto‑DJ, clustered playlists) once the library is cleaned.
- Use Library Sync to compare/merge two libraries when needed.
- File organization / rename: Indexer tab
- Duplicates: Duplicate Finder tab (Cross-Album Scan is a reserved toggle in the Indexer UI)
- Playlist tools: Tools ▸ Playlist Generator / Clustered Playlists
- Playback + diagnostics: Tools ▸ Similarity Inspector, Log tab
When you start a playlist job (tempo/energy buckets, Auto‑DJ, or auto‑creating clustered playlists), the app automatically switches to the Log tab. The tab shows timestamped messages from the playlist helpers (feature gathering, similarity calculations, and playlist writes) so you can see that background work is running without waiting for a popup.
Cluster generation writes progress into <method>_log.txt inside your library so you can review the steps later.
The indexer automatically prefixes file paths with \\?\ on Windows, allowing it to work with directories deeper than the classic 260-character limit.
- Preview output: Every run writes
Docs/MusicIndex.htmlunder the selected library root; dry runs open the preview automatically. - Decision log: A detailed
Docs/indexer_log.txtcaptures routing decisions and metadata fallbacks. - Missing metadata: Tracks missing core tags land in
Manual Review/so you can fix them before re-running. - Excluded folders:
Not Sorted/andPlaylists/are skipped during scans, letting you stash exceptions safely. - Leftover files: Non-audio leftovers are moved into
Docs/(.txt,.html,.db) orTrash/for everything else. - Playlists: Full runs generate playlists in
Playlists/by default; uncheck Create Playlists to skip.
Long running actions such as indexing, tag fixing and library sync operations are executed in QThread daemon threads. GUI updates from these background tasks are delivered via Qt signals and scheduled on the main thread — worker threads never touch widgets directly.
User settings are stored in ~/.soundvault_config.json (legacy filename from SoundVault). To tweak the near-duplicate
fingerprint threshold used during deduplication, add a value like:
{
"near_duplicate_threshold": 0.1
}Lower values require more similar fingerprints.
Per-format fingerprint thresholds used by the sync matcher can also be
configured. Add a format_fp_thresholds section with extension keys:
{
"format_fp_thresholds": {
"default": 0.3,
".flac": 0.3,
".mp3": 0.35,
".m4a": 0.35,
".aac": 0.35
}
}Values are floating point distances – lower numbers require closer matches.
The same dictionary can be provided to library_sync.compare_libraries via
the optional thresholds parameter to control how strictly fingerprints are
matched when scanning two libraries.
You can also store the path to your library for automatic scanning:
{
"library_root": "/path/to/your/Music"
}The configuration file also stores your selected metadata service and API key. You can update these via Settings → Metadata Services in the GUI:
{
"metadata_service": "AcoustID",
"metadata_api_key": "YOUR_KEY"
}These values are updated whenever you save the Metadata Services settings. Testing the connection or saving will persist your selections for future runs.
MusicBrainz requests require a valid User-Agent string containing your application name, version and contact email.
The Duplicate Finder has been rebuilt into a review-first workflow that makes it easy to preview and execute deduplication safely.
- Scan Library builds a fingerprint plan and summarizes duplicate groups.
- Preview writes
Docs/duplicate_preview.jsonandDocs/duplicate_preview.htmlso you can review every group before changes. - Execute applies the plan, writes a detailed HTML report under
Docs/duplicate_execution_reports/, and updates playlists when enabled. - Group dispositions let you retain, quarantine, or delete losers per group while keeping global defaults for everything else.
- Review-required groups block execution until resolved or overridden.
- Thresholds controls let you tune exact/near matching as well as fingerprint windowing and silence trimming for tough cases.
Duplicates are quarantined into Quarantine/ by default; you can switch to
retain-in-place or delete (with confirmation) from the main controls.
The Similarity Inspector is a targeted tool for understanding why two tracks match (or do not match) during duplicate detection.
- Launch from Tools → Similarity Inspector….
- Select two files, optionally override fingerprint offsets, trimming, and thresholds, then run the inspection.
- The report shows codec, duration, raw fingerprint distance, the effective near-duplicate threshold (including mixed-codec adjustments), and the verdict.
- Every run writes a timestamped report to
Docs/inside the selected library.
The codebase is organized into backend modules plus a PySide6 GUI layer:
alpha_dex_gui.py - PySide6 entry point (current GUI)
main_gui.py - Legacy Tkinter entry point (still functional)
── Backend ──────────────────────────────────────────────────────────────
music_indexer_api.py - Core scanning and relocation logic
duplicate_consolidation.py - Duplicate plan builder (dry-run)
duplicate_consolidation_executor.py - Plan executor
library_sync.py - Library comparison and plan execution
fingerprint_generator.py - Build AcoustID fingerprint database
fingerprint_cache.py - Persistent SQLite fingerprint cache
near_duplicate_detector.py - Fuzzy near-duplicate detection helpers
tag_fixer.py - Tag fixing engine using plugin metadata
update_genres.py - Batch genre tag updater via MusicBrainz
playlist_generator.py - .m3u playlist creation helpers
playlist_engine.py - Tempo/energy/Auto-DJ logic
clustered_playlists.py - Feature extraction and K-Means/HDBSCAN clustering
cluster_graph_panel.py - Interactive scatter plot for clustered playlists
validator.py - Verify AlphaDEX folder layout
config.py - Read/write persistent configuration (~/.soundvault_config.json)
chromaprint_utils.py - fpcalc wrapper
audio_norm.py - Audio normalization helpers
── PySide6 GUI (gui/) ───────────────────────────────────────────────────
gui/main_window.py - AlphaDEXWindow: sidebar + stacked workspace + log drawer
gui/compat.py - PySide6/PyQt6 compatibility shim
gui/themes/ - Custom QPainter theme engine
tokens.py - ThemeTokens dataclass + 14 named themes (8 dark, 6 light)
style.py - AlphaDEXStyle(QProxyStyle): full QPainter rendering
manager.py - ThemeManager singleton: apply/persist/auto OS day-night switch
effects.py - card_shadow(), lerp_color(), build_palette(), radius constants
animations.py - HoverMixin, AnimatedNavButton (badge), AnimatedButton
picker.py - ThemePickerDialog (swatch grid) + AutoThemeDialog
gui/widgets/
top_bar.py - Library path display, stats, Theme and Settings buttons
sidebar.py - Animated navigation sidebar (5 sections, 13 items)
log_drawer.py - Slide-up log panel with colour-coded levels
gui/workspaces/ - One QWidget per workflow (loaded into QStackedWidget)
base.py - WorkspaceBase: scroll wrapper, card/title/button helpers
indexer.py - Indexer: 3-phase progress, dry-run/execute, report
duplicates.py - Duplicate Finder: fingerprint scan, groups + inspector
library_sync.py - Library Sync: scan, plan, copy/move execution
similarity.py - Similarity Inspector: two-file threshold breakdown
tag_fixer.py - Tag Fixer: proposals table with checkboxes
genres.py - Genre Normalizer: MusicBrainz/Last.fm batch update
playlists.py - Playlist Generator: Folder / Tempo+Energy / Auto-DJ / Repair
clustered.py - Clustered Playlists: K-Means + HDBSCAN + graph launcher
graph.py - Visual Music Graph launcher
player.py - Player: libVLC transport controls + metadata display
compression.py - Library Compression: format targets, bitrate, archive
tools.py - Export & Utilities: artist/title export, codec list, cleanup
help.py - Help: doc links, keyboard shortcuts, About
── Other ─────────────────────────────────────────────────────────────────
controllers/ - Thin wrappers wiring backend to the legacy Tkinter GUI
plugins/
base.py - Metadata plugin interface
acoustid_plugin.py - Metadata lookup via AcoustID / MusicBrainz
assistant_plugin.py - LLM helper integration (requires user-supplied GGUF model)
discogs.py - Discogs metadata stub (not yet wired end-to-end)
lastfm.py - Fetch genres from Last.fm
spotify.py - Spotify metadata stub (not yet wired end-to-end)
mutagen_stub/ - Minimal mutagen fallback used by the test suite
bindings/ - C++/pybind11 wrapper for llama binaries
docs/ - Project documentation and design notes
third_party/ - Prebuilt llama executables
tests/ - pytest suite (42 modules)
These items are currently under development and not yet part of the stable release.
- Expanded metadata plugins beyond AcoustID/Last.fm (Discogs, Spotify)
See docs/project_documentation.html for technical details.
- Tidal-dl sync:
tidal-dlis listed inrequirements.txt, but there is no UI or workflow wired up yet. - Metadata provider breadth: only AcoustID + Last.fm are fully wired end-to-end; Spotify/Gracenote listed in config but not implemented.
- Library Sync per-item flags: ✅ IMPLEMENTED — Users can now right-click incoming tracks to flag for copy/replace or add notes. Flags override auto-decisions during plan building.
- Library Sync Export Report: export helper functions exist but the Export Report button is not wired to a user-accessible control.