Skip to content

Greensand321/Music_Indexer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,741 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AlphaDEX

AlphaDEX (formerly SoundVault) a Music Indexer

Welcome to AlphaDEX (formerly SoundVault)! If you have a library of songs with mixed codecs, duplicates, messy metadata, and disorganized folders, AlphaDEX is built to fix all of it. It is designed for casual users and enthusiasts alike, and the tools inside are the result of years of frustration born from too many songs, too little space, and no desire to manually sort thousands of files. The current feature set represents 500+ hours of focused work over roughly six months, with the original vision first outlined more than five years ago when the music collection began. The advent of AI made it possible to extend those ideas and ship the program you see here.

This README mirrors the project documentation and provides:

  • A friendly overview of the core programs.
  • A quick path to get running.
  • Supporting tools and configuration details for deeper work.

Core programs (start here)

These are the main workflows described in the documentation:

  1. Music Indexer: preview-first workflows to organize, move, and rename your library with a full HTML report before committing changes.
  2. Duplicate Finder (Redesigned): review-first deduplication with fingerprinting, group-by-group decisions, and safety-focused execution reports.
  3. Similarity Inspector: a diagnostic tool to understand why two tracks match (or do not match) during duplicate detection.
  4. Library Sync Review: compare two libraries, build a plan, and preview or execute copy/move actions.

Supporting tools (roughly simple → advanced)

  1. Tag Fixer: repair metadata using AcoustID and other services.
  2. Genre Normalizer: batch-update genres for consistent tagging.
  3. Playlist Generator: build .m3u playlists and Auto‑DJ flows.
  4. Clustered Playlists: run K-Means/HDBSCAN clustering and visualize the results.
  5. Visual Music Graph: an interactive scatter plot for clustered playlists that lets you explore your library as a map.

Prerequisites

  • Python 3.11+ (use conda or venv)
  • Git command line for cloning the repo
  • FFmpeg installed and on your PATH (required for audio analysis)
  • VLC / libVLC for in-app playback (recommended; required for audio preview features)
  • Optional LLM helper: third_party/llama/llama-run.exe (Windows binaries included) plus a GGUF model (place it at models/your-model.gguf or update plugins/assistant_plugin.py).

Installation

Create and activate a virtual environment then install requirements:

python -m venv .venv
source .venv/bin/activate  # or "Scripts\activate" on Windows
pip install -r requirements.txt

The indexer will exit with an error if the real mutagen package is missing, so ensure all dependencies are installed before running.

The Duplicate Finder tab opens the redesigned shell for spotting duplicates. If you run from source, keep this repo on your Python path (run python main_gui.py from the repo root or install an editable package) so the backend modules remain importable.

Optional: Essentia audio engine

Essentia can be used instead of librosa for tempo and feature extraction. It is optional—stick with librosa if you don't need it—but enables faster C++ implementations when available.

  • Prerequisites (Linux builds compile C++ code and can take several minutes):
    • Debian/Ubuntu: sudo apt-get install build-essential libfftw3-dev liblapack-dev libblas-dev libyaml-dev libtag1-dev libchromaprint-dev libsamplerate0-dev libavcodec-dev libavformat-dev libavutil-dev libavresample-dev
    • macOS: brew install essentia (installs prebuilt formula with dependencies)
    • Windows: no official wheel; use WSL/Linux if you need Essentia.
  • Install (after prerequisites):
    pip install essentia==2.1b6

Expect longer build times on Linux the first time you install Essentia. If you prefer the pure-Python stack, you can continue using librosa without this extra dependency.

Quickstart (basic flow)

# New PySide6 GUI (recommended)
python alpha_dex_gui.py

# Legacy Tkinter GUI (still functional)
python main_gui.py
  1. Open your library folder.
  2. Run the Indexer tab in preview mode to review the HTML plan.
  3. Execute the Indexer to apply moves/renames after the preview looks right.
  4. Open Duplicate Finder to scan, preview, and execute dedupe groups.
  5. Use Tools → Similarity Inspector to understand tricky duplicate matches.
  6. Explore Playlists (folder playlists, Auto‑DJ, clustered playlists) once the library is cleaned.
  7. Use Library Sync to compare/merge two libraries when needed.

Where to find things

  • File organization / rename: Indexer tab
  • Duplicates: Duplicate Finder tab (Cross-Album Scan is a reserved toggle in the Indexer UI)
  • Playlist tools: Tools ▸ Playlist Generator / Clustered Playlists
  • Playback + diagnostics: Tools ▸ Similarity Inspector, Log tab

Playlist generator feedback

When you start a playlist job (tempo/energy buckets, Auto‑DJ, or auto‑creating clustered playlists), the app automatically switches to the Log tab. The tab shows timestamped messages from the playlist helpers (feature gathering, similarity calculations, and playlist writes) so you can see that background work is running without waiting for a popup.

Cluster generation writes progress into <method>_log.txt inside your library so you can review the steps later.

Windows Long Paths

The indexer automatically prefixes file paths with \\?\ on Windows, allowing it to work with directories deeper than the classic 260-character limit.

Indexer outputs and behavior

  • Preview output: Every run writes Docs/MusicIndex.html under the selected library root; dry runs open the preview automatically.
  • Decision log: A detailed Docs/indexer_log.txt captures routing decisions and metadata fallbacks.
  • Missing metadata: Tracks missing core tags land in Manual Review/ so you can fix them before re-running.
  • Excluded folders: Not Sorted/ and Playlists/ are skipped during scans, letting you stash exceptions safely.
  • Leftover files: Non-audio leftovers are moved into Docs/ (.txt, .html, .db) or Trash/ for everything else.
  • Playlists: Full runs generate playlists in Playlists/ by default; uncheck Create Playlists to skip.

Threading

Long running actions such as indexing, tag fixing and library sync operations are executed in QThread daemon threads. GUI updates from these background tasks are delivered via Qt signals and scheduled on the main thread — worker threads never touch widgets directly.

Configuration

User settings are stored in ~/.soundvault_config.json (legacy filename from SoundVault). To tweak the near-duplicate fingerprint threshold used during deduplication, add a value like:

{
  "near_duplicate_threshold": 0.1
}

Lower values require more similar fingerprints.

Per-format fingerprint thresholds used by the sync matcher can also be configured. Add a format_fp_thresholds section with extension keys:

{
  "format_fp_thresholds": {
    "default": 0.3,
    ".flac": 0.3,
    ".mp3": 0.35,
    ".m4a": 0.35,
    ".aac": 0.35
  }
}

Values are floating point distances – lower numbers require closer matches.

The same dictionary can be provided to library_sync.compare_libraries via the optional thresholds parameter to control how strictly fingerprints are matched when scanning two libraries.

You can also store the path to your library for automatic scanning:

{
  "library_root": "/path/to/your/Music"
}

The configuration file also stores your selected metadata service and API key. You can update these via Settings → Metadata Services in the GUI:

{
  "metadata_service": "AcoustID",
  "metadata_api_key": "YOUR_KEY"
}

These values are updated whenever you save the Metadata Services settings. Testing the connection or saving will persist your selections for future runs.

MusicBrainz requests require a valid User-Agent string containing your application name, version and contact email.

Duplicate Finder (Redesigned)

The Duplicate Finder has been rebuilt into a review-first workflow that makes it easy to preview and execute deduplication safely.

  • Scan Library builds a fingerprint plan and summarizes duplicate groups.
  • Preview writes Docs/duplicate_preview.json and Docs/duplicate_preview.html so you can review every group before changes.
  • Execute applies the plan, writes a detailed HTML report under Docs/duplicate_execution_reports/, and updates playlists when enabled.
  • Group dispositions let you retain, quarantine, or delete losers per group while keeping global defaults for everything else.
  • Review-required groups block execution until resolved or overridden.
  • Thresholds controls let you tune exact/near matching as well as fingerprint windowing and silence trimming for tough cases.

Duplicates are quarantined into Quarantine/ by default; you can switch to retain-in-place or delete (with confirmation) from the main controls.

Similarity Inspector

The Similarity Inspector is a targeted tool for understanding why two tracks match (or do not match) during duplicate detection.

  • Launch from Tools → Similarity Inspector….
  • Select two files, optionally override fingerprint offsets, trimming, and thresholds, then run the inspection.
  • The report shows codec, duration, raw fingerprint distance, the effective near-duplicate threshold (including mixed-codec adjustments), and the verdict.
  • Every run writes a timestamped report to Docs/ inside the selected library.

File Overview

The codebase is organized into backend modules plus a PySide6 GUI layer:

alpha_dex_gui.py          - PySide6 entry point (current GUI)
main_gui.py               - Legacy Tkinter entry point (still functional)

── Backend ──────────────────────────────────────────────────────────────
music_indexer_api.py      - Core scanning and relocation logic
duplicate_consolidation.py - Duplicate plan builder (dry-run)
duplicate_consolidation_executor.py - Plan executor
library_sync.py           - Library comparison and plan execution
fingerprint_generator.py  - Build AcoustID fingerprint database
fingerprint_cache.py      - Persistent SQLite fingerprint cache
near_duplicate_detector.py - Fuzzy near-duplicate detection helpers
tag_fixer.py              - Tag fixing engine using plugin metadata
update_genres.py          - Batch genre tag updater via MusicBrainz
playlist_generator.py     - .m3u playlist creation helpers
playlist_engine.py        - Tempo/energy/Auto-DJ logic
clustered_playlists.py    - Feature extraction and K-Means/HDBSCAN clustering
cluster_graph_panel.py    - Interactive scatter plot for clustered playlists
validator.py              - Verify AlphaDEX folder layout
config.py                 - Read/write persistent configuration (~/.soundvault_config.json)
chromaprint_utils.py      - fpcalc wrapper
audio_norm.py             - Audio normalization helpers

── PySide6 GUI (gui/) ───────────────────────────────────────────────────
gui/main_window.py        - AlphaDEXWindow: sidebar + stacked workspace + log drawer
gui/compat.py             - PySide6/PyQt6 compatibility shim

gui/themes/               - Custom QPainter theme engine
  tokens.py               - ThemeTokens dataclass + 14 named themes (8 dark, 6 light)
  style.py                - AlphaDEXStyle(QProxyStyle): full QPainter rendering
  manager.py              - ThemeManager singleton: apply/persist/auto OS day-night switch
  effects.py              - card_shadow(), lerp_color(), build_palette(), radius constants
  animations.py           - HoverMixin, AnimatedNavButton (badge), AnimatedButton
  picker.py               - ThemePickerDialog (swatch grid) + AutoThemeDialog

gui/widgets/
  top_bar.py              - Library path display, stats, Theme and Settings buttons
  sidebar.py              - Animated navigation sidebar (5 sections, 13 items)
  log_drawer.py           - Slide-up log panel with colour-coded levels

gui/workspaces/           - One QWidget per workflow (loaded into QStackedWidget)
  base.py                 - WorkspaceBase: scroll wrapper, card/title/button helpers
  indexer.py              - Indexer: 3-phase progress, dry-run/execute, report
  duplicates.py           - Duplicate Finder: fingerprint scan, groups + inspector
  library_sync.py         - Library Sync: scan, plan, copy/move execution
  similarity.py           - Similarity Inspector: two-file threshold breakdown
  tag_fixer.py            - Tag Fixer: proposals table with checkboxes
  genres.py               - Genre Normalizer: MusicBrainz/Last.fm batch update
  playlists.py            - Playlist Generator: Folder / Tempo+Energy / Auto-DJ / Repair
  clustered.py            - Clustered Playlists: K-Means + HDBSCAN + graph launcher
  graph.py                - Visual Music Graph launcher
  player.py               - Player: libVLC transport controls + metadata display
  compression.py          - Library Compression: format targets, bitrate, archive
  tools.py                - Export & Utilities: artist/title export, codec list, cleanup
  help.py                 - Help: doc links, keyboard shortcuts, About

── Other ─────────────────────────────────────────────────────────────────
controllers/              - Thin wrappers wiring backend to the legacy Tkinter GUI
plugins/
  base.py               - Metadata plugin interface
  acoustid_plugin.py    - Metadata lookup via AcoustID / MusicBrainz
  assistant_plugin.py   - LLM helper integration (requires user-supplied GGUF model)
  discogs.py            - Discogs metadata stub (not yet wired end-to-end)
  lastfm.py             - Fetch genres from Last.fm
  spotify.py            - Spotify metadata stub (not yet wired end-to-end)

mutagen_stub/             - Minimal mutagen fallback used by the test suite
bindings/                 - C++/pybind11 wrapper for llama binaries
docs/                     - Project documentation and design notes
third_party/              - Prebuilt llama executables
tests/                    - pytest suite (42 modules)

Roadmap (Upcoming Features)

These items are currently under development and not yet part of the stable release.

  • Expanded metadata plugins beyond AcoustID/Last.fm (Discogs, Spotify)

See docs/project_documentation.html for technical details.

Known gaps

  • Tidal-dl sync: tidal-dl is listed in requirements.txt, but there is no UI or workflow wired up yet.
  • Metadata provider breadth: only AcoustID + Last.fm are fully wired end-to-end; Spotify/Gracenote listed in config but not implemented.
  • Library Sync per-item flags: ✅ IMPLEMENTED — Users can now right-click incoming tracks to flag for copy/replace or add notes. Flags override auto-decisions during plan building.
  • Library Sync Export Report: export helper functions exist but the Export Report button is not wired to a user-accessible control.

About

AlphaDEX is a full music catalog manager that automatically sorts and organizes your library, removes duplicates, and supports library‑sync workflows so your collections stay consistent across devices. It offers interactive playlist creation and discovery tools, while keeping the entire experience easy and efficient.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors