Skip to content

DevelopRelease v0.9.3 – Canonical Authority & Scheduler Hardening#2

Merged
z3ro-2 merged 46 commits intomainfrom
develop
Feb 18, 2026
Merged

DevelopRelease v0.9.3 – Canonical Authority & Scheduler Hardening#2
z3ro-2 merged 46 commits intomainfrom
develop

Conversation

@z3ro-2
Copy link
Collaborator

@z3ro-2 z3ro-2 commented Feb 18, 2026

v0.9.3 stabilizes Retreivr’s ingestion architecture.

  • MusicBrainz-first canonical metadata
  • Spotify optional (OAuth + Premium required)
  • Deterministic playlist snapshot hashing
  • Idempotent scheduler ticks
  • Active-job duplicate prevention
  • MKV default container
  • Full integration test coverage

z3ro-2 added 30 commits February 3, 2026 14:11
…scaffolding

add SQLite migration helpers for playlist_snapshots and playlist_snapshot_items with unique constraints and indexes
add snapshot persistence layer with fast-path skip when snapshot_id is unchanged
add Spotify playlist client method get_playlist_items(playlist_id) returning (snapshot_id, normalized ordered items)
add duplicate-aware diff_playlist(prev, curr) utility with added/removed/moved output
add scheduler job spotify_playlist_watch.py to compute snapshot+diff and enqueue only new tracks
add unit tests for diff behavior and snapshot persistence fast-path/ordering
… music pipeline

	•	Introduced modular, focused Codex prompt set for Spotify ingestion hardening
	•	Added prompt blocks for:
	•	Spotify playlist fetch + normalization
	•	Snapshot diff logic (duplicate + move aware)
	•	SQLite snapshot persistence + fast-path handling
	•	Scheduler watch job integration
	•	Canonical MusicMetadata object
	•	Deterministic metadata precedence (Spotify → MusicBrainz → yt-dlp)
	•	Music filename builder (industry-standard format)
	•	ID3/Vorbis tagging routine with artwork + lyrics
	•	Deterministic Spotify search query builder
	•	Designed prompts for incremental generation to keep Codex scoped and focused
	•	Enforces deterministic behavior, idempotency, and streaming-quality metadata goals
	•	Supports “Spotify as authority” model (metadata only, not media source)
	•	Lays foundation for production-grade Music Mode

No runtime behavior changed.
Development workflow enhancement only.
…n pipeline

	•	Introduced 10 incremental Codex prompt blocks to implement Spotify track resolution subsystem
	•	Covers:
	•	Resolver interface definition
	•	Deterministic candidate scoring heuristics
	•	Async search execution wrapper
	•	Full resolver implementation
	•	Unit tests for scoring + resolver behavior
	•	Enqueue integration with resolved media + attached metadata
	•	Download worker update to prefer attached Spotify metadata
	•	End-to-end integration test scaffold
	•	Improved Spotify search query builder
	•	Structured resolution logging
	•	Prompts are scoped one-task-per-block to keep Codex focused and avoid breaking existing modules
	•	Designed to preserve current architecture and incrementally integrate Spotify-driven ingestion
	•	No runtime behavior changes in this commit (development workflow enhancement only)
…ad registry

	•	Added downloaded_music_tracks table with unique constraint on (playlist_id, isrc)
	•	Introduced DB helpers:
	•	has_downloaded_isrc
	•	record_downloaded_track
	•	Updated Spotify enqueue flow to:
	•	Skip tracks already downloaded for playlist via ISRC
	•	Log duplicate skip events
	•	Allow fallback when ISRC missing
	•	Updated download worker to:
	•	Record successful downloads into registry
	•	Only record after successful download + tagging
	•	Added rollback support to migration
	•	Added unit tests:
	•	ISRC guard behavior
	•	Duplicate skip validation
	•	Registry helper correctness
	•	Full idempotency integration scenario

This change enforces persistent playlist-level idempotency and prevents duplicate Spotify track downloads across scheduler runs.
…enforcement

	•	Added media/ffprobe.py wrapper to extract media duration via ffprobe
	•	Added media/validation.py with validate_duration helper enforcing duration tolerance
	•	Introduced configurable settings:
	•	ENABLE_DURATION_VALIDATION
	•	SPOTIFY_DURATION_TOLERANCE_SECONDS
	•	Integrated duration validation into download worker:
	•	Validates Spotify music jobs before tagging and recording
	•	Skips tagging and ISRC recording on mismatch
	•	Introduces new job status: validation_failed
	•	Added structured WARNING logs for validation failures (actual vs expected vs tolerance)
	•	Ensured failed validation does not persist idempotency records
	•	Added unit tests:
	•	Duration validation behavior
	•	Worker validation enforcement
	•	Configurable tolerance effects
	•	Full pipeline validation guard (no ISRC record on failure)

This change introduces post-download media verification to prevent incorrect track versions from being permanently recorded, strengthening Spotify ingestion reliability and data integrity.
…tify ingestion

	•	Introduced metadata/normalize.py with normalize_music_metadata as canonical metadata sanitation layer
	•	Added deterministic title cleanup rules (removal of resolver artifacts like “Official Audio”, “[HD]”, etc.)
	•	Implemented featured artist normalization policy (move feat./ft. to title when appropriate)
	•	Enforced album_artist consistency for proper album grouping across media players
	•	Added date normalization logic (YYYY / YYYY-MM-DD handling with graceful fallback)
	•	Implemented genre deduplication and normalization
	•	Applied Unicode NFC normalization to prevent duplicate album grouping issues
	•	Integrated normalization into worker pipeline (download → validate → normalize → tag → record)
	•	Added comprehensive unit tests for normalization behavior
	•	Added album-level integration test to verify consistent grouping across multi-track downloads

This change hardens metadata integrity and ensures album downloads remain clean, consistent, and correctly grouped in Apple Music, Jellyfin, Plex, and other media players.
Imported canonical path helpers in worker.py:
build_music_path, ensure_parent_dir
Marked the music-processing section with:
# === Canonical Path Enforcement Starts Here ===
Updated music job flow to:
derive extension from temp download path
build canonical path from normalized metadata
ensure parent directories exist
move temp file to canonical path
tag file at canonical path
return canonical path in process_job result
Added explicit failure handling:
move failure -> log + {"status": "failed", "file_path": None}
tagging failure -> log + {"status": "failed", "file_path": None}
Ensured idempotency persistence uses final canonical path:
record_downloaded_track(..., file_path=str(canonical_path))
only after successful move + tagging
never on validation_failed or move/tag failure
Added regression coverage:
test_worker_canonical_path.py
verifies returned canonical path, file moved to canonical location, temp file removed
…erwrite semantics

Added export.py with write_m3u(playlist_root, playlist_name, track_paths):
Creates/overwrites {playlist_name}.m3u under playlist root
Ensures playlist root exists
Writes UTF-8 output
Writes paths relative to configured music root via .relative_to()
Skips non-existing tracks and tracks outside music root
Performs atomic overwrite (temp file + replace)
Added explicit playlist name sanitizer:
sanitize_playlist_name(name: str) -> str
Removes invalid filesystem characters <>:"/\\|?*
Collapses whitespace
Strips trailing spaces/dots
Wired into write_m3u
Added test_playlist_export.py:
Verifies file creation
Verifies relative-path entries
Verifies skipping missing files
Verifies clean overwrite behavior after second write
…er Spotify sync

Added rebuild.py with reusable helper:
rebuild_playlist_from_tracks(playlist_name, playlist_root, music_root, track_file_paths)
Rebuilds playlist M3U from canonical absolute DB file paths via write_m3u
Added deterministic rebuild test:
test_playlist_rebuild.py
Verifies existing-only inclusion and relative path output
Integrated post-sync playlist export into watcher:
Updated spotify_playlist_watch.py
After successful snapshot store, loads downloaded canonical file paths from downloaded_music_tracks
Rebuilds M3U using configured playlist/music directories
Logs summary: Playlist M3U updated: {playlist_name} ({count} tracks)
Wrapped rebuild in best-effort error handling so scheduler/watcher flow never crashes on M3U failures
Added watcher integration test:
test_playlist_watcher_m3u.py
Confirms rebuild is called after successful sync and generated M3U contains expected paths
…ompatibility

Added Spotify Liked Songs virtual playlist scaffolding in spotify_playlist_watch.py:
SPOTIFY_LIKED_SONGS_PLAYLIST_ID = "__spotify_liked_songs__"
get_liked_songs_playlist_name() -> "Spotify - Liked Songs"
Added placeholder sync entrypoint:
run_liked_songs_sync()
Includes future OAuth flow docstring and logs:
"Liked Songs sync not enabled (OAuth required)"
Not wired into scheduler yet
Hardened rebuild.py for virtual playlist usage:
Normalizes/cleans playlist_name before calling write_m3u
No playlist-ID assumptions or special-case logic
Added regression test for virtual playlist M3U flow:
test_liked_songs_virtual_playlist.py
Verifies liked-songs name and canonical-path M3U generation work end-to-end
…atibility scaffolding

Added pure intent routing module intent_router.py:
IntentType enum and Intent dataclass
detect_intent(user_input) for deterministic detection of:
Spotify album/playlist/track/artist URLs
YouTube playlist URLs (list= query param)
fallback SEARCH
No network/UI/ingestion coupling
Added intent router tests test_intent_router.py:
Covers Spotify URL variants, YouTube playlist URL, plain search text, and malformed URL fallback
Wired API search handler to intent router in main.py:
POST /api/search/requests now runs detect_intent on raw query
For non-SEARCH intents, returns structured detection response:
{"detected_intent": "<intent>", "identifier": "<id>"}
For SEARCH, existing behavior remains unchanged
Hardened virtual playlist rebuild compatibility:
rebuild.py now normalizes/cleans playlist_name generically with no ID-format assumptions
Added virtual liked-songs playlist regression test:
test_liked_songs_virtual_playlist.py verifies canonical-path M3U generation for "Spotify - Liked Songs"
…-gated confirmation UI

Added homepage intent-routing integration end-to-end:
Search API now returns intent detection payload for non-search inputs
Frontend handles detected intents with dedicated confirmation state
Added backend execution plumbing:
New endpoint POST /api/intent/execute
Validates intent_type and identifier
Returns deterministic acceptance payload (no enqueue/ingestion yet)
Added deterministic endpoint tests scaffold:
test_intent_execute_endpoint.py
Valid intent returns accepted payload
Invalid intent_type returns 400
Added Spotify intent preview endpoint:
New POST /api/intent/preview
Supports spotify_album and spotify_playlist
Fetches preview metadata (title, artist, track_count) from Spotify API
No ingestion side effects
Upgraded homepage confirmation card flow:
For Spotify album/playlist intents:
Fetch metadata first
Render Title / Artist / Track count
Show Confirm Download only after successful preview
Show error state on preview failure
Cancel returns user to search state
Confirm Download still calls /api/intent/execute only
…h album/playlist parity

Added intent_dispatcher.py as thin execution router using existing ingestion patterns:
spotify_playlist routes to existing playlist_watch_job sync flow
spotify_album routes to new run_spotify_album_sync(...) orchestration
spotify_track reuses enqueue_spotify_track(...)
spotify_artist returns accepted response requiring user selection
Implemented run_spotify_album_sync(...):
Fetches ordered album tracks from Spotify public API
Enqueues via existing enqueue/metadata pipeline
Best-effort M3U rebuild for album:
name format: Spotify - Album - {Artist} - {Album}
uses canonical downloaded paths from DB + existing rebuild.py
Returns deterministic summary with enqueued_count
Updated POST /api/intent/execute in main.py:
Delegates to dispatcher instead of static accepted response
Passes dependencies from app state (config/db/queue/search_service/spotify client)
Keeps /api/intent/preview unchanged
Keeps /api/search/requests behavior unchanged
Added deterministic dispatcher routing tests:
test_intent_dispatcher.py
verifies artist/playlist/album/track branches with monkeypatched spies (no network calls)
Added API-level intent execute delegation tests:
test_api_intent_execute.py
verifies endpoint returns mocked dispatcher payload and invalid intent validation (400)
…le tests

Added oauth_store.py:
SpotifyOAuthToken dataclass (access_token, refresh_token, expires_at, scope)
SpotifyOAuthStore with single-row persistence (id=1)
table bootstrap via _ensure_table()
save() upsert semantics
load() returns token or None
clear() deletes stored token
Uses direct sqlite3; no encryption and no network/OAuth calls
Added deterministic unit test coverage:
test_spotify_oauth_store.py
Verifies save/load equality, overwrite behavior, and clear -> None behavior using tmp_path DB
…terministic refresh tests

Extended oauth_client.py:
Added refresh_access_token(client_id, client_secret, refresh_token) -> dict
Performs token refresh request to Spotify token endpoint
Raises on non-200 responses
Returns parsed JSON payload on success
Extended oauth_store.py:
Added get_valid_token(client_id, client_secret)
Loads stored token, checks expiration, refreshes when expired, saves updated token
Clears token and returns None if refresh fails
Added deterministic test coverage in test_spotify_oauth_refresh.py:
not-expired token returns unchanged token
expired token refreshes and persists updated values
refresh failure clears token and returns None
all refresh behavior mocked via monkeypatch (no Spotify network calls)
…e alerting

Added refresh_access_token(...) in oauth_client.py for refresh-token exchange against Spotify token endpoint.
Extended SpotifyOAuthStore.get_valid_token(...) in oauth_store.py to refresh expired tokens, persist updates, clear invalid tokens, and send best-effort Telegram notification on refresh failure.
Updated Spotify client bootstrap path in main.py to prefer valid OAuth access tokens when available and gracefully fall back to public/client-credentials mode when unavailable.
Added OAuth injection coverage in test_spotify_oauth_injection.py to verify client construction with and without access_token via monkeypatched token store behavior.
…b, and deterministic tests

Added SpotifyPlaylistClient.get_saved_albums() in client.py:

OAuth-required /v1/me/albums fetch with pagination.
Per-album track expansion via album endpoint + track pagination.
Normalized album + ordered track payloads for album-sync compatibility.
Deterministic snapshot hash from ordered album IDs.
Added spotify_saved_albums_watch_job(...) in spotify_playlist_watch.py:

Validates OAuth token before sync.
Loads saved albums snapshot, diffs by album IDs, and triggers run_spotify_album_sync(...) only for newly added albums.
Persists snapshot under __spotify_saved_albums__.
Best-effort M3U rebuild as "Spotify - Saved Albums".
Non-destructive behavior on removals (no local file deletions).
Extended scheduler wiring in main.py:

Added periodic job spotify_saved_albums_watch.
Default interval 30 minutes (configurable).
Silent skip when no valid OAuth token.
Auto-activates immediately after successful OAuth callback.
Preserves existing playlist and liked-songs scheduling behavior.
Added network-free tests in test_spotify_saved_albums_sync.py:

Verifies only new albums trigger album sync when token is valid.
Verifies clean skip path when OAuth token is unavailable.
…eduler + tests

Added SpotifyPlaylistClient.get_user_playlists() in client.py:

OAuth-required fetch from /v1/me/playlists with pagination.
Normalized output (id, name, track_count).
Deterministic snapshot hash from ordered playlist IDs.
Added spotify_user_playlists_watch_job(...) in spotify_playlist_watch.py:

Validates OAuth token.
Loads current user playlists and diffs against snapshot __spotify_user_playlists__.
Triggers existing playlist_watch_job only for newly added playlist IDs.
Stores updated snapshot.
Non-destructive on removals (no local file deletion).
Wired periodic scheduler support in main.py:

New job id spotify_user_playlists_watch.
Default interval 30 minutes (configurable).
Silent skip on missing/invalid OAuth token.
Auto-activation after successful OAuth callback.
Reapplies on startup/config/schedule updates.
Added network-free tests in test_spotify_user_playlists_sync.py:

Valid token path: only new playlists trigger sync.
Missing token path: clean skip with no sync calls.
… Config/Status pages

Added new Spotify Integration card to Config page with:
OAuth connection status display
Connect/Disconnect actions
Sync toggles for Liked Songs, Saved Albums, and My Playlists
Interval inputs for each Spotify sync stream
Wired app.js config logic:
Added refreshSpotifyConfig() to load config + OAuth status and update Spotify controls
Hooked refresh into config page navigation flow
Added connect/disconnect button handlers with user notices
Extended config save payload to persist Spotify sync flags and interval settings
Added new Spotify Sync Status card to Status page with fields for:
OAuth state
Last liked songs sync
Last saved albums sync
Last playlists sync
Extended refreshStatus() to best-effort fetch /api/spotify/status and update Spotify status fields without breaking existing status polling when endpoint is unavailable.
Updated homepage Music Mode label text to:
Music Mode (Spotify metadata, album structure, validation)
Updated intent confirmation button text to be contextual:
Album/Playlist/Track-specific labels with fallback to Confirm Download.
…settings handling

Added Spotify OAuth completion event flow in app.js:
Dispatches spotify-oauth-complete when OAuth transitions to connected.
Global listener shows home notice: “Spotify connected successfully. Initial sync has started.”
Extended Status page Spotify runtime rendering:
refreshStatus() now supports liked_sync_running, saved_sync_running, and playlists_sync_running.
Displays Running... with running class while active; otherwise shows timestamps.
Improved Config page Spotify state behavior:
In refreshSpotifyConfig(), disables sync toggles/intervals when disconnected and shows helper text.
Re-enables controls when connected.
Applies running class to #spotify-connection-status when connected.
Added home result metadata badge:
renderHomeResultItem() now shows Spotify Metadata tag for media_type === "music".
Hardened Spotify interval save logic in buildConfigFromForm():
Clamps values to minimum 1.
Defaults to 30 when input is NaN.
Prevents zero/negative interval persistence.
…uth lifecycle plumbing

Added Spotify OAuth callback redirect to UI config route:

/api/spotify/oauth/callback now returns 302 to /#config?spotify=connected on success.
Added Spotify OAuth API endpoints:

GET /api/spotify/oauth/status (always 200; returns connected + optional scopes/expires_at)
POST /api/spotify/oauth/disconnect (clears token store; returns disconnected)
Hardened OAuth connect flow server-side:

Clears stale app.state.spotify_oauth_state before generating fresh random state.
Extended Config UI for Spotify credentials:

Added fields for client_id, client_secret, and read-only redirect_uri.
Wired load/save to config.spotify.* in app.js.
Redirect URI now auto-populates from config or falls back to:
${window.location.protocol}//${window.location.host}/api/spotify/oauth/callback.
Improved Spotify Config UX state handling:

Disabled sync toggles/interval inputs when OAuth is disconnected; re-enabled when connected.
Added helper text when disconnected: “Connect Spotify to enable sync options.”
Added running-style class toggle for connection status.
Added post-OAuth one-time UI refresh/notice logic:

Detects #config?spotify=connected, clears hash flag once, refreshes Spotify state, and shows success notice.
Added custom event dispatch/listener for OAuth completion notice messaging.
Extended Status page Spotify runtime display:

Added sync running state handling (liked/saved/playlists *_running):
Shows Running... and applies running class while active.
Falls back to timestamps otherwise.
Improved Home page behavior and labels:

Updated music mode label to reflect metadata pipeline.
Added contextual intent confirm button text (Download Album/Playlist/Track).
Added “Spotify Metadata” source tag for music result headers.
Fixed home search control recovery on failure paths:

Re-enables controls after Spotify preview failures and direct URL rejection paths.
Reworked Home Spotify URL routing priority:

Added early intent detection (detect_intent) at start of home search flow.
Spotify URLs now route to intent preview flow before any direct/generic URL handling.
Preserved non-Spotify direct URL behavior.
Relaxed backend config save behavior:

Config save no longer blocks on blank Spotify client credentials.
OAuth credential validation remains enforced at connect/callback time only.
…arden sync UX/logging

Added Spotify playlist identifier normalization and validation across config save and scheduler apply paths.
Scheduler now reads config.spotify.watch_playlists, skips invalid entries, and logs warnings instead of failing.
Ensured scheduled sync iterates normalized IDs only and logs each playlist being fetched/synced.
Added explicit watcher log at job start: Fetching Spotify playlist {playlist_id}.
Preserved per-playlist error isolation so one failing playlist does not block others.
Wired playlist textarea persistence in UI (config-spotify-playlists) through load/save config flow.
Added pre-save frontend validation for playlist entries with early error notice on invalid IDs.
Updated post-save notice to confirm playlist mapping update: Spotify playlists updated.
Kept backward compatibility with legacy watch_playlists fallback while prioritizing spotify.watch_playlists.
No ingestion behavior changes; focused on mapping correctness, operator visibility, and resilience.
…rer interval/job completion logs

Added new async wrapper spotify_playlists_watch_job(config, db, queue, spotify_client, search_service, ignore_downtime=False) to centralize configured Spotify playlist sync execution.
Implemented downtime-aware guard in Spotify playlist scheduler path with override support:
Default scheduled behavior respects downtime.
ignore_downtime=True bypasses downtime for manual-triggered flows.
Added explicit suppression logging before API work when downtime is active:
Spotify playlist sync waiting for downtime to end (downtime START->END).
Added interval tick visibility in scheduler driver:
Logs when Spotify interval tick is skipped due to downtime.
Logs when Spotify interval tick starts.
Added batch completion logging for playlist scheduler wrapper:
Spotify playlist sync completed: X/Y playlists processed.
Extended manual archive run flow to trigger Spotify playlist sync immediately after archive completion (when spotify.sync_user_playlists is enabled), with downtime override and final completion logs:
Manual run triggering Spotify playlist sync (override downtime)
Manual-run Spotify playlist sync completed
Manual run completed (archive + Spotify playlist sync)
Preserved existing archive pipeline and error isolation; Spotify sync failures are logged without breaking archive completion semantics.
…/completion logs, and enable manual-run Spotify sync override

Centralized Spotify scheduler registration behind _apply_spotify_schedule(config) and replaced all legacy apply call sites so disabled flags no longer leave stale jobs scheduled.
Removed legacy per-job scheduler apply functions (_apply_liked_songs_schedule, _apply_saved_albums_schedule, _apply_user_playlists_schedule, _apply_spotify_playlists_schedule) to prevent accidental use.
Added consistent downtime suppression logging across Spotify sync jobs with explicit window message:
Spotify sync waiting for downtime to end (downtime START -> END).
Added successful completion logs so syncs don’t appear to hang:
liked songs, saved albums, user playlists, playlist polling, and polling batch wrapper.
Extended manual run flow to trigger Spotify playlist polling immediately after archive completion with downtime override (ignore_downtime=True) and added clear manual-run completion logs:
trigger started, sync completed, and final combined completion (archive + Spotify sync).
…ss UI + API

Move Home #home-music-mode toggle out of Advanced panel into a visible Home search row, preserving existing ID/behavior and label text.
Add Home Music Mode state in app.js:
persist in localStorage (retreivr.home.music_mode)
restore on init and sync checkbox UI
re-run Home search (search-only refresh) when toggled with a non-empty query
include music_mode from state in Home search payloads
show subtle Music Mode chip in Home results header when enabled
Harden backend search request plumbing in main.py:
add music_mode: bool = False to SearchRequestPayload
log one DEBUG line per Home search request with music_mode + query
accept/pass through music_mode safely and echo it in /api/search/requests responses
No ranking/source/download behavior changes introduced.
…, and music_track worker support

Home UI:

Moved #home-music-mode to visible Home search area (no ID change, no duplication).
Promoted Music Mode to first-class Home flag with persisted state:
added state.homeMusicMode
localStorage key: retreivr.home.music_mode
auto-refresh Home search (search-only) on toggle when query is present
added Home results “Music Mode” chip using existing chip styles
Added album action UI:
when Home search response includes music_mode=true and music_resolution.type==="album", render “Download Full Album” button
button posts to /api/music/album/download.
Backend API:

Added resolver.py:
resolve_album(query) via MusicBrainz release search
fetch_album_tracks(album_id) via MusicBrainz release recordings
/api/search/requests:
explicit music_mode: bool = False in SearchRequestPayload
DEBUG logging per request: Home search: music_mode=<bool> query=<...>
safe round-trip handling and response echo of music_mode
added additive music_resolution response field (resolved via resolve_album when music_mode enabled; None on miss/failure)
Added new endpoint:
POST /api/music/album/download
validates album_id, fetches tracks, enqueues each as media_intent: "music_track" using existing queue adapter, returns tracks_enqueued.
Worker pipeline:

Added music_track branch in DownloadWorkerEngine:
builds search query from metadata (artist + track + "audio")
resolves candidate via search_service.search_best_match when available, with adapter+scoring fallback
reuses existing adapter download/postprocessing/metadata pipeline (no bypass)
preserves existing retry semantics
Wired search_service into worker initialization.
Scope/safety:

No new DB tables.
No ranking/source-selection changes for existing search flows beyond additive music_track resolution branch.
No duplicate postprocessing logic introduced.
…d UI button robustness

API music resolution/album enqueue auditing:

Added [MUSIC] INFO log in /api/search/requests showing music_resolution + query.
Added [MUSIC] logs in /api/music/album/download:
fetched track count
per-track enqueue debug line
enqueue completion count.
Kept music_resolution explicitly present in response shape (dict | None) with music_mode.
Added explicit track_number int coercion in album enqueue payload while preserving existing behavior.
Added 404 guard for empty album-track results:
Album resolved but no tracks returned from MusicBrainz.
Home UI robustness:

Hardened album button render path:
strict header lookup with warning if missing
duplicate button removal before append
standardized button id/class (home-download-album-btn, btn primary small)
debug log before rendering.
Worker diagnostics and intent handling:

Added [WORKER] debug log of incoming job payload keys.
Hardened media intent lookup to support both top-level and nested payload:
intent = media_intent || payload.media_intent.
Added [WORKER] info log when processing music_track.
Added final output confirmation log:
[MUSIC] finalized file: <full_path> after atomic move.
Scope:

No business-logic or pipeline flow changes; additive observability and defensive checks only.
…sic_track worker hardening

MusicBrainz resolver upgrades:

Replaced free-text release search with structured release-group search.
Added search_albums(query) using Lucene fields:
artist:"..."
primarytype:"Album"
releasegroup:"..."
Added candidate parsing/normalization (album_id, title, artist, first_released, track_count) and top-5 cap.
Added stopword cleanup before structured query build (album, full, official, audio, music, etc.) to improve matching quality.
Added resolver logging for candidate count per query.
Music API exposure:

Added POST /api/music/album/candidates returning {status, album_candidates}.
Added GET /api/music/album/art/{album_id} proxy to Cover Art Archive with in-memory TTL caching and safe fallback (cover_url: null).
Home UI music candidates:

Removed old single auto album button logic.
Added candidate-section rendering under Home results header:
#home-album-candidates
per-card fields for title/artist/date
per-card Download Full Album button (.album-download-btn) posting to /api/music/album/download.
Added candidate lifecycle handling tied to terminal Home search state.
Ensured cleanup/reset to prevent duplicate candidate sections across searches.
Album art in UI:

Added cover-art fetch for each candidate via backend proxy endpoint.
Added per-album client cache and staggered fetch timing to reduce request bursts.
Graceful fallback to plain card when no art is available.
Worker music_track branch:

Hardened branch logging and observability:
logs incoming job payload keys
logs processing line with artist/track
logs finalized output file full path.
Improved intent lookup compatibility:
checks top-level and nested payload media intent.
Updated music-track query composition to:
"<artist> <track> album:<album> audio"
Added relaxed threshold usage attempt for search_service.search_best_match with compatibility fallback.
Kept existing adapter/download/postprocessing pipeline intact.
API album download audit enhancements:

Added [MUSIC] logs for fetched track count, per-track enqueue, and completion count.
Ensured enqueue payload consistently includes:
media_intent, artist, album, track, track_number.
Added explicit 404 when album resolves but returns zero tracks.
…g, and legacy cleanup

Move Home Music Mode toggle into visible Home search controls and persist state in localStorage.
Include music_mode in Home search payloads and plumb it through backend request/response safely.
Add MusicBrainz integration for album candidates, release-group resolution, release track fetching, and album enqueue endpoint.
Improve music_track worker resolution with music-specific query composition, source prioritization, scoring adjustments, threshold logging, and retryable no-match failures.
Add MusicBrainz client polish (rate limiting, retries, cache, docs) and remove legacy auto-album resolution/single-button paths; enforce explicit candidate selection for album downloads.
…hting, normalization, and debug visibility

Build richer music_track queries with quoted artist/track/album plus music-specific tokens.
Prioritize music sources (youtube_music, youtube, soundcloud, bandcamp) with explicit source-weight boosts and logging.
Add title-token match scoring boosts (exact + partial overlap) with candidate-level debug logs.
Add duration similarity bonuses (±3s/±8s/±15s) when duration metadata is available, without penalizing missing duration.
Add studio-track keyword penalties (live, cover, karaoke, remix, etc.) with detailed penalty logging.
Relax music_track match threshold override and emit top-5 candidate logs on no-match while preserving retryable failure behavior.
New metrics fields added
get_metrics() now returns:

total_requests
cache_hits
cache_misses
retries
cover_art_requests
cover_art_failures
What was added (without changing return schemas)
Internal counters with thread-safe increments.
get_metrics() method returning a snapshot dict of counters.
Debug logging toggle via:
constructor arg: MusicBrainzService(debug=True/False)
env var: RETREIVR_MUSICBRAINZ_DEBUG=1|true|yes|on
Debug logs emitted on:
cache hit
cache miss
retry/backoff
rate-limit sleep
CAA fetch attempts
…nt ingestion, remove legacy run UI

Centralized MusicBrainz usage through musicbrainz_service.py and removed duplicate app-level MB client/service paths.
Switched canonical metadata resolution to MusicBrainz-first, with Spotify fallback only when OAuth token is present and premium validation succeeds.
Unified album candidate search behind one canonical implementation and kept both endpoints backward-compatible:
GET /api/music/albums/search (canonical)
POST /api/music/album/candidates (compat wrapper)
Reworked _IntentQueueAdapter to enqueue directly into DownloadJobStore and convert watch/intents into explicit jobs (including music_track query jobs).
Eliminated dead intent_dispatch_queue behavior; intent execution now routes into active ingestion pipeline.
Removed legacy-run UI section from index.html and deleted corresponding #run-* listeners/helpers in app.js.
Updated status/notice bindings to active Home UI elements to avoid removed DOM targets.
Updated docs (README.md, musicbrainz.md) to reflect MB-first authority, endpoint canonicalization, and compatibility behavior.
Added/updated tests for canonical resolver behavior, album endpoint compatibility, intent-to-download conversion, and webUI smoke checks.
…ke, and DB migration

add test_integration_full_music_flow.py for canonical search -> enqueue -> mock download -> metadata embed -> history persist
add test_integration_spotify_intent.py for Spotify intent ingestion with OAuth+Premium gating and MB-first resolution
add test_integration_youtube_download.py for direct YouTube flow, canonical naming, collision suffixes, and history fields
add test_canonical_resolver_behavior.py for MusicBrainz-first resolver behavior and Spotify fallback gating
replace/add test_webui_smoke.py Playwright smoke test for Home search/download interactions and no legacy-run JS errors
add test_db_migration_channel_id.py to validate download_history.channel_id migration, legacy row preservation, and new inserts
…hot correctness and idempotency, without touching metadata resolution or download adapters.
…otency path

skip snapshot persistence when normalized snapshot hash is unchanged (even if snapshot_id differs)
keep deterministic normalized snapshot rows and batch executemany inserts
add DB indexes for duplicate detection queries on download_jobs (canonical_id/url + destination + status + created_at)
eliminate MusicBrainz per-candidate release lookup N+1 by deferring track-number release lookup to best candidate only
add scheduler run-duration logs for playlist watch jobs
harden tests and add coverage for:
hash-based snapshot no-op writes
duplicate-detection index presence
MusicBrainz provider lookup efficiency
temp-path isolation for playlist watch tests to avoid repo-root Music/ side effects
change runtime fallback final_format defaults from webm to mkv in API/worker paths
update direct URL self-test default override to mkv
update UI format labels/options to show mkv as default
refresh README default format documentation accordingly
@z3ro-2 z3ro-2 merged commit 84659f5 into main Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments