Context
PR #108 added early duplicate detection to /ingest_youtube (#107). When a video URL is re-submitted, the endpoint now returns status: "duplicate" immediately without calling yt-dlp.
Problem
The old behavior intentionally refreshed metadata (view counts, publish date, engagement rate) when re-ingesting an existing video (see kb_server.py around the update_metadata call). The new early-exit bypasses this path entirely.
A user who re-submits a URL to refresh stale metadata will now get status="duplicate" with no update.
Fix Options
- Preserve early-exit + add refresh flag: Accept
refresh=true query param to force the full pipeline for known duplicates
- Split the concern: Keep fast duplicate detection as default, add a separate
PATCH /source/{id}/refresh endpoint
- Remove early-exit: Go back through the full yt-dlp pipeline for duplicates (slower but preserves old behavior)
Acceptance Criteria
- Re-ingesting an existing YouTube URL can refresh metadata when desired
- Default duplicate detection remains fast (no yt-dlp call)
status: "duplicate" still returned for known videos
Part of krisoye/admin-dashboard#72 (integration audit)
Context
PR #108 added early duplicate detection to
/ingest_youtube(#107). When a video URL is re-submitted, the endpoint now returnsstatus: "duplicate"immediately without calling yt-dlp.Problem
The old behavior intentionally refreshed metadata (view counts, publish date, engagement rate) when re-ingesting an existing video (see
kb_server.pyaround theupdate_metadatacall). The new early-exit bypasses this path entirely.A user who re-submits a URL to refresh stale metadata will now get
status="duplicate"with no update.Fix Options
refresh=truequery param to force the full pipeline for known duplicatesPATCH /source/{id}/refreshendpointAcceptance Criteria
status: "duplicate"still returned for known videosPart of krisoye/admin-dashboard#72 (integration audit)