Skip to content

Claude/speaker signature storage oqv7e#277

Open
phyrexia wants to merge 5 commits intokaixxx:mainfrom
phyrexia:claude/speaker-signature-storage-Oqv7e
Open

Claude/speaker signature storage oqv7e#277
phyrexia wants to merge 5 commits intokaixxx:mainfrom
phyrexia:claude/speaker-signature-storage-Oqv7e

Conversation

@phyrexia
Copy link

No description provided.

claude and others added 5 commits February 25, 2026 08:31
Introduces a speaker signature storage system so that once a person is
identified in one recording their voice can be automatically recognised
in future sessions.

Changes
-------
speaker_db.py (new)
  - JSON-based database stored in the user config directory
    (~/.config/noScribe/speaker_signatures.json)
  - find_match(embedding) – cosine-similarity lookup with configurable
    threshold (default 0.75)
  - save_speaker(name, embedding) – add/update entry, blending the new
    embedding with any existing one for the same name so the model
    gradually adapts to voice variation
  - list_speakers() / delete_speaker() helpers for future management UI

pyannote_mp_worker.py
  - After diarization, extracts per-speaker L2-normalised embeddings by
    feeding the longest audio segments (≥ 1.5 s, up to 5 per speaker)
    through the pipeline's already-loaded embedding model
  - Returns embeddings alongside the segment list in the result message;
    the entire extraction is wrapped in try/except so any failure is
    silently logged and never blocks the transcription

noScribe.py
  - Imports speaker_db
  - SpeakerNamingDialog (CTkToplevel): modal dialog shown between
    diarization and transcription; lists each detected speaker with
    their matched name and confidence badge (green ≥ 75 %, orange
    > 55 %, grey = new speaker), an editable name field and a Save
    checkbox; OK applies names + saves checked signatures, Skip falls
    back to S01/S02 labels
  - _run_diarize_subprocess now returns (segments, embeddings)
  - _run_speaker_naming_dialog helper runs the dialog in the GUI thread
  - speaker_name_map (closure variable) carries label→name mappings
    into find_speaker so confirmed names appear directly in the
    transcript instead of S01/S02
  - Threading: dialog is scheduled via self.after(0, …) and the worker
    thread waits on a threading.Event, keeping Tkinter calls on the
    main thread

trans/noScribe.*.yml
  - Added speaker_naming_title, speaker_naming_hint,
    speaker_name_placeholder, speaker_new_badge, speaker_save_checkbox,
    btn_ok and btn_skip keys for all 9 supported languages

https://claude.ai/code/session_016natySHkUNa6oDH7sEPf4F
Adds a languages.yml file in the noScribe config directory so users can
comment out the transcription languages they never use, shortening the
dropdown to just the ones that matter to them.

How it works
------------
- On first run noScribe creates ~/.config/noScribe/languages.yml (or
  the OS-equivalent path) listing all supported languages with an
  explanatory header in English.
- The file is a plain YAML list; to hide a language the user adds '#'
  at the start of that line.  Standard YAML comment syntax means the
  lines can be uncommented just as easily.
- noScribe reads the file at startup but NEVER writes to it, so
  comments and formatting are always preserved across sessions.
- If the file is missing, unreadable, or yields an empty list, noScribe
  silently falls back to the full built-in language list (no regression
  for existing users).
- 'Auto' is always kept in the active list even if accidentally
  commented out, to prevent the dropdown from breaking.

https://claude.ai/code/session_016natySHkUNa6oDH7sEPf4F
The language filter file now lives in the same folder as noScribe.py
so users can find and edit it directly without hunting through OS config
paths.  The shipped languages.yml has English, Spanish and Portuguese
active by default (Auto and Multilingual included); everything else is
commented out and can be re-enabled by removing the '#'.

https://claude.ai/code/session_016natySHkUNa6oDH7sEPf4F
Two issues fixed:

1. Dialog never appeared
   Gated on `if _embeddings:` so it was silently skipped whenever
   embedding extraction failed. Now it always shows after diarization
   so the user can assign names even without stored signatures.

2. Embedding extraction too fragile
   - Tries pipeline._embedding, embedding_, _embedding_model in sequence
   - Falls back to loading the embedding model directly from
     pyannote/embedding/pytorch_model.bin when none of the pipeline
     attributes exist (covers pyannote versions that changed internals)
   - Each step now logs to the debug log so failures are visible in
     the noScribe log file instead of being silently swallowed

https://claude.ai/code/session_016natySHkUNa6oDH7sEPf4F
@kaixxx
Copy link
Owner

kaixxx commented Feb 25, 2026

There are some interesting ideas in here. But it would be good to discuss such ideas before letting Claude Code loose and making such massive changes to the codebase. I have to check this all and decide what I want to keep and what not. As an example, I don't think that it is a good idea to show a modal dialog every time the speaker detection is finished. People are letting noScribe run for hours unattended. When they come back, they expect the transcript to be ready, or even a whole queue of jobs. A modal dialog interrupting this can be quite annoying.
An editable language list - I'm not sure if many people will use this. noScribe will already remember the last language setting. So most people set the language only once and that's it.

@phyrexia
Copy link
Author

hey Kaixxx, my bad this was menat for my own fork, i´ve been developing new improvements on your code base bringing some of my own previous development in the subject, Agentic coding has been very usefull.
I would love the discussion, and love your insigth on them going further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants