Release v0.3.0 — Speaker Identification · roboalchemist/any2md

Speaker Identification via WeSpeaker ResNet293

New feature: named speaker identification across recordings using persistent voice profiles.

New: `any2md speaker` subcommand

any2md speaker add "Joe" --audio joe-sample.wav
any2md speaker list
any2md speaker remove "Joe"
any2md speaker merge "Speaker A" "Speaker B"
any2md speaker stats "Joe"
any2md speaker gallery "Joe"

New: `--identify` flag

any2md meeting.m4a --diarize --identify
# Output: **Joe** [00:24] instead of SPEAKER_0

How it works

WeSpeaker ResNet293 extracts 256-d speaker embeddings (PyTorch MPS on Apple Silicon)
Gallery model: stores multiple embeddings per speaker to handle voice variation across mics/conditions
sqlite-vec for fast KNN nearest-neighbor search
Adaptive thresholds: high-confidence auto-match (≥0.85), medium-confidence with score (0.70-0.85)
Auto-enrolls new embeddings for matched speakers (gallery grows over time)
Prompts for unknown speakers (or --auto-enroll / --no-enroll)

Speaker catalog

Persistent at ~/.config/any2md/speakers.db:

Gallery maintenance: rolling window of 20 embeddings per speaker
Per-speaker distance statistics for drift detection
Merge support for duplicate profiles
Audit trail for profile merges

Install

brew upgrade any2md
# Install speaker identification deps
uv pip install any2md[speaker]

Stats

201 new tests (119 speaker + 42 CLI + 40 yt)
7 tickets implemented (ANY2-11 through ANY2-17)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.3.0 — Speaker Identification

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Speaker Identification via WeSpeaker ResNet293

New: `any2md speaker` subcommand

New: `--identify` flag

How it works

Speaker catalog

Install

Stats

Uh oh!

v0.3.0 — Speaker Identification

Speaker Identification via WeSpeaker ResNet293

New: any2md speaker subcommand

New: --identify flag

How it works

Speaker catalog

Install

Stats

Uh oh!

New: `any2md speaker` subcommand

New: `--identify` flag