Skip to content

Conversation

@yzevm
Copy link

@yzevm yzevm commented Nov 29, 2025

Description

Type of Change

  • Bug fix (non-breaking)
  • New feature (non-breaking)
  • Breaking change
  • Documentation update

Motivation

Keyless should successfully start audio capture, either by supporting the I8 sample format

Additional Notes

It all works now, but maybe my laptop isn’t strong enough, my microphone isn’t good, the model isn’t right, or something else is wrong. For now, it’s hard for me to use it every day

 INFO keyless_models::download::model: starting model download model=openai/whisper-tiny.en
 INFO keyless_models::download::model: model download completed successfully model=openai/whisper-tiny.en
 INFO keyless_whisper::model::loader: loading whisper model model=openai/whisper-tiny.en
 INFO keyless_whisper::model::loader: detected model format format=normal (.safetensors)
 INFO keyless_whisper::model::loader: reading config.json path=/home/yegor/.cache/keyless/models/openai--whisper-tiny.en/config.json
 INFO keyless_whisper::model::loader: parsed config
 INFO keyless_whisper::model::loader: mapping safetensors weights path=/home/yegor/.cache/keyless/models/openai--whisper-tiny.en/model.safetensors
 INFO keyless_whisper::model::loader: constructed model
 INFO keyless_whisper::model::loader: loading tokenizer path=/home/yegor/.cache/keyless/models/openai--whisper-tiny.en/tokenizer.json
 INFO keyless_whisper::model::loader: tokenizer loaded
 INFO keyless_whisper::model::loader: token IDs resolved
 INFO keyless_whisper::model::loader: generating mel filters
 INFO keyless_whisper::model::loader: mel filters generated
 INFO keyless_whisper::whisper::worker_thread: initializing rubato resampler source_hz=48000 target_hz=16000
 INFO keyless_audio::sfx::player: sfx output stream started sample_rate_hz=44100.0 channels=2
DEBUG keyless_audio::input::cpal: chosen_sample_rate_final_frame_samples chosen_sample_rate=48000 final_frame_samples=4800
DEBUG keyless_audio::input::cpal: Sample format: I8
 INFO keyless_audio::input::cpal: audio stream started sample_rate=48000 frame_size=4800
 INFO keyless_runtime::pipeline::startup: audio capture started device=USB PnP Audio Device
DEBUG keyless_runtime::ptt::handlers: PTT pressed, listening with VAD gating
DEBUG keyless_runtime::pipeline::callback: VAD gate state changed state="OPEN" level_db=-35.337242
 INFO keyless_whisper::whisper::worker_thread: rubato processed chunk in_len=1026 out_len=342
DEBUG keyless_whisper::inference::pipeline: prepared voiced unit for final decode unit_start=4800 unit_end=28728 samples=23928 rms=0.07882763907019147
DEBUG keyless_whisper::inference::pipeline: prepared mel spectrogram pcm_samples=23928 mel_frames_initial=3000
DEBUG keyless_whisper::inference::pipeline: inference quality metrics avg_logprob=-0.8769845902962867 compression_ratio=2.625 no_speech_prob=0.20353645086288452 temperature=1.0
DEBUG keyless_whisper::inference::pipeline: voiced unit decode metrics unit_start=4800 unit_end=28728 avg_logprob=-0.8769845902962867 compression_ratio=2.625 no_speech_prob=0.20353645086288452 temperature=1.0 preview= Osaka is a beautiful
 INFO keyless_whisper::inference::pipeline: unit_start=4800 unit_end=28728 appended=Osaka is a beautiful current_final=Osaka is a beautiful
 INFO keyless_whisper::inference::pipeline: voiced decode complete len=20
DEBUG keyless_whisper::whisper::inference_thread: preview units processed unit_seconds=10 units_total=1 units_reused=0 units_decoded=1 preview_len=20
DEBUG keyless_whisper::inference::pipeline: prepared voiced unit for final decode unit_start=4800 unit_end=47880 samples=43080 rms=0.06567584241441583
DEBUG keyless_whisper::inference::pipeline: prepared mel spectrogram pcm_samples=43080 mel_frames_initial=3000
DEBUG keyless_whisper::inference::pipeline: inference quality metrics avg_logprob=-0.3377138527596076 compression_ratio=2.888888888888889 no_speech_prob=0.045450158417224884 temperature=1.0
DEBUG keyless_whisper::inference::pipeline: voiced unit decode metrics unit_start=4800 unit_end=47880 avg_logprob=-0.3377138527596076 compression_ratio=2.888888888888889 no_speech_prob=0.045450158417224884 temperature=1.0 preview= Osaka is a beautiful city
 INFO keyless_whisper::inference::pipeline: unit_start=4800 unit_end=47880 appended=Osaka is a beautiful city current_final=Osaka is a beautiful city
 INFO keyless_whisper::inference::pipeline: voiced decode complete len=25
DEBUG keyless_whisper::whisper::inference_thread: preview units processed unit_seconds=10 units_total=1 units_reused=0 units_decoded=1 preview_len=25
DEBUG keyless_whisper::inference::pipeline: prepared voiced unit for final decode unit_start=4800 unit_end=49800 samples=45000 rms=0.06425948528451816
DEBUG keyless_whisper::inference::pipeline: prepared mel spectrogram pcm_samples=45000 mel_frames_initial=3000
DEBUG keyless_whisper::inference::pipeline: inference quality metrics avg_logprob=-0.3377138527596076 compression_ratio=2.888888888888889 no_speech_prob=0.045450158417224884 temperature=1.0
DEBUG keyless_whisper::inference::pipeline: voiced unit decode metrics unit_start=4800 unit_end=49800 avg_logprob=-0.3377138527596076 compression_ratio=2.888888888888889 no_speech_prob=0.045450158417224884 temperature=1.0 preview= Osaka is a beautiful city
 INFO keyless_whisper::inference::pipeline: unit_start=4800 unit_end=49800 appended=Osaka is a beautiful city current_final=Osaka is a beautiful city
 INFO keyless_whisper::inference::pipeline: voiced decode complete len=25
DEBUG keyless_whisper::whisper::inference_thread: final units processed unit_seconds=10 units_total=1 units_reused=0 units_decoded=1 preview_len=25
 INFO keyless_output::sinks::clipboard: copied text to clipboard len=25
 INFO keyless_runtime::pipeline::events: final transcription delivered text=Osaka is a beautiful city sink=Clipboard
DEBUG keyless_core::config::storage: config: loaded successfully path=/home/yegor/.config/keyless/config.json
DEBUG keyless_core::config::storage: config: saved successfully path=/home/yegor/.config/keyless/config.json bytes=484

Related Issues

Fixes #20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Audio capture fails on USB microphone with unsupported sample format (I8)

1 participant