Skip to content

Model test results - model 20240117 #23

@jwebmeister

Description

@jwebmeister

Post test results + useful remarks here, ideally of both:

  • the new model (20240117), and
  • the base model (kaldi_model_daanzu_20211030-mediumlm)

, using the same test data, and using the default Ready or Not grammar module.

Useful remarks include:

  • specific words or phrases that were consistently misrecognised
  • rate of false positives / false negatives
  • subjective opinion, which model works better or worse, and in which areas

Important instructions:

  • It's important to manually review, clean and update retain.tsv with the correct rules + text, see example workflow near the end of these instructions
  • See this YouTube video
  • Please only include "normal commands" in the test data, please exclude "Freeze", etc.
    • There's a script ./scripts/copy_retain_item_cmds_only.ps1 that can be used in PowerShell to copy only "normal commands" out of ./retain/ and into ./cleanaudio_cmds/
  • Please, if possible, only use the default _readyornot.py grammar module, or very minor modifications, i.e. no new words.
  • Example command to run test ./tacspeak.exe --test_model './cleanaudio_cmds/retain.tsv' './kaldi_model/' './kaldi_model/lexicon.txt' 4
  • There are a number of useful PowerShell scripts in the ./scripts/ folder related to cleaning up the retain.tsv and related .wav files.
  • A workflow I use for cleaning up the data after a play session:
    • Open retain.tsv and go through each line, reviewing the rule and text
    • At the same time, load into a playlist every .wav file in the ./retain/ folder in VLC media player on single file loop, pressing 'N' to move to next .wav as I read through each line of retain.tsv
    • When there's a mismatch between the text vs the audio, but the rule is correct, I correct the text in retain.tsv to align with the audio.
    • When there's a mismatch between the recognised rule (and/or option) vs the audio, I either A) update both the rule + text manually, or B) delete the line in retain.tsv, then when I'm done reviewing I run the list_wav_missing_from_retain_tsv.ps1 first to make sure I'm deleting the right files, then run delete_wav_missing_from_retain_tsv.ps1 script (option A is preferred, but hey we're all busy and life is too short to spend cleaning all the data).
    • If the audio is so stupidly vague or garbled that I can't understand with my own ears and brain what I'm saying, I delete the line in retain.tsv, then when I'm done reviewing I run the list_wav_missing_from_retain_tsv.ps1 first to make sure I'm deleting the right files, then run delete_wav_missing_from_retain_tsv.ps1 script.

Example report:

  • 0 incorrect commands out of 4 cmds (1 missions played), same result both models
  • 5% WER, same result both models
  • new model more often picks up baby crying as "freeze", using "listen_key_toggle":-1, using USE_NOISE_SINK = True; also picked up in base model but not as often.
  • New model tended to pick up "red" as "gold" when wife was speaking
  • using default _readyornot.py without any modifications
  • './kaldi_model/' is new model
  • './kaldi_model_base/' is base model

('./kaldi_model/', './retain/retain.tsv', 'Command', 'WER', 'Overall -> 5.00 %+/- 9.55 %N=20 C=19 S=1 D=0 I=0')
('./kaldi_model/', './retain/retain.tsv', 'Command', 'CMDERR', {'cmd_not_correct_output': 0, 'cmd_not_correct_rule': 0, 'cmd_not_correct_options': 0, 'cmd_not_recog_output': 0, 'cmd_not_recog_input': 0, 'cmds': 4})
('./kaldi_model_base/', './retain/retain.tsv', 'Command', 'WER', 'Overall -> 5.00 %+/- 9.55 %N=20 C=19 S=0 D=1 I=0')
('./kaldi_model_base/', './retain/retain.tsv', 'Command', 'CMDERR', {'cmd_not_correct_output': 0, 'cmd_not_correct_rule': 0, 'cmd_not_correct_options': 0, 'cmd_not_recog_output': 0, 'cmd_not_recog_input': 0, 'cmds': 4})

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions