kaldi-SENAN is the implementation of speech-enhanced and noise-aware network (SENAN, see the following paper) built on the open-sourced Kaldi toolkit. Example scripts for Aurora-4 task are provided and located at egs/aurora4/proposed. Scripts for AMI task are also provided.
Hung-Shin Lee, Pin-Yuan Chen, Yu Tsao, and Hsin-Min Wang, "Speech-enhanced and noise-aware networks for robust speech recognition," submitted to Interspeech 2022.
Follow kaldi installation steps and install this project.
- In stage 8 of
run.sh, change command to
# TDNN-F as AM + proposed model
local/chain/tuning/run_tdnn-1a_mtae_mfcc-mfcc-cont_noise-stats.sh# TDNN-F as AM + SpecAugment + proposed model
local/chain/tuning/run_tdnn-1a_mtae_mfcc-mfcc-cont_noise-stats_specaugment.sh # CNN-TDNN-F as AM + proposed model
local/chain/tuning/run_cnn-tdnn-1c_mtae_mfcc-mfcc-cont_noise-stats.sh# CNN-TDNN-F as AM + SpecAugment + proposed model
local/chain/tuning/run_cnn-tdnn-1c_mtae_mfcc-mfcc-cont_noise-stats_specaugment.sh - The weight for the two output layers can be changed by modifying frame_weight_dae and frame_weight_dspae in
run_{tdnn-1a,cnn-tdnn-1c}\_mtae\_*.sh
- In stage 11 of
run.sh, change command to
# CNN-TDNN-F as AM + SpecAugment
local/chain/tuning/run_cnn-tdnn-1c_specaugment.sh# TDNN-F as AM + SpecAugment + proposed model
local/chain/tuning/run_cnn-tdnn-1c_mtae_fbank-mfcc-t_noise-t_specaugment.sh