Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models

This work proposes a source-free unsupervised domain adaptation approach for speech foundation models.

Conda Environment Configuration

Our conda environment is provided via the file requirements.txt, please run the command below to install necessary packages:

pip install -r requirements.txt

Data Preparation

Our code requires two kaldi-format data files: wav.scp and text.

wav.scp contains a list of audio files, each line includes sample ID and absolute audio path:
```
utt_1  /your-data-path/1.wav
utt_2  /your-data-path/2.wav
```
text contains a list of ground-truth transcriptions, each line includes sample ID and transcription:
```
utt_1  i feel good
utt_2  he is coming back
```

NOTE: each line in above two files should be paired.

Training

Please refer to our training script finetune.sh and specify some settings:

dataset: training data name;
model_size: whisper model size;
train_data: training data directory that contains files wav.scp and text;
dev_data: development data directory that contains files wav.scp and text;

Then, please run command bash finetune.sh to start training. The model weights will be saved at runs/{dataset}_{model_size}.

Inference

Please refer to our inference script inference.sh and specify some settings:

dataset: training data name;
model_size: whisper model size;
checkpoint: path of the trained model checkpoint (.pth file);
test_data: test data directory that contains files wav.scp and text;

Please run command bash inference.sh for inference. WER results would be printed in the log.

References

We kindly hope you can cite our paper in your publication when using our research or code:

@article{hu2024self,
  title={Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models},
  author={Hu, Yuchen and Chen, Chen and Yang, Chao-Han Huck and Qin, Chengwei and Chen, Pin-Yu and Chng, Eng Siong and Zhang, Chao},
  journal={arXiv preprint arXiv:2405.14161},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.DS_Store		.DS_Store
README.md		README.md
finetune.py		finetune.py
finetune.sh		finetune.sh
inference.py		inference.py
inference.sh		inference.sh
requirements.txt		requirements.txt
star.png		star.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models

Conda Environment Configuration

Data Preparation

Training

Inference

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

YUCHEN005/STAR-Adapt

Folders and files

Latest commit

History

Repository files navigation

Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models

Conda Environment Configuration

Data Preparation

Training

Inference

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages