RNN-Transducer

A Pytorch Implementation of Transducer Model for End-to-End Speech Recognition and Deep Speech.

Environment

pytorch >= 0.4
warp-transducer

Preparation

We utilize Kaldi for data preparation. At least these files(text, feats.scp) should be included in the training/development/test set. If you apply cmvn, utt2spk and cmvn.scp are required. The format of these file is consistent with Kaidi. The format of vocab is as follows.

<blk> 0
<unk> 1
我 2
你 3
...

Train

python bin/train.py -config config/aishell.yaml

Eval

python bin/eval.py -config config/aishell.yaml

Experiments

The details of our RNN-Transducer are as follows.

model:
    enc:
        type: lstm
        hidden_size: 320
        n_layers: 4
        bidirectional: True
    dec:
        type: lstm
        hidden_size: 512
        n_layers: 1
    embedding_dim: 512
    vocab_size: 4232
    dropout: 0.2

Acknowledge

Thanks to warp-transducer and ctc-decoder.

ctc decoder

alpha表示语言模型分数的占比（不匹配语料0.2，匹配语料1） beta表示每增加一个字的奖励，越大字越多（一般取2，字数比较合适）

Name		Name	Last commit message	Last commit date
Latest commit History 330 Commits
bin		bin
config		config
egs/aishell		egs/aishell
src		src
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RNN-Transducer

Environment

Preparation

Train

Eval

Experiments

Acknowledge

ctc decoder

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RNN-Transducer

Environment

Preparation

Train

Eval

Experiments

Acknowledge

ctc decoder

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages