Skip to content

yeonju123/pair_ngram_model

Repository files navigation

Pair n-gram modeling toolkit

Requirements

Suggested workflow

readonly SEED="${RANDOM}"
echo "Using seed: ${SEED}"
./split.py \
    --seed="${SEED}" \
    --input_path=lexicon.tsv \
    --train_path=train.tsv \
    --dev_path=dev.tsv \
    --test_path=test.tsv
./train.py \
    --seed="${SEED}" \
    --input_path=train.tsv \
    --output_path=model.fst
# TODO: hyperparameter optimization using dev.tsv
cut -f1 test.tsv > test.g
cut -f2 test.tsv > gold.p
./rewrite.py \
    --word_path=test.g \
    --fst_path=model.fst > hypo.p
paste gold.p hypo.p > eval.tsv
./evaluate.py eval.tsv
rm -f test.g gold.p hypo.p
rm -f train.tsv dev.tsv test.tsv eval.tsv

About

pair n_gram language model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors