This repository is some work related to pāliwiki, which is a Tipiṭaka reading, learning and translating platform.
includes the materials for cook, they are basicly convert from others.
From pāliwiki:
-
pali_text.tgzis the full text of Tipiṭaka. -
dict_parent.dbis convert fromsys_regular.dbandsys_irregular.db.- The
sys_regular.dbis a regular inflection dictionary infered by all possible grammer rules. The parent of a word is not the stem nor the root. For example, "gam->gaccha(ti)->gacchanta->gacchato", each word's previous is called the parent of the word. - The
sys_irregular.dbis the dictionary of irregular word entered manually.
- The
-
dict_compound.dbis convert frompm.dbandcomp.db, it contains the split of each compound word.pm.dbis the pāli-myanmar dictionary.comp.dbis the results of splits using algorithm.
includes preprocessing scripts:
parenting.pyconvert each word of pali_text.txt to its parent usingdict_parent.dbanddict_compound.db.
includes codes used for analysis the sentences in Tipiṭaka:
- Get similar sentences using jaccard similarity.