`chardict.py` could be improved : - % do not add up to 100 % (especially for trigrams) - once parsed, corpora cannot be merge because information like the number of n-grams is missing