The Belarusian language support (at least just spellchecking for it) #2195
+211,108
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issues
#2155
Description
This is not intended to be merged. It's just the most simple test to see whether the Harper engine is able to shoulder the load of supporting one more language in addition to English without degrading performance and without increasing the memory footprint too much.
The Belarusian language uses a different Cyrillic alphabet, so it doesn't clash with the existing English language support. The Belarusian dictionary has been taken from https://github.com/375gnu/spell-be-tarask
I could have added a simple Belarusian linter rule as well, but would prefer to keep it simple until the seemingly unreasonable memory footprint issue is resolved. The Belarusian wordlist has ~20MB size in a plain text format. Using the
fst-bincommand line tool (which accompanies thefstcrate used by Harper) the Belarusian spellchecker dictionary can be compressed to a merely ~1.1MB binary. At least that's the theory.Demo
How Has This Been Tested?
Used the
just lintcommand to check short text files, such as "Ths is an test. Гта тэст."And also the Firefox plugin to check the same short texts in https://textarea.online
Checklist