- A language classifier to classify if the given text is in English or Dutch using decision tree and adaboost over the text from Wikipedia.
- Features used to predict the language are deteremined by the
pronouns,parts of speech,frequency of i's,j's and k's(higher in Dutch as compared to English language),frequency of consecutive repeating letters in a wordandaverage length of a word in a given sentence.
-
Notifications
You must be signed in to change notification settings - Fork 0
Samridhi16/wikipedia-language-classifier
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
A language classifier to classify if the given text is in English or Dutch using decision tree and adaboost
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published