Nikhil Pande

CS10 23W - Problem Set 5

Hidden Markov Models and the Viterbi Algorithm

In Problem Set 5, we build a machine learning-based bot named Sudi, who labels each word in a sentence with its part-of-speech, based on its neighbors and the word itself. The program reads large training sets of sentences and tags, then uses a map of maps to build an extensive hidden markov model. Using the Viterbi algorithm, which calculates scores based on obeservation and transition probability to decipher the most likely traversal of the automaton, the program outputs its predicted part-of-speech for every word based on the calculated Viterbi scores. The program can tag an entire testing file of sentences using Viterbi, and it can calculate its accuracy based on a corresponding file of correct tags. This model was able to tag parts-of-speech to a 97% accuracy.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
MarkovReader.java		MarkovReader.java
README.md		README.md
Sudi.java		Sudi.java
brown-test-sentences.txt		brown-test-sentences.txt
brown-test-tags.txt		brown-test-tags.txt
brown-train-sentences.txt		brown-train-sentences.txt
brown-train-tags.txt		brown-train-tags.txt
simple-test-sentences.txt		simple-test-sentences.txt
simple-test-tags.txt		simple-test-tags.txt
simple-train-sentences.txt		simple-train-sentences.txt
simple-train-tags.txt		simple-train-tags.txt
testtags.txt		testtags.txt
testwords.txt		testwords.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nikhil Pande

CS10 23W - Problem Set 5

Hidden Markov Models and the Viterbi Algorithm

About

Uh oh!

Releases

Packages

Languages

NPande25/PartOfSpeechTagger

Folders and files

Latest commit

History

Repository files navigation

Nikhil Pande

CS10 23W - Problem Set 5

Hidden Markov Models and the Viterbi Algorithm

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages