Searcher is a study project with the objective of using VSM (Vector Space Model) to search documents and reveal the similarity between the document and the query.
val documentTable = new DocumentTable() documentTable.pushText("Hello World")
documentTable.pushText("Hello Guys")
documentTable.pushText("Hello Main")
documentTable.pushText("Hello Man Guys World")
documentTable.pushText("Hello hello")
documentTable.pushText("Hello guys guys hello hello guys hello") documentTable.pushQuery("Hello Guys") documentTable.result().show() Id Similarity
5 0.8164965809277259
1 0.6666666666666667
6 0.8082903768654761
2 1.0000000000000002
3 0.6666666666666667
4 0.8660254037844387In the first step, we process the input by removing accents and punctuations and converting all letters to lowercase. Immediately after that, we index the sentences into an Inverted Index.
