A dataset is developed comprising social media comments that can be categorized as positive, negative, or neutral, as well as categorized by five topics: Education, Transport, Crime, Health, and Hygiene.
Count Vectorization was used to process the data, and a Multinomial Naïve Bayes’ Classifier model was developed with an accuracy of 89%. Dataset along with .py and .ipynb files are included.
To classify new data (comments), users may enter them in comma-separated format, and the output will be saved in a csv file named ‘result’.