Minor Project
Minor group details :
- Nitesh Vishwakarma
- Sarthak Patel
- Neomi Sule
- Harshit
Online technologies have started blooming recently, and all our tasks have shifted to digital mode. Reviews play an essential part in our online purchases, from online shopping based on product reviews to deciding whether to go for a movie based on movie reviews. Allotting sentiments for all such reviews to segregate them into positives and negatives wasn't manually possible; hence, some research began in this sector. Not much research has been done on this, so our project adds to this research for further classification tasks that can be added to the pipeline and help in the sentiment analysis of the reviews. For the ease of writing reviews, people in India prefer to pen down their opinions in Hinglish rather than just English or Hindi. Our technique processes and analysis such Hinglish datasets using technologies like Python, Scikit learn and transformers (for future purposes), and libraries like Indic translate, Indic transliterate, Nltk, Pandas, Regex and Enchant.
Live Link : https://huggingface.co/spaces/Tihsrah/Hinglish-Text-Normalizer Report : https://docs.google.com/document/d/1zcWExfY3fAOs7_fk4mGStY6Pk8YacExC8IGu-up2YXg/edit?usp=sharing SRS : https://docs.google.com/document/d/1oqa0pW-DE09NmIxdcSnAFK6rATl6CKLfNi7Z5qfr0QU/edit?usp=sharing PPT : https://docs.google.com/presentation/d/1nUFjB7_eIP9LM4FNkx6NwMtUk4pmbEId/edit?usp=sharing&ouid=108095548513010025067&rtpof=true&sd=true
Some important colab files which contains experiments : (These are not in order)
https://colab.research.google.com/drive/16vS_HvgSyM-g4MJ6si2iKUObRuBMkOtr https://colab.research.google.com/drive/1D9RWkX4QZ0idughQw2E-0lqPUOwg4v4N https://colab.research.google.com/drive/1s969YqJY6o4qgwvnfFp9ms5CvNXVtDkr https://colab.research.google.com/drive/1SFv51Cq2J8hXgh_7caHjc8KashsldkZ8 https://colab.research.google.com/drive/1vwVGzFVr0Ds9yA33TyUM_Sjgj5lXtHo2
Some important kaggle file : https://www.kaggle.com/code/tihsrahly/scrapped-test-000-100 https://www.kaggle.com/code/tihsrahly/pppm-deberta-v3-large-baseline-w-w-b-tr-c9dbd3 https://www.kaggle.com/code/tihsrahly/train-15000-end
(There are other files as well, must see my notebooks)
Datasets :
https://www.kaggle.com/tihsrahly/datasets
Note : Some models are stored in UPES_STUDENT_ONE_DRIVE in NNST_SENTIMENT_ANALYSIS folder.