Concept Level Sentiment Analysis of Bengali Text

Overview

This project explores the concept-level sentiment analysis of Bengali text, focusing on restaurant reviews, comments, and blogs. Unlike traditional feature-based sentiment analysis, this research introduces a concept-level approach using custom-built dictionaries to identify and classify sentiments more accurately.

Key Features

Data Collection:
- Collected 2,700 Bengali text reviews, comments, and blogs from Facebook.
- Labeled data manually into positive, negative, and neutral sentiment categories.
Preprocessing Workflow:
- Punctuation Removal: Removed unnecessary symbols and spaces.
- Tokenization & Detokenization: Processed text using nltk for splitting and rejoining meaningful words.
- Stopword Removal: Eliminated uninformative words like "অথচ," "অথবা," and "এটি."
- Concept Construction: Built custom dictionaries for nouns, adjectives, verbs, and adverbs to extract sentiment-bearing concepts.
Machine Learning Models:
- Evaluated algorithms: Naïve Bayes, Support Vector Machine (SVM), Random Forest, Decision Tree, and K-Nearest Neighbor.
- Compared model performance on feature-based and concept-level datasets.

Methodology

Concept Extraction:
- Created self-made dictionaries to identify sentiment-rich concepts.
- Mapped these concepts to BanglaSenticNet for sentiment matching.
Data Processing:
- Cleaned data and transformed text into concepts using custom algorithms.
- Generated two datasets: one for feature-based analysis and another for concept-level analysis.
Evaluation Metrics:
- Used metrics such as accuracy, precision, recall, and F1-score for model performance evaluation.
- Confusion matrix to analyze predictions.

Results

Feature-Based Approach:
- SVM achieved the highest accuracy: 84%.
- Other models' accuracy: Random Forest (73%), Decision Tree (72%), Naïve Bayes (66%).
Concept-Level Approach:
- SVM achieved the highest accuracy: 96%.
- Other models' accuracy: Random Forest (75%), Decision Tree (71%), Naïve Bayes (68%), K-Nearest Neighbor (68%).
Performance Comparison:
- The concept-level approach outperformed feature-based models across all metrics, showcasing its efficacy in sentiment analysis for Bengali text.

Conclusion

The proposed concept-level approach improves sentiment detection by incorporating domain-specific knowledge from dictionaries.
SVM consistently outperformed other models, validating its robustness in both feature-based and concept-level approaches.
The research highlights the challenges of extracting meaningful concepts from Bengali text and the effectiveness of custom-built dictionaries.

Future Directions

Advanced Concept Extraction:
- Implement dependency-based semantic parsing for improved results.
Expanding Sentiment Labels:
- Move beyond positive, negative, and neutral to include emotions like joy, anger, and sadness.
Larger Datasets:
- Work with a more extensive and diverse dataset to generalize findings.
Multi-language Support:
- Apply the methodology to other languages using R or Python.

References

Key references include:

BanglaSenticNet for concept matching.
SentiWordNet for sentiment classification.
Various research papers on multilingual sentiment analysis and dependency parsing.

For more details, please refer to the research paper attached in this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
bn.txt		bn.txt
concept_level_data.csv		concept_level_data.csv
concept_sentiment.ipynb		concept_sentiment.ipynb
feature_data.csv		feature_data.csv
image.jpg		image.jpg
punctuation.py		punctuation.py
raw.xlsx		raw.xlsx
word_remove.py		word_remove.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Concept Level Sentiment Analysis of Bengali Text

Overview

Key Features

Methodology

Results

Conclusion

Future Directions

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Concept Level Sentiment Analysis of Bengali Text

Overview

Key Features

Methodology

Results

Conclusion

Future Directions

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages