This project aims to develop a paraphrasing tool for low-resource languages, focusing on Marathi. The tool will transform complex sentences into simpler ones while maintaining their meaning, helping with language learning, NLP applications, and text simplification.
- Sentence Validation: Ensures input sentences conform to the dataset.
- Paraphrasing Engine: Uses NLP techniques like tokenization, stemming, and synonym replacement.
- Evaluation System: Measures similarity between original and paraphrased sentences.
- Dataset Management: Supports dataset expansion for improved paraphrasing.
- Programming Language: Python
- Libraries: NLTK, IndoWordNet, mahaNLP
- Platform: Google Colab, Spyder
- Machine Learning Models: Transformer-based models, LSTM
- Clone the repository
git clone https://github.com/your-repo/paraphrasing-tool.git cd paraphrasing-tool - Install dependencies
pip install -r requirements.txt
- Run the tool
python main.py
- Hardware: Minimum 4GB RAM, Core i3 processor
- Software: Python 3.x, Google Colab (recommended)
- Implement lemmatization for better word replacement
- Expand dataset for improved paraphrasing
- Enhance semantic understanding using deep learning models
- Pranita Barbade
- Akshada Malpure
- Anushka Pawar
- Reena Prasad