An interactive tool for evaluating English-to-Bengali machine translations using the NLLB (No Language Left Behind) model and METEOR evaluation metrics. The tool uses Facebook's NLLB-200-1.3B model to translate English text to Bengali.
- Translate English text to Bengali using Facebook's NLLB-200-1.3B model
- Compare machine translations against multiple human reference translations
- Calculate METEOR evaluation scores to assess translation quality
- Interactive command-line interface for easy usage
- Python 3.7 or higher
- transformers
- nltk
- sacrebleu
-
Clone this repository:
git clone https://github.com/AHMED-SAFA/Bengali-Translation-Evaluator.git cd Bengali-Translation-Evaluator -
Install the required dependencies:
pip install transformers nltk sacrebleu
-
Download required NLTK data:
import nltk nltk.download('wordnet') nltk.download('omw-1.4')
from translator import translate_user_input
# Start the interactive translation and evaluation session
translate_user_input()--- User Input Translation with Multiple References ---
For each English input, two reference Bengali translations needed
The function will calculate METEOR scores for each reference and their average
Enter English text to translate to Bengali (or type 'exit' to quit): I am going to the market.
Step 1: Translating input text...
English: I am going to the market.
Model Translation: আমি বাজারে যাচ্ছি।
Step 2: Collecting reference translations...
Enter 1st reference Bengali translation: আমি বাজারে যাচ্ছি।
Enter 2nd reference Bengali translation: আমি হাটে যাচ্ছি।
Step 3: Calculating METEOR score for 1st reference...
Reference 1: আমি বাজারে যাচ্ছি।
METEOR Score 1: 1.0000
Step 4: Calculating METEOR score for 2nd reference...
Reference 2: আমি হাটে যাচ্ছি।
METEOR Score 2: 0.7500
Step 5: Calculating average METEOR score...
Average METEOR Score: 0.8750
Calculation: (1.0000 + 0.7500) / 2 = 0.8750
--------------------------------------------------- Translation: The tool uses Facebook's NLLB-200-1.3B model to translate English text to Bengali
- Reference Collection: Users provide two reference translations for comparison
- METEOR Calculation: The tool calculates METEOR scores between the machine translation and each reference
- Evaluation: An average METEOR score is computed to evaluate overall translation quality
def calculate_meteor(reference, hypothesis):
"""Calculate METEOR score between reference and hypothesis."""
try:
reference_tokens = reference.split()
hypothesis_tokens = hypothesis.split()
score = meteor_score([reference_tokens], hypothesis_tokens)
return score
except Exception as e:
print(f"Error calculating METEOR score: {e}")
return NoneThe METEOR (Metric for Evaluation of Translation with Explicit ORdering) score ranges from 0 to 1:
- 1.0: Perfect match with the reference
- 0.7-0.9: High-quality translation
- 0.5-0.7: Moderate-quality translation
- Below 0.5: Low-quality translation that may need improvement
You can modify the model parameters for different translation requirements:
tokenizer = AutoTokenizer.from_pretrained("facebook/nllb-200-3.3B") # Larger model
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/nllb-200-3.3B")
translation = translator(user_input, max_length=512, num_beams=5)The No Language Left Behind (NLLB) model by Meta AI is designed to provide high-quality translations across 200+ languages, including many low-resource languages. This project utilizes the NLLB-200-1.3B model, which offers a good balance between performance and resource requirements.
Contributions are welcome! Please feel free to submit a Pull Request.