A multilingual text summarizer built with Hugging Face Transformers that supports both Bengali and English.
This project demonstrates state-of-the-art abstractive summarization using mT5-small, optimized with ONNX runtime and 8-bit quantization for faster inference.
- ✅ Supports two languages: Bengali + English
- ✅ Transformer-based summarization using
MT5ForConditionalGeneration - ✅ Evaluated with ROUGE metrics (best ROUGE-L =
0.0654) - ✅ Optimized with ONNX runtime + 8-bit quantization
- ✅ Post-processing filters for cleaner summaries
- ✅ Inference Speedup – Reduced runtime from 71s → 21s
- Bengali: Prothom Alo News (Kaggle)
- English: CNN/DailyMail (Kaggle)
Data columns:
description→ Input textsummary→ Target summary
- Tokenizer:
MT5Tokenizer - Base Model:
MT5ForConditionalGeneration(google/mt5-small) - Training Args:
learning_rate = 2e-5num_train_epochs = 6per_device_train_batch_size = 8evaluation_strategy = "epoch"
- Best Validation Loss:
0.8312(after 6 epochs)
| Metric | Value |
|---|---|
| Validation Loss | 0.8312 |
| ROUGE-L | 0.0654 |
- ✅ ONNX Runtime – reduced from 71.4s → 58.5s
- ✅ 8-bit Quantization – reduced further to 21.4s
- filter_short_sentences() → Removes sentences with < 3 words
- Beam search decoding parameters:
max_length = 500 num_beams = 4 early_stopping = True
QuickSum/
├── training/
│ └── training.ipynb # Model training experiments, Evaluation & ROUGE scoring
│
├── optimization/
│ ├── onnx_optimize.ipynb # ONNX conversion + quantization
│ └── postprocess.ipynb # Post-processing filters
│
├── static/
│ ├── css/ # cascaded style file
│ └── js/ # javascript file
│
├── templates/
│ └── index.html # front end skeleton
│
├── .gitignore
├── LICENSE
├── README.md
└── requirements.txt
- Install Dependencies: Make sure you have Python installed on your system. Install the required Python libraries by running the following command:
pip install -r requirements.txt- Run the FastAPI Server: To start the server, use the following command:
uvicorn server:app --reloadOnce the server is running, you can access the API at:
- http://127.0.0.1:8000/ (Main endpoint)
| Summarizer API | Bengali Text | English Text |
![]() |
![]() |
![]() |
- Python
- Hugging Face Transformers
- PyTorch
- ONNX Runtime
- FastAPI
This project highlights:
- Using Transformer-based architectures
- Handling multilingual NLP tasks
- Model training + evaluation + optimization
- Applying quantization for production-ready inference
If you'd like to discuss NLP, transformers, or optimization techniques, feel free to connect! 🚀
.png)
.png)
.png)