This project fine-tunes a T5 model to correct grammatical errors in English text. Uses JFLEG dataset for training and evaluation.
The assignment uses JFLEG (JHU FLuency-Extended GUG) dataset for grammatical error correction.
- Contains 1,511 sentences with 4 human-written fluency corrections each
- Split into development (754 sentences) and test sets (747 sentences)
- development set used as training and test set used as validation & testing sets
- Focuses on fluency edits beyond just grammatical corrections
- Citation: Napoles, C., Sakaguchi, K., & Tetreault, J. (2017). JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction
- Base Model: T5-small 60 million parameters
- Fine-tuning: Using LoRA (Low-Rank Adaptation)
Assignment2/Grammar_Correction
├── grammar_correction_t5_lora.ipynb # Main training and evaluation notebook [Download to view the file, Not visible in GitHub]
├── README.md # This file
├── Images/ # Directory for saving output images
├── ../Checkpoints/ # Directory for saving training logs
The model is trained with the following configuration:
- Batch size: 4
- Number of epochs: 3
- Learning rate: 0.0001
- Warm Ratio: 0.1
- Weight Decay: 0.01
- LoRA Rank: 4
- LoRA Alpha: 8
- LoRA Dropout: 0.1
- Device: CUDA (if available) or CPU
- Random seed: 42
The implementation includes:
- Augmented the training set through mapping each sources to it's multiple references.
- Setup the LoRA T5 model with trainable params: 147,456 || all params: 60,654,080 || trainable%: 0.2431.
- Evaluating every epochs on gleu, bertscore and meteor metrics.
- Visualisation of training and validation loss along with metrics and learning rate.
- Running test on the 10% of the validation data.
- Inference on few sentences.
- datasets==3.6.0
- evaluate==0.4.4
- huggingface-hub==0.33.1
- ipykernel==6.29.5
- ipython==9.3.0
- jupyter_client==8.6.3
- jupyter_core==5.8.1
- matplotlib==3.10.3
- matplotlib-inline==0.1.7
- nltk==3.9.1
- numpy==2.3.1
- pandas==2.3.0
- pycocotools==2.0.10
- rouge_score==0.1.2
- scikit-learn==1.7.0
- seaborn==0.13.2
- textstat==0.7.7
- tokenizers==0.21.2
- torch==2.7.1
- torchaudio==2.7.1
- torchvision==0.22.1
- tqdm==4.67.1
- transformers==4.52.4
- wordcloud==1.9.4
- Clone the repository
- Set up the environment with required dependencies
- Run the
grammar_correction_t5_lora.ipynbnotebook for eda, training and evaluatating the model
The model is evaluated on the validation set using following metrics:
- GLEU (Generalized Language Evaluation Understanding): Specifically designed for grammar error correction; recall oriented
- Achieved 66.58% on test data.
- METEOR: Incorporates linguistic understanding.
- Achieved 86.94% on test data.
- BERT Score: Measures semantic similarity
- Precision: 92.32% on test data.
- Recall: 93.25% on test data.
- F1: 92.76% on test data.
This project is part of the Research Methods in Data Science course assignment.
- JFLEG dataset creators
- Hugging Face
- Research Methods in Data Science course instructors