This project explores the capabilities and limitations of Large Language Models (LLMs) in multilingual contexts through three main tasks:
- Evaluation of pre-trained LLMs (XGLM and GPT-2) on multilingual text
- Implementation using HuggingFace's transformers library
- Focus on model loading, tokenization, and inference
- Performance comparison across different languages
Task 2: Hidden Representation Analysis
- Visualization and analysis of model's internal representations
- Implementation of dimensionality reduction techniques (PCA, t-SNE)
- Cross-lingual representation analysis
- Understanding model's language embeddings
- Implementation of efficient finetuning strategies:
- LoRA (Low-Rank Adaptation)
- iA3 (Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning)
- Comparative analysis of adaptation methods
- Performance evaluation across multiple languages
llm-finetuning/
├── final_submission/
│ ├── Code/
│ │ ├── Task 1/ # Language model inference
│ │ ├── Task 2/ # Hidden representation analysis
│ │ └── Task 3/ # Model adaptation techniques
│ └── Report/ # Project documentation and results
├── notebooks/ # Development notebooks
├── scripts/ # Utility scripts and implementations
└── tasks/ # Task descriptions and requirements
python>=3.8
torch>=1.10.0
transformers>=4.30.0
datasets>=2.12.0
numpy>=1.17
matplotlib>=3.3.0
scikit-learn>=0.24.0- Clone the repository:
git clone https://github.com/yourusername/llm-finetuning.git
cd llm-finetuning- Install dependencies:
pip install -r requirements.txtcd final_submission/Code/Task\ 1
jupyter notebook "Task 1.ipynb"Task 2: Hidden Representation Analysis
cd final_submission/Code/Task\ 2
jupyter notebook task2_cheatVersion.ipynb- Model finetuning:
cd final_submission/Code/Task\ 3
python task3.py- Finetuning Results visualization
cd final_submission/Code/Task\ 3
jupyter notebook task3-vis.ipynbThe project evaluates models on a diverse set of languages including:
- English (eng_Latn)
- Spanish (spa_Latn)
- Italian (ita_Latn)
- German (deu_Latn)
- Arabic (arb_Arab)
- Telugu (tel_Telu)
- Tamil (tam_Taml)
- Quechua (quy_Latn)
- XGLM-564M: Primary multilingual model
- GPT-2: Comparison baseline
- Custom adapted versions using LoRA, iA3, and DoRA
- Cross-entropy loss
- Hidden representation analysis
- Adaptation efficiency metrics
This project is part of the Neural Network Theory and Implementation course at Saarland University. All rights reserved.
- UdS NNTI Course instructors and teaching assistants
- HuggingFace team for the transformers library
- Facebook AI for the FLORES dataset and XGLM model