This repository hosts experimental scripts and configurations for a Retrieval-Augmented Generation (RAG) pipeline built on top of a LoRA fine-tuned Flan-T5 model for telecommunications research.
The project aims to explore how lightweight LoRA fine-tuning and retrieval-based augmentation can improve domain-specific language understanding in telecom-related datasets.
At this stage, only preliminary testing and environment setup scripts are included.
Future scripts will expand on retrieval, evaluation, and deployment components.
telecom-lora-rag/
├── test_lora_model.py # Example script to load and evaluate a LoRA adapter
├── pyproject.toml # Project dependencies for Python
├── environment.yml # Conda environment configuration
├── README.md # This file
└── (additional scripts to be added later)
-
Create and activate the Conda environment:
conda env create -f environment.yml conda activate telecom-lora-rag
-
Edit configuration parameters: Each script contains a configuration block under:
if __name__ == "__main__": # Modify these paths to match your local setup MODEL_BASE = "google/flan-t5-large" MODEL_DIR = "~/Downloads/telecom_lora_model" CSV_PATH = "./3gpp_rel18_qa_3000.csv"
⚠️ Paths and datasets are not tracked in Git. Each collaborator should update them manually in their local copy. -
Run the test:
python test_lora_model.py
- The base model (
google/flan-t5-large) is automatically downloaded from Hugging Face when first used. - LoRA adapter weights must be provided locally by each collaborator.
- No datasets, checkpoints, or training outputs are stored in this repository.
This repository is intended for code only. The following assets must not be committed to Git:
*.pt
*.bin
*.safetensors
checkpoint-*
*.csv
*.zip
*.pdf
*_model/
-
Keep all scripts self-contained and well-documented.
-
Use clear English docstrings and comments.
-
Update configuration paths manually before running experiments.
-
Future additions may include:
- RAG prototype using FAISS/ChromaDB
- Dataset parsing utilities
- Evaluation metrics and benchmarking tools
Telecom LoRA RAG provides a minimal, extensible codebase for experimenting with LoRA fine-tuning and retrieval-augmented generation techniques in the telecom domain. All confidential data and model files remain local to each contributor’s environment.