This repository contains the implementation of Shielded RecRL, a method for adding chat-style explanations to recommender systems without affecting the underlying ranking model.
Shielded RecRL uses a two-tower architecture:
- A frozen ranking model (collaborative filtering)
- A trainable language model that generates explanations
The key innovation is the gradient projection technique that prevents the explanation model from affecting the ranking model's performance.
-
Clone this repository:
git clone https://github.com/your_username/shielded-recrl.git cd shielded-recrl -
Edit
setup_local.shto update your GitHub username, then run:bash setup_local.sh
-
Launch a RunPod instance with:
- Runtime: PyTorch 2.3 | Python 3.10 | CUDA 12.2
- GPU: NVIDIA A100 80GB or 2× RTX 4090 24GB
- Volume: ≥ 400GB
-
SSH into your RunPod instance:
ssh -p YOUR_PORT runpod@YOUR_POD_ID.connect.runpod.io
-
Edit
setup_runpod.shto update your GitHub username, then run:bash setup_runpod.sh
-
Verify the setup:
python gpu_test.py
├── code
│ ├── dataset/ # Dataset preprocessing
│ ├── ranker/ # SASRec implementation
│ ├── explainer/ # LLM with LoRA
│ ├── projection/ # Gradient projection
│ ├── trainer/ # Shielded PPO
│ └── eval/ # Evaluation metrics
├── data # Datasets
├── checkpoints # Model checkpoints
├── logs # Training logs
├── experiments # Experiment configurations
├── docs # Documentation
└── docker # Docker configuration
- Edit code on your local machine
- Commit and push changes to GitHub
- Pull changes on RunPod and execute experiments
- Results are logged to W&B and saved to the persistent volume
[Add your license information here]