Welcome to Knowledge Update Playground (KUP) — an automatic framework for generating realistic knowledge update/conflict datasets and evaluating how well Large Language Models (LLMs) adapt to knowledge changes during continued pre-training.
KUP helps researchers and practitioners:
- Generate realistic knowledge update pairs to simulate real-world knowledge shifts and conflicts.
- Evaluate LLMs’ adaptability to knowledge updates during fine-tuning or continued pre-training.
- Train LLMs using both continued pre-training and supervised fine-tuning following the setup in Synthetic Continued Pre-training.
This playground is designed to benchmark how well LLMs handle incremental knowledge, especially in dynamic environments.
Note: The
mainbranch is fully functional. However, we are actively working on improving code readability, structure, and usability to make the project more production-ready inprodbranch.
The KUP dataset contains 5,000 high-quality knowledge update/conflict pairs, automatically synthesized and verified to represent realistic knowledge shifts.
🔗 Hugging Face Dataset:
https://huggingface.co/datasets/aochongoliverli/KUP
git clone https://github.com/your-username/KUP.git
cd KUP
pip install -r requirements.txt