A modular, multi-source, multi-hop RAG framework that decomposes queries, routes subquestions, and fuses answers with reflexive reasoning.
The arxiv link is https://arxiv.org/abs/2507.22050
The website link is https://minghokwok.github.io/deepsieve
- 2025-07-29 — Uploaded full corpus to Arkiv and released DeepSieve preprint on arXiv
- [Jan 2026] This paper was accepted by EACL 2026 Findings 🎉🎉🎉🎉🎉
DeepSieve is a Retrieval-Augmented Generation (RAG) framework designed to handle:
- Structurally heterogeneous knowledge (e.g., SQL tables, JSON logs, Wikipedia)
- Compositional queries requiring multi-step reasoning
- Privacy-aware sources that cannot be merged
DeepSieve introduces a novel information sieving pipeline:
- Decompose complex queries into subquestions
- Route each subquestion to an appropriate tool–corpus pair
- Reflect and retry failed retrievals
- Fuse partial answers into a final response
This pipeline implements a modular Retrieval-Augmented Generation (RAG) system with:
- ✅ Query decomposition (
--decompose) - ✅ Per-subquestion routing to local/global knowledge (
--use_routing) - ✅ Reflection for failed queries (
--use_reflection) - ✅ Lightweight RAG backends:
naiveorgraph - ✅ Detailed logging & performance tracking
Before running, install dependencies:
pip install -r requirements.txtThen, export your LLM-related credentials:
export OPENAI_API_KEY=your_api_key
export OPENAI_MODEL=deepseek-chat
export OPENAI_API_BASE=https://api.deepseek.com/v1 # OptionalFor
naiveorgraphmode, also set:
export RAG_TYPE=naive # or graphpython runner/main_rag_only.py \
--dataset hotpot_qa \
--sample_size 100 \
--decompose \
--use_routing \
--use_reflection \
--max_reflexion_times 2This runs the full pipeline with:
- Decomposition
- Routing to local/global sources
- Reflection (up to 2 retries)
- Default backend: Naive RAG
export RAG_TYPE=naive
python runner/main_rag_only.py \
--dataset hotpot_qa \
--sample_size 100 \
--decompose \
--use_routing \
--use_reflection \
--max_reflexion_times 2export RAG_TYPE=graph
python runner/main_rag_only.py \
--dataset hotpot_qa \
--sample_size 100 \
--decompose \
--use_routing \
--use_reflection \
--max_reflexion_times 2You can toggle modules by removing flags:
- Disable decomposition: remove
--decompose - Disable routing: remove
--use_routing - Disable reflection: remove
--use_reflection
Each run saves:
-
Individual results per query:
outputs/{rag_type}_{dataset}*/query_{i}_results.jsonl -
Fusion prompts:
outputs/.../query_{i}_fusion_prompt.txt -
Aggregated metrics:
overall_results.txtandoverall_results.json
If you find this work helpful, please consider citing our paper:
@inproceedings{guo2025deepsieve,
title={DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router},
author={Guo, Minghao and Zeng, Qingcheng and Zhao, Xujiang and Liu, Yanchi and Yu, Wenchao and Du, Mengnan and Chen, Haifeng and Cheng, Wei},
booktitle={Findings of the Association for Computational Linguistics: EACL 2026},
year={2026}
}
@article{guo2025deepsieve,
title={DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router},
author={Guo, Minghao and Zeng, Qingcheng and Zhao, Xujiang and Liu, Yanchi and Yu, Wenchao and Du, Mengnan and Chen, Haifeng and Cheng, Wei},
journal={arXiv preprint arXiv:2507.22050},
year={2025}
}

