GitHub - faerber-lab/SQuAI: SQuAI: Scientific Question-Answering with Multi-Agent Retrieval-Augmented Generation

SQuAI: Scientific Question-Answering with Multi-Agent Retrieval-Augmented Generation

SQuAI is a scalable and trustworthy multi-agent Retrieval-Augmented Generation (RAG) system for scientific question answering (QA). It is designed to address the challenges of answering complex, open-domain scientific queries with high relevance, verifiability, and transparency. This project is introduced in our CIKM 2025 demo paper:

Link to: Demo Video

Requirements

Python 3.8+
PyTorch 2.0.0+
CUDA-compatible GPU

Installation

Load Module for Swig

ml release/24.04 GCC/12.3.0 OpenMPI/4.1.5 PyTorch/2.1.2

Install libleveldb-dev

sudo apt-get install libleveldb-dev

Clone the repository:

git clone git@github.com:faerber-lab/SQuAI.git
cd SQuAI

Create and activate a virtual environment:

python -m venv env
source env/bin/activate  # On Windows, use: env\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Running SQuAI

SQuAI can be run on a single question or a batch of questions from a JSON/JSONL file.

Process a Single Question

python run_SQuAI.py --model tiiuae/Falcon3-10B-Instruct --n 0.5 --alpha 0.65 --top_k 20 --single_question "Your question here?"

Process Questions from a Dataset

python run_SQuAI.py --model tiiuae/Falcon3-10B-Instruct --n 0.5 --alpha 0.65 --top_k 20 --data_file your_questions.jsonl --output_format jsonl

Parameters

--model: Model name or path (default: "tiiuae/falcon-3-10b-instruct")
--n: Adjustment factor for adaptive judge bar (default: 0.5)
--alpha: Weight for semantic search vs. keyword search (0-1, default: 0.65)
--top_k: Number of documents to retrieve (default: 20)
--data_file: File containing questions in JSON or JSONL format
--single_question: Process a single question instead of a dataset
--output_format: Output format - json, jsonl, or debug (default: jsonl)
--output_dir: Directory to save results (default: "results")

System Architecture

SQuAI consists of four key agents working collaboratively to deliver accurate, faithful, and verifiable answers:

Agent 1: Decomposer
Decomposes complex user queries into simpler, semantically distinct sub-questions. This step ensures that each aspect of the question is treated with focused retrieval and generation, enabling precise evidence aggregation.
Agent 2: Generator
For each sub-question, this agent processes retrieved documents to generate structured Question–Answer–Evidence (Q-A-E) triplets. These triplets form the backbone of transparent and evidence-grounded answers.
Agent 3: Judge
Evaluates the relevance and quality of each Q-A-E triplet using a learned scoring mechanism. It filters out weak or irrelevant documents based on confidence thresholds, dynamically tuned to the difficulty of each query.
Agent 4: Answer Generator
Synthesizes a final, coherent answer from filtered Q-A-E triplets. Critically, it includes fine-grained in-line citations and citation context to enhance trust and verifiability. Every factual statement is explicitly linked to one or more supporting documents.

Retrieval Engine

The agents are supported by a hybrid retrieval system that combines:

Sparse retrieval (BM25) for keyword overlap and exact matching.
Dense retrieval (E5 embeddings) for semantic similarity.

The system interpolates scores from both methods to maximize both lexical precision and semantic coverage.

$$S_{hybrid}(d) = \alpha \cdot S_{sparse}(d) + (1 - \alpha) \cdot S_{dense}(d)$$

(\alpha = 0.65), based on empirical tuning. This slightly favors dense retrieval while retaining complementary signals from sparse methods, ensuring both semantic relevance and precision.

User Interface

SQuAI includes an interactive web-based UI built with Streamlit and backed by a FastAPI server. Key features include:

A simple input form for entering scientific questions.
Visualization of decomposed sub-questions.
Toggle between sparse, dense, and hybrid retrieval modes.
Adjustable settings for document filtering thresholds and top-k retrieval.
Display of generated answers with fine-grained in-line citations.
Clickable references linking to original arXiv papers.

Benchmarks & Evaluation

We evaluate SQuAI using three QA datasets designed to test performance across varying complexity levels:

LitSearch: Real-world literature review queries from computer science.
unarXive Simple: General questions with minimal complexity.
unarXive Expert: Highly specific and technical questions requiring deep evidence grounding.

Evaluation metrics (via DeepEval) include:

Answer Relevance – How well the answer semantically matches the question.
Contextual Relevance – How well the answer integrates retrieved evidence.
Faithfulness – Whether the answer is supported by cited sources.

SQuAI improves combined scores by up to 12% in faithfulness compared to a standard RAG baseline.

Dataset & Resources

unarXive 2024: Full-text arXiv papers with structured metadata, section segmentation, and citation annotations. Hugging Face Dataset
QA Triplet Benchmark: 1,000 synthetic question–answer–evidence triplets for reproducible evaluation.

`$HOME/data_dir`

For the case that you want to change where the FAISS data is stored, create a file in $HOME/data_dir, in which a path to your workspace is. If no such file is defined or it is empty, /data/horse/ws/inbe405h-unarxive will be used by default.

How to restart the service if it isn't working:

sudo systemctl restart squai-frontend.service

Data is not being copied

Check if /etc/dont_copy exists.

Name		Name	Last commit message	Last commit date
Latest commit History 233 Commits
Dataset		Dataset
Dataset_Generation		Dataset_Generation
Evaluation		Evaluation
Retrieval_BM25		Retrieval_BM25
Retrieval_Hybrid		Retrieval_Hybrid
continous_hpc		continous_hpc
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
api_agent.py		api_agent.py
app.py		app.py
backend.sh		backend.sh
bm25_only_retriever.py		bm25_only_retriever.py
bm25_retrieval.py		bm25_retrieval.py
bm25_worker.py		bm25_worker.py
config.py		config.py
defaults.ini		defaults.ini
example.jsonl		example.jsonl
fast_llamaindex_retriever.py		fast_llamaindex_retriever.py
frontend.sh		frontend.sh
get_paths.py		get_paths.py
haystack_retriever.py		haystack_retriever.py
hybrid_retriever.py		hybrid_retriever.py
hybrid_retriever.py.backup		hybrid_retriever.py.backup
local_agent.py		local_agent.py
main.py		main.py
main.py.backup		main.py.backup
mix_query.jsonl		mix_query.jsonl
performance_monitor.py		performance_monitor.py
requirements.txt		requirements.txt
run_SQuAI.py		run_SQuAI.py
run_SQuAI.py.backup		run_SQuAI.py.backup
run_SQuAI.py.backup.10.15		run_SQuAI.py.backup.10.15
run_SQuAI.py.semantic		run_SQuAI.py.semantic
run_basic_RAG.py		run_basic_RAG.py
smartproxy.py		smartproxy.py
squai-frontend.service		squai-frontend.service
start_backend_from_enterprise_cloud.sh		start_backend_from_enterprise_cloud.sh
sub_query.jsonl		sub_query.jsonl
test_bm25.py		test_bm25.py
text_cleaner.py		text_cleaner.py
unified_arxiv_retriever.py		unified_arxiv_retriever.py
workspace_handler.sh		workspace_handler.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SQuAI: Scientific Question-Answering with Multi-Agent Retrieval-Augmented Generation

Requirements

Installation

Running SQuAI

Process a Single Question

Process Questions from a Dataset

Parameters

System Architecture

Retrieval Engine

User Interface

Benchmarks & Evaluation

Dataset & Resources

`$HOME/data_dir`

How to restart the service if it isn't working:

Data is not being copied

About

Uh oh!

Releases

Packages

Contributors 4

Languages

License

faerber-lab/SQuAI

Folders and files

Latest commit

History

Repository files navigation

SQuAI: Scientific Question-Answering with Multi-Agent Retrieval-Augmented Generation

Requirements

Installation

Running SQuAI

Process a Single Question

Process Questions from a Dataset

Parameters

System Architecture

Retrieval Engine

User Interface

Benchmarks & Evaluation

Dataset & Resources

$HOME/data_dir

How to restart the service if it isn't working:

Data is not being copied

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

`$HOME/data_dir`

Packages