GitHub

HELIOS: Harmonizing Early Fusion, Late Fusion, and LLM Reasoning for Multi-Granular Table-Text Retrieval

HELIOS outperforms state-of-the-art models with a significant improvement in both recall and nDCG on the OTT-QA benchmark.

About HELIOS

We introduce HELIOS, our novel table-text retrieval model designed to enhance the capabilities of open-domain question answering systems by addressing the limitations of both early and late fusion methods. Here are the key features of HELIOS:

Combining early and late fusion techniques, it bridges the gap between static pre-alignments and dynamic retrieval strategies, ensuring contextually relevant results for more complex queries.
Utilizing edge-based bipartite subgraph retrieval, HELIOS materializes finer-grained relationships between table segments and text passages, reducing the inclusion of irrelevant information while maintaining crucial query-dependent links.
Employing a query-relevant node expansion mechanism, it dynamically identifies and retrieves the most promising nodes for expansion, minimizing the risk of missing vital contexts.
Integrating a star-based LLM refinement step, it prevents hallucinations by performing logical inference at the star graph level, enabling advanced reasoning tasks such as column-wise aggregation and multi-hop reasoning.

HELIOS outperforms state-of-the-art models with a significant improvement in both recall and nDCG on the OTT-QA benchmark.

Getting Started

This page guides you to reproduce the results written in the paper "HELIOS: Harmonizing Early Fusion, Late Fusion, and LLM Reasoning for Multi-Granular Table-Text Retrieval".

Please refer to the instructions below.

Prerequisites

You must be able to download our docker image from the docker cloud. Please refer to Docker Docs to download docker.

Download Docker Image

We made a docker image of our environment. Please download from docker cloud.

Download our image from docker cloud

docker pull anonymous824/heliosworkspace:latest

docker pull anonymous824/heliosresources:latest

Create HELIOS Workspace

Create a helios workspace using the downloaded image.

Docker run

docker run -itd --name acl2025-heliosworkspace anonymous824/heliosworkspace /bin/bash

Docker start

docker start acl2025-heliosworkspace

Docker init

docker init acl2025-heliosworkspace

Activate Conda Env

conda activate fm

Download Dataset and Model Checkpoints

Docker run

docker run -itd --name acl2025-heliosresources anonymous824/heliosresources /bin/bash

Docker start

docker start acl2025-heliosresources

Docker init

docker init acl2025-heliosresources

Download large language model

HF_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download meta-llama/Llama-3.1-8B-Instruct --local-dir-use-symlinks False --local-dir /mnt/sdd/OTT-QAMountSpace/ModelCheckpoints/Ours/llm/Meta-Llama-3.1-8B-Instruct --exclude *.pth

Build Index

Create edge index

sh Algorithms/Ours/scripts/build_edge_index.sh

Create table segment index

sh Algorithms/Ours/scripts/build_table_segment_index.sh

Create passage index

sh Algorithms/Ours/scripts/build_passage_index.sh

Run Edge-based Bipartite Subgraph Retrieval

If tmux is not installed, run the following command

apt-get install tmux

Load edge retriever

tmux new -s edge_retriever
conda activate fm
cd HELIOS
sh Algorithms/Ours/scripts/load_edge_retriever.sh

Load edge reranker

tmux new -s edge_reranker
conda activate fm
cd HELIOS
sh Algorithms/Ours/scripts/load_edge_reranker.sh

Run bipartite subgraph retrieval

sh Algorithms/Ours/scripts/run_edge_based_bipartite_subgraph_retrieval.sh

Run Query-relevant Node Expansion

Kill edge retriever session

tmux kill-session -t edge_retriever

Load seed node scorer

tmux new -s node_scorer
conda activate fm
cd HELIOS
sh Algorithms/Ours/scripts/load_seed_node_scorer.sh

Load table segment retriever

tmux new -s table_segment_retriever
conda activate fm
cd HELIOS
sh Algorithms/Ours/scripts/load_table_segment_retriever.sh

Load passage retriever

tmux new -s passage_retriever
conda activate fm
cd HELIOS
sh Algorithms/Ours/scripts/load_passage_retriever.sh

Run query-relevant node expansion

sh Algorithms/Ours/scripts/run_query_relevant_node_expansion.sh

Run Star-based LLM Refinement

Kill edge retriever session

tmux kill-session -t edge_reranker
tmux kill-session -t node_scorer
tmux kill-session -t table_segment_retriever
tmux kill-session -t passage_retriever

Load large language model

tmux new -s llm
conda activate fm
cd HELIOS
sh Algorithms/Ours/scripts/load_llm.sh

Run star-based llm refinement

sh Algorithms/Ours/scripts/run_star_based_llm_refinement.sh

Evaluate Retrieval Accuracy

Evaluate AnswerRecall@K

sh Algorithms/Ours/scripts/eval_answer_recall.sh

Evaluate nDCG@K

sh Algorithms/Ours/scripts/eval_ndcg.sh

Evaluate HITS@4K

sh Algorithms/Ours/scripts/eval_hits.sh

Evaluate Reading Accuracy

Convert retrieval results into reader input

sh Algorithms/Ours/scripts/get_reader_input.sh

Evaluate Exact Match & F1 Score

sh Algorithms/Ours/scripts/eval_reading_accuracy.sh

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Algorithms		Algorithms
Evaluation		Evaluation
images		images
.gitignore		.gitignore
ACL2025_HELIOS.pdf		ACL2025_HELIOS.pdf
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HELIOS: Harmonizing Early Fusion, Late Fusion, and LLM Reasoning for Multi-Granular Table-Text Retrieval

About HELIOS

Getting Started

Prerequisites

Download Docker Image

Create HELIOS Workspace

Activate Conda Env

Download Dataset and Model Checkpoints

Build Index

Run Edge-based Bipartite Subgraph Retrieval

Run Query-relevant Node Expansion

Run Star-based LLM Refinement

Evaluate Retrieval Accuracy

Evaluate Reading Accuracy

About

Uh oh!

Releases

Packages

Languages

pshlego/HELIOS

Folders and files

Latest commit

History

Repository files navigation

HELIOS: Harmonizing Early Fusion, Late Fusion, and LLM Reasoning for Multi-Granular Table-Text Retrieval

About HELIOS

Getting Started

Prerequisites

Download Docker Image

Create HELIOS Workspace

Activate Conda Env

Download Dataset and Model Checkpoints

Build Index

Run Edge-based Bipartite Subgraph Retrieval

Run Query-relevant Node Expansion

Run Star-based LLM Refinement

Evaluate Retrieval Accuracy

Evaluate Reading Accuracy

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages