tabularqa

This repo includes the system developed for SemEval 2025 Task 8: Question Answering Over Tabular Data by AILS-NTUA.

The system ranked 1st in the proprietary models ranking in both subtasks of the competition.

The system description paper will be published in ACL 2025. The preprint is available at arXiv.

Architecture

The system performs Text-to-Python Code conversion of user queries through prompting Large Language Models (LLMs). More details on the architecture can be found in the paper.

Usage Instructions

Note

Tested on Python 3.12

Clone the repository
Install the required packages from requirements.txt:

pip install -r requirements.txt

Set up credentials or models based on the evaluation scenario:
- For evaluating Claude 3.5 Sonnet or Llama 3.1 Instruct-405B: Create a .env file in the root directory and add AWS credentials:
```
AWS_ACCESS_KEY_ID=your_access_key_id
AWS_SECRET_ACCESS_KEY=your_secret_access_key
```
- For evaluating Ollama models (llama3.1:8b, llama3.3:70b, qwen2.5-coder:7b): Download the models by following the instructions on the Ollama website. Ensure that Ollama is installed and running on port 11434.
Download competition.zip and extract it in the root directory for running the model in the DataBench Test Set (this is the default behavior, can be changed by loading another split as shown in the Hugging Face Page). This can be downloaded from the DataBench Competition Page or directly from here.

unzip competition.zip

Download the answers.zip file with the answers for the test set and extract it in competition/answers/ directory.

wget https://raw.githubusercontent.com/jorses/databench_eval/main/examples/answers.zip
mkdir -p competition/answers
unzip answers.zip -d competition/answers

Run the main.py script with input the specification of the pipeline. All pipelines are found in the config/ folder. Include the --lite flag to run on DataBench lite.

python main.py --pipeline config/claude3.5-sonnet 
# or
python main.py --pipeline config/claude3.5-sonnet --lite

The results will be saved in a new results/ directory.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
annotation		annotation
assets		assets
config		config
core		core
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

tabularqa

Architecture

Usage Instructions

About

Uh oh!

Releases

Uh oh!

Languages

andrewevag/tabularqa

Folders and files

Latest commit

History

Repository files navigation

tabularqa

Architecture

Usage Instructions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Languages