Skip to content

andrewevag/tabularqa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tabularqa

This repo includes the system developed for SemEval 2025 Task 8: Question Answering Over Tabular Data by AILS-NTUA.

The system ranked 1st in the proprietary models ranking in both subtasks of the competition.

The system description paper will be published in ACL 2025. The preprint is available at arXiv.

Architecture

Architecture The system performs Text-to-Python Code conversion of user queries through prompting Large Language Models (LLMs). More details on the architecture can be found in the paper.

Usage Instructions

Note

Tested on Python 3.12

  1. Clone the repository
  2. Install the required packages from requirements.txt:
pip install -r requirements.txt
  1. Set up credentials or models based on the evaluation scenario:

    • For evaluating Claude 3.5 Sonnet or Llama 3.1 Instruct-405B: Create a .env file in the root directory and add AWS credentials:

      AWS_ACCESS_KEY_ID=your_access_key_id
      AWS_SECRET_ACCESS_KEY=your_secret_access_key
    • For evaluating Ollama models (llama3.1:8b, llama3.3:70b, qwen2.5-coder:7b): Download the models by following the instructions on the Ollama website. Ensure that Ollama is installed and running on port 11434.

  2. Download competition.zip and extract it in the root directory for running the model in the DataBench Test Set (this is the default behavior, can be changed by loading another split as shown in the Hugging Face Page). This can be downloaded from the DataBench Competition Page or directly from here.

unzip competition.zip
  1. Download the answers.zip file with the answers for the test set and extract it in competition/answers/ directory.
wget https://raw.githubusercontent.com/jorses/databench_eval/main/examples/answers.zip
mkdir -p competition/answers
unzip answers.zip -d competition/answers
  1. Run the main.py script with input the specification of the pipeline. All pipelines are found in the config/ folder. Include the --lite flag to run on DataBench lite.
python main.py --pipeline config/claude3.5-sonnet 
# or
python main.py --pipeline config/claude3.5-sonnet --lite
  1. The results will be saved in a new results/ directory.

About

Question Answering over Tabular Data with LLM Code Generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Languages