Skip to content

Latest commit

Β 

History

History
executable file
Β·
63 lines (52 loc) Β· 3.72 KB

File metadata and controls

executable file
Β·
63 lines (52 loc) Β· 3.72 KB

πŸš€ LiveSQLBench-Base-Lite

A dynamic, contamination‑free benchmark for evaluating LLMs on complex, real‑world text‑to‑SQL tasks.

🌐 Website β€’ πŸ“„ Paper (coming soon) β€’ πŸ’» GitHub

Maintained by the 🦜 BIRD Team @ HKU & ☁️ Google Cloud

Dataset Access

Checkout our Hugging Face Page for easy access to the LiveSQLBench-Base-Lite-SQLite dataset.

πŸ” To avoid data leakage by auto-crawling, certain fields (e.g., sol_sql, test_cases, external_knowledge) are excluded from the public dataset livesqlbench_data_sqlite.jsonl. For the full dataset, please email: πŸ“§ bird.bench25@gmail.com with subject tag [livesqlbench-base-lite-SQLite GT&Test Cases], which will be sent automatically.

🎯 Current Release: LiveSQLBench-Base-Lite-SQLite

We are pleased to release a SQLite version of LiveSQLBench-Base-Lite, extending from PostgreSQL to SQLite dialect to improve accessibility β€” SQLite requires no server setup and runs locally. This release features 18 end-user level databases with 270 tasks (180 SELECT-only, 90 Management tasks), HKB-JSON and JSON operations in SQL for trial.

Beyond SQL and test case translation, we carefully adapted 20+ user queries to align with SQLite's database engine characteristics. For example, since SQLite doesn't support custom functions, we modified queries to either return specific scenario values or utilize views (e.g., CREATE VIEW AS ...) to maintain query complexity while ensuring compatibility.

πŸ“š Project Structure

mini_dev/
β”œβ”€β”€ live_sql_bench_sqlite/
β”‚   β”œβ”€β”€ database/                # SQLite databases
β”‚   β”œβ”€β”€ evaluation/              # Evaluation scripts
β”‚   β”œβ”€β”€ utils/                   # Utility scripts for prompt generation and post-processing
β”‚   β”œβ”€β”€ prompts/                 # Prompt templates
β”‚   β”œβ”€β”€ README.md                # This file

πŸ“¦ Usage

Inference

Use the following script to generate the baseline prompt for the SQLite version of LiveSQLBench-Base-Lite:

cd utils
data_path="xxx/livesqlbench_data_sqlite.jsonl"
assistant_prompt_path="xxx/assistant_sqlite.jsonl" # Replace with your desired output path
data_path_base="xxx/database" # Replace with your database path
python xxx/prompt_generator.py \
    --data_path $data_path \
    --prompt_path $assistant_prompt_path \
    --prompt_type "assistant" \
    --data_path_base $data_path_base

The generated jsonl file will contain a prompt field. Then you can use your favorite LLM to generate SQL queries based on the provided prompt.

πŸ› οΈ Process

After you generate SQL queries using the prompts, you can use the following script to postprocess the generated SQL queries (assume the input is a jsonl file with a response field):

cd utils
input_path="xxx/generated_sql.jsonl" # Replace with your generated SQL file path
output_path="xxx/post_processed_sql.jsonl" # Replace with your desired output path
python xxx/post_process_baseline.py --input_path $input_path --output_path $output_path

πŸ“Š Evaluation

Now we can evaluate the generated SQL queries against the SQLite databases. Use the following script to evaluate the post-processed SQL queries:

cd evaluation
# You need to modify the JSONL location and the database path in the run_eval.sh file
jsonl_file="xxx/post_processed_sql.jsonl" # Replace with your desired output path
db_path="xxx/database" # Replace with your database path
bash run_eval.sh