A dynamic, contaminationβfree benchmark for evaluating LLMs on complex, realβworld textβtoβSQL tasks.
π Website β’ π Paper (coming soon) β’ π» GitHub
Maintained by the π¦ BIRD Team @ HKU & βοΈ Google Cloud
Checkout our Hugging Face Page for easy access to the LiveSQLBench-Base-Lite-SQLite dataset.
π To avoid data leakage by auto-crawling, certain fields (e.g., sol_sql, test_cases, external_knowledge) are excluded from the public dataset livesqlbench_data_sqlite.jsonl. For the full dataset, please email: π§ bird.bench25@gmail.com with subject tag [livesqlbench-base-lite-SQLite GT&Test Cases], which will be sent automatically.
We are pleased to release a SQLite version of LiveSQLBench-Base-Lite, extending from PostgreSQL to SQLite dialect to improve accessibility β SQLite requires no server setup and runs locally. This release features 18 end-user level databases with 270 tasks (180 SELECT-only, 90 Management tasks), HKB-JSON and JSON operations in SQL for trial.
Beyond SQL and test case translation, we carefully adapted 20+ user queries to align with SQLite's database engine characteristics. For example, since SQLite doesn't support custom functions, we modified queries to either return specific scenario values or utilize views (e.g., CREATE VIEW AS ...) to maintain query complexity while ensuring compatibility.
mini_dev/
βββ live_sql_bench_sqlite/
β βββ database/ # SQLite databases
β βββ evaluation/ # Evaluation scripts
β βββ utils/ # Utility scripts for prompt generation and post-processing
β βββ prompts/ # Prompt templates
β βββ README.md # This file
Use the following script to generate the baseline prompt for the SQLite version of LiveSQLBench-Base-Lite:
cd utils
data_path="xxx/livesqlbench_data_sqlite.jsonl"
assistant_prompt_path="xxx/assistant_sqlite.jsonl" # Replace with your desired output path
data_path_base="xxx/database" # Replace with your database path
python xxx/prompt_generator.py \
--data_path $data_path \
--prompt_path $assistant_prompt_path \
--prompt_type "assistant" \
--data_path_base $data_path_baseThe generated jsonl file will contain a prompt field. Then you can use your favorite LLM to generate SQL queries based on the provided prompt.
After you generate SQL queries using the prompts, you can use the following script to postprocess the generated SQL queries (assume the input is a jsonl file with a response field):
cd utils
input_path="xxx/generated_sql.jsonl" # Replace with your generated SQL file path
output_path="xxx/post_processed_sql.jsonl" # Replace with your desired output path
python xxx/post_process_baseline.py --input_path $input_path --output_path $output_pathNow we can evaluate the generated SQL queries against the SQLite databases. Use the following script to evaluate the post-processed SQL queries:
cd evaluation
# You need to modify the JSONL location and the database path in the run_eval.sh file
jsonl_file="xxx/post_processed_sql.jsonl" # Replace with your desired output path
db_path="xxx/database" # Replace with your database path
bash run_eval.sh