Skip to content

[Paper][EMNLP 2025] SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs

Notifications You must be signed in to change notification settings

zjukg/SKA-Bench

Repository files navigation

SKA-Bench

📄arXiv🤗 Huggingface

Environment

conda create -n skabench python=3.9.0
conda activate skabench
pip install openai
pip install asyncio
pip install uvloop

Testbed Construction

For noisy robustness, order insensitivity and information integration testbeds, you can run:

python process_dataset.py --type KG --sequence random --scale 1k

NOTE:

Please write the data type in type, sequence type in sequence, the size of scale in size before running the code. Then the test set will be generated in the dataset folder.

For negative rejection, you can run:

python process_dataset.py --type Table --sequence original --scale 4k --negative_rejection negative_rejection
python process_dataset.py --type KG --sequence random --scale 4k --negative_rejection negative_rejection
python process_dataset.py --type Table+Text --sequence original --scale 12k --negative_rejection negative_rejection
python process_dataset.py --type KG+Text --sequence random --scale 12k --negative_rejection negative_rejection

Evaluating scripts

For noisy robustness, order insensitivity and information integration testbeds, you can run:

python evaluate.py --type <type> --api_key <api_key> --api_url <api_url> --model <model> --dataset_dir ./dataset/Table_original_42_4k.json

NOTE:

Please change the data type in <type>, the api key in <api_key>, the api url in <api_url>, the model type in <model>, and dataset dir in the position of ./dataset/Table_original_42_4k.json.

For negative rejection, you can run:

python evaluate_negative.py --type KG --api_key <api_key> --api_url <api_url> --model <model> --dataset_dir ./dataset/KG_random_42_4k_negative_rejection.json
python evaluate_negative.py --type Table --api_key <api_key> --api_url <api_url> --model <model> --dataset_dir ./dataset/Table_original_42_4k_negative_rejection.json
python evaluate_negative.py --type KG+Text --api_key <api_key> --api_url <api_url> --model <model> --dataset_dir ./dataset/KG+Text_random_42_12k_negative_rejection.json
python evaluate_negative.py --type Table+Text --api_key <api_key> --api_url <api_url> --model <model> --dataset_dir ./dataset/Table+Text_original_42_12k_negative_rejection.json

🤝 Cite:

Please consider citing this paper if you find our work useful.


@article{liu2025ska,
  title={SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs},
  author={Liu, Zhiqiang and Niu, Enpei and Hua, Yin and Sun, Mengshu and Liang, Lei and Chen, Huajun and Zhang, Wen},
  journal={arXiv preprint arXiv:2507.17178},
  year={2025}
}

About

[Paper][EMNLP 2025] SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages