The TREC-ToT dataset is integrated into ir_datasets (you can install this via pip3 install ir-datasets) and we have baselines for Anserini, PyTerrier, and a Dense Retrieval approach that use this ir_datasets integration. The code and description for all baselines is available in main/trec25.
The indices of our baselines are publicly available for faster experimentation/modification of our baselines are publicly available.
The following baselines and runs are available (more details available in trec25/evaluation/evaluation-of-baselines.ipynb):
| ir_dataset | Baseline | Runfiles | NDCG@10 | NDCG@1000 | R@1000 |
|---|---|---|---|---|---|
| trec-tot/2025/train | BM25 (Anserini) | runs | 0.022 | 0.055 | 0.280 |
| trec-tot/2025/train | BM25 (PyTerrier) | runs | 0.065 | 0.115 | 0.455 |
| trec-tot/2025/train | Dense Retrieval | runs | 0.318 | 0.373 | 0.755 |
| ir_dataset | Baseline | Runfiles | NDCG@10 | NDCG@1000 | R@1000 |
|---|---|---|---|---|---|
| trec-tot/2025/dev1 | BM25 (Anserini) | runs | 0.031 | 0.058 | 0.218 |
| trec-tot/2025/dev1 | BM25 (PyTerrier) | runs | 0.084 | 0.134 | 0.451 |
| trec-tot/2025/dev1 | Dense Retrieval | runs | 0.324 | 0.381 | 0.761 |
| ir_dataset | Baseline | Runfiles | NDCG@10 | NDCG@1000 | R@1000 |
|---|---|---|---|---|---|
| trec-tot/2025/dev2 | BM25 (Anserini) | runs | 0.043 | 0.072 | 0.252 |
| trec-tot/2025/dev2 | BM25 (PyTerrier) | runs | 0.099 | 0.143 | 0.455 |
| trec-tot/2025/dev2 | Dense Retrieval | runs | 0.020 | 0.050 | 0.245 |
| ir_dataset | Baseline | Runfiles | NDCG@10 | NDCG@1000 | R@1000 |
|---|---|---|---|---|---|
| trec-tot/2025/dev3 | BM25 (Anserini) | runs | 0.092 | 0.143 | 0.470 |
| trec-tot/2025/dev3 | BM25 (PyTerrier) | runs | 0.337 | 0.392 | 0.771 |
| trec-tot/2025/dev3 | Dense Retrieval | runs | 0.014 | 0.035 | 0.174 |
Note: This repository hosts the code for processing the 2025 edition of the TREC ToT corpus in the trec25 directory. For processing older versions, please refer to the trec24 directory respectively the dedicated 2024 release release respectively the dedicated 2023 release.