S-Tree is a Static pointer-free and SIMD-optimized B-Tree, inspired by this post on algorithmica.org
The S-Tree library offers efficient data structures optimized for modern hardware. It uses cache-sensitive layouts and SIMD intrinsics to enhance the search performance, particularly for computing the rank (i.e. lower_bound) of elements, making it ideal for read-heavy workloads.
Other useful references and similar projects:
- The Static Search Tree (post and repo) written in Rust by @RagnarGrootKoerkamp, with common ideas plus more complex layouts and batched queries.
- FAST: fast architecture sensitive tree, presented in this paper
- The CSS-Tree presented in the paper Cache conscious indexing for decision-support in main memory by Rao, J., & Ross, K. A. (1998), and its implementation by @gvinciguerra
The BTree is a pointer-free B-Tree (not a B+Tree) that organizes elements in a cache-sensitive layout similar to the CSS Tree. It generalizes the Eytzinger layout (commonly used for binary trees) to support a configurable fanout B (node size). Key features include:
- Efficient Traversal: Uses SIMD intrinsics for fast operations.
- Space Efficiency: The input data is permuted into the BTree layout, allowing the original sorted input to be discarded.
-
Performance Trade-Off: Accessing elements incurs a small overhead of
$\Theta(\log_B n)$ due to the layout transformation.
Additional optimizations inspired by the post include cache-line alignment and the use of huge pages to minimize TLB misses.
The Sampled-BTree is a variant of the BTree that operates on a subset of the input data, extracted via regular sampling with a fixed step. This structure proved to be more efficient on some input.
#include "b_tree.h"
std::vector<int32_t> data = {-3, 2, 4, 11, 35, 60};
SIMD_Btree::btree<int32_t> tree;
tree.build(data.begin(), data.end());
std::cout << tree.lower_bound_idx(11) << std::endl; // 3
std::cout << tree.lower_bound_idx(12) << std::endl; // 4
std::cout << tree[3] << std::endl; // 11For a complete working example, refer to src/example.cc.
When declaring a btree or sampled_btree, several template parameters can be specified (all have default values):
template <typename key_type,
size_t known_log_tree_size = 0,
SIMD_ext ext = AVX512,
size_t vectors_per_block = 1>
class btree;
template <typename key_type,
SIMD_ext ext = AVX512,
size_t vectors_per_block = 1,
size_t leaves_vectors_per_block = vectors_per_block>
class sampled_btree;
vectors_per_block: Specifies the number of vectors per tree node.ext: Defines the SIMD extension to use (AVX2orAVX512).leaves_vectors_per_block: Sets the sampling offset for thesampled_btree.known_log_tree_size: If the logarithm (baseB) of the data size is known at compile time (consider usingbtree::log_size(size_t n)) this allows additional optimizations. The default value is0(unknown).
This library was developed as part of the experimental setup for a research article that compares various indexing structures. For a detailed experimental evaluation and performance comparison have a look at the repo and the paper.
If you use this code in your research project, please cite us:
@InProceedings{bellomo_et_al:LIPIcs.SEA.2025.5,
author = {Bellomo, Lorenzo and Cianci, Giuseppe and de Rosa, Luca and Ferragina, Paolo and Odorisio, Mattia},
title = {{A Comparative Study of Compressed, Learned, and Traditional Indexing Methods for Integer Data}},
booktitle = {23rd International Symposium on Experimental Algorithms (SEA 2025)},
pages = {5:1--5:23},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-375-1},
ISSN = {1868-8969},
year = {2025},
volume = {338},
editor = {Mutzel, Petra and Prezza, Nicola},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2025.5},
URN = {urn:nbn:de:0030-drops-232439},
doi = {10.4230/LIPIcs.SEA.2025.5},
}