Skip to content

mattiaodorisio/S-Tree

Repository files navigation

S-Tree

S-Tree is a Static pointer-free and SIMD-optimized B-Tree, inspired by this post on algorithmica.org

Overview

The S-Tree library offers efficient data structures optimized for modern hardware. It uses cache-sensitive layouts and SIMD intrinsics to enhance the search performance, particularly for computing the rank (i.e. lower_bound) of elements, making it ideal for read-heavy workloads.

Other useful references and similar projects:

  1. The Static Search Tree (post and repo) written in Rust by @RagnarGrootKoerkamp, with common ideas plus more complex layouts and batched queries.
  2. FAST: fast architecture sensitive tree, presented in this paper
  3. The CSS-Tree presented in the paper Cache conscious indexing for decision-support in main memory by Rao, J., & Ross, K. A. (1998), and its implementation by @gvinciguerra

Data Structures

BTree

The BTree is a pointer-free B-Tree (not a B+Tree) that organizes elements in a cache-sensitive layout similar to the CSS Tree. It generalizes the Eytzinger layout (commonly used for binary trees) to support a configurable fanout B (node size). Key features include:

  • Efficient Traversal: Uses SIMD intrinsics for fast operations.
  • Space Efficiency: The input data is permuted into the BTree layout, allowing the original sorted input to be discarded.
  • Performance Trade-Off: Accessing elements incurs a small overhead of $\Theta(\log_B n)$ due to the layout transformation.

Additional optimizations inspired by the post include cache-line alignment and the use of huge pages to minimize TLB misses.

Sampled-BTree

The Sampled-BTree is a variant of the BTree that operates on a subset of the input data, extracted via regular sampling with a fixed step. This structure proved to be more efficient on some input.

Usage Example

#include "b_tree.h"

std::vector<int32_t> data = {-3, 2, 4, 11, 35, 60};
SIMD_Btree::btree<int32_t> tree;
tree.build(data.begin(), data.end());
std::cout << tree.lower_bound_idx(11) << std::endl;  // 3
std::cout << tree.lower_bound_idx(12) << std::endl;  // 4
std::cout << tree[3] << std::endl;                   // 11

For a complete working example, refer to src/example.cc.

Additional Details

When declaring a btree or sampled_btree, several template parameters can be specified (all have default values):

template <typename key_type,
          size_t known_log_tree_size = 0,
          SIMD_ext ext = AVX512,
          size_t vectors_per_block = 1>
class btree;

template <typename key_type,
          SIMD_ext ext = AVX512,
          size_t vectors_per_block = 1,
          size_t leaves_vectors_per_block = vectors_per_block>
class sampled_btree;

Template Parameters

  • vectors_per_block: Specifies the number of vectors per tree node.
  • ext: Defines the SIMD extension to use (AVX2 or AVX512).
  • leaves_vectors_per_block: Sets the sampling offset for the sampled_btree.
  • known_log_tree_size: If the logarithm (base B) of the data size is known at compile time (consider using btree::log_size(size_t n)) this allows additional optimizations. The default value is 0 (unknown).

Benchmarks

This library was developed as part of the experimental setup for a research article that compares various indexing structures. For a detailed experimental evaluation and performance comparison have a look at the repo and the paper.

Cite us

If you use this code in your research project, please cite us:

@InProceedings{bellomo_et_al:LIPIcs.SEA.2025.5,
  author =	{Bellomo, Lorenzo and Cianci, Giuseppe and de Rosa, Luca and Ferragina, Paolo and Odorisio, Mattia},
  title =	{{A Comparative Study of Compressed, Learned, and Traditional Indexing Methods for Integer Data}},
  booktitle =	{23rd International Symposium on Experimental Algorithms (SEA 2025)},
  pages =	{5:1--5:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-375-1},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{338},
  editor =	{Mutzel, Petra and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2025.5},
  URN =		{urn:nbn:de:0030-drops-232439},
  doi =		{10.4230/LIPIcs.SEA.2025.5},
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •