Doux is built upon the open-source code of Bourbon. Bourbon is a fine-tuned version of Wisckey using learned indexes. Bourbon's source code is here. Since Bourbon already implements Wisckey, we extend it by integrating our method (Doux) along with other comparable approaches, such as RISE.
Our main implementations are located in the following directories:
impl/: Core implementations, benchmarks, and utilities for our methods.mod/: Additional modules and extensions used by our implementations.db/: Storage engine internals and modifications (where many method-level changes are integrated).
TPCH data for experiments is generated using the bundled tpch-tool.tar.gz:
- Extract the tool:
tar -xzf tpch-tool.tar.gz
- Follow the instructions in
tpch-tool/README.mdto builddbgenand generate tables with the desired scale factor. - A simple example (generate 1GB-scale
lineitemand rename the file):cd tpch-tool/TPCH/dbgen && ./dbgen -T L -s 1 -f && mv -f lineitem.tbl lineitem.1
For more options (different tables, scale factors, etc.), please refer to the detailed documentation in tpch-tool/README.md.
To build the project, follow these steps:
-
Create build directories:
mkdir build mkdir build_debug
-
Configure and build for Release mode:
cd build cmake .. -DCMAKE_BUILD_TYPE=Release make -j$(nproc)
-
Configure and build for Debug mode:
cd build_debug cmake .. -DCMAKE_BUILD_TYPE=Debug make -j$(nproc)
Note: Replace $(nproc) with the number of CPU cores you want to use for parallel compilation, or simply use make -j to use all available cores.
After building, the TPCH benchmark binaries are generated under the build directory (e.g., build/ or build_debug/):
tpch_bench_load_data: load TPCHlineitemdata into the DBtpch_bench_query_q6: run range queries (Q6-style) on the DB
Example (TPCH lineitem, Doux mode -m 10):
cd build
./tpch_bench_load_data \
-m 10 \
-f <PATH_TO_LINEITEM_FILE> \
-c <NUM_RECORDS_TO_LOAD> \
-d <DB_DIR> &Typical usage:
-m, --modification: method switch (e.g.,0for vanilla LevelDB,10for our modified version)-f, --input_file: TPCH input file (e.g.,lineitem.100generated in the Data Generate step)-c, --count: number of tuples to load (should match the actual number of records in the file)-d, --directory: DB directory (e.g.,../test/doux_db_100)
Query files for Q6 are provided under:
impl/benchmarks/tpch/tpch_query/
From the build directory, you can refer to them as:
../impl/benchmarks/tpch/tpch_query/query_q6../impl/benchmarks/tpch/tpch_query/query_q6_001...
After loading finishes, run queries, for example:
cd build
./tpch_bench_query_q6 \
-m 10 \
-f ../impl/benchmarks/tpch/tpch_query/query_q6 \
-q 5 \
-d <DB_DIR>Key arguments:
-q, --query_size: number of queries to execute