Skip to content

Latest commit

 

History

History

README.md

Doux: Decoupling Values from Keys for Real-Time Analytics

Doux is built upon the open-source code of Bourbon. Bourbon is a fine-tuned version of Wisckey using learned indexes. Bourbon's source code is here. Since Bourbon already implements Wisckey, we extend it by integrating our method (Doux) along with other comparable approaches, such as RISE.

Code Layout

Our main implementations are located in the following directories:

  • impl/: Core implementations, benchmarks, and utilities for our methods.
  • mod/: Additional modules and extensions used by our implementations.
  • db/: Storage engine internals and modifications (where many method-level changes are integrated).

Data Generate

TPCH data for experiments is generated using the bundled tpch-tool.tar.gz:

  1. Extract the tool:
    • tar -xzf tpch-tool.tar.gz
  2. Follow the instructions in tpch-tool/README.md to build dbgen and generate tables with the desired scale factor.
  3. A simple example (generate 1GB-scale lineitem and rename the file):
    • cd tpch-tool/TPCH/dbgen && ./dbgen -T L -s 1 -f && mv -f lineitem.tbl lineitem.1

For more options (different tables, scale factors, etc.), please refer to the detailed documentation in tpch-tool/README.md.

Build

To build the project, follow these steps:

  1. Create build directories:

    mkdir build
    mkdir build_debug
  2. Configure and build for Release mode:

    cd build
    cmake .. -DCMAKE_BUILD_TYPE=Release
    make -j$(nproc)
  3. Configure and build for Debug mode:

    cd build_debug
    cmake .. -DCMAKE_BUILD_TYPE=Debug
    make -j$(nproc)

Note: Replace $(nproc) with the number of CPU cores you want to use for parallel compilation, or simply use make -j to use all available cores.

Run Benchmarks (TPCH Example)

After building, the TPCH benchmark binaries are generated under the build directory (e.g., build/ or build_debug/):

  • tpch_bench_load_data: load TPCH lineitem data into the DB
  • tpch_bench_query_q6: run range queries (Q6-style) on the DB

1) Load Data

Example (TPCH lineitem, Doux mode -m 10):

cd build
./tpch_bench_load_data \
  -m 10 \
  -f <PATH_TO_LINEITEM_FILE> \
  -c <NUM_RECORDS_TO_LOAD> \
  -d <DB_DIR> &

Typical usage:

  • -m, --modification: method switch (e.g., 0 for vanilla LevelDB, 10 for our modified version)
  • -f, --input_file: TPCH input file (e.g., lineitem.100 generated in the Data Generate step)
  • -c, --count: number of tuples to load (should match the actual number of records in the file)
  • -d, --directory: DB directory (e.g., ../test/doux_db_100)

2) Run Queries

Query files for Q6 are provided under:

  • impl/benchmarks/tpch/tpch_query/

From the build directory, you can refer to them as:

  • ../impl/benchmarks/tpch/tpch_query/query_q6
  • ../impl/benchmarks/tpch/tpch_query/query_q6_001
  • ...

After loading finishes, run queries, for example:

cd build
./tpch_bench_query_q6 \
  -m 10 \
  -f ../impl/benchmarks/tpch/tpch_query/query_q6 \
  -q 5 \
  -d <DB_DIR>

Key arguments:

  • -q, --query_size: number of queries to execute