Skip to content

peacewalker122/cere-db

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

wasm-kv

Rust

An LSM-tree based key-value store built from scratch in Rust. Built as a learning project to deeply understand how storage engines work — from WAL to compaction — with a future path toward WASM compatibility.

Overview

wasm-kv is a single-node, single-writer embedded key-value store that implements the core components of a Log-Structured Merge Tree (LSM-tree):

  • Write path: Writes go through a Write-Ahead Log for durability, then into an in-memory SkipMap, which flushes to sorted on-disk SSTables when a size threshold is reached.
  • Read path: Reads check the in-memory MemTable first, then search through leveled SSTables using bloom filters and sparse indexes to minimize disk I/O.

Architecture

                          WRITE PATH
                          ─────────
    put(key, value)
          │
          ▼
    ┌───────────┐     Append record with
    │    WAL    │◄─── CRC32 checksum for
    │ (durability)│    crash recovery
    └─────┬─────┘
          │
          ▼
    ┌───────────┐     crossbeam SkipMap
    │ MemTable  │◄─── (sorted, concurrent)
    └─────┬─────┘
          │  threshold reached (400KB)
          ▼
    ┌───────────┐     Flush via crossbeam
    │  Flush    │◄─── channel to background
    │ (async)   │     watcher thread
    └─────┬─────┘
          │
          ▼
    ┌───────────────────────────────────┐
    │           SSTable File            │
    │  ┌───────┬───────┬───────┐       │
    │  │Block 0│Block 1│Block N│ 4KB   │
    │  └───────┴───────┴───────┘       │
    │  ┌─────────────────────┐         │
    │  │    Sparse Index     │         │
    │  └─────────────────────┘         │
    │  ┌─────────────────────┐         │
    │  │    Bloom Filter     │         │
    │  └─────────────────────┘         │
    │  ┌─────────────────────┐         │
    │  │      Footer         │         │
    │  └─────────────────────┘         │
    └───────────────────────────────────┘
          │
          ▼
    ┌───────────┐     Tracks SSTable files
    │ Manifest  │◄─── per level
    └───────────┘


                          READ PATH
                          ─────────
    get(key)
      │
      ▼
    MemTable ──found──▶ return value
      │ miss
      ▼
    Bloom Filter ──negative──▶ skip file
      │ maybe
      ▼
    Sparse Index ──locate──▶ target block
      │
      ▼
    Block scan ──found──▶ return value
      │ miss
      ▼
    Next SSTable... ──▶ repeat per level

Features

  • Write-Ahead Log (WAL) — Append-only log with CRC32 checksums, WAL rotation on flush, and archival for recovery
  • MemTable — Lock-free concurrent SkipMap (crossbeam-skiplist) with sorted key ordering
  • SSTable — Immutable on-disk sorted tables with:
    • Fixed-size 4KB blocks
    • Sparse index for block-level binary search
    • Bloom filters for fast negative lookups
    • Footer with offset metadata and checksums
  • Leveled Storage — Multi-level SSTable organization (L0, L1, L2...) with manifest tracking
  • Concurrent Flush — Channel-based background flush watcher thread decoupled from the write path
  • Tombstone Deletes — Logical deletes via RecordType::Delete markers, cleaned up during compaction
  • Crash Recovery — WAL replay on startup to rebuild MemTable state
  • CLI — Configurable via clap with log level and data directory options

Project Structure

src/
├── main.rs                 # CLI entry point
├── lib.rs                  # Public crate API
├── config.rs               # CLI config (clap)
├── error.rs                # Error types (thiserror)
├── api/
│   ├── mod.rs
│   └── api.rs              # KVEngine trait definition
└── storage/
    ├── mod.rs
    ├── kv.rs               # PersistentKV — core engine implementation
    ├── wal.rs               # WAL header, record encoding/decoding
    ├── log.rs               # Record format, WAL I/O, SSTable search
    ├── record.rs            # RecordType (Put/Delete)
    ├── block.rs             # BlockBuilder — 4KB block encoding
    ├── bloom.rs             # Bloom filter implementation
    ├── sstable.rs           # SSTable encode/decode, flush, k-way merge
    ├── manifest.rs          # Manifest file tracking (level → files)
    ├── levelstore.rs        # Level store data structure
    ├── constant.rs          # Thresholds and magic numbers
    ├── signal.rs            # FlushSignal struct
    ├── skiplist.rs          # SkipList utilities
    └── watcher.rs           # Background flush watcher thread

Getting Started

Prerequisites

  • Rust (edition 2024)

Build

cargo build

Run

# Default (info logging)
cargo run

# With debug logging
cargo run -- --verbose

# Custom log level and data directory
cargo run -- --log-level trace --data-dir ./mydata

Test

cargo test --verbose

Heap Profiling

cargo run --features dhat-heap

API

The core interface is the KVEngine trait:

pub trait KVEngine {
    fn get(&self, key: &[u8]) -> Result<Option<Cow<'_, Vec<u8>>>, DBError>;
    fn put(&mut self, key: Vec<u8>, value: Vec<u8>) -> Result<(), DBError>;
    fn delete(&mut self, key: &[u8]);
}

Usage

use wasm_kv::PersistentKV;
use wasm_kv::api::api::KVEngine;

let mut kv = PersistentKV::new();

// Put
kv.put(b"hello".to_vec(), b"world".to_vec()).unwrap();

// Get
let value = kv.get(b"hello").unwrap();
assert_eq!(value.unwrap().as_slice(), b"world");

// Delete
kv.delete(b"hello");
assert!(kv.get(b"hello").unwrap().is_none());

Storage Format

WAL Record

Each WAL entry is appended with the following binary layout:

┌──────────┬───────────┬──────┬──────┬─────────────┬──────────┐
│ key_len  │ value_len │ key  │value │ record_type │ checksum │
│ (4 bytes)│ (4 bytes) │ (var)│(var) │  (1 byte)   │(4 bytes) │
└──────────┴───────────┴──────┴──────┴─────────────┴──────────┘

The WAL file begins with a 32-byte header containing a magic number (WALMGIC\0), version, and checkpoint metadata.

SSTable Layout

Each SSTable file is structured as:

┌─────────────────────────────────────────┐
│  Data Blocks (N x 4KB blocks)           │
│  ┌─────────────────────────────────┐    │
│  │ Record: key_len|val_len|key|val │    │
│  │ Record: ...                     │    │
│  └─────────────────────────────────┘    │
├─────────────────────────────────────────┤
│  Sparse Index                           │
│  ┌─────────────────────────────────┐    │
│  │ first_key | block_offset | ...  │    │
│  └─────────────────────────────────┘    │
├─────────────────────────────────────────┤
│  Bloom Filter (serialized bit vector)   │
├─────────────────────────────────────────┤
│  Footer                                 │
│  ┌─────────────────────────────────┐    │
│  │ data_block_start/end            │    │
│  │ index_block_start/end           │    │
│  │ bloom_block_start/end           │    │
│  │ index_checksum | bloom_checksum │    │
│  └─────────────────────────────────┘    │
└─────────────────────────────────────────┘

Read flow: Footer is read first (fixed size at end of file) to locate the sparse index and bloom filter offsets. The bloom filter provides fast negative lookups, the sparse index locates the target block, and then the block is scanned linearly.

Learning Resources

This project follows a phased learning roadmap (see PLAN.md) covering:

  • Binary encoding and on-disk record formats
  • Write-Ahead Logging and crash recovery
  • LSM-tree architecture (MemTable, SSTable, compaction)
  • Bloom filters and sparse indexing
  • Concurrency patterns (channels, RwLock, atomic operations)

Further Reading

About

Simple implementation of LSM Tree database in Rust

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages