An LSM-tree based key-value store built from scratch in Rust. Built as a learning project to deeply understand how storage engines work — from WAL to compaction — with a future path toward WASM compatibility.
wasm-kv is a single-node, single-writer embedded key-value store that implements the core components of a Log-Structured Merge Tree (LSM-tree):
- Write path: Writes go through a Write-Ahead Log for durability, then into an in-memory SkipMap, which flushes to sorted on-disk SSTables when a size threshold is reached.
- Read path: Reads check the in-memory MemTable first, then search through leveled SSTables using bloom filters and sparse indexes to minimize disk I/O.
WRITE PATH
─────────
put(key, value)
│
▼
┌───────────┐ Append record with
│ WAL │◄─── CRC32 checksum for
│ (durability)│ crash recovery
└─────┬─────┘
│
▼
┌───────────┐ crossbeam SkipMap
│ MemTable │◄─── (sorted, concurrent)
└─────┬─────┘
│ threshold reached (400KB)
▼
┌───────────┐ Flush via crossbeam
│ Flush │◄─── channel to background
│ (async) │ watcher thread
└─────┬─────┘
│
▼
┌───────────────────────────────────┐
│ SSTable File │
│ ┌───────┬───────┬───────┐ │
│ │Block 0│Block 1│Block N│ 4KB │
│ └───────┴───────┴───────┘ │
│ ┌─────────────────────┐ │
│ │ Sparse Index │ │
│ └─────────────────────┘ │
│ ┌─────────────────────┐ │
│ │ Bloom Filter │ │
│ └─────────────────────┘ │
│ ┌─────────────────────┐ │
│ │ Footer │ │
│ └─────────────────────┘ │
└───────────────────────────────────┘
│
▼
┌───────────┐ Tracks SSTable files
│ Manifest │◄─── per level
└───────────┘
READ PATH
─────────
get(key)
│
▼
MemTable ──found──▶ return value
│ miss
▼
Bloom Filter ──negative──▶ skip file
│ maybe
▼
Sparse Index ──locate──▶ target block
│
▼
Block scan ──found──▶ return value
│ miss
▼
Next SSTable... ──▶ repeat per level
- Write-Ahead Log (WAL) — Append-only log with CRC32 checksums, WAL rotation on flush, and archival for recovery
- MemTable — Lock-free concurrent SkipMap (
crossbeam-skiplist) with sorted key ordering - SSTable — Immutable on-disk sorted tables with:
- Fixed-size 4KB blocks
- Sparse index for block-level binary search
- Bloom filters for fast negative lookups
- Footer with offset metadata and checksums
- Leveled Storage — Multi-level SSTable organization (L0, L1, L2...) with manifest tracking
- Concurrent Flush — Channel-based background flush watcher thread decoupled from the write path
- Tombstone Deletes — Logical deletes via
RecordType::Deletemarkers, cleaned up during compaction - Crash Recovery — WAL replay on startup to rebuild MemTable state
- CLI — Configurable via
clapwith log level and data directory options
src/
├── main.rs # CLI entry point
├── lib.rs # Public crate API
├── config.rs # CLI config (clap)
├── error.rs # Error types (thiserror)
├── api/
│ ├── mod.rs
│ └── api.rs # KVEngine trait definition
└── storage/
├── mod.rs
├── kv.rs # PersistentKV — core engine implementation
├── wal.rs # WAL header, record encoding/decoding
├── log.rs # Record format, WAL I/O, SSTable search
├── record.rs # RecordType (Put/Delete)
├── block.rs # BlockBuilder — 4KB block encoding
├── bloom.rs # Bloom filter implementation
├── sstable.rs # SSTable encode/decode, flush, k-way merge
├── manifest.rs # Manifest file tracking (level → files)
├── levelstore.rs # Level store data structure
├── constant.rs # Thresholds and magic numbers
├── signal.rs # FlushSignal struct
├── skiplist.rs # SkipList utilities
└── watcher.rs # Background flush watcher thread
- Rust (edition 2024)
cargo build# Default (info logging)
cargo run
# With debug logging
cargo run -- --verbose
# Custom log level and data directory
cargo run -- --log-level trace --data-dir ./mydatacargo test --verbosecargo run --features dhat-heapThe core interface is the KVEngine trait:
pub trait KVEngine {
fn get(&self, key: &[u8]) -> Result<Option<Cow<'_, Vec<u8>>>, DBError>;
fn put(&mut self, key: Vec<u8>, value: Vec<u8>) -> Result<(), DBError>;
fn delete(&mut self, key: &[u8]);
}use wasm_kv::PersistentKV;
use wasm_kv::api::api::KVEngine;
let mut kv = PersistentKV::new();
// Put
kv.put(b"hello".to_vec(), b"world".to_vec()).unwrap();
// Get
let value = kv.get(b"hello").unwrap();
assert_eq!(value.unwrap().as_slice(), b"world");
// Delete
kv.delete(b"hello");
assert!(kv.get(b"hello").unwrap().is_none());Each WAL entry is appended with the following binary layout:
┌──────────┬───────────┬──────┬──────┬─────────────┬──────────┐
│ key_len │ value_len │ key │value │ record_type │ checksum │
│ (4 bytes)│ (4 bytes) │ (var)│(var) │ (1 byte) │(4 bytes) │
└──────────┴───────────┴──────┴──────┴─────────────┴──────────┘
The WAL file begins with a 32-byte header containing a magic number (WALMGIC\0), version, and checkpoint metadata.
Each SSTable file is structured as:
┌─────────────────────────────────────────┐
│ Data Blocks (N x 4KB blocks) │
│ ┌─────────────────────────────────┐ │
│ │ Record: key_len|val_len|key|val │ │
│ │ Record: ... │ │
│ └─────────────────────────────────┘ │
├─────────────────────────────────────────┤
│ Sparse Index │
│ ┌─────────────────────────────────┐ │
│ │ first_key | block_offset | ... │ │
│ └─────────────────────────────────┘ │
├─────────────────────────────────────────┤
│ Bloom Filter (serialized bit vector) │
├─────────────────────────────────────────┤
│ Footer │
│ ┌─────────────────────────────────┐ │
│ │ data_block_start/end │ │
│ │ index_block_start/end │ │
│ │ bloom_block_start/end │ │
│ │ index_checksum | bloom_checksum │ │
│ └─────────────────────────────────┘ │
└─────────────────────────────────────────┘
Read flow: Footer is read first (fixed size at end of file) to locate the sparse index and bloom filter offsets. The bloom filter provides fast negative lookups, the sparse index locates the target block, and then the block is scanned linearly.
This project follows a phased learning roadmap (see PLAN.md) covering:
- Binary encoding and on-disk record formats
- Write-Ahead Logging and crash recovery
- LSM-tree architecture (MemTable, SSTable, compaction)
- Bloom filters and sparse indexing
- Concurrency patterns (channels, RwLock, atomic operations)
- LevelDB Implementation Notes — Google's original LSM-tree implementation
- RocksDB Wiki — Production-grade LSM with advanced compaction strategies
- WiscKey: Separating Keys from Values — Key-value separation to reduce write amplification
- Database Internals by Alex Petrov — Comprehensive guide to storage engine design