A robust, single-node key-value store built in Rust. It utilizes an append-only log for data persistence and an in-memory hash map for fast retrieval. This project serves as an educational implementation of storage engine basics, demonstrating crash safety and file handling.
- Persistent Storage: All data is saved to disk in
data/kv.log. - Fast Reads: An in-memory
HashMapindex maps keys directly to their file offsets, ensuring O(1) lookup time (excluding disk I/O). - Crash Recovery: On startup, the store replays the log to rebuild the index, ensuring data survives restarts.
- CLI Interface: Simple command-line tools to interact with the database.
- Append-Only Design: Writes are always appended to the end of the file, optimizing for write performance and simplifying data integrity.
- Rust & Cargo installed.
To compile the project:
cargo build --releaseRun the binary via cargo run (for development) or the built executable.
Stores a value associated with a key. If the key exists, it updates the index to point to the new entry (old entries remain in the log but are ignored).
cargo run -- set <KEY> <VALUE>
# Example
cargo run -- set user:101 "Alice Wonderland"Retrieves the value for a given key.
cargo run -- get <KEY>
# Example
cargo run -- get user:101
# Output: Alice WonderlandRemoves a key from the store. This appends a "tombstone" record to the log, effectively hiding the key from future lookups.
cargo run -- rm <KEY>
# Example
cargo run -- rm user:101src/main.rs: The entry point. It usesclapto parse command-line arguments and dispatches commands to theKvStore.src/store.rs: The core storage engine.- Manages the
KvStorestruct. - Handles file I/O (reading/writing to
data/kv.log). - Maintains the in-memory
index.
- Manages the
src/utils.rs: A module for utility functions and future extensions (currently a placeholder).
The storage file is a continuous sequence of binary records. Each record has the following format:
| Field | Type | Description |
|---|---|---|
| Key Length | u32 |
Length of the key in bytes (Little Endian). |
| Value Length | u32 |
Length of the value in bytes (Little Endian). |
| Key | [u8] |
The key data (UTF-8 string). |
| Value | [u8] |
The value data (UTF-8 string). |
set(key, value):- Seeks to the end of the file.
- Writes the header (
key_len,val_len) and then the bodies (key,value). - Updates the in-memory
indexwith thekeyand the start offset of this new record.
get(key):- Looks up the
keyin the in-memoryindexto find the file offset. - Seeks to that offset in the file.
- Reads the lengths and then reads the value directly (skipping the key for performance).
- Looks up the
load_index()(Startup):- Reads the entire log file from beginning to end.
- Parses every record's key and length.
- Populates the
indexmap. - This ensures the in-memory state matches the persistent log.
The project relies on a few key crates to ensure robustness and code clarity:
anyhow: Provides flexible and easy-to-use error handling. It allows us to attach context to errors (e.g., "Failed to open log file") for better debugging.clap: A powerful command-line argument parser. We use it with thederivefeature to define our CLI structure (Set,Get) directly from Rust structs and enums.byteorder: Helpers for reading/writing primitive numbers (likeu32) in a specific endianness (Little Endian), ensuring portability of the data file across different systems.
- Educational: Perfect for understanding how databases like Bitcask or log-structured merge-trees (LSM) start.
- Simple: Minimal codebase makes it easy to read and modify.
- Safe: Rust's type system and explicit error handling prevent common crashes and data races.