High-performance Python implementation of Sparse Merkle Trees with Poseidon hash and SQLite persistence. Features ZK-compatible cryptography for seamless integration with zero-knowledge proof systems.
pyseidon-smt provides a production-ready sparse Merkle tree implementation suitable for cryptographic applications, data integrity verification, and efficient set membership proofs with associated values. Uses the Poseidon hash function over BN254, ensuring compatibility with zero-knowledge proof systems like Noir and RISC Zero. Features both in-memory and persistent SQLite storage with LRU caching. Each key can store an associated 32-byte value, making it ideal for status tracking, revocation lists, and other applications requiring key-value storage with cryptographic proofs.
- Sparse Merkle tree with efficient storage and retrieval
- Key-value storage where each key maps to a 32-byte value
- Cryptographically secure using Poseidon hash function over BN254
- Dual storage backends: In-memory (fast) and SQLite (persistent)
- LRU caching for optimal SQLite performance
- Thread-safe operations with concurrent access support
- Merkle proof generation for membership verification (16-level and full proofs)
- PyR0 integration with canonical 514‑byte
MerklePath16andwrite_to()interface .hexconvenience view for0x‑prefixed keys and hex roots- ZK-compatible for seamless integration with zero-knowledge proof systems
- High-performance Rust core with Python bindings via PyO3
Tree depth. Keys are 32 bytes ⇒ a theoretical depth of 256 (1 bit per level).
Path16 window. For circuit efficiency we expose a fixed 16-level window closest to the leaf.
This "Path16" is a projection of the full path: it contains the 16 sibling hashes starting at the leaf.
Ordering & bits.
siblings[0]is adjacent to the leaf;siblings[15]is the sibling 16 levels above the leaf (leaf-first indexing).dir_bitspacks directions leaf-first: bit 0 (LSB) corresponds tosiblings[0], bit 15 tosiblings[15].
Convention: 0 = LEFT, 1 = RIGHT.
Thread Safety: All tree operations are thread-safe within a single process. SQLite mode supports concurrent reads; writes are serialized. Cross-process safety depends on SQLite's file locking. For better concurrent read performance, the library automatically enables WAL mode (PRAGMA journal_mode=WAL) on first open.
| Item | Type/Size | Order / Endianness |
|---|---|---|
MerklePath16.dir_bits |
u16 (Python int) |
Little‑endian in to_raw() |
MerklePath16.siblings |
16 × bytes32 |
Leaf‑first; siblings[0] next to leaf |
MerklePath16.to_raw() |
514 bytes | u16 (LE) + 16×bytes32 (leaf‑first) |
MerkleProof.bitmaps |
bytes (variable length) |
Bit 0 (LSB) = lowest proof level |
poseidon_hash([...]) |
bytes32 |
Digest of hex inputs |
poseidon_hash_bytes([...]) |
bytes32 |
Digest of byte inputs |
poseidon_hash_limbs(a,b) |
tuple[int,int,int,int] |
4 little‑endian limbs |
Once published to PyPI:
pip install pyseidon-smt- Rust toolchain (latest stable)
- Python 3.11+
- uv package manager
# 1. Clone and navigate to project
git clone https://github.com/garyrob/pyseidon-smt.git
cd pyseidon-smt
# 2. Sync project dependencies
uv sync
# 3. Build and install in development mode (faster iteration)
uv tool run maturin develop
# OR: Build release version and install locally
uv tool run maturin build --release
uv pip install target/wheels/pyseidon_smt-0.2.0-*.whl# Run all tests with verbose output
uv run pytest tests/ -v
# Run specific test categories
uv run pytest tests/test_merkle_tree.py::TestSQLiteSpecific -vThe SparseMerkleTree class provides a strict, bytes-only core API:
import pyseidon_smt
# Create tree (bytes-only core API)
tree = pyseidon_smt.SparseMerkleTree()
# Or create persistent SQLite tree (db_path accepts str | os.PathLike[str])
tree = pyseidon_smt.SparseMerkleTree(db_path="merkle.db", cache_capacity=1000)
# Insert keys with values (bytes only - exactly 32 bytes each)
key1 = b'\x00' * 31 + b'\x01'
key2 = b'\x00' * 31 + b'\x02'
value1 = b'\x00' * 31 + b'\x01' # e.g., status: active
value2 = b'\x00' * 31 + b'\x02' # e.g., status: revoked
tree.insert(key1, value1)
tree.insert(key2, value2)
# Or insert with default value (0x00...01) for backward compatibility
tree.insert(key1) # value defaults to b'\x00' * 31 + b'\x01'
# Check if keys exist (multiple ways)
assert key1 in tree
assert tree.exists(key1)
# Retrieve values
value = tree.get(key1) # Returns bytes or None if not found
assert value == value1
# Get current root as bytes
root_bytes = tree.root_bytes() # 32 bytes
print(f"Merkle root: {root_bytes.hex()}")
# Batch insert for efficiency
keys = [i.to_bytes(32, 'big') for i in range(1, 100)]
tree.batch_insert(keys)
# Context manager support
with pyseidon_smt.SparseMerkleTree(db_path="data.db") as tree:
tree.insert(b'\x00' * 31 + b'\x01')
# Automatic cleanup on exitFor ergonomic hex string operations, use the .hex view:
# Hex string operations via .hex view
key_hex = "0x0000000000000000000000000000000000000000000000000000000000000001"
value_hex = "0x0000000000000000000000000000000000000000000000000000000000000001"
tree.hex.insert(key_hex, value_hex)
# Or with default value for backward compatibility
tree.hex.insert(key_hex) # value defaults to 0x00...01
# Check if hex keys exist
assert key_hex in tree.hex
assert tree.hex.exists(key_hex)
# Retrieve values as hex
value = tree.hex.get(key_hex) # Returns hex string or None if not found
assert value == value_hex
# Get root as hex string
root_hex = tree.hex.root() # "0x..."
print(f"Merkle root: {root_hex}")
# Batch insert hex strings
hex_keys = [f"0x{i:064x}" for i in range(1, 100)]
tree.hex.batch_insert(hex_keys)Keys and Values: Both must be 32-byte values. Hex strings must be
0x-prefixed, 64 hex chars (case-insensitive). Invalid length/format raisesValueError.Values: If not provided, defaults to
0x00...01for backward compatibility. Values are stored and persisted (including in SQLite).Note: Deletion is not exposed (sparse Merkle trees are designed as append-only structures).
# Generate 16-level Merkle path (leaf-first window)
# Using bytes
key = b'\x00' * 31 + b'\x01'
path_bytes = tree.path16(key)
print(f"dir_bits (u16, LSB=leaf): {path_bytes.dir_bits}")
print(f"#siblings: {len(path_bytes.siblings)} # always 16")
# Or using hex strings via .hex view
hex_key = "0x" + "00"*31 + "01"
path_hex = tree.hex.path16(hex_key)
# Full Merkle proof (typed)
proof_bytes = tree.proof(key) # bytes API
proof_hex = tree.hex.proof(hex_key) # hex API
print(len(proof_bytes.bitmaps), len(proof_hex.bitmaps))
# Note: proof.bitmaps is a byte string; bit 0 (LSB) corresponds to the lowest level in the proof
# Serialization helpers
# Path16 serialization (fixed 514 bytes): u16 (LE) + 16×bytes32 (leaf-first)
raw_path = path_bytes.to_raw()
path2 = pyseidon_smt.MerklePath16.from_raw(raw_path)
# Full proof serialization (variable length; see docstring)
raw_proof = proof_bytes.to_raw()
proof2 = pyseidon_smt.MerkleProof.from_raw(raw_proof)# Create persistent tree and add data with values
tree1 = pyseidon_smt.SparseMerkleTree(db_path="data.db", cache_capacity=500)
key = "0x0000000000000000000000000000000000000000000000000000000000000001"
value = "0x0000000000000000000000000000000000000000000000000000000000000002"
tree1.hex.insert(key, value)
root1 = tree1.hex.root()
# Later: restore from same database
tree2 = pyseidon_smt.SparseMerkleTree(db_path="data.db", cache_capacity=500)
root2 = tree2.hex.root()
assert root1 == root2 # Data persisted automatically
assert tree2.hex.get(key) == value # Values also persistedThe core SparseMerkleTree API is designed for zero-knowledge proof integration:
from pyseidon_smt import SparseMerkleTree, MerklePath16, LookupResult16
# Create tree (bytes-only core API)
tree = SparseMerkleTree() # In-memory
# tree = SparseMerkleTree(db_path="data.db", cache_capacity=1000) # Persistent
# All keys and values must be exactly 32 bytes
key = b'\x00' * 31 + b'\x01'
value = b'\x00' * 31 + b'\x01' # e.g., account status
# Insert key-value pair (no return value)
tree.insert(key, value)
# Or insert with default value
tree.insert(key) # value defaults to b'\x00' * 31 + b'\x01'
# Check if key exists (multiple ways)
assert key in tree # __contains__ protocol
assert tree.exists(key) # explicit method
# Retrieve value
value = tree.get(key) # Returns bytes or None if not found
# Get Merkle path only
path: MerklePath16 = tree.path16(key)
print(f"Direction bits: {path.dir_bits}") # u16 integer
print(f"Siblings: {len(path.siblings)}") # 16 x 32-byte hashes
# Complete lookup with found flag
result: LookupResult16 = tree.lookup(key)
print(f"Found: {result.found}") # bool
print(f"Path: {result.path}") # MerklePath16 (always present)
# Get root as bytes
root = tree.root_bytes() # 32 bytes
# PyR0 integration example
try:
import pyr0
# Produce canonical input for guest
ib = pyr0.InputBuilder()
result.path.write_to(ib) # Writes u16 + 16×bytes32
inp = ib.build()
# Proceed with proof generation
image_id = ... # set to your compiled guest image ID
receipt = pyr0.prove(image_id, inp)
except ImportError:
pass # PyR0 not availableGuest reader (Rust) for consuming the merkle path in RISC Zero zkVM:
// guest/src/merkle.rs
#![allow(dead_code)]
use risc0_zkvm::guest::env;
pub const DEPTH: usize = 16;
pub const BYTES_PER_HASH: usize = 32;
pub const RAW_SIZE: usize = 2 + DEPTH * BYTES_PER_HASH; // 514 bytes
pub struct MerklePath16 {
pub dir_bits: u16,
pub siblings: [[u8; BYTES_PER_HASH]; DEPTH],
}
impl MerklePath16 {
pub fn read_from_env() -> Self {
let mut dir = [0u8; 2];
env::read_slice(&mut dir);
let dir_bits = u16::from_le_bytes(dir);
let mut siblings = [[0u8; BYTES_PER_HASH]; DEPTH];
for i in 0..DEPTH {
env::read_slice(&mut siblings[i]);
}
// Validation checks
assert_eq!(DEPTH, 16, "DEPTH changed unexpectedly");
for (i, s) in siblings.iter().enumerate() {
assert_eq!(s.len(), BYTES_PER_HASH, "sibling[{i}] wrong length");
}
Self { dir_bits, siblings }
}
}Guest main usage:
// guest/src/main.rs
mod merkle;
use merkle::MerklePath16;
fn main() {
let path = MerklePath16::read_from_env();
// Verify against provided root using your Poseidon hash implementation
// Direction bits: bit i corresponds to siblings[i] (leaf-first, i=0 at leaf)
// Sibling order: siblings[0] at leaf level; siblings[15] 16 levels above the leaf
// ... your merkle verification logic here ...
}Additional operations:
# Get raw wire format (exactly 514 bytes)
# Layout: dir_bits as little-endian u16, followed by 16×bytes32 in leaf-first order
raw_bytes = path_bytes.to_raw()
reconstructed = pyseidon_smt.MerklePath16.from_raw(raw_bytes)
# Get limbs for custom circuits
limbs = path_bytes.limbs_le() # List[tuple[int,int,int,int]] for 64-bit limbsimport pyseidon_smt
# From hex → bytes digest
digest = pyseidon_smt.poseidon_hash([
"0x0000000000000000000000000000000000000000000000000000000000000001",
"0x0000000000000000000000000000000000000000000000000000000000000002",
]) # -> 32 bytes
# Strict bytes-only API
digest = pyseidon_smt.poseidon_hash_bytes([
b'\x00'*31 + b'\x01',
b'\x00'*31 + b'\x02',
]) # -> 32 bytes
# Limb-based host Poseidon (pure Python, for parent hashing)
l1 = pyseidon_smt.hex_to_limbs("0x0000000000000000000000000000000000000000000000000000000000000001")
l2 = pyseidon_smt.hex_to_limbs("0x0000000000000000000000000000000000000000000000000000000000000002")
h = pyseidon_smt.poseidon_hash_limbs(l1, l2) # -> tuple[int, int, int, int] (4 limbs)
# Utility functions
hex_bytes = pyseidon_smt.hex_to_bytes("0x123...")
hex_string = pyseidon_smt.bytes_to_hex(b"\x01\x02...")pyseidon-smt provides several utility modules for zero-knowledge proof integration:
from pyseidon_smt import H256, BN254_FR
# Create hash values from different formats (preferred explicit constructors)
h1 = H256.from_hex("0x123...") # From hex string
h2 = H256.from_bytes(b"\x01" * 32) # From bytes
h3 = H256.from_int(12345) # From integer
# Smart constructor (accepts multiple types)
h4 = H256.new("0x123...") # Still available for convenience
# Check if value fits in BN254 field
is_valid = h1.is_canonical_mod_p(BN254_FR) # bool
is_valid_alt = pyseidon_smt.is_canonical_mod_p(h1) # Back-compat function
# Convert to different formats
as_bytes = bytes(h1) # 32 bytes
as_hex = h1.hex # H256.hex is a property returning a 0x-prefixed string (not a method)
as_int = int(h1) # Python intfrom pyseidon_smt import hex_to_limbs, limbs_to_hex, to_noir_limbs, to_risc0_limbs, to_circom_limbs
# Convert hex to 64-bit limbs (little-endian)
hex_val = "0x123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0"
limbs = hex_to_limbs(hex_val) # tuple[int, int, int, int]
# Convert back to hex
hex_back = limbs_to_hex(limbs) # "0x123..."
# Framework-specific limb formats
# These helpers accept 0x-prefixed hex strings and handle limb conversion internally
noir_limbs = to_noir_limbs(hex_val) # For Noir circuits
risc0_limbs = to_risc0_limbs(hex_val) # For RISC Zero
circom_limbs = to_circom_limbs(hex_val) # For Circomfrom pyseidon_smt import pack_dir_bits, unpack_dir_bits
# Pack boolean list into integer (leaf-first; True=RIGHT, False=LEFT)
direction_list = [True, False, True, False]
packed = pack_dir_bits(direction_list) # -> u16
unpacked = unpack_dir_bits(packed, len(direction_list)) # -> [True, False, True, False]from pyseidon_smt import (
merkle_path_16_limbs, merkle_proof_limbs,
noir_witness_for_path, risc0_witness_for_path, circom_witness_for_path
)
# Get tree and generate path
tree = pyseidon_smt.SparseMerkleTree()
tree.insert("0x0000000000000000000000000000000000000000000000000000000000000001")
path = tree.path16("0x0000000000000000000000000000000000000000000000000000000000000001")
# Convert to limb-based witnesses for different frameworks
key = "0x0000000000000000000000000000000000000000000000000000000000000001"
noir_witness, noir_dirs = noir_witness_for_path(tree, key)
risc0_witness, risc0_dirs = risc0_witness_for_path(tree, key)
circom_witness, circom_dirs = circom_witness_for_path(tree, key, bits_per_limb=15)
# Direct limb conversion from MerklePath16
limbs_16 = path.limbs_le() # Already in limb format
# Full proof limb conversion
key = "0x0000000000000000000000000000000000000000000000000000000000000001"
proof_limbs = merkle_proof_limbs(tree, key, limb_bits=64)
path_limbs = merkle_path_16_limbs(tree, key, limb_bits=64)- In-memory mode: ~100K insertions/sec
- SQLite mode: ~10K insertions/sec (with caching)
- Memory usage: ~10MB for LRU cache (configurable)
- Proof generation: <1ms for 16-level proofs
- Thread safety: Full concurrent read/write support
Numbers measured on M1 Mac, 16GB RAM; see tests/test_production_ready.py to reproduce.
pyseidon-smt is suitable for various cryptographic applications including:
General Applications:
- Data integrity verification and audit trails
- Efficient set membership proofs
- Cryptographic accumulators
- Blockchain and distributed systems
Zero-Knowledge Proof Integration:
- PyR0 (RISC Zero): Strict
SparseMerkleTreewithMerklePath16.write_to(builder) - Noir circuits: Use
merkle_path_16_limbs()orMerklePath16.limbs_le() - RISC Zero zkVM: Raw wire format via
MerklePath16.to_raw() - Circom/SnarkJS: Standard Poseidon hash over BN254
- Custom circuits: Flexible limb extraction and witness generation
# Development build (faster compilation, debug symbols)
uv tool run maturin develop
# Release build (optimized for performance)
uv tool run maturin build --release
uv pip install target/wheels/pyseidon_smt-0.2.0-*.whlType hints shipped (py.typed included for PEP 561 compliance).
The test suite includes:
- ✅ Basic tree operations (insert, contains, root)
- ✅ Both in-memory and SQLite storage modes
- ✅ SQLite persistence verification
- ✅ Merkle proof generation and validation
- ✅ Poseidon hash function testing
- ✅ Error handling and edge cases
- ✅ Concurrent access scenarios
- ✅ Production load testing
├── src/
│ ├── lib.rs # Python bindings
│ ├── merkle.rs # Core Merkle tree implementation
│ ├── sqlite_store.rs # SQLite persistence layer
│ └── pyseidon_smt/
│ ├── __init__.py # Main module exports
│ ├── api.py # Tree API (bytes core + hex view)
│ ├── bits.py # Direction bits utilities
│ ├── h256.py # 32-byte hash value type
│ ├── hashing.py # Poseidon hash utilities
│ ├── limbs.py # Field element limb conversion
│ ├── merkle_path.py # MerklePath16 & MerkleProof types
│ ├── witness.py # ZK witness generation
│ └── py.typed # Type annotations marker
├── tests/
│ ├── test_merkle_tree.py # Core tree functionality
│ ├── test_merkle_path_api.py # Path API tests
│ ├── test_rust_integration.py # Rust/Python integration
│ ├── test_production_ready.py # Performance tests
│ ├── test_bits.py # Direction bits tests
│ ├── test_h256.py # Hash value tests
│ ├── test_hashing_host.py # Poseidon hash tests
│ ├── test_limbs_helpers.py # Limb conversion tests
│ ├── test_witness_helpers.py # Witness generation tests
│ └── rust/ # Rust integration tests
├── Cargo.toml # Rust dependencies
└── pyproject.toml # Python project configuration
MIT License - see LICENSE file for details.
This project builds upon the following open source libraries:
- sparse-merkle-tree (v0.6.x) by Nervos Network - MIT License
- Core Merkle tree functionality and storage traits
Special thanks to Nervos Network for the excellent sparse-merkle-tree implementation that serves as the foundation for this project.