Distributed Audit Log with Merkle Trees, gRPC, and Independent Auditor
This project implements an append-only audit log system backed by Merkle Trees, using:
- A PerfectTree structure for efficient incremental hashing
- A gRPC Log Server
- A Writer Client that appends events
- An Auditor Client that verifies inclusion & consistency proofs
It is based on the requirements of II.3502 – Distributed Architecture and Programming – Lab 5.
MerkleTree/
├── auditor/
│ └── auditor.py # Auditor CLI – verify membership & consistency
├── merkle/
│ ├── merkle_tree.py # MerkleTree (RFC6962 hash rules)
│ ├── perfect_tree.py # PerfectTree (perfectly left-full tree)
│ ├── merkle_helper.py # Consistency proof range computation
│ ├── verifier.py # Inclusion & consistency verification
├── mk/
│ ├── merkle.proto # gRPC interface
│ └── merkle_pb2*.py # generated protobuf files
├── server/
│ └── log_server.py # gRPC server maintaining Merkle tree
├── writer/
│ └── writer.py # Writer CLI – append events
├── environment.yml
├── log_state.txt # Append-only log file
└── README.md
conda env create -f environment.yml
conda activate merkle-labpython -m grpc_tools.protoc \
--proto_path=./mk \
--python_out=./mk \
--grpc_python_out=./mk \
--pyi_out=./mk \
./mk/merkle.protoFix imports:
sed -i.bak 's/^import merkle_pb2/from . import merkle_pb2/' mk/merkle_pb2_grpc.py \
&& rm mk/merkle_pb2_grpc.py.bakpython -m server.log_serverpython -m writer.writerpython -m auditor.auditorLeaves:
H = SHA256(0x00 | data)
Internal nodes:
H = SHA256(0x01 | left_child_hash | right_child_hash)
The 0x00 and 0x01 prefixes enforce domain separation and prevent structural collisions.
The implementation uses a data structure called PerfectTree, which is not the same as a perfectly balanced tree. PerfectTree is:
- Also referred to as a left-full Merkle tree
- This is the canonical structure used in early Merkle tree implementations
- It is the basis of Certificate Transparency’s Merkle consistency proofs
- It is not necessarily balanced, but always filled from the left
- Internal nodes exist only when both children exist
- Right subtrees appear only when enough leaves have been added
A perfectly balanced tree requires that the number of leaves is a power of two. Real logs grow one entry at a time, so forcing exact power-of-two sizes is impossible.
In contrast:
Example with 5 leaves:
[1..5]
/ \
[1..4] [5..5]
/ \
[1..2] [3..4]
This layout enables:
- Efficient incremental updates
- Consistency proofs based on prefix subtrees
- O(log n) audit paths
- A clean representation of ranges (e.g., [1..3], [4..4], ...)
PerfectTree ensures:
- Each internal node covers a contiguous range of leaf indices
- The tree decomposition into power-of-two ranges is deterministic
- The consistency proof algorithm (
consistency_proof_ranges) can correctly compute subtrees shared between old and new logs
Without this structure, the verifier could not reconstruct roots purely from subtree hashes.
The server:
- Stores the append-only log in
log_state.txt - Updates the MerkleTree incrementally
- Exposes:
GetRoot()
GetSize()
Append(event)
GetAuditPath(index)
GetConsistencyProof(old_size)
The Writer is the append-only producer:
1. Append event
2. Show log size
3. Show current root
Each append updates:
- The log file
- The Merkle root
The Auditor is independent and does not trust the server.
User provides:
- index
- event
Auditor obtains the audit path and checks:
leaf_hash → ... → root
User provides:
- old_size
- trusted old root (not from the server!)
Auditor gets:
- new_size, new_root (from server)
- consistency proof blocks
Then verifies:
reconstructed_old_root == trusted_old_root
reconstructed_new_root == new_root
If both hold → the log is a valid append-only extension.
This matches the design of Google Certificate Transparency.
log_state.txtpersists tree state- Auditor never stores old roots — user provides trusted checkpoints
- PerfectTree guarantees deterministic Merkle structure
- Consistency proofs follow the same principles as RFC6962