-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Raft Consensus Implementation #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Implement Phase 1 (Common Types Foundation) of Raft consensus feature: - Add type aliases: NodeId, Term, LogIndex with comprehensive docs - Define Error enum with thiserror for ergonomic error handling - Add 32 passing unit tests (100% Phase 1 test coverage) - Update task tracking with executive summary and progress metrics Phase 1 Status: 2/2 tasks complete (100%) Overall Progress: 2/24 tasks (8%) Test Coverage: - crates/common/src/types.rs: 10 tests passing - crates/common/src/errors.rs: 20 tests passing - All doctests passing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Implement Phase 4 Storage Layer task 1 (mem_storage_skeleton):
- Create MemStorage struct with thread-safe RwLock fields
- Add comprehensive test coverage (13 storage tests + 2 doctests)
- Switch raft-rs to prost-codec to avoid protobuf version conflicts
Implementation Details:
- MemStorage with HardState, ConfState, Vec<Entry>, Snapshot fields
- Thread-safe design (Send + Sync) using RwLock for concurrent access
- new() constructor with Default trait implementation
- Comprehensive documentation with usage examples
Dependencies:
- raft = { version = "0.7", default-features = false, features = ["prost-codec"] }
- tokio = { version = "1", features = ["full"] }
Fixes:
- Fix clippy warnings in common crate (inline format args, assign ops)
- Fix mise lint task (remove --all-features flag causing protobuf conflicts)
Test Results:
- 46 tests passing workspace-wide
- 14/14 raft crate tests passing
- 32/32 common crate tests passing
- No clippy warnings
Progress:
- Phase 4 (Storage Layer): 1/7 tasks complete (14%)
- Overall: 3/24 tasks complete (12.5%)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Implement Phase 4 Storage Layer task 2 (mem_storage_initial_state): - Add initial_state() method returning RaftState with HardState and ConfState - Add helper methods set_hard_state() and set_conf_state() for testing - 11 comprehensive tests covering defaults, mutations, thread safety, and edge cases Implementation Details: - initial_state() acquires read locks for efficient concurrent access - Returns cloned data to prevent mutation leaks - Thread-safe with multiple concurrent readers - Follows raft-rs API conventions (raft::Result<RaftState>) Helper Methods: - set_hard_state(hs: HardState) - Updates storage hard state - set_conf_state(cs: ConfState) - Updates storage conf state Test Coverage (11 new tests): - test_initial_state_returns_defaults - Verifies term=0, vote=0, commit=0 - test_initial_state_reflects_hard_state_changes - State updates reflected - test_initial_state_reflects_conf_state_changes - Config updates reflected - test_initial_state_is_thread_safe - 10 concurrent threads - test_initial_state_returns_cloned_data - Data isolation verified - test_initial_state_multiple_calls_are_consistent - 100 consecutive calls - test_set_hard_state_updates_storage - Direct storage verification - test_set_conf_state_updates_storage - Direct storage verification - test_initial_state_with_empty_conf_state - Partial state updates - test_initial_state_with_complex_conf_state - Joint consensus scenarios - Edge cases for configuration changes Fixes: - Use struct initialization syntax to satisfy clippy::field_reassign_with_default - All 24 tests passing (13 original + 11 new) - No clippy warnings Progress: - Phase 4 (Storage Layer): 2/7 tasks complete (29%) - Overall: 4/24 tasks complete (16.7%) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Add MemStorage::entries() method with comprehensive range query support: - Range queries [low, high) with proper bounds checking - Size-limited queries using prost::Message::encoded_len() - Error handling for compacted (StorageError::Compacted) and unavailable entries - Helper methods: first_index(), last_index(), append() - Guarantees at least one entry returned even if exceeds max_size Test coverage (12 new tests): - Empty and normal range queries - Size limits and partial results - Boundary conditions and error cases - Thread safety with concurrent access Dependencies: Added prost = "0.11" for message size calculation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Add MemStorage::term() method with comprehensive term lookup support: - Special case: term(0) always returns 0 (Raft convention) - Returns snapshot.metadata.term for snapshot index - Proper bounds checking with first_index() and last_index() - Error handling for compacted (StorageError::Compacted) entries - Error handling for unavailable (StorageError::Unavailable) entries - Thread-safe with RwLock read access Test coverage (9 new tests): - Index 0 returns 0 - Valid indices return correct terms - Snapshot index returns snapshot term - Compacted and unavailable error cases - Empty storage and snapshot-only scenarios - Thread safety with concurrent access - Boundary conditions Progress: 6/24 tasks (25%), Storage Layer 57% (4/7) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Add 18 comprehensive tests for existing first_index() and last_index() methods: first_index() tests (6 tests): - Empty log returns 1 - After append returns correct index - With snapshot returns snapshot.index + 1 - Snapshot with entries scenario - After compaction with sparse entries - Entries not starting at index 1 last_index() tests (6 tests): - Empty log returns 0 - After append returns last entry index - Snapshot only returns snapshot.index - Snapshot with entries returns last entry - Multiple appends update correctly - Single entry edge case Invariant & safety tests (6 tests): - Verify first_index <= last_index + 1 always holds - Boundary conditions (empty, single, multiple) - Thread safety with concurrent access - Consistency across multiple calls - Large snapshot indices handling - Multiple scenario lifecycle testing All methods already implemented and working - this formalizes them with comprehensive test coverage per acceptance criteria. Progress: 7/24 tasks (29.2%), Storage Layer 71% (5/7) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Add MemStorage::snapshot() method with Phase 1 simplified implementation: - Always returns current snapshot (ignores request_index in Phase 1) - Thread-safe with RwLock read access - Returns cloned snapshot to prevent mutation leaks - Comprehensive documentation with Phase 1 simplification note Test coverage (7 new tests): - Default snapshot on new storage - Stored snapshot retrieval - Phase 1 behavior (ignores request_index) - Complex metadata (ConfState with voters/learners) - Large data payloads (10KB) - Clone independence validation - Thread safety (10 threads × 100 iterations) Implementation notes: - Phase 1: Simple read-lock-clone-return pattern - Future phases may return SnapshotTemporarilyUnavailable - Validates snapshot data integrity (metadata + data) - 1000 total concurrent reads tested Progress: 8/24 tasks (33.3%), Storage Layer 86% (6/7) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Update mise check task to: - Format code (not just check formatting) - Include build step - Use cleaner depends pattern Now runs: format → lint → build → test
Implement apply_snapshot() and wl_append_entries() to complete the Storage Layer implementation. Both methods use proper Raft semantics: - apply_snapshot(): Replaces storage state with snapshot, clears covered entries, updates hard_state and conf_state - wl_append_entries(): Appends entries with conflict resolution (compares terms, truncates on mismatch) Adds 16 comprehensive tests covering: - Snapshot installation with state updates - Entry appending with conflict resolution - Thread safety with concurrent operations - Edge cases (empty log, conflicting terms) All 86 tests passing with zero clippy warnings. Storage Layer (Phase 4) now 100% complete (7/7 tasks). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Corrected protobuf enum variant names (Normal, ConfChange, Noop) and updated all format strings to use inline variable syntax for clippy compliance. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Add Operation types and StateMachine implementation for Raft consensus: Protocol Layer (operations.rs): - Operation enum with Set/Del variants for key-value mutations - Serialization/deserialization using bincode - apply() method for executing operations on HashMap - 17 tests covering all operation scenarios State Machine (state_machine.rs): - StateMachine struct with HashMap data and last_applied tracking - Core methods: new(), get(), exists(), last_applied() - apply() method with Operation deserialization and idempotency - 19 tests covering all state machine operations - Integration with protocol crate Operation types Progress: 16/24 tasks complete (66.7%), Phase 5 at 67% (2/3) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Implement snapshot/restore functionality for log compaction: - Add snapshot() method to serialize state machine using bincode - Add restore() method to deserialize and replace state - Add Serialize/Deserialize derives to StateMachine struct - Add bincode 1.3 dependency to raft crate Tests: 9 new unit tests + 2 doc tests covering: - Empty snapshot creation - Snapshot with data - Restore from snapshot - Roundtrip serialization - Error handling for invalid data - Large state (100 keys) performance - State overwrite verification All 147 tests passing (123 unit + 24 doc tests) Phase 5 (State Machine) now 100% complete (3/3 tasks) Overall progress: 70.8% (17/24 tasks) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add RaftNode struct wrapping raft-rs RawNode - Implement new() for node initialization - Implement tick() for logical clock advancement - Implement propose() for client command submission - Implement handle_ready() for Raft state processing - Add apply_committed_entries() helper method - Add MemStorage::append() for entry persistence - Add comprehensive test coverage (22 tests) - Update progress: 83.3% complete (20/24 tasks) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Implement is_leader() to check if node is leader - Implement leader_id() to get current leader ID - Add 8 comprehensive tests for leader queries - Complete Phase 6 (Raft Node) - 100% done - Update progress: 87.5% complete (21/24 tasks) - Ready for Phase 7 (Integration) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Address code review feedback: - Replace .unwrap() with .expect() for descriptive error messages - Fix TOCTOU races in entries() and term() by acquiring locks once - Add defensive logging in apply_committed_entries() - Document lock poisoning philosophy for Phase 1 All 199 tests passing, zero clippy warnings. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Implement transport layer for Raft message communication: - Add TransportServer/Client with gRPC (tonic 0.12, prost 0.13) - Bridge prost 0.11 (raft-rs) ↔ 0.13 (transport) via conversion layer - Extract KV operations to separate crate (seshat-kv) - Rename protocol → protocol-resp as RESP placeholder - Remove custom protobuf definitions (use raft-rs built-ins internally) Benefits: - Modern gRPC stack (2024/2025 versions) for transport - No version lock on rest of service - Clean isolation of old prost dependency Tests: 203 passing (157 unit + 13 integration + 33 doctests) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Upgrade transport layer to latest versions: - tonic 0.12 → 0.14 - prost 0.13 → 0.14 - Use tonic-prost-build instead of tonic-build (API change) - Add tonic-prost runtime dependency for generated code All 203 tests passing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Remove placeholder add() function from lib.rs - Add prost version bridging documentation in storage.rs - Replace eprintln! with log::warn! for structured logging - Document direct field access rationale in is_leader() - Remove outdated #[allow(dead_code)] on MemStorage - Add log dependency for proper logging infrastructure All 156 library tests and 13 integration tests passing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Integrate complete RESP2/3 parser and encoder from feat/resp branch: - Full protocol support (14 data types, 487 tests passing) - Zero-copy parsing with bytes::Bytes - Tokio codec integration for async I/O - Command parser for GET, SET, DEL, EXISTS, PING - Buffer pooling for memory efficiency Additional changes: - Simplify CI workflow to use mise for local/CI parity - Fix duplicate CI runs (removed push on feat/* branches) - Remove optional raft dependency from common crate to avoid protobuf-build conflicts - Add --all-features to mise lint task for comprehensive testing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
The mise install task warns about protoc but doesn't install it. CI needs protoc installed before building raft-proto dependency. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Add protoc to mise.toml tools for automatic installation. This eliminates manual protoc installation steps and ensures version consistency across local development and CI environments. Changes: - Add protoc = "28" to [tools] in mise.toml - Remove manual apt-get protoc installation from CI workflow - Mise action automatically installs all tools defined in mise.toml Benefits: - Single source of truth for tool versions - Automatic protoc installation in CI via mise-action - Consistent protoc version (28.3) across all environments - Simpler CI workflow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Feature Overview
Implements core Raft consensus components for Seshat. This PR adds the foundational pieces needed for distributed consensus but does not wire them into a working cluster yet.
What Was Built
Implements raft::Storage trait for in-memory Raft log management:
Orchestrates Raft consensus operations:
Deterministic state machine for KV operations:
Custom protobuf-based transport for Raft messages:
Shared foundation: