-
-
Notifications
You must be signed in to change notification settings - Fork 0
feat(procmond): implement WAL and EventBus connector integration #127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(procmond): implement WAL and EventBus connector integration #127
Conversation
|
Caution Review failedFailed to post review comments Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds a durable Write‑Ahead Log and WAL-backed EventBusConnector, migrates workspace serialization from bincode to postcard, expands procmond process‑collection surface (WAL, connector, macOS enhancements), introduces an agent health utility, and updates CI/tooling, manifests, and test/lint allowances. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Producer
participant WAL as Write‑Ahead Log
participant Buf as In‑Memory Buffer
participant Connector as EventBusConnector
participant Broker as Broker
App->>WAL: write(event)
WAL-->>App: seq_id
alt Connector connected
App->>Connector: publish(event, seq_id)
Connector->>Broker: send_event
Broker-->>Connector: ack
Connector->>WAL: mark_published(seq_id)
WAL-->>WAL: remove_published_files()
else Connector disconnected
App->>Buf: buffer(event, seq_id)
Buf->>Buf: check_size()
Buf-->>App: backpressure_signal if threshold crossed
end
loop reconnect & recovery
Connector->>Broker: reconnect_attempt (with backoff)
Broker-->>Connector: connected
Connector->>WAL: replay_unpublished()
loop replay entries
Connector->>Broker: publish(entry)
Broker-->>Connector: ack
Connector->>WAL: mark_published(entry.seq)
end
Connector->>Buf: drain_buffer()
Connector->>WAL: mark_published(drained.seq)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
❌ 1 blocking issue (82 total)
@qltysh one-click actions:
|
| "WAL replay completed" | ||
| ); | ||
|
|
||
| Ok(replayed.saturating_add(buffer_flushed)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assert_eq!(events.len(), 5); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Implements procmond’s crash-recoverable event delivery by integrating a Write-Ahead Log (WAL) and an EventBus connector, along with supporting infrastructure updates across the workspace.
Changes:
- Added/extended procmond event durability + delivery mechanics (WAL + broker connector, buffering/backpressure, replay).
- Migrated EventBus/RPC/task/message serialization across daemoneye-eventbus + collector-core from
bincodetopostcard. - Updated specs/ticket docs and adjusted tests/benchmarks/lints to align with the new architecture and stricter workspace linting.
Reviewed changes
Copilot reviewed 73 out of 74 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| spec/procmond/tickets/Validate_Performance_and_Optimize.md | Adds perf validation/optimization ticket details for the epic. |
| spec/procmond/tickets/Validate_FreeBSD_Platform_Support.md | Adds FreeBSD validation ticket details and expectations. |
| spec/procmond/tickets/Implement_Write-Ahead_Log_and_Event_Bus_Connector.md | Documents WAL + connector requirements/acceptance criteria. |
| spec/procmond/tickets/Implement_Security_Hardening_and_Data_Sanitization.md | Adds security hardening/sanitization ticket details. |
| spec/procmond/tickets/Implement_RPC_Service_and_Registration_Manager_(procmond).md | Adds RPC/registration/heartbeat ticket details. |
| spec/procmond/tickets/Implement_Comprehensive_Test_Suite.md | Adds comprehensive testing strategy ticket details. |
| spec/procmond/tickets/Implement_Agent_Loading_State_and_Heartbeat_Detection.md | Adds agent loading/heartbeat detection ticket details. |
| spec/procmond/tickets/Implement_Actor_Pattern_and_Startup_Coordination.md | Adds actor/startup coordination ticket details. |
| spec/procmond/specs/Epic_Brief__Complete_Procmond_Implementation.md | Adds/updates epic brief content for overall procmond plan. |
| spec/procmond/specs/Core_Flows__Procmond_Process_Monitoring.md | Adds/updates detailed operational flows and failure handling. |
| spec/procmond/index.md | Updates ticket index/status to reflect current epic progress. |
| procmond/tests/property_based_process_tests.rs | Updates test docs + adds lint overrides for property-based tests. |
| procmond/tests/process_enumeration_edge_cases.rs | Updates test docs + adds lint overrides for edge-case tests. |
| procmond/tests/privilege_management_tests.rs | Updates test docs + adds lint overrides; adjusts privilege detection helper. |
| procmond/tests/os_compatibility_tests.rs | Updates docs + lint overrides for OS compatibility tests. |
| procmond/tests/os_compatibility_comprehensive_tests.rs | Updates docs + lint overrides for comprehensive OS tests. |
| procmond/tests/macos_integration_tests.rs | Adds lint overrides and doc cleanup for macOS integration tests. |
| procmond/tests/macos_enhanced_integration_tests.rs | Adds lint overrides and doc cleanup for enhanced macOS tests. |
| procmond/tests/linux_integration_tests.rs | Doc tweak + lint overrides for Linux integration tests. |
| procmond/tests/lifecycle_integration_tests.rs | Doc tweaks + lint overrides for lifecycle integration tests. |
| procmond/tests/integration_tests.rs | Doc tweaks + lint overrides for integration tests. |
| procmond/tests/cross_platform_integration_tests.rs | Doc tweaks + lint overrides for cross-platform integration tests. |
| procmond/src/process_collector.rs | Makes error types non-exhaustive; hardens conversions/overflow handling; refactors iteration. |
| procmond/src/monitor_collector.rs | Refactors timer strings, backpressure/circuit breaker bookkeeping, and constructor signature. |
| procmond/src/main.rs | Improves error formatting and replaces stdout prints with structured logging in places. |
| procmond/src/macos_collector.rs | Makes error enum non-exhaustive; refactors conversions; adds/adjusts lint attributes. |
| procmond/src/lifecycle.rs | Makes enums non-exhaustive; refactors stats math/overflow handling; minor API tweaks. |
| procmond/src/lib.rs | Exposes new modules/exports; refines task handling error formatting and timestamp conversions. |
| procmond/src/event_source.rs | Refactors batching/backpressure logic, stats updates, and logging; adds defensive checks. |
| procmond/examples/process_collector_usage.rs | Doc tweaks + lint overrides in the example. |
| procmond/benches/process_collector_benchmarks.rs | Doc tweaks + lint overrides for benchmarks. |
| procmond/Cargo.toml | Adds eventbus dependency, updates lints to workspace config, adjusts deps. |
| mise.toml | Adds mise toolchain configuration (rust/protoc/etc). |
| daemoneye-lib/src/models/rule.rs | Simplifies statement validation logic and future-proofs enum matching. |
| daemoneye-eventbus/tests/rpc_integration_tests.rs | Switches RPC serialization test cases to postcard. |
| daemoneye-eventbus/src/task_distribution.rs | Switches task serialization to postcard. |
| daemoneye-eventbus/src/rpc.rs | Switches RPC request/response serialization to postcard. |
| daemoneye-eventbus/src/message.rs | Switches message serialization/deserialization to postcard. |
| daemoneye-eventbus/src/client.rs | Switches client-side event serialization to postcard. |
| daemoneye-eventbus/src/broker.rs | Switches broker-side decode/encode paths to postcard. |
| daemoneye-eventbus/benches/throughput.rs | Updates serialization benchmarks to postcard. |
| daemoneye-eventbus/Cargo.toml | Replaces bincode dependency with postcard. |
| daemoneye-cli/tests/cli.rs | Minor formatting update in test error output. |
| daemoneye-cli/src/main.rs | Adds lint override for intentional stdout printing; minor doc tweak. |
| daemoneye-cli/Cargo.toml | Cleans up dependencies; adopts workspace lints. |
| daemoneye-agent/tests/rpc_lifecycle_integration.rs | Adds lint overrides for integration tests. |
| daemoneye-agent/tests/rpc_collector_management_integration.rs | Adds lint overrides for integration tests. |
| daemoneye-agent/tests/dual_protocol_integration.rs | Adds lint overrides for integration tests. |
| daemoneye-agent/tests/cli.rs | Adds lint overrides for CLI test. |
| daemoneye-agent/tests/broker_integration.rs | Adds lint overrides for broker integration tests. |
| daemoneye-agent/src/main.rs | Refactors logging/printing, minor robustness improvements in loop counters and formatting. |
| daemoneye-agent/src/lib.rs | Docstring backtick/wording tweaks. |
| daemoneye-agent/src/ipc_server.rs | Makes health enum non-exhaustive; refactors state updates and error strings. |
| daemoneye-agent/src/collector_registry.rs | Makes error enum non-exhaustive; small refactors/ownership cleanups. |
| daemoneye-agent/src/broker_manager.rs | Makes health enum non-exhaustive; refactors state transitions and string handling. |
| daemoneye-agent/examples/dual_protocol_demo.rs | Adds lint overrides in example. |
| daemoneye-agent/Cargo.toml | Cleans up dependencies; adopts workspace lints. |
| collector-core/tests/rpc_server_integration.rs | Switches RPC serialization to postcard in tests. |
| collector-core/tests/daemoneye_eventbus_ipc_integration.rs | Switches event serialization to postcard in tests. |
| collector-core/src/task_distributor.rs | Switches distribution payload serialization to postcard. |
| collector-core/src/rpc_services.rs | Switches RPC service serialization to postcard. |
| collector-core/Cargo.toml | Replaces bincode with postcard; removes now-unused deps. |
| Cargo.toml | Updates workspace dependencies/versions and centralizes workspace lints. |
| .vscode/settings.json | Adds ruff/python interpreter configuration to workspace settings. |
| .serena/project.yml | Updates supported language list/comments and project_name. |
| .claude/commands/review-tests.md | Adds/updates Claude command docs for reviewing tests. |
| .claude/commands/review-simplicity.md | Normalizes formatting/content for simplification review command. |
| .claude/commands/review-performance.md | Normalizes formatting/content for performance review command. |
| .claude/commands/review-dependencies.md | Normalizes formatting/content for dependency review command. |
| .claude/commands/review-architecture.md | Normalizes formatting/content for architecture review command. |
| /// # Backoff Strategy | ||
| /// | ||
| /// - Initial delay: 100ms | ||
| /// - Maximum delay: 30 seconds | ||
| /// - Multiplier: 2x per attempt | ||
| /// - Jitter: ±10% | ||
| async fn try_reconnect(&mut self) -> EventBusConnectorResult<bool> { |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The backoff docs claim a jitter of ±10%, but the implementation only computes delay_ms as min(MIN_BACKOFF_MS * 2^attempt, MAX_BACKOFF_MS) with no jitter applied. Either implement jitter (to avoid thundering herd reconnects) or update the doc comment to match the actual behavior.
Add enhanced process enumeration with platform-specific collectors, crash-safe WAL, and migrate IPC serialization to postcard for smaller payloads; update dependencies and tighten lints to improve reliability and performance.
Updated the review command markdown files to clarify instructions, add steps for making refactoring changes, and emphasize running `just ci-check` after changes. Added a new review-tests.md file to standardize test coverage review. Improved formatting and detail for clarity and consistency across all review command files. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Expanded the list of supported language servers in the project configuration, adding languages such as fsharp, groovy, pascal, powershell, and toml. Updated the project name from 'DaemonEye' to 'daemoneye' for consistency. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Introduces detailed specification, epic brief, and technical planning documents for the complete implementation of procmond. These documents cover user flows, architectural decisions, platform support, security, performance, testing, and integration with daemoneye-agent. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Implements Ticket 1 of the procmond epic - Write-Ahead Log and EventBus Connector integration for reliable event delivery. Key changes: - Add EventBusConnector with daemoneye-eventbus client integration - Add event type persistence in WAL for correct topic routing on replay - Implement automatic reconnection with exponential backoff - Add backpressure signaling (70% activation, 50% release thresholds) - Add replay_entries() for recovering full WAL entries with metadata - Implement 10MB in-memory event buffer with overflow protection The WAL integration ensures crash-recoverable event delivery: 1. Events are written to WAL with sequence numbers and event types 2. If connected, events are published to the broker 3. On disconnect, events are buffered in memory 4. On reconnect, WAL is replayed with proper sequence tracking Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
8df8a29 to
6cca7b4
Compare
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…flows Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
| "WAL replay completed" | ||
| ); | ||
|
|
||
| Ok(replayed.saturating_add(buffer_flushed)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assert_eq!(events.len(), 5); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
procmond/src/linux_collector.rs
Outdated
| } else { | ||
| None | ||
| }; | ||
| // TODO: Implement start time parsing from /proc/[pid]/stat jiffies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .await | ||
| .map_err(|e| ProcessCollectionError::SystemEnumerationFailed { | ||
| message: format!("Process enumeration task failed: {}", e), | ||
| message: format!("Process enumeration task failed: {e}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .await | ||
| .map_err(|e| ProcessCollectionError::SystemEnumerationFailed { | ||
| message: format!("Process lookup task failed: {}", e), | ||
| message: format!("Process lookup task failed: {e}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - **Implementation**: Events persisted to disk before buffering, replayed on restart if procmond crashes | ||
| - **Risk Mitigation**: Bounded buffer size, WAL rotation to prevent disk exhaustion, backpressure when buffer full | ||
|
|
||
| **Trade-off 3: Privilege Model** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - **Rationale**: procmond needs persistent elevated access; agent has larger attack surface (network connectivity) | ||
| - **Risk Mitigation**: procmond has no network access, minimal attack surface, runs as child process (isolated) | ||
|
|
||
| **Trade-off 4: FreeBSD Support Level** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| ### 4. Technical Constraints | ||
|
|
||
| **Platform Constraints** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Must respect platform security boundaries (SELinux, AppArmor, SIP, UAC) | ||
| - Must use platform-native APIs for process enumeration | ||
|
|
||
| **Performance Constraints** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Process enumeration \<100ms for 1,000 processes (average) | ||
| - Event publishing must handle backpressure gracefully | ||
|
|
||
| **Security Constraints** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
- Refactor IpcServerManager::wait_for_healthy to use shared health::wait_for_healthy helper, eliminating code duplication with BrokerManager - Fix markdown formatting in Core_Flows spec: proper indentation for nested lists (MD007), blank lines around lists (MD032) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
||
| fn service_name() -> &'static str { | ||
| "Broker" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| fn service_name() -> &'static str { | ||
| "IPC server" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assert_eq!(metadata.max_sequence, 5, "Max sequence should be 5"); | ||
| assert_eq!(metadata.entry_count, 5, "Should have 5 entries"); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
| ``` | ||
|
|
||
| **Heartbeat Message** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
| ``` | ||
|
|
||
| **Process Event Message** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| ### 1. New Components | ||
|
|
||
| **WriteAheadLog (New)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Handle WAL corruption (skip corrupted entries with CRC32 validation, log warning, continue) | ||
| - Track which events have been published (mark for deletion) | ||
|
|
||
| **EventBusConnector (New)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Calculate new interval: current_interval * 1.5 (50% increase) | ||
| - Release backpressure when buffer drops below 50% (send AdjustInterval with original interval) | ||
|
|
||
| **RpcServiceHandler (New)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Add start_time calculation for Linux collector by parsing starttime jiffies from /proc/[pid]/stat and converting using boot time from /proc/stat - Fix /proc/[pid]/stat parsing to handle comm field with spaces by finding the last ')' before parsing subsequent fields - Replace unsafe `as_millis() as u64` casts with `u64::try_from().unwrap_or(u64::MAX)` across all collectors for safer overflow handling - Improve lifecycle tracker cleanup documentation and warning messages - Remove unused `_platform_name` variable in FallbackProcessCollector Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Critical fixes: - Add missing boot_time_secs and clock_ticks_per_sec fields to LinuxProcessCollector Clone impl - Change WalError sequence types from u32 to u64 for consistency with WalEntry Important fixes: - Log reconnection errors at debug level instead of dropping silently - Log backpressure signal send failures at debug level - Log WAL file scanning failures during initialization - Add #[non_exhaustive] to WindowsCollectionError for consistency - Document shutdown() as best-effort (errors logged but not propagated) Suggestions implemented: - Add WAL replay failure count summary with warning when files fail - Add invalid_start_events counter to LifecycleTrackingStats - Track and log invalid lifecycle start events Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| "WAL replay completed" | ||
| ); | ||
|
|
||
| Ok(replayed.saturating_add(buffer_flushed)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assert_eq!(events.len(), 5); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| pub invalid_start_events: u64, | ||
|
|
||
| /// Average number of processes tracked per update | ||
| pub avg_processes_tracked: f64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .await | ||
| .map_err(|e| ProcessCollectionError::SystemEnumerationFailed { | ||
| message: format!("Process enumeration task failed: {}", e), | ||
| message: format!("Process enumeration task failed: {e}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .await | ||
| .map_err(|e| ProcessCollectionError::SystemEnumerationFailed { | ||
| message: format!("Process lookup task failed: {}", e), | ||
| message: format!("Process lookup task failed: {e}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - **MonitorCollector trait**: Provides statistics and health check interface | ||
| - **ProcessEvent**: Standard event format for process data | ||
|
|
||
| **AgentCollectorConfig (New)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| ### 6. daemoneye-agent Enhancements Required | ||
|
|
||
| **Collector Configuration Loading (New)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Spawn collectors in order defined in configuration file | ||
| - Pass collector-specific configuration via environment variables or config files | ||
|
|
||
| **Loading State Management (New)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Transition command: Broadcast "begin monitoring" to `control.collector.lifecycle` when entering steady state | ||
| - Timeout: If collectors don't report ready within timeout (60s default), fail startup with error | ||
|
|
||
| **Heartbeat Failure Detection (Enhanced)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Log all recovery actions for operator visibility | ||
| - Emit alerts for repeated collector failures (e.g., 3+ restarts in 10 minutes) | ||
|
|
||
| **Configuration Push (Enhanced)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| fn service_name() -> &'static str { | ||
| "Broker" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| fn service_name() -> &'static str { | ||
| "IPC server" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assert_eq!(metadata.max_sequence, 5, "Max sequence should be 5"); | ||
| assert_eq!(metadata.entry_count, 5, "Should have 5 entries"); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| clippy::clone_on_ref_ptr, | ||
| clippy::as_conversions, | ||
| clippy::redundant_clone, | ||
| clippy::str_to_string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| clippy::single_match_else, | ||
| clippy::clone_on_ref_ptr, | ||
| clippy::let_underscore_must_use, | ||
| clippy::ignored_unit_patterns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - **Implementation**: Events persisted to disk before buffering, replayed on restart if procmond crashes | ||
| - **Risk Mitigation**: Bounded buffer size, WAL rotation to prevent disk exhaustion, backpressure when buffer full | ||
|
|
||
| **Trade-off 3: Privilege Model** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - **Rationale**: procmond needs persistent elevated access; agent has larger attack surface (network connectivity) | ||
| - **Risk Mitigation**: procmond has no network access, minimal attack surface, runs as child process (isolated) | ||
|
|
||
| **Trade-off 4: FreeBSD Support Level** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| ### 4. Technical Constraints | ||
|
|
||
| **Platform Constraints** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Must respect platform security boundaries (SELinux, AppArmor, SIP, UAC) | ||
| - Must use platform-native APIs for process enumeration | ||
|
|
||
| **Performance Constraints** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Process enumeration \<100ms for 1,000 processes (average) | ||
| - Event publishing must handle backpressure gracefully | ||
|
|
||
| **Security Constraints** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rustdoc was interpreting [pid] in `/proc/[pid]/stat` as a link. Escaped brackets with backslashes to prevent this. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| "WAL replay completed" | ||
| ); | ||
|
|
||
| Ok(replayed.saturating_add(buffer_flushed)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assert_eq!(events.len(), 5); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| pub invalid_start_events: u64, | ||
|
|
||
| /// Average number of processes tracked per update | ||
| pub avg_processes_tracked: f64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .await | ||
| .map_err(|e| ProcessCollectionError::SystemEnumerationFailed { | ||
| message: format!("Process enumeration task failed: {}", e), | ||
| message: format!("Process enumeration task failed: {e}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .await | ||
| .map_err(|e| ProcessCollectionError::SystemEnumerationFailed { | ||
| message: format!("Process lookup task failed: {}", e), | ||
| message: format!("Process lookup task failed: {e}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
| ``` | ||
|
|
||
| **Heartbeat Message** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
| ``` | ||
|
|
||
| **Process Event Message** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| ### 1. New Components | ||
|
|
||
| **WriteAheadLog (New)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Handle WAL corruption (skip corrupted entries with CRC32 validation, log warning, continue) | ||
| - Track which events have been published (mark for deletion) | ||
|
|
||
| **EventBusConnector (New)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Calculate new interval: current_interval * 1.5 (50% increase) | ||
| - Release backpressure when buffer drops below 50% (send AdjustInterval with original interval) | ||
|
|
||
| **RpcServiceHandler (New)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Change get_clock_ticks_per_sec to const fn returning u64 directly instead of Option<u64> (missing_const_for_fn, unnecessary_wraps) - Use safe string slicing with .get() to avoid potential panic on UTF-8 character boundaries (string_slice) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
||
| fn service_name() -> &'static str { | ||
| "Broker" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| fn service_name() -> &'static str { | ||
| "IPC server" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| @@ -0,0 +1,1774 @@ | |||
| //! Write-Ahead Log (WAL) for crash recovery and event persistence. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assert_eq!(metadata.max_sequence, 5, "Max sequence should be 5"); | ||
| assert_eq!(metadata.entry_count, 5, "Should have 5 entries"); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| clippy::clone_on_ref_ptr, | ||
| clippy::as_conversions, | ||
| clippy::redundant_clone, | ||
| clippy::str_to_string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - **MonitorCollector trait**: Provides statistics and health check interface | ||
| - **ProcessEvent**: Standard event format for process data | ||
|
|
||
| **AgentCollectorConfig (New)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| ### 6. daemoneye-agent Enhancements Required | ||
|
|
||
| **Collector Configuration Loading (New)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Spawn collectors in order defined in configuration file | ||
| - Pass collector-specific configuration via environment variables or config files | ||
|
|
||
| **Loading State Management (New)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Transition command: Broadcast "begin monitoring" to `control.collector.lifecycle` when entering steady state | ||
| - Timeout: If collectors don't report ready within timeout (60s default), fail startup with error | ||
|
|
||
| **Heartbeat Failure Detection (Enhanced)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Log all recovery actions for operator visibility | ||
| - Emit alerts for repeated collector failures (e.g., 3+ restarts in 10 minutes) | ||
|
|
||
| **Configuration Push (Enhanced)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary
replay_entries()for recovering full WAL entries with metadataDesign
The WAL integration ensures crash-recoverable event delivery:
Test plan
🤖 Generated with Claude Code