-
-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Cryptographic Integrity Verification (Binary Hashing)
Overview
Implement a comprehensive cryptographic integrity verification system that computes and validates cryptographic hashes of executable files and other critical system binaries to detect tampering, verify authenticity, and support forensic analysis.
🔄 Collector Type: This is implemented as a Triggered Collector - activated on-demand when the Monitor Collectors (like procmond) detect suspicious processes or file changes that require hash analysis.
Context & Motivation
Binary integrity verification is a cornerstone of endpoint security. By computing cryptographic hashes of executables, DaemonEye can:
- Detect Malware: Identify known malicious binaries through hash comparison
- Verify Integrity: Ensure system binaries haven't been tampered with
- Support Forensics: Provide cryptographic proof of file state for incident response
- Enable Correlation: Cross-reference hashes with threat intelligence databases
Architecture Integration
Triggered Collector Behavior
- Event-Driven Execution: Triggered by Monitor Collectors (procmond, filemond) when suspicious activity detected
- On-Demand Analysis: Runs only when needed, minimizing system resource usage
- Scalable Processing: Supports concurrent hash operations with configurable limits
- Result Correlation: Returns enriched data to triggering collectors and event bus
Trigger Scenarios
-
Process-Based Triggering (from procmond):
- New process execution detected
- Suspicious process behavior identified
- Unknown or unsigned binary execution
-
File-Based Triggering (from filemond):
- System binary modification detected
- New executable files created
- Critical directory changes
-
Manual Triggering:
- Administrative hash verification requests
- Incident response investigations
- Scheduled integrity checks
Technical Requirements
Multi-Algorithm Hash Computing
-
Primary Algorithms
- SHA-256: Primary hash for integrity verification
- SHA-1: Legacy compatibility and threat intelligence correlation
- MD5: Legacy support for older threat intelligence sources
-
Advanced Algorithms (Future Enhancement)
- SHA-3: Next-generation cryptographic hashing
- BLAKE3: High-performance cryptographic hashing
- Fuzzy Hashing (ssdeep): Detect similar/modified binaries
Cross-Platform Implementation
-
File Access Handling
- Handle locked/in-use files gracefully
- Respect file permissions and access controls
- Support symbolic links and junction points
- Handle large files efficiently with streaming
-
Performance Optimization
- Concurrent hash computation with worker pools
- Memory-efficient streaming for large files
- Caching frequently accessed hashes
- Configurable resource limits
Integration Architecture
-
Triggered Collector Interface
- Implement TriggerableCollector trait from collector-core
- Handle CollectionEvent::TriggerRequest events
- Support trigger priority levels (Critical, High, Normal, Low)
- Emit CollectionEvent::TriggerResponse with results
-
Event Processing
- Parse trigger events from Monitor Collectors
- Validate file paths and access permissions
- Queue hash operations with priority handling
- Return enriched events with hash metadata
Implementation Architecture
Core Hash Computing Engine
use sha2::{Sha256, Digest};
use sha1::Sha1;
use md5::Md5;
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct HashResult {
pub file_path: PathBuf,
pub file_size: u64,
pub modified_time: SystemTime,
pub sha256: String,
pub sha1: String,
pub md5: String,
pub computation_time: Duration,
}
#[async_trait]
pub trait HashComputer: Send + Sync {
async fn compute_hashes(&self, file_path: &Path) -> Result<HashResult, HashError>;
async fn verify_hash(&self, file_path: &Path, expected_hash: &str, algorithm: HashAlgorithm) -> Result<bool, HashError>;
fn supported_algorithms(&self) -> Vec<HashAlgorithm>;
}
pub struct MultiAlgorithmHasher {
worker_pool_size: usize,
max_file_size: u64,
timeout: Duration,
}Triggered Collector Implementation
pub struct BinaryHasherCollector {
hasher: Box<dyn HashComputer>,
config: HasherConfig,
active_operations: Arc<AtomicUsize>,
}
#[async_trait]
impl TriggerableCollector for BinaryHasherCollector {
async fn handle_trigger(
&self,
trigger_event: CollectionEvent,
response_tx: mpsc::Sender<CollectionEvent>
) -> Result<()> {
// Parse trigger event for file path
let file_path = self.extract_file_path(&trigger_event)?;
// Check resource limits
if self.active_operations.load(Ordering::Relaxed) >= self.config.max_concurrent {
return Err(HashError::ResourceLimitExceeded);
}
// Increment active operations
self.active_operations.fetch_add(1, Ordering::Relaxed);
// Compute hashes
let hash_result = self.hasher.compute_hashes(&file_path).await?;
// Create enriched response event
let response = CollectionEvent::TriggerResponse {
source_collector: "binary_hasher".to_string(),
trigger_id: self.extract_trigger_id(&trigger_event)?,
result: TriggerResult::Success,
analysis_data: Some(serde_json::to_value(hash_result)?),
};
// Send response
response_tx.send(response).await?;
// Decrement active operations
self.active_operations.fetch_sub(1, Ordering::Relaxed);
Ok(())
}
fn should_trigger(&self, event: &CollectionEvent) -> bool {
match event {
CollectionEvent::Process(proc_event) => proc_event.executable_path.is_some(),
CollectionEvent::Filesystem(fs_event) => fs_event.is_executable(),
_ => false,
}
}
}Configuration System
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct HasherConfig {
pub algorithms: Vec<HashAlgorithm>,
pub max_concurrent: usize,
pub max_file_size: u64, // Skip files larger than this
pub timeout_per_file: Duration, // Timeout for individual file hashing
pub worker_pool_size: usize,
pub cache_results: bool,
pub cache_ttl: Duration,
// Trigger configuration
pub trigger_on_new_process: bool,
pub trigger_on_file_change: bool,
pub priority_paths: Vec<PathBuf>, // High priority paths
pub exclude_paths: Vec<PathBuf>, // Paths to skip
}Security Considerations
File Access Security
- Privilege Management: Run with minimum required privileges for file access
- Sandboxing: Isolate hash computation in secure context
- Access Logging: Log all file access attempts for audit
- Permission Validation: Verify file access permissions before hashing
Hash Integrity
- Secure Algorithms: Use cryptographically secure hash functions
- Timing Attack Prevention: Use constant-time operations where possible
- Hash Verification: Implement self-verification of hash computations
- Result Authenticity: Cryptographically sign hash results if required
Performance Requirements
Computation Performance
- Single File: Complete hash computation for typical executables (<50MB) in under 500ms
- Concurrent Operations: Support up to 10 concurrent hash operations
- Memory Usage: Limit memory usage to <100MB during peak operations
- CPU Impact: Maintain CPU usage below 20% during active hashing
Resource Management
- File Size Limits: Configurable maximum file size for hashing
- Timeout Handling: Abort operations that exceed timeout limits
- Queue Management: Priority-based operation queuing
- Backpressure: Handle high trigger volumes without overwhelming system
Integration Points
Monitor Collector Integration
- procmond Triggers: Hash analysis for new/suspicious processes
- filemond Triggers: Hash verification for modified system files
- netmond Triggers: Hash analysis for downloaded executables
Threat Intelligence Integration (Future)
- Hash Lookups: Query threat intelligence databases
- Reputation Scoring: Assign reputation scores to binaries
- IOC Matching: Match against known indicators of compromise
- Alert Generation: Generate alerts for known malicious hashes
Testing Strategy
Unit Testing
- Hash computation accuracy across all algorithms
- Error handling for inaccessible files
- Performance benchmarks for various file sizes
- Concurrent operation handling
Integration Testing
- Trigger event handling from procmond
- Response event generation and delivery
- Resource limit enforcement
- Cross-platform file access handling
Security Testing
- Privilege boundary verification
- File access permission enforcement
- Hash computation integrity validation
- Resource exhaustion resistance
Acceptance Criteria
Functional Requirements
-
Multi-Algorithm Support
- Compute SHA-256, SHA-1, and MD5 hashes accurately
- Support configurable algorithm selection
- Validate hash computation accuracy against known test vectors
-
Triggered Collector Behavior
- Successfully receive and process trigger events from Monitor Collectors
- Handle trigger priorities appropriately (Critical > High > Normal > Low)
- Generate proper trigger response events with hash results
- Support concurrent trigger processing with resource limits
-
Cross-Platform Compatibility
- Handle file access on Linux, Windows, and macOS
- Respect platform-specific file permissions
- Handle symbolic links and junction points correctly
- Process locked/in-use files gracefully
-
Performance Requirements
- Hash computation for <50MB files completes within 500ms
- Support up to 10 concurrent hash operations
- Memory usage remains below 100MB during peak operation
- CPU usage stays below 20% during active hashing
Integration Requirements
-
Event System Integration
- Successfully integrate with collector-core event bus
- Handle CollectionEvent::TriggerRequest events correctly
- Generate CollectionEvent::TriggerResponse events with results
- Support event correlation and trigger ID tracking
-
Monitor Collector Coordination
- Respond to triggers from procmond for process analysis
- Respond to triggers from filemond for file verification
- Handle high-volume trigger scenarios without degradation
- Provide timely responses to critical priority triggers
Security Requirements
-
Access Control
- Run with minimum required file access privileges
- Validate file permissions before access attempts
- Log all file access attempts for security audit
- Handle permission denied errors gracefully
-
Hash Integrity
- Use cryptographically secure hash algorithms
- Verify hash computation accuracy
- Protect against hash collision attacks
- Ensure reproducible hash results
Dependencies
- collector-core: Enhanced framework with TriggerableCollector trait
- Event Bus System: Inter-collector communication infrastructure
- SHA-2/SHA-1/MD5 crates: Cryptographic hash implementations
- tokio: Async runtime for concurrent operations
- serde: Serialization for configuration and results
Timeline
Target completion aligned with v0.2.0 milestone (Due: September 22, 2025)
Related Issues
- Two-Tier Collector Framework: Core framework enhancement (to be created)
- Event Bus System: Inter-collector communication system (to be created)
- Issue Implement Core Process Monitoring Daemon (procmond) #89: procmond implementation (Monitor Collector - will trigger this collector)
- Issue Epic: Cross-platform daemon/service mode for daemoneye-agent (Unix/macOS/Windows) with process supervision #103: daemoneye-agent service architecture (service context)
This issue implements binary hashing as a Triggered Collector that provides on-demand cryptographic analysis of executable files when triggered by Monitor Collectors, enabling efficient and scalable integrity verification within the DaemonEye security platform.