Skip to content

Implement cryptographic integrity verification for executables using SHA-256 hashing #40

@unclesp1d3r

Description

@unclesp1d3r

Cryptographic Integrity Verification (Binary Hashing)

Overview

Implement a comprehensive cryptographic integrity verification system that computes and validates cryptographic hashes of executable files and other critical system binaries to detect tampering, verify authenticity, and support forensic analysis.

🔄 Collector Type: This is implemented as a Triggered Collector - activated on-demand when the Monitor Collectors (like procmond) detect suspicious processes or file changes that require hash analysis.

Context & Motivation

Binary integrity verification is a cornerstone of endpoint security. By computing cryptographic hashes of executables, DaemonEye can:

  • Detect Malware: Identify known malicious binaries through hash comparison
  • Verify Integrity: Ensure system binaries haven't been tampered with
  • Support Forensics: Provide cryptographic proof of file state for incident response
  • Enable Correlation: Cross-reference hashes with threat intelligence databases

Architecture Integration

Triggered Collector Behavior

  • Event-Driven Execution: Triggered by Monitor Collectors (procmond, filemond) when suspicious activity detected
  • On-Demand Analysis: Runs only when needed, minimizing system resource usage
  • Scalable Processing: Supports concurrent hash operations with configurable limits
  • Result Correlation: Returns enriched data to triggering collectors and event bus

Trigger Scenarios

  1. Process-Based Triggering (from procmond):

    • New process execution detected
    • Suspicious process behavior identified
    • Unknown or unsigned binary execution
  2. File-Based Triggering (from filemond):

    • System binary modification detected
    • New executable files created
    • Critical directory changes
  3. Manual Triggering:

    • Administrative hash verification requests
    • Incident response investigations
    • Scheduled integrity checks

Technical Requirements

Multi-Algorithm Hash Computing

  • Primary Algorithms

    • SHA-256: Primary hash for integrity verification
    • SHA-1: Legacy compatibility and threat intelligence correlation
    • MD5: Legacy support for older threat intelligence sources
  • Advanced Algorithms (Future Enhancement)

    • SHA-3: Next-generation cryptographic hashing
    • BLAKE3: High-performance cryptographic hashing
    • Fuzzy Hashing (ssdeep): Detect similar/modified binaries

Cross-Platform Implementation

  • File Access Handling

    • Handle locked/in-use files gracefully
    • Respect file permissions and access controls
    • Support symbolic links and junction points
    • Handle large files efficiently with streaming
  • Performance Optimization

    • Concurrent hash computation with worker pools
    • Memory-efficient streaming for large files
    • Caching frequently accessed hashes
    • Configurable resource limits

Integration Architecture

  • Triggered Collector Interface

    • Implement TriggerableCollector trait from collector-core
    • Handle CollectionEvent::TriggerRequest events
    • Support trigger priority levels (Critical, High, Normal, Low)
    • Emit CollectionEvent::TriggerResponse with results
  • Event Processing

    • Parse trigger events from Monitor Collectors
    • Validate file paths and access permissions
    • Queue hash operations with priority handling
    • Return enriched events with hash metadata

Implementation Architecture

Core Hash Computing Engine

use sha2::{Sha256, Digest};
use sha1::Sha1;
use md5::Md5;

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct HashResult {
    pub file_path: PathBuf,
    pub file_size: u64,
    pub modified_time: SystemTime,
    pub sha256: String,
    pub sha1: String,
    pub md5: String,
    pub computation_time: Duration,
}

#[async_trait]
pub trait HashComputer: Send + Sync {
    async fn compute_hashes(&self, file_path: &Path) -> Result<HashResult, HashError>;
    async fn verify_hash(&self, file_path: &Path, expected_hash: &str, algorithm: HashAlgorithm) -> Result<bool, HashError>;
    fn supported_algorithms(&self) -> Vec<HashAlgorithm>;
}

pub struct MultiAlgorithmHasher {
    worker_pool_size: usize,
    max_file_size: u64,
    timeout: Duration,
}

Triggered Collector Implementation

pub struct BinaryHasherCollector {
    hasher: Box<dyn HashComputer>,
    config: HasherConfig,
    active_operations: Arc<AtomicUsize>,
}

#[async_trait]
impl TriggerableCollector for BinaryHasherCollector {
    async fn handle_trigger(
        &self,
        trigger_event: CollectionEvent,
        response_tx: mpsc::Sender<CollectionEvent>
    ) -> Result<()> {
        // Parse trigger event for file path
        let file_path = self.extract_file_path(&trigger_event)?;
        
        // Check resource limits
        if self.active_operations.load(Ordering::Relaxed) >= self.config.max_concurrent {
            return Err(HashError::ResourceLimitExceeded);
        }
        
        // Increment active operations
        self.active_operations.fetch_add(1, Ordering::Relaxed);
        
        // Compute hashes
        let hash_result = self.hasher.compute_hashes(&file_path).await?;
        
        // Create enriched response event
        let response = CollectionEvent::TriggerResponse {
            source_collector: "binary_hasher".to_string(),
            trigger_id: self.extract_trigger_id(&trigger_event)?,
            result: TriggerResult::Success,
            analysis_data: Some(serde_json::to_value(hash_result)?),
        };
        
        // Send response
        response_tx.send(response).await?;
        
        // Decrement active operations
        self.active_operations.fetch_sub(1, Ordering::Relaxed);
        
        Ok(())
    }
    
    fn should_trigger(&self, event: &CollectionEvent) -> bool {
        match event {
            CollectionEvent::Process(proc_event) => proc_event.executable_path.is_some(),
            CollectionEvent::Filesystem(fs_event) => fs_event.is_executable(),
            _ => false,
        }
    }
}

Configuration System

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct HasherConfig {
    pub algorithms: Vec<HashAlgorithm>,
    pub max_concurrent: usize,
    pub max_file_size: u64,           // Skip files larger than this
    pub timeout_per_file: Duration,   // Timeout for individual file hashing
    pub worker_pool_size: usize,
    pub cache_results: bool,
    pub cache_ttl: Duration,
    
    // Trigger configuration
    pub trigger_on_new_process: bool,
    pub trigger_on_file_change: bool,
    pub priority_paths: Vec<PathBuf>,  // High priority paths
    pub exclude_paths: Vec<PathBuf>,   // Paths to skip
}

Security Considerations

File Access Security

  • Privilege Management: Run with minimum required privileges for file access
  • Sandboxing: Isolate hash computation in secure context
  • Access Logging: Log all file access attempts for audit
  • Permission Validation: Verify file access permissions before hashing

Hash Integrity

  • Secure Algorithms: Use cryptographically secure hash functions
  • Timing Attack Prevention: Use constant-time operations where possible
  • Hash Verification: Implement self-verification of hash computations
  • Result Authenticity: Cryptographically sign hash results if required

Performance Requirements

Computation Performance

  • Single File: Complete hash computation for typical executables (<50MB) in under 500ms
  • Concurrent Operations: Support up to 10 concurrent hash operations
  • Memory Usage: Limit memory usage to <100MB during peak operations
  • CPU Impact: Maintain CPU usage below 20% during active hashing

Resource Management

  • File Size Limits: Configurable maximum file size for hashing
  • Timeout Handling: Abort operations that exceed timeout limits
  • Queue Management: Priority-based operation queuing
  • Backpressure: Handle high trigger volumes without overwhelming system

Integration Points

Monitor Collector Integration

  • procmond Triggers: Hash analysis for new/suspicious processes
  • filemond Triggers: Hash verification for modified system files
  • netmond Triggers: Hash analysis for downloaded executables

Threat Intelligence Integration (Future)

  • Hash Lookups: Query threat intelligence databases
  • Reputation Scoring: Assign reputation scores to binaries
  • IOC Matching: Match against known indicators of compromise
  • Alert Generation: Generate alerts for known malicious hashes

Testing Strategy

Unit Testing

  • Hash computation accuracy across all algorithms
  • Error handling for inaccessible files
  • Performance benchmarks for various file sizes
  • Concurrent operation handling

Integration Testing

  • Trigger event handling from procmond
  • Response event generation and delivery
  • Resource limit enforcement
  • Cross-platform file access handling

Security Testing

  • Privilege boundary verification
  • File access permission enforcement
  • Hash computation integrity validation
  • Resource exhaustion resistance

Acceptance Criteria

Functional Requirements

  • Multi-Algorithm Support

    • Compute SHA-256, SHA-1, and MD5 hashes accurately
    • Support configurable algorithm selection
    • Validate hash computation accuracy against known test vectors
  • Triggered Collector Behavior

    • Successfully receive and process trigger events from Monitor Collectors
    • Handle trigger priorities appropriately (Critical > High > Normal > Low)
    • Generate proper trigger response events with hash results
    • Support concurrent trigger processing with resource limits
  • Cross-Platform Compatibility

    • Handle file access on Linux, Windows, and macOS
    • Respect platform-specific file permissions
    • Handle symbolic links and junction points correctly
    • Process locked/in-use files gracefully
  • Performance Requirements

    • Hash computation for <50MB files completes within 500ms
    • Support up to 10 concurrent hash operations
    • Memory usage remains below 100MB during peak operation
    • CPU usage stays below 20% during active hashing

Integration Requirements

  • Event System Integration

    • Successfully integrate with collector-core event bus
    • Handle CollectionEvent::TriggerRequest events correctly
    • Generate CollectionEvent::TriggerResponse events with results
    • Support event correlation and trigger ID tracking
  • Monitor Collector Coordination

    • Respond to triggers from procmond for process analysis
    • Respond to triggers from filemond for file verification
    • Handle high-volume trigger scenarios without degradation
    • Provide timely responses to critical priority triggers

Security Requirements

  • Access Control

    • Run with minimum required file access privileges
    • Validate file permissions before access attempts
    • Log all file access attempts for security audit
    • Handle permission denied errors gracefully
  • Hash Integrity

    • Use cryptographically secure hash algorithms
    • Verify hash computation accuracy
    • Protect against hash collision attacks
    • Ensure reproducible hash results

Dependencies

  • collector-core: Enhanced framework with TriggerableCollector trait
  • Event Bus System: Inter-collector communication infrastructure
  • SHA-2/SHA-1/MD5 crates: Cryptographic hash implementations
  • tokio: Async runtime for concurrent operations
  • serde: Serialization for configuration and results

Timeline

Target completion aligned with v0.2.0 milestone (Due: September 22, 2025)

Related Issues


This issue implements binary hashing as a Triggered Collector that provides on-demand cryptographic analysis of executable files when triggered by Monitor Collectors, enabling efficient and scalable integrity verification within the DaemonEye security platform.

Metadata

Metadata

Assignees

Labels

core-featureCore system functionalitycryptoCryptographic functionality and hashingenhancementNew feature or requestpriority:highHigh priority issue that should be addressed soonprocess-monitoringProcess monitoring and enumeration featuresrustPull requests that update rust codesecuritySecurity-related issues and vulnerabilities

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions