Implement cryptographic integrity verification for executables using SHA-256 hashing

# Cryptographic Integrity Verification (Binary Hashing)

## Overview
Implement a comprehensive cryptographic integrity verification system that computes and validates cryptographic hashes of executable files and other critical system binaries to detect tampering, verify authenticity, and support forensic analysis.

**🔄 Collector Type**: This is implemented as a **Triggered Collector** - activated on-demand when the Monitor Collectors (like procmond) detect suspicious processes or file changes that require hash analysis.

## Context & Motivation
Binary integrity verification is a cornerstone of endpoint security. By computing cryptographic hashes of executables, DaemonEye can:
- **Detect Malware**: Identify known malicious binaries through hash comparison
- **Verify Integrity**: Ensure system binaries haven't been tampered with
- **Support Forensics**: Provide cryptographic proof of file state for incident response
- **Enable Correlation**: Cross-reference hashes with threat intelligence databases

## Architecture Integration
### Triggered Collector Behavior
- **Event-Driven Execution**: Triggered by Monitor Collectors (procmond, filemond) when suspicious activity detected
- **On-Demand Analysis**: Runs only when needed, minimizing system resource usage
- **Scalable Processing**: Supports concurrent hash operations with configurable limits
- **Result Correlation**: Returns enriched data to triggering collectors and event bus

### Trigger Scenarios
1. **Process-Based Triggering** (from procmond):
   - New process execution detected
   - Suspicious process behavior identified
   - Unknown or unsigned binary execution

2. **File-Based Triggering** (from filemond):
   - System binary modification detected
   - New executable files created
   - Critical directory changes

3. **Manual Triggering**:
   - Administrative hash verification requests
   - Incident response investigations
   - Scheduled integrity checks

## Technical Requirements

### Multi-Algorithm Hash Computing
- [ ] **Primary Algorithms**
  - **SHA-256**: Primary hash for integrity verification
  - **SHA-1**: Legacy compatibility and threat intelligence correlation
  - **MD5**: Legacy support for older threat intelligence sources

- [ ] **Advanced Algorithms** (Future Enhancement)
  - **SHA-3**: Next-generation cryptographic hashing
  - **BLAKE3**: High-performance cryptographic hashing
  - **Fuzzy Hashing** (ssdeep): Detect similar/modified binaries

### Cross-Platform Implementation
- [ ] **File Access Handling**
  - Handle locked/in-use files gracefully
  - Respect file permissions and access controls
  - Support symbolic links and junction points
  - Handle large files efficiently with streaming

- [ ] **Performance Optimization**
  - Concurrent hash computation with worker pools
  - Memory-efficient streaming for large files
  - Caching frequently accessed hashes
  - Configurable resource limits

### Integration Architecture
- [ ] **Triggered Collector Interface**
  - Implement TriggerableCollector trait from collector-core
  - Handle CollectionEvent::TriggerRequest events
  - Support trigger priority levels (Critical, High, Normal, Low)
  - Emit CollectionEvent::TriggerResponse with results

- [ ] **Event Processing**
  - Parse trigger events from Monitor Collectors
  - Validate file paths and access permissions
  - Queue hash operations with priority handling
  - Return enriched events with hash metadata

## Implementation Architecture

### Core Hash Computing Engine
```rust
use sha2::{Sha256, Digest};
use sha1::Sha1;
use md5::Md5;

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct HashResult {
    pub file_path: PathBuf,
    pub file_size: u64,
    pub modified_time: SystemTime,
    pub sha256: String,
    pub sha1: String,
    pub md5: String,
    pub computation_time: Duration,
}

#[async_trait]
pub trait HashComputer: Send + Sync {
    async fn compute_hashes(&self, file_path: &Path) -> Result<HashResult, HashError>;
    async fn verify_hash(&self, file_path: &Path, expected_hash: &str, algorithm: HashAlgorithm) -> Result<bool, HashError>;
    fn supported_algorithms(&self) -> Vec<HashAlgorithm>;
}

pub struct MultiAlgorithmHasher {
    worker_pool_size: usize,
    max_file_size: u64,
    timeout: Duration,
}
```

### Triggered Collector Implementation
```rust
pub struct BinaryHasherCollector {
    hasher: Box<dyn HashComputer>,
    config: HasherConfig,
    active_operations: Arc<AtomicUsize>,
}

#[async_trait]
impl TriggerableCollector for BinaryHasherCollector {
    async fn handle_trigger(
        &self,
        trigger_event: CollectionEvent,
        response_tx: mpsc::Sender<CollectionEvent>
    ) -> Result<()> {
        // Parse trigger event for file path
        let file_path = self.extract_file_path(&trigger_event)?;
        
        // Check resource limits
        if self.active_operations.load(Ordering::Relaxed) >= self.config.max_concurrent {
            return Err(HashError::ResourceLimitExceeded);
        }
        
        // Increment active operations
        self.active_operations.fetch_add(1, Ordering::Relaxed);
        
        // Compute hashes
        let hash_result = self.hasher.compute_hashes(&file_path).await?;
        
        // Create enriched response event
        let response = CollectionEvent::TriggerResponse {
            source_collector: "binary_hasher".to_string(),
            trigger_id: self.extract_trigger_id(&trigger_event)?,
            result: TriggerResult::Success,
            analysis_data: Some(serde_json::to_value(hash_result)?),
        };
        
        // Send response
        response_tx.send(response).await?;
        
        // Decrement active operations
        self.active_operations.fetch_sub(1, Ordering::Relaxed);
        
        Ok(())
    }
    
    fn should_trigger(&self, event: &CollectionEvent) -> bool {
        match event {
            CollectionEvent::Process(proc_event) => proc_event.executable_path.is_some(),
            CollectionEvent::Filesystem(fs_event) => fs_event.is_executable(),
            _ => false,
        }
    }
}
```

### Configuration System
```rust
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct HasherConfig {
    pub algorithms: Vec<HashAlgorithm>,
    pub max_concurrent: usize,
    pub max_file_size: u64,           // Skip files larger than this
    pub timeout_per_file: Duration,   // Timeout for individual file hashing
    pub worker_pool_size: usize,
    pub cache_results: bool,
    pub cache_ttl: Duration,
    
    // Trigger configuration
    pub trigger_on_new_process: bool,
    pub trigger_on_file_change: bool,
    pub priority_paths: Vec<PathBuf>,  // High priority paths
    pub exclude_paths: Vec<PathBuf>,   // Paths to skip
}
```

## Security Considerations

### File Access Security
- **Privilege Management**: Run with minimum required privileges for file access
- **Sandboxing**: Isolate hash computation in secure context
- **Access Logging**: Log all file access attempts for audit
- **Permission Validation**: Verify file access permissions before hashing

### Hash Integrity
- **Secure Algorithms**: Use cryptographically secure hash functions
- **Timing Attack Prevention**: Use constant-time operations where possible
- **Hash Verification**: Implement self-verification of hash computations
- **Result Authenticity**: Cryptographically sign hash results if required

## Performance Requirements

### Computation Performance
- **Single File**: Complete hash computation for typical executables (<50MB) in under 500ms
- **Concurrent Operations**: Support up to 10 concurrent hash operations
- **Memory Usage**: Limit memory usage to <100MB during peak operations
- **CPU Impact**: Maintain CPU usage below 20% during active hashing

### Resource Management
- **File Size Limits**: Configurable maximum file size for hashing
- **Timeout Handling**: Abort operations that exceed timeout limits
- **Queue Management**: Priority-based operation queuing
- **Backpressure**: Handle high trigger volumes without overwhelming system

## Integration Points

### Monitor Collector Integration
- **procmond Triggers**: Hash analysis for new/suspicious processes
- **filemond Triggers**: Hash verification for modified system files
- **netmond Triggers**: Hash analysis for downloaded executables

### Threat Intelligence Integration (Future)
- **Hash Lookups**: Query threat intelligence databases
- **Reputation Scoring**: Assign reputation scores to binaries
- **IOC Matching**: Match against known indicators of compromise
- **Alert Generation**: Generate alerts for known malicious hashes

## Testing Strategy

### Unit Testing
- Hash computation accuracy across all algorithms
- Error handling for inaccessible files
- Performance benchmarks for various file sizes
- Concurrent operation handling

### Integration Testing
- Trigger event handling from procmond
- Response event generation and delivery
- Resource limit enforcement
- Cross-platform file access handling

### Security Testing
- Privilege boundary verification
- File access permission enforcement
- Hash computation integrity validation
- Resource exhaustion resistance

## Acceptance Criteria

### Functional Requirements
- [ ] **Multi-Algorithm Support**
  - Compute SHA-256, SHA-1, and MD5 hashes accurately
  - Support configurable algorithm selection
  - Validate hash computation accuracy against known test vectors

- [ ] **Triggered Collector Behavior**
  - Successfully receive and process trigger events from Monitor Collectors
  - Handle trigger priorities appropriately (Critical > High > Normal > Low)
  - Generate proper trigger response events with hash results
  - Support concurrent trigger processing with resource limits

- [ ] **Cross-Platform Compatibility**
  - Handle file access on Linux, Windows, and macOS
  - Respect platform-specific file permissions
  - Handle symbolic links and junction points correctly
  - Process locked/in-use files gracefully

- [ ] **Performance Requirements**
  - Hash computation for <50MB files completes within 500ms
  - Support up to 10 concurrent hash operations
  - Memory usage remains below 100MB during peak operation
  - CPU usage stays below 20% during active hashing

### Integration Requirements
- [ ] **Event System Integration**
  - Successfully integrate with collector-core event bus
  - Handle CollectionEvent::TriggerRequest events correctly
  - Generate CollectionEvent::TriggerResponse events with results
  - Support event correlation and trigger ID tracking

- [ ] **Monitor Collector Coordination**
  - Respond to triggers from procmond for process analysis
  - Respond to triggers from filemond for file verification
  - Handle high-volume trigger scenarios without degradation
  - Provide timely responses to critical priority triggers

### Security Requirements
- [ ] **Access Control**
  - Run with minimum required file access privileges
  - Validate file permissions before access attempts
  - Log all file access attempts for security audit
  - Handle permission denied errors gracefully

- [ ] **Hash Integrity**
  - Use cryptographically secure hash algorithms
  - Verify hash computation accuracy
  - Protect against hash collision attacks
  - Ensure reproducible hash results

## Dependencies
- **collector-core**: Enhanced framework with TriggerableCollector trait
- **Event Bus System**: Inter-collector communication infrastructure
- **SHA-2/SHA-1/MD5 crates**: Cryptographic hash implementations
- **tokio**: Async runtime for concurrent operations
- **serde**: Serialization for configuration and results

## Timeline
Target completion aligned with **v0.2.0 milestone** (Due: September 22, 2025)

## Related Issues
- **Two-Tier Collector Framework**: Core framework enhancement (to be created)
- **Event Bus System**: Inter-collector communication system (to be created)  
- **Issue #89**: procmond implementation (Monitor Collector - will trigger this collector)
- **Issue #103**: daemoneye-agent service architecture (service context)

---
*This issue implements binary hashing as a Triggered Collector that provides on-demand cryptographic analysis of executable files when triggered by Monitor Collectors, enabling efficient and scalable integrity verification within the DaemonEye security platform.*

Uh oh!

Implement cryptographic integrity verification for executables using SHA-256 hashing #40

Description

Cryptographic Integrity Verification (Binary Hashing)

Overview

Context & Motivation

Architecture Integration

Triggered Collector Behavior

Trigger Scenarios

Technical Requirements

Multi-Algorithm Hash Computing

Cross-Platform Implementation

Integration Architecture

Implementation Architecture

Core Hash Computing Engine

Triggered Collector Implementation

Configuration System

Security Considerations

File Access Security

Hash Integrity

Performance Requirements

Computation Performance

Resource Management

Integration Points

Monitor Collector Integration

Threat Intelligence Integration (Future)

Testing Strategy

Unit Testing

Integration Testing

Security Testing

Acceptance Criteria

Functional Requirements

Integration Requirements

Security Requirements

Dependencies

Timeline

Related Issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions