Skip to content

feat(procmond): Implement cross-platform process enumeration with enhanced metadata collection #39

@unclesp1d3r

Description

@unclesp1d3r

Cross-Platform Process Enumeration for ProcMond

Overview

Implement a robust ProcessCollector system that provides cross-platform process enumeration capabilities for the DaemonEye monitoring system. This component will serve as a critical foundation for process monitoring and security analysis.

🏗️ Architecture Context: This functionality will be integrated into procmond, which runs as a managed collector under daemoneye-agent supervision (see Issue #89 for full procmond architecture and Issue #103 for service management details).

Technical Context

Platform-Specific Challenges

Process enumeration presents unique challenges across different operating systems:

Windows

  • API: Uses Windows API (CreateToolhelp32Snapshot, Process32First/Next, NtQuerySystemInformation)
  • Privileges: Different privilege requirements for accessing process details
  • UAC: User Account Control affects process visibility
  • Sessions: Must handle different user sessions and service processes
  • Performance: Snapshot-based enumeration can be resource-intensive

Linux

  • Filesystem: Relies on /proc filesystem parsing with varying permission models
  • Namespaces: Container and namespace isolation affects process visibility
  • Permissions: SELinux and AppArmor may restrict /proc access
  • Variations: Different kernel versions expose different /proc entries
  • Performance: Direct filesystem reads can be efficient but requires parsing

macOS

  • API: Uses BSD-style process APIs (sysctl, proc_listpids, proc_pidinfo)
  • SIP: System Integrity Protection restricts access to certain processes
  • Sandboxing: App sandboxing affects process enumeration capabilities
  • Architecture: Must support both Intel and Apple Silicon
  • Performance: Mach kernel interfaces provide efficient access

FreeBSD (Secondary Support)

  • Filesystem: Similar to Linux but with different /proc structure
  • Permissions: Different permission models than Linux
  • Testing: Limited testing resources for secondary platform
  • Best Effort: Features may lag behind primary platforms

Proposed Solution

Platform Support Matrix

Feature Windows Linux macOS FreeBSD
Support Level Primary Primary Primary Secondary
Basic Enumeration ✅ Full ✅ Full ✅ Full ✅ Basic
Process Metadata ✅ Full ✅ Full ✅ Full ⚠️ Limited
Parent Relationships ✅ Full ✅ Full ✅ Full ✅ Basic
Memory Usage ✅ Full ✅ Full ✅ Full ⚠️ Limited
CPU Metrics ✅ Full ✅ Full ✅ Full ⚠️ Limited
Network Connections ✅ Full ✅ Full ⚠️ Limited ❌ None
File Descriptors ✅ Handles ✅ Full ⚠️ Limited ❌ None
Environment Variables ✅ Full ✅ Full ⚠️ Limited ❌ None
Security Context ✅ Full ✅ Full ⚠️ Limited ❌ None
Performance Target <100ms <100ms <100ms <200ms
Testing Coverage High High High Basic

Recommended Third-Party Crates

Primary Process Enumeration

  • sysinfo (^0.31) - Main cross-platform system information library
    • ✅ Excellent cross-platform support (Windows, Linux, macOS, FreeBSD)
    • ✅ Active maintenance with regular updates (last updated 2024)
    • ✅ Clean API with consistent behavior across platforms
    • ✅ Good performance characteristics for system monitoring
    • ✅ Handles platform-specific quirks automatically
    • ✅ Memory-safe implementation
    • 📊 ~10M downloads, well-tested in production

Platform-Specific Enhanced Metadata

Linux Enhancement
  • procfs (^0.16) - Linux-specific detailed process information
    • ✅ Provides detailed Linux process metadata beyond sysinfo
    • ✅ Direct /proc filesystem parsing for maximum detail
    • ✅ Network connections, file descriptors, memory maps
    • ✅ Namespace and cgroup information
    • ⚠️ Linux-only, requires conditional compilation
Windows Enhancement
  • windows (^0.58) - Official Microsoft Windows API bindings
    • ✅ Modern, safe Windows API bindings (replaces winapi)
    • ✅ Access to CreateToolhelp32Snapshot and Process APIs
    • ✅ Process tokens, security descriptors, handles
    • ✅ Performance counters and detailed metrics
    • ✅ Actively maintained by Microsoft
    • ⚠️ Windows-only, requires conditional compilation
macOS/BSD Enhancement
  • libc (^0.2) - Unix system calls
    • ✅ BSD sysctl interface access
    • ✅ proc_listpids and proc_pidinfo for macOS
    • ✅ Cross-platform Unix support
    • ⚠️ Unsafe API requires careful wrapping

Supporting Crates

  • tokio (^1.0) - Async runtime for non-blocking operations
  • thiserror (^1.0) - Structured error handling
  • tracing (^0.1) - Structured logging and diagnostics

Core Architecture

Data Structures

use std::path::PathBuf;
use std::time::{SystemTime, Duration};
use std::collections::HashMap;

/// Configuration for the process collector
#[derive(Debug, Clone)]
pub struct CollectorConfig {
    /// Enable enhanced metadata collection (may require elevated privileges)
    pub enable_enhanced_metadata: bool,
    /// Maximum time to spend on a single collection cycle
    pub collection_timeout: Duration,
    /// Whether to collect network connection information
    pub collect_network_info: bool,
    /// Whether to collect file descriptor information
    pub collect_fd_info: bool,
    /// Whether to collect environment variables
    pub collect_env_vars: bool,
    /// Minimum privilege level required for collection
    pub required_privilege: PrivilegeLevel,
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum PrivilegeLevel {
    User,       // Normal user privileges
    Elevated,   // Elevated but not root/SYSTEM
    Admin,      // Full administrative privileges
}

/// Primary process collector interface
pub struct ProcessCollector {
    system: sysinfo::System,
    config: CollectorConfig,
    last_refresh: Option<SystemTime>,
    
    #[cfg(target_os = "linux")]
    procfs_collector: Option<LinuxProcfsCollector>,
    
    #[cfg(target_os = "windows")]
    windows_collector: Option<WindowsProcessCollector>,
    
    #[cfg(target_os = "macos")]
    macos_collector: Option<MacOsProcessCollector>,
}

/// Complete process information
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct ProcessRecord {
    /// Process identifier
    pub pid: u32,
    /// Parent process identifier
    pub ppid: Option<u32>,
    /// Process name (executable name)
    pub name: String,
    /// Full executable path
    pub exe_path: Option<PathBuf>,
    /// Command line arguments
    pub cmd_line: Vec<String>,
    /// Process owner (username)
    pub owner: Option<String>,
    /// Process creation time
    pub start_time: Option<SystemTime>,
    /// Current process state
    pub state: ProcessState,
    /// Resource usage information
    pub resources: ResourceUsage,
    /// Enhanced metadata (when available)
    pub enhanced: Option<EnhancedMetadata>,
    /// Platform-specific metadata
    pub platform_specific: PlatformMetadata,
}

#[derive(Debug, Clone, Copy, serde::Serialize, serde::Deserialize)]
pub enum ProcessState {
    Running,
    Sleeping,
    Stopped,
    Zombie,
    Unknown,
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct ResourceUsage {
    /// Virtual memory size in bytes
    pub virtual_memory: u64,
    /// Resident set size in bytes
    pub resident_memory: u64,
    /// CPU usage percentage (0-100)
    pub cpu_percent: f32,
    /// Number of threads
    pub thread_count: u32,
    /// Total CPU time used
    pub cpu_time: Duration,
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct EnhancedMetadata {
    /// Network connections (Linux/Windows)
    pub network_connections: Vec<NetworkConnection>,
    /// Open file descriptors (Linux/Unix)
    pub open_files: Vec<FileDescriptor>,
    /// Environment variables (when accessible)
    pub environment: HashMap<String, String>,
    /// Security context information
    pub security_context: Option<SecurityContext>,
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct NetworkConnection {
    pub protocol: Protocol,
    pub local_addr: std::net::SocketAddr,
    pub remote_addr: Option<std::net::SocketAddr>,
    pub state: ConnectionState,
}

#[derive(Debug, Clone, Copy, serde::Serialize, serde::Deserialize)]
pub enum Protocol {
    Tcp,
    Udp,
    Unix,
}

#[derive(Debug, Clone, Copy, serde::Serialize, serde::Deserialize)]
pub enum ConnectionState {
    Listen,
    Established,
    TimeWait,
    CloseWait,
    Closed,
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct FileDescriptor {
    pub fd: i32,
    pub path: PathBuf,
    pub mode: FileMode,
}

#[derive(Debug, Clone, Copy, serde::Serialize, serde::Deserialize)]
pub enum FileMode {
    Read,
    Write,
    ReadWrite,
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub enum PlatformMetadata {
    Windows(WindowsMetadata),
    Linux(LinuxMetadata),
    MacOs(MacOsMetadata),
    FreeBsd(FreeBsdMetadata),
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct WindowsMetadata {
    pub session_id: u32,
    pub handle_count: u32,
    pub is_wow64: bool,
    pub integrity_level: Option<IntegrityLevel>,
}

#[derive(Debug, Clone, Copy, serde::Serialize, serde::Deserialize)]
pub enum IntegrityLevel {
    Low,
    Medium,
    High,
    System,
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct LinuxMetadata {
    pub cgroup: Option<String>,
    pub namespace_pid: Option<u32>,
    pub oom_score: Option<i32>,
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct MacOsMetadata {
    pub code_signing: Option<CodeSigningInfo>,
    pub sandbox_profile: Option<String>,
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct CodeSigningInfo {
    pub is_signed: bool,
    pub team_id: Option<String>,
    pub signing_id: Option<String>,
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct FreeBsdMetadata {
    pub jail_id: Option<u32>,
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct SecurityContext {
    pub selinux_context: Option<String>,
    pub apparmor_profile: Option<String>,
    pub capabilities: Vec<String>,
}

/// Process tree structure
#[derive(Debug, Clone)]
pub struct ProcessTree {
    pub root: ProcessRecord,
    pub children: Vec<ProcessTree>,
}

Core Implementation

impl ProcessCollector {
    /// Create a new process collector with the given configuration
    pub fn new(config: CollectorConfig) -> Result<Self, ProcessError> {
        let system = sysinfo::System::new_all();
        
        Ok(Self {
            system,
            config,
            last_refresh: None,
            #[cfg(target_os = "linux")]
            procfs_collector: if config.enable_enhanced_metadata {
                Some(LinuxProcfsCollector::new()?)
            } else {
                None
            },
            #[cfg(target_os = "windows")]
            windows_collector: if config.enable_enhanced_metadata {
                Some(WindowsProcessCollector::new()?)
            } else {
                None
            },
            #[cfg(target_os = "macos")]
            macos_collector: if config.enable_enhanced_metadata {
                Some(MacOsProcessCollector::new()?)
            } else {
                None
            },
        })
    }
    
    /// Collect all processes on the system
    pub async fn collect_processes(&mut self) -> Result<Vec<ProcessRecord>, ProcessError> {
        // Refresh system information
        self.system.refresh_all();
        self.last_refresh = Some(SystemTime::now());
        
        let mut processes = Vec::new();
        
        for (pid, process) in self.system.processes() {
            match self.collect_single_process(*pid, process).await {
                Ok(record) => processes.push(record),
                Err(e) => {
                    tracing::warn!(pid = pid.as_u32(), error = %e, "Failed to collect process");
                    // Continue collecting other processes
                }
            }
        }
        
        Ok(processes)
    }
    
    /// Collect information about a specific process
    async fn collect_single_process(
        &self,
        pid: sysinfo::Pid,
        process: &sysinfo::Process,
    ) -> Result<ProcessRecord, ProcessError> {
        let basic_info = self.collect_basic_info(pid, process)?;
        let enhanced_info = if self.config.enable_enhanced_metadata {
            self.collect_enhanced_info(pid).await.ok()
        } else {
            None
        };
        
        Ok(ProcessRecord {
            pid: pid.as_u32(),
            ppid: process.parent().map(|p| p.as_u32()),
            name: process.name().to_string(),
            exe_path: process.exe().map(|p| p.to_path_buf()),
            cmd_line: process.cmd().to_vec(),
            owner: self.get_process_owner(process),
            start_time: Some(SystemTime::UNIX_EPOCH + Duration::from_secs(process.start_time())),
            state: self.map_process_state(process.status()),
            resources: ResourceUsage {
                virtual_memory: process.virtual_memory(),
                resident_memory: process.memory(),
                cpu_percent: process.cpu_usage(),
                thread_count: process.tasks.len() as u32,
                cpu_time: Duration::from_secs(process.run_time()),
            },
            enhanced: enhanced_info,
            platform_specific: self.collect_platform_specific(pid).await?,
        })
    }
    
    /// Build a process tree starting from a root process
    pub async fn collect_process_tree(&mut self, root_pid: u32) -> Result<ProcessTree, ProcessError> {
        self.system.refresh_all();
        
        let processes = self.collect_processes().await?;
        self.build_tree(&processes, root_pid)
    }
    
    /// Monitor process changes in real-time (returns a stream)
    pub async fn monitor_process_changes(&mut self) -> Result<ProcessStream, ProcessError> {
        // Implementation would use tokio channels and spawn a monitoring task
        unimplemented!("Real-time monitoring to be implemented")
    }
    
    fn build_tree(&self, processes: &[ProcessRecord], root_pid: u32) -> Result<ProcessTree, ProcessError> {
        let root = processes.iter()
            .find(|p| p.pid == root_pid)
            .ok_or(ProcessError::ProcessNotFound { pid: root_pid })?
            .clone();
        
        let mut children = Vec::new();
        for process in processes {
            if process.ppid == Some(root_pid) {
                children.push(self.build_tree(processes, process.pid)?);
            }
        }
        
        Ok(ProcessTree { root, children })
    }
}

Error Handling Strategy

use thiserror::Error;

#[derive(Debug, Error)]
pub enum ProcessError {
    #[error("Permission denied accessing process {pid}: {source}")]
    PermissionDenied { 
        pid: u32, 
        #[source] source: std::io::Error 
    },
    
    #[error("Process {pid} not found or terminated")]
    ProcessNotFound { pid: u32 },
    
    #[error("Insufficient privileges: {required:?} required, have {current:?}")]
    InsufficientPrivileges {
        required: PrivilegeLevel,
        current: PrivilegeLevel,
    },
    
    #[error("Collection timeout after {timeout_ms}ms")]
    Timeout { timeout_ms: u64 },
    
    #[error("Platform-specific error: {message}")]
    PlatformError { message: String },
    
    #[error("Invalid process data: {details}")]
    InvalidData { details: String },
    
    #[error("System call failed: {syscall}")]
    SystemCallFailed {
        syscall: String,
        #[source] source: std::io::Error,
    },
}

impl ProcessError {
    /// Check if this error is recoverable
    pub fn is_recoverable(&self) -> bool {
        matches!(self, 
            ProcessError::Timeout { .. } | 
            ProcessError::ProcessNotFound { .. }
        )
    }
    
    /// Check if this error indicates a permission issue
    pub fn is_permission_error(&self) -> bool {
        matches!(self,
            ProcessError::PermissionDenied { .. } |
            ProcessError::InsufficientPrivileges { .. }
        )
    }
}

Platform-Specific Implementation Details

Linux Implementation

#[cfg(target_os = "linux")]
struct LinuxProcfsCollector {
    // Fields for caching and optimization
}

#[cfg(target_os = "linux")]
impl LinuxProcfsCollector {
    fn collect_network_connections(&self, pid: u32) -> Result<Vec<NetworkConnection>, ProcessError> {
        // Use procfs to read /proc/[pid]/net/tcp and /proc/[pid]/net/udp
        use procfs::process::Process;
        let process = Process::new(pid as i32)?;
        
        let mut connections = Vec::new();
        
        // TCP connections
        if let Ok(tcp) = process.tcp() {
            for entry in tcp {
                connections.push(NetworkConnection {
                    protocol: Protocol::Tcp,
                    local_addr: entry.local_address,
                    remote_addr: Some(entry.remote_address),
                    state: map_tcp_state(entry.state),
                });
            }
        }
        
        // UDP connections
        if let Ok(udp) = process.udp() {
            for entry in udp {
                connections.push(NetworkConnection {
                    protocol: Protocol::Udp,
                    local_addr: entry.local_address,
                    remote_addr: None,
                    state: ConnectionState::Established,
                });
            }
        }
        
        Ok(connections)
    }
    
    fn collect_file_descriptors(&self, pid: u32) -> Result<Vec<FileDescriptor>, ProcessError> {
        use procfs::process::Process;
        let process = Process::new(pid as i32)?;
        
        let mut fds = Vec::new();
        if let Ok(fd_iter) = process.fd() {
            for fd_entry in fd_iter.flatten() {
                if let Ok(target) = fd_entry.target() {
                    fds.push(FileDescriptor {
                        fd: fd_entry.fd,
                        path: target.into(),
                        mode: FileMode::ReadWrite, // Simplified
                    });
                }
            }
        }
        
        Ok(fds)
    }
}

Windows Implementation

#[cfg(target_os = "windows")]
struct WindowsProcessCollector {
    // Fields for Windows API handles
}

#[cfg(target_os = "windows")]
impl WindowsProcessCollector {
    fn collect_windows_metadata(&self, pid: u32) -> Result<WindowsMetadata, ProcessError> {
        use windows::Win32::System::Threading::{OpenProcess, PROCESS_QUERY_INFORMATION};
        use windows::Win32::System::ProcessStatus::GetProcessMemoryInfo;
        
        unsafe {
            let handle = OpenProcess(PROCESS_QUERY_INFORMATION, false, pid)?;
            
            // Get session ID
            let mut session_id = 0u32;
            ProcessIdToSessionId(pid, &mut session_id);
            
            // Get handle count
            let mut handle_count = 0u32;
            GetProcessHandleCount(handle, &mut handle_count);
            
            // Check if WOW64
            let mut is_wow64 = false;
            IsWow64Process(handle, &mut is_wow64);
            
            CloseHandle(handle);
            
            Ok(WindowsMetadata {
                session_id,
                handle_count,
                is_wow64,
                integrity_level: None, // Would require additional API calls
            })
        }
    }
}

macOS Implementation

#[cfg(target_os = "macos")]
struct MacOsProcessCollector {
    // Fields for BSD sysctl
}

#[cfg(target_os = "macos")]
impl MacOsProcessCollector {
    fn collect_macos_metadata(&self, pid: u32) -> Result<MacOsMetadata, ProcessError> {
        // Use sysctl to get process information
        let code_signing = self.check_code_signing(pid)?;
        let sandbox_profile = self.get_sandbox_profile(pid).ok();
        
        Ok(MacOsMetadata {
            code_signing: Some(code_signing),
            sandbox_profile,
        })
    }
    
    fn check_code_signing(&self, pid: u32) -> Result<CodeSigningInfo, ProcessError> {
        // Implementation would use Security framework APIs
        unimplemented!("Code signing check to be implemented")
    }
}

Integration with DaemonEye Architecture

Communication Protocol

The ProcessCollector integrates with daemoneye-agent through IPC:

/// Message types for process collector IPC
#[derive(Debug, serde::Serialize, serde::Deserialize)]
pub enum ProcessCollectorMessage {
    /// Request to enumerate all processes
    EnumerateRequest {
        include_enhanced: bool,
    },
    /// Response with process list
    EnumerateResponse {
        processes: Vec<ProcessRecord>,
        collection_time_ms: u64,
    },
    /// Request for specific process details
    ProcessDetailsRequest {
        pid: u32,
    },
    /// Response with detailed process information
    ProcessDetailsResponse {
        process: ProcessRecord,
    },
    /// Request to start monitoring process changes
    StartMonitoring {
        filter: ProcessFilter,
    },
    /// Notification of process change
    ProcessChangeNotification {
        change_type: ProcessChangeType,
        process: ProcessRecord,
    },
    /// Error occurred during collection
    Error {
        error: String,
    },
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub enum ProcessChangeType {
    Created,
    Terminated,
    Modified,
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct ProcessFilter {
    pub name_pattern: Option<String>,
    pub min_cpu_percent: Option<f32>,
    pub min_memory_mb: Option<u64>,
}

Service Integration

/// Integration with daemoneye-agent service manager
pub struct ProcmondService {
    collector: ProcessCollector,
    ipc_channel: IpcChannel,
    config: ProcmondConfig,
}

impl ProcmondService {
    pub async fn run(&mut self) -> Result<(), ServiceError> {
        loop {
            tokio::select! {
                msg = self.ipc_channel.recv() => {
                    self.handle_message(msg?).await?;
                }
                _ = tokio::time::sleep(self.config.collection_interval) => {
                    self.perform_scheduled_collection().await?;
                }
            }
        }
    }
    
    async fn handle_message(&mut self, msg: ProcessCollectorMessage) -> Result<(), ServiceError> {
        match msg {
            ProcessCollectorMessage::EnumerateRequest { include_enhanced } => {
                let start = std::time::Instant::now();
                let processes = self.collector.collect_processes().await?;
                let elapsed = start.elapsed().as_millis() as u64;
                
                self.ipc_channel.send(ProcessCollectorMessage::EnumerateResponse {
                    processes,
                    collection_time_ms: elapsed,
                }).await?;
            }
            // Handle other message types...
            _ => {}
        }
        Ok(())
    }
}

Performance Requirements

Detailed Performance Targets

Metric Primary Platforms FreeBSD Measurement Method
Initial Full Enumeration
- 100 processes <20ms <40ms Criterion benchmark
- 1,000 processes <100ms <200ms Criterion benchmark
- 10,000 processes <500ms <1000ms Criterion benchmark
Incremental Updates
- Changed processes <50ms <100ms Real-world testing
- Single process query <5ms <10ms Unit test timing
Memory Usage
- Base collector <10MB <15MB Valgrind/instruments
- Per 1000 processes <5MB <7MB Memory profiling
- Enhanced metadata +10MB +15MB Memory profiling
CPU Overhead
- Idle monitoring <1% <2% System monitoring
- Active collection <5% <10% System monitoring
- Peak during enum <20% <30% System monitoring

Performance Optimization Strategies

  1. Caching

    • Cache process metadata that changes infrequently (executable path, start time)
    • Implement smart refresh based on process state
    • Use bloom filters for quick process existence checks
  2. Batch Operations

    • Windows: Use CreateToolhelp32Snapshot for bulk enumeration
    • Linux: Read multiple /proc entries in parallel
    • macOS: Use batch sysctl queries where possible
  3. Selective Collection

    • Support filtering by process name, owner, or resource usage
    • Lazy loading of enhanced metadata
    • Configurable collection depth
  4. Async Processing

    • Non-blocking I/O for file system operations
    • Parallel collection of platform-specific metadata
    • Streaming results to avoid memory spikes

Security Considerations

Threat Model

  1. Information Disclosure

    • Process enumeration exposes running applications
    • Command-line arguments may contain sensitive data
    • Environment variables may contain secrets
  2. Privilege Escalation

    • Collector must not allow unauthorized privilege gain
    • Must validate all inputs from IPC
    • Must sanitize data before transmission
  3. Denial of Service

    • Malicious processes with large metadata
    • Rapid process creation/termination
    • Resource exhaustion attacks

Security Controls

/// Security controls for process collection
pub struct SecurityPolicy {
    /// Sanitize command-line arguments
    pub sanitize_cmdline: bool,
    /// Redact environment variables matching patterns
    pub env_var_redaction_patterns: Vec<regex::Regex>,
    /// Maximum time to spend collecting single process
    pub per_process_timeout: Duration,
    /// Maximum total collection time
    pub total_collection_timeout: Duration,
    /// Require privilege verification
    pub verify_privileges: bool,
}

impl ProcessCollector {
    fn sanitize_command_line(&self, cmd: &[String]) -> Vec<String> {
        if !self.config.security_policy.sanitize_cmdline {
            return cmd.to_vec();
        }
        
        cmd.iter().map(|arg| {
            // Redact potential passwords, tokens, keys
            if arg.contains("password=") || arg.contains("token=") || arg.contains("key=") {
                String::from("[REDACTED]")
            } else {
                arg.clone()
            }
        }).collect()
    }
    
    fn sanitize_environment(&self, env: &HashMap<String, String>) -> HashMap<String, String> {
        let patterns = &self.config.security_policy.env_var_redaction_patterns;
        
        env.iter().map(|(k, v)| {
            let should_redact = patterns.iter().any(|re| re.is_match(k));
            let value = if should_redact {
                String::from("[REDACTED]")
            } else {
                v.clone()
            };
            (k.clone(), value)
        }).collect()
    }
}

Privilege Management

/// Detect current privilege level
pub fn detect_privilege_level() -> PrivilegeLevel {
    #[cfg(target_os = "windows")]
    {
        if is_elevated_windows() {
            PrivilegeLevel::Admin
        } else {
            PrivilegeLevel::User
        }
    }
    
    #[cfg(unix)]
    {
        use nix::unistd::{Uid, getuid, geteuid};
        
        if getuid().is_root() || geteuid().is_root() {
            PrivilegeLevel::Admin
        } else if has_cap_sys_ptrace() {
            PrivilegeLevel::Elevated
        } else {
            PrivilegeLevel::User
        }
    }
}

#[cfg(target_os = "windows")]
fn is_elevated_windows() -> bool {
    // Check if process has administrator token
    use windows::Win32::Security::{GetTokenInformation, TokenElevation};
    // Implementation details...
    false
}

#[cfg(unix)]
fn has_cap_sys_ptrace() -> bool {
    // Check for CAP_SYS_PTRACE capability on Linux
    // Implementation details...
    false
}

Testing Strategy

Unit Testing

#[cfg(test)]
mod tests {
    use super::*;
    
    #[tokio::test]
    async fn test_enumerate_processes() {
        let config = CollectorConfig::default();
        let mut collector = ProcessCollector::new(config).unwrap();
        
        let processes = collector.collect_processes().await.unwrap();
        assert!(!processes.is_empty());
        
        // Verify current process is in the list
        let current_pid = std::process::id();
        assert!(processes.iter().any(|p| p.pid == current_pid));
    }
    
    #[tokio::test]
    async fn test_process_tree_building() {
        let config = CollectorConfig::default();
        let mut collector = ProcessCollector::new(config).unwrap();
        
        // Build tree starting from current process
        let current_pid = std::process::id();
        let tree = collector.collect_process_tree(current_pid).await.unwrap();
        
        assert_eq!(tree.root.pid, current_pid);
    }
    
    #[test]
    fn test_error_handling() {
        let error = ProcessError::ProcessNotFound { pid: 99999 };
        assert!(error.is_recoverable());
        assert!(!error.is_permission_error());
        
        let error = ProcessError::PermissionDenied { 
            pid: 1, 
            source: std::io::Error::from(std::io::ErrorKind::PermissionDenied) 
        };
        assert!(!error.is_recoverable());
        assert!(error.is_permission_error());
    }
    
    #[test]
    fn test_command_line_sanitization() {
        let config = CollectorConfig {
            security_policy: SecurityPolicy {
                sanitize_cmdline: true,
                ..Default::default()
            },
            ..Default::default()
        };
        let collector = ProcessCollector::new(config).unwrap();
        
        let cmd = vec![
            "myapp".to_string(),
            "--password=secret123".to_string(),
            "--user=admin".to_string(),
        ];
        
        let sanitized = collector.sanitize_command_line(&cmd);
        assert_eq!(sanitized[0], "myapp");
        assert_eq!(sanitized[1], "[REDACTED]");
        assert_eq!(sanitized[2], "--user=admin");
    }
}

Integration Testing

#[cfg(test)]
mod integration_tests {
    use super::*;
    use std::process::Command;
    
    #[tokio::test]
    #[cfg_attr(not(target_os = "linux"), ignore)]
    async fn test_linux_procfs_integration() {
        let config = CollectorConfig {
            enable_enhanced_metadata: true,
            collect_network_info: true,
            collect_fd_info: true,
            ..Default::default()
        };
        
        let mut collector = ProcessCollector::new(config).unwrap();
        let processes = collector.collect_processes().await.unwrap();
        
        // Find a process with network connections
        let with_network = processes.iter()
            .find(|p| p.enhanced.as_ref()
                .map(|e| !e.network_connections.is_empty())
                .unwrap_or(false));
        
        assert!(with_network.is_some(), "Should find process with network connections");
    }
    
    #[tokio::test]
    async fn test_process_lifecycle_monitoring() {
        let config = CollectorConfig::default();
        let mut collector = ProcessCollector::new(config).unwrap();
        
        // Spawn a child process
        let mut child = Command::new("sleep")
            .arg("1")
            .spawn()
            .unwrap();
        let child_pid = child.id();
        
        // Collect processes
        let processes = collector.collect_processes().await.unwrap();
        assert!(processes.iter().any(|p| p.pid == child_pid));
        
        // Wait for process to exit
        child.wait().unwrap();
        tokio::time::sleep(Duration::from_millis(100)).await;
        
        // Verify process is gone
        let processes = collector.collect_processes().await.unwrap();
        assert!(!processes.iter().any(|p| p.pid == child_pid));
    }
}

Performance Benchmarking

#[cfg(test)]
mod benches {
    use super::*;
    use criterion::{black_box, criterion_group, criterion_main, Criterion};
    
    fn bench_enumerate_processes(c: &mut Criterion) {
        let config = CollectorConfig::default();
        let runtime = tokio::runtime::Runtime::new().unwrap();
        
        c.bench_function("enumerate_processes", |b| {
            b.to_async(&runtime).iter(|| async {
                let mut collector = ProcessCollector::new(config.clone()).unwrap();
                let processes = collector.collect_processes().await.unwrap();
                black_box(processes);
            });
        });
    }
    
    fn bench_process_tree(c: &mut Criterion) {
        let config = CollectorConfig::default();
        let runtime = tokio::runtime::Runtime::new().unwrap();
        let current_pid = std::process::id();
        
        c.bench_function("build_process_tree", |b| {
            b.to_async(&runtime).iter(|| async {
                let mut collector = ProcessCollector::new(config.clone()).unwrap();
                let tree = collector.collect_process_tree(current_pid).await.unwrap();
                black_box(tree);
            });
        });
    }
    
    criterion_group!(benches, bench_enumerate_processes, bench_process_tree);
    criterion_main!(benches);
}

Acceptance Criteria

Functional Requirements

Core Functionality

  • Cross-Platform Process Enumeration

    • Successfully enumerate all accessible processes on Windows 10+, Windows Server 2019+
    • Successfully enumerate all accessible processes on Ubuntu 20.04+, RHEL 8+, Debian 11+
    • Successfully enumerate all accessible processes on macOS 12+
    • Basic enumeration works on FreeBSD 13+ (secondary support)
    • Handle process creation/termination during enumeration gracefully
  • Comprehensive Metadata Collection

    • Process ID (PID) and Parent Process ID (PPID)
    • Process name and full executable path
    • Command-line arguments (sanitized when configured)
    • Process owner/user information
    • Resource usage (CPU percentage, memory usage, thread count)
    • Process start time and cumulative runtime
    • Process state (running, sleeping, stopped, zombie)
  • Enhanced Metadata (Platform-Specific)

    • Linux: Network connections, open file descriptors, cgroup/namespace info
    • Windows: Session ID, handle count, WOW64 status, integrity level
    • macOS: Code signing information, sandbox profile
    • Graceful degradation when enhanced metadata is unavailable
  • Process Tree Functionality

    • Build complete process tree from any root PID
    • Correctly identify parent-child relationships
    • Handle orphaned processes appropriately
    • Support querying process ancestry

API Requirements

  • Async Interface

    • All collection methods are async using tokio runtime
    • Non-blocking I/O for all file system operations
    • Support for cancellation via tokio cancellation tokens
  • Error Handling

    • Structured error types with rich context
    • Distinguish between recoverable and fatal errors
    • Permission errors reported separately with actionable guidance
    • Partial results on collection failures (collect what's accessible)
  • Configuration

    • Configurable collection depth (basic vs enhanced)
    • Timeouts for collection operations
    • Security policy configuration (sanitization, redaction)
    • Platform-specific feature toggles

Performance Requirements

  • Enumeration Performance

    • Primary Platforms: Enumerate 1,000 processes in <100ms (average)
    • FreeBSD: Enumerate 1,000 processes in <200ms (best effort)
    • Large Systems: Support enumeration of 10,000+ processes
    • Latency: Single process query completes in <5ms
  • Memory Efficiency

    • Base collector memory usage <10MB
    • Memory scales linearly: <5MB per 1,000 processes
    • No memory leaks under continuous operation (24h+ test)
    • Support streaming results for large process lists
  • CPU Overhead

    • Idle monitoring: <1% CPU usage
    • Active collection: <5% CPU usage average
    • Peak during enumeration: <20% CPU usage
    • No sustained high CPU usage

Security Requirements

  • Privilege Management

    • Detect current privilege level correctly
    • Request minimum required privileges
    • Document privilege requirements clearly
    • Handle privilege escalation failures gracefully
  • Data Sanitization

    • Redact sensitive command-line arguments (passwords, tokens, keys)
    • Filter environment variables based on configurable patterns
    • Sanitize file paths containing usernames or sensitive data
    • Provide clear documentation on what data is collected
  • Access Control

    • Respect platform security boundaries (SELinux, AppArmor, SIP)
    • Handle permission denied errors without crashing
    • Log all access attempts for security auditing
    • Never attempt privilege escalation without explicit configuration

Quality Requirements

  • Testing Coverage

    • Unit tests: >80% code coverage
    • Integration tests for each supported platform
    • Performance benchmarks with regression detection
    • Security tests for privilege boundary handling
  • Documentation

    • Comprehensive API documentation with examples
    • Platform-specific behavior documented
    • Security considerations clearly explained
    • Performance characteristics documented
  • Reliability

    • Handle rapid process creation/termination
    • Recover from transient system errors
    • Support long-running operation (24h+ without restart)
    • Consistent behavior across platforms

Platform-Specific Requirements

Linux

  • Parse /proc filesystem efficiently
  • Handle Linux kernel versions 4.9+
  • Support major distributions (Ubuntu, RHEL, Debian, Fedora)
  • Handle SELinux and AppArmor restrictions
  • Support containerized environments (Docker, Podman)
  • Read namespace and cgroup information

Windows

  • Support Windows 10 and Windows Server 2019+
  • Handle UAC elevation appropriately
  • Support both 32-bit and 64-bit processes
  • Detect WOW64 processes correctly
  • Handle service and system processes
  • Support different user sessions

macOS

  • Support macOS 12 (Monterey) and later
  • Handle System Integrity Protection restrictions
  • Support both Intel and Apple Silicon
  • Read code signing information
  • Detect sandboxed applications
  • Handle macOS-specific process attributes

FreeBSD (Secondary)

  • Basic process enumeration works
  • Handle FreeBSD /proc differences
  • Best-effort performance (documented limitations)
  • Clear documentation of unsupported features

Dependencies

Rust Crates

[dependencies]
# Core cross-platform process enumeration
sysinfo = "0.31"

# Async runtime and utilities
tokio = { version = "1.0", features = ["full"] }
tokio-stream = "0.1"

# Error handling
thiserror = "1.0"
anyhow = "1.0"

# Serialization
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"

# Logging
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }

# Platform-specific dependencies
[target.'cfg(target_os = "linux")'.dependencies]
procfs = "0.16"

[target.'cfg(target_os = "windows")'.dependencies]
windows = { version = "0.58", features = [
    "Win32_Foundation",
    "Win32_System_Threading",
    "Win32_System_ProcessStatus",
    "Win32_System_Diagnostics_ToolHelp",
    "Win32_Security",
] }

[target.'cfg(any(target_os = "macos", target_os = "freebsd"))'.dependencies]
libc = "0.2"

[dev-dependencies]
criterion = { version = "0.5", features = ["async_tokio"] }
tempfile = "3.0"
tokio-test = "0.4"

System Dependencies

Linux

  • Kernel 4.9+ with /proc filesystem support
  • Optional: libcap for capability management

Windows

  • Windows 10 version 1809+ or Windows Server 2019+
  • Optional: WMI for extended process information

macOS

  • macOS 12 (Monterey) or later
  • Xcode Command Line Tools

FreeBSD

  • FreeBSD 13.0+
  • /proc filesystem mounted (optional, for enhanced metadata)

Implementation Plan

Phase 1: Core Infrastructure (Weeks 1-2)

Goal: Basic cross-platform process enumeration

  • Week 1: Foundation

    • Set up project structure and dependencies
    • Define core data structures (ProcessRecord, ResourceUsage, etc.)
    • Implement ProcessCollector with sysinfo integration
    • Basic error handling with ProcessError types
    • Unit tests for data structures
  • Week 2: Core Functionality

    • Implement basic process enumeration
    • Add process tree building logic
    • Implement configuration system
    • Integration tests on primary platforms
    • Documentation for core API

Phase 2: Platform-Specific Enhancements (Weeks 3-5)

Goal: Enhanced metadata collection for each platform

  • Week 3: Linux Enhancement

    • Implement LinuxProcfsCollector
    • Add network connection enumeration
    • Add file descriptor collection
    • Add cgroup/namespace information
    • Linux-specific integration tests
  • Week 4: Windows Enhancement

    • Implement WindowsProcessCollector
    • Add handle count and session information
    • Add WOW64 detection
    • Add integrity level detection
    • Windows-specific integration tests
  • Week 5: macOS/FreeBSD Enhancement

    • Implement MacOsProcessCollector
    • Add code signing information
    • Add sandbox detection
    • Basic FreeBSD support
    • macOS/FreeBSD integration tests

Phase 3: Security & Performance (Weeks 6-7)

Goal: Production-ready security and performance

  • Week 6: Security Hardening

    • Implement privilege detection and management
    • Add data sanitization (command-line, environment)
    • Add security policy configuration
    • Security-focused testing
    • Security documentation
  • Week 7: Performance Optimization

    • Implement caching strategies
    • Add streaming for large process lists
    • Optimize platform-specific collectors
    • Performance benchmarking
    • Load testing with 10,000+ processes

Phase 4: Integration & Testing (Weeks 8-9)

Goal: Integration with DaemonEye and comprehensive testing

  • Week 8: DaemonEye Integration

    • Implement IPC message protocol
    • Integrate with daemoneye-agent service manager
    • Add real-time monitoring support
    • Integration tests with daemoneye-agent
    • End-to-end testing
  • Week 9: Final Testing & Polish

    • Cross-platform testing on all supported platforms
    • Performance regression testing
    • Security audit and penetration testing
    • Documentation review and completion
    • Release preparation

Timeline & Milestones

Total Duration: 9 weeks (fits within v0.2.0 milestone)

Milestone Deadline Deliverables
M1: Core Infrastructure End of Week 2 Basic process enumeration working
M2: Platform Enhancements End of Week 5 Enhanced metadata on all platforms
M3: Production Ready End of Week 7 Security hardened, performance optimized
M4: Integration Complete End of Week 9 Fully integrated with DaemonEye

v0.2.0 Milestone: Due September 22, 2025

Related Issues

Direct Dependencies

Related Features

Architecture Context

This issue provides the foundational process enumeration capability that enables:


Definition of Done

  • All acceptance criteria met for primary platforms (Windows, Linux, macOS)
  • Basic functionality verified on FreeBSD
  • Code review completed with focus on security and performance
  • All tests passing (unit, integration, performance benchmarks)
  • Performance benchmarks meet or exceed targets
  • Security audit completed with no critical findings
  • Documentation complete and reviewed
  • Integration tests pass with daemoneye-agent
  • Cross-platform compatibility verified on CI/CD
  • PR approved and merged to main branch

This issue provides the foundational process enumeration capability that enables procmond to monitor system processes effectively within the DaemonEye architecture. The implementation will serve as a critical component for security monitoring, process integrity verification, and system state analysis.

Metadata

Metadata

Assignees

Labels

core-featureCore system functionalitycross-platformMulti-platform compatibility featuresenhancementNew feature or requestprocess-monitoringProcess monitoring and enumeration features

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions