Skip to content

Deep Dive: Rust + Headless Chrome #8

@chhot2u

Description

@chhot2u

Rust + Headless Chrome — Deep Dive Analysis

Overview

Pure Rust solution using headless_chrome or chromiumoxide crate for CDP-based browser control. Maximum performance, minimal resource usage, no GUI (CLI/server only or pair with a web dashboard).


Architecture

┌──────────────────────────────────────┐
│         Rust Binary                  │
│  ┌────────────────────────────────┐  │
│  │   Tokio Async Runtime          │  │
│  │   - Task spawner               │  │
│  │   - Channel-based queue        │  │
│  │   - Semaphore (100 limit)      │  │
│  ├────────────────────────────────┤  │
│  │   chromiumoxide / headless_chrome│ │
│  │   - CDP WebSocket connections  │  │
│  │   - Per-task proxy config      │  │
│  │   - Page automation            │  │
│  ├────────────────────────────────┤  │
│  │   Optional: Axum Web Server    │  │
│  │   - REST API                   │  │
│  │   - WebSocket for live updates │  │
│  │   - Serve SPA dashboard        │  │
│  ├────────────────────────────────┤  │
│  │   Data Layer                   │  │
│  │   - SQLx + PostgreSQL/SQLite   │  │
│  │   - tokio::fs for artifacts    │  │
│  └────────────────────────────────┘  │
└──────────────────────────────────────┘

Key Dependencies

# Cargo.toml
[dependencies]
chromiumoxide = { version = "0.7", features = ["tokio-runtime"] }
tokio = { version = "1", features = ["full"] }
axum = "0.8"               # optional web server
tower-http = "0.6"         # CORS, compression
sqlx = { version = "0.8", features = ["runtime-tokio", "sqlite"] }
serde = { version = "1", features = ["derive"] }
reqwest = { version = "0.12", features = ["json"] }

Proxy per Task Implementation

use chromiumoxide::{Browser, BrowserConfig};

async fn create_browser_with_proxy(proxy: &ProxyConfig) -> Result<Browser> {
    let config = BrowserConfig::builder()
        .arg(format!("--proxy-server={}", proxy.server))
        .arg("--headless=new")
        .build()
        .map_err(|e| anyhow::anyhow!(e))?;
    
    let (browser, mut handler) = Browser::launch(config).await?;
    tokio::spawn(async move { while handler.next().await.is_some() {} });
    
    Ok(browser)
}

// For 100 tasks: pool of browsers or use browser contexts
async fn run_task(browser: &Browser, task: TaskConfig) -> TaskResult {
    let page = browser.new_page("about:blank").await?;
    
    for step in &task.steps {
        execute_step(&page, step).await?;
    }
    
    let screenshot = page.screenshot(
        chromiumoxide::page::ScreenshotParams::builder().build()
    ).await?;
    
    page.close().await?;
    TaskResult { success: true, screenshot: Some(screenshot) }
}

Concurrency Model

use tokio::sync::Semaphore;
use std::sync::Arc;

async fn run_batch(tasks: Vec<TaskConfig>) -> Vec<TaskResult> {
    let sem = Arc::new(Semaphore::new(100));
    let mut handles = Vec::new();

    for task in tasks {
        let permit = sem.clone().acquire_owned().await.unwrap();
        handles.push(tokio::spawn(async move {
            let browser = create_browser_with_proxy(&task.proxy).await;
            let result = match browser {
                Ok(b) => run_task(&b, task).await.unwrap_or_else(|e| {
                    TaskResult { success: false, error: Some(e.to_string()), ..Default::default() }
                }),
                Err(e) => TaskResult { success: false, error: Some(e.to_string()), ..Default::default() },
            };
            drop(permit);
            result
        }));
    }

    futures::future::join_all(handles)
        .await
        .into_iter()
        .map(|r| r.unwrap())
        .collect()
}

Strengths

  • Best performance: Zero-cost abstractions, no GC
  • Lowest RAM: ~1GB for 100 tasks
  • Tiny binary: ~5MB compiled
  • Memory safety: No crashes from null pointers or buffer overflows
  • Tokio async: Handles thousands of concurrent tasks efficiently
  • Single binary: No runtime dependencies (except Chrome)
  • Predictable latency: No GC pauses

Weaknesses

  • Steep learning curve: Rust ownership model, lifetimes, async
  • Chromium only: No multi-browser support
  • Limited CDP crate ecosystem: chromiumoxide is less mature than Playwright
  • No auto-wait: Must implement retry/wait logic manually
  • No codegen/trace: No equivalent of Playwright tooling
  • No built-in GUI: Need to add Axum + SPA for dashboard
  • Longer development time: Rust takes longer to write
  • Smaller community: Fewer examples and Stack Overflow answers

Resource Estimates (100 tasks)

Resource Estimate
RAM ~1 GB
CPU 2-4 cores sufficient
Disk ~5MB binary + system Chrome
Startup <0.5s

When to Choose This Stack

Maximum performance is the goal
✅ Team is proficient in Rust
✅ Running on resource-constrained hardware
✅ Need 1000+ concurrent tasks (scale ceiling is highest)
✅ Building infrastructure/platform (long-term investment)

❌ Avoid if: need rapid prototyping, team doesn't know Rust, need multi-browser, or need rich automation API


Verdict: 7.35/10

Unmatched performance but high development cost. Best for teams already invested in Rust.

References issue #1 for full comparison

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions