Skip to content

P1: Add concurrency governor for streaming and agent spawning #65

@galic1987

Description

@galic1987

Source

ChatGPT architecture review feedback

Problem

No global backpressure mechanism for streaming responses or multi-agent spawning. Unbounded task spawning with concurrent streaming can cause:

  • Resource exhaustion under heavy swarm workloads
  • Flaky behavior when many agents stream simultaneously
  • No structured cancellation on session end

Proposal

  1. Global semaphore (MAX_INFLIGHT_STREAMS) to cap concurrent LLM streaming responses
  2. Per-agent semaphore to limit tool execution concurrency within each agent
  3. Structured cancellation — when a session ends or ESC is pressed, all spawned tasks are cancelled cleanly via CancellationToken (tokio-util) rather than just setting an AtomicBool
  4. Backpressure on tool results — if an agent's result queue is full, slow down rather than drop

Relevant Code

  • src/agent/mod.rscancelled: Arc<AtomicBool> (current cancellation)
  • src/agent/execution.rs — tool execution spawning
  • src/orchestration/parallel.rs — parallel orchestration

Priority

P1 — stability under load

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions