P1: Add concurrency governor for streaming and agent spawning

## Source
ChatGPT architecture review feedback

## Problem

No global backpressure mechanism for streaming responses or multi-agent spawning. Unbounded task spawning with concurrent streaming can cause:
- Resource exhaustion under heavy swarm workloads
- Flaky behavior when many agents stream simultaneously
- No structured cancellation on session end

## Proposal

1. **Global semaphore** (`MAX_INFLIGHT_STREAMS`) to cap concurrent LLM streaming responses
2. **Per-agent semaphore** to limit tool execution concurrency within each agent
3. **Structured cancellation** — when a session ends or ESC is pressed, all spawned tasks are cancelled cleanly via `CancellationToken` (tokio-util) rather than just setting an `AtomicBool`
4. **Backpressure on tool results** — if an agent's result queue is full, slow down rather than drop

## Relevant Code
- `src/agent/mod.rs` — `cancelled: Arc<AtomicBool>` (current cancellation)
- `src/agent/execution.rs` — tool execution spawning
- `src/orchestration/parallel.rs` — parallel orchestration

## Priority
P1 — stability under load

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

P1: Add concurrency governor for streaming and agent spawning #65

Source

Problem

Proposal

Relevant Code

Priority

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

P1: Add concurrency governor for streaming and agent spawning #65

Description

Source

Problem

Proposal

Relevant Code

Priority

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions