Skip to content

feat: model fallback chain + multi-agent routing#131

Open
Leeaandrob wants to merge 6 commits intosipeed:mainfrom
Leeaandrob:feat/multi-agent-routing
Open

feat: model fallback chain + multi-agent routing#131
Leeaandrob wants to merge 6 commits intosipeed:mainfrom
Leeaandrob:feat/multi-agent-routing

Conversation

@Leeaandrob
Copy link
Collaborator

@Leeaandrob Leeaandrob commented Feb 13, 2026

Overview

This PR delivers two major features for PicoClaw, closing the gap with OpenClaw's production capabilities:

  1. Model Fallback Chain — Automatic failover between LLM models with intelligent error classification
  2. Multi-Agent Routing — Declarative agent bindings with 7-level priority resolution

Both features are backward compatible — existing single-agent configs continue to work unchanged.


Feature 1: Model Fallback Chain

When a primary model fails (rate limit, billing, auth error, etc.), the system automatically rotates to the next available candidate.

What's included

  • Error classifier (pkg/providers/error_classifier.go) — ~40 regex patterns across 6 categories: rate_limit, billing, auth, context_length, content_filter, server_error
  • Cooldown tracker (pkg/providers/cooldown.go) — Exponential backoff per model, separate billing cooldown (5h base), automatic recovery
  • Fallback manager (pkg/providers/fallback.go) — Candidate rotation, availability checks, transparent retry
  • Model reference parser (pkg/providers/model_ref.go) — Handles provider/model format and plain model names

Config

{
  "agents": {
    "defaults": {
      "model": "google/gemini-2.5-pro",
      "model_fallbacks": ["openai/gpt-4o", "anthropic/claude-sonnet"]
    }
  }
}

Tests

  • 24 unit tests covering error classification, cooldown math, fallback rotation, and edge cases

Feature 2: Multi-Agent Routing

Multiple agents with independent models, workspaces, sessions, and tools — routed by declarative bindings.

What's included

  • Route resolver (pkg/routing/route.go) — 7-level priority cascade: peer > parent_peer > guild > team > account > channel_wildcard > default
  • Agent registry (pkg/agent/registry.go) — Lifecycle management, per-agent state isolation, subagent ACL
  • Agent instance (pkg/agent/instance.go) — Per-agent container with workspace, model, session manager, tools
  • Session key builder (pkg/routing/session_key.go) — 4 DM scopes: per-peer, per-channel, per-account, global
  • Channel adapters — Telegram, Discord, Slack emit peer metadata (Kind, ID, ParentID, GuildID)
  • Subagent allowlist — Per-agent spawn control via subagents.allow_agents

Architecture

InboundMessage → RouteResolver (binding match) → AgentInstance (isolated workspace/session)
                                                    ↓
                                              runAgentLoop → LLM (agent model + fallbacks)

Config

{
  "agents": {
    "defaults": { "model": "gpt-4", "model_fallbacks": ["gpt-4o-mini"] },
    "list": [
      { "id": "main", "default": true },
      {
        "id": "cto",
        "model": { "primary": "claude-opus", "fallbacks": ["haiku"] },
        "subagents": { "allow_agents": ["main"] }
      }
    ]
  },
  "bindings": [
    {
      "agent_id": "cto",
      "match": { "channel": "discord", "peer": { "kind": "channel", "id": "123456" } }
    }
  ],
  "session": {
    "dm_scope": "per-peer",
    "identity_links": { "john": ["telegram:123", "discord:456"] }
  }
}

Tests

  • 20 unit tests across routing, registry, config, and channel adapters

Files changed (27 files)

New packages

Package Files Purpose
pkg/routing/ route.go, session_key.go, agent_id.go + tests Route resolution, session scoping, ID normalization
pkg/providers/ fallback.go, cooldown.go, error_classifier.go, model_ref.go + tests Fallback chain engine

Modified

File Change
pkg/agent/loop.go Rewritten for multi-agent: registry-based dispatch, per-agent tool loop
pkg/config/config.go AgentsConfig.List, AgentBinding, SessionConfig, AgentModelConfig
pkg/channels/telegram.go Emit peer metadata (user/group/supergroup)
pkg/channels/discord.go Emit peer metadata (DM/channel/thread + guild)
pkg/channels/slack.go Emit peer metadata (DM/channel/thread)
pkg/channels/base.go PeerInfo struct with Kind, ID, ParentID, GuildID
pkg/tools/spawn.go Allowlist check + AsyncCallback integration
pkg/tools/subagent.go Agent ID param for cross-agent spawning

New

File Purpose
pkg/agent/instance.go Per-agent state container
pkg/agent/registry.go Agent lifecycle + route resolution
pkg/agent/registry_test.go Registry unit tests

Validation

  • go build ./... — clean
  • go vet ./... — clean
  • go test ./... — 44 tests passing (all packages)
  • gofmt — formatted
  • CI checks — fmt-check, vet, test all green
  • Backward compatible — empty agents.list creates implicit "main" agent
  • Live tested with Discord: binding routing (channel → cto agent) + default fallback (unbound → main agent)
  • Synced with upstream/main (merge commit resolving 5 conflict files)

Closes

Add 2-layer fallback system (text + image) with automatic candidate
resolution. Includes error classifier (~40 patterns), per-provider
cooldown (exponential backoff), and model reference parsing.

- FailoverError/FailoverReason types for structured error handling
- ErrorClassifier with rate_limit, billing, auth, timeout patterns
- FallbackChain with cooldown management and candidate rotation
- ModelRef parser for provider/model string format
- 128 tests, 95%+ coverage
Implement per-agent workspace/model/session isolation with 7-level
priority routing cascade (peer > parent_peer > guild > team > account >
channel > default). Backward compatible - empty agents.list creates
implicit "main" agent from defaults.

Core components:
- routing/agent_id.go: ID normalization with pre-compiled regex
- routing/session_key.go: 4 DM scope modes with identity links
- routing/route.go: RouteResolver with priority-based binding matcher
- agent/instance.go: Per-agent state (workspace, sessions, tools, model)
- agent/registry.go: Agent lifecycle, route resolution, subagent ACL

Integration:
- config.go: AgentModelConfig (flexible JSON), bindings, session config
- loop.go: Complete rewrite for multi-agent dispatch
- Channel adapters: peer_kind/peer_id metadata (telegram, discord, slack)
- spawn.go: Subagent allowlist enforcement per agent

Validated end-to-end with Discord channel-based bindings, default
fallback routing, and per-agent session persistence.
Resolve conflicts in loop.go, config.go, config_test.go,
spawn.go, and subagent.go. Integrate upstream ToolResult/AsyncTool
pattern with multi-agent routing features. Rename mockProvider
to mockRegistryProvider in registry_test.go to avoid redeclaration
with upstream's loop_test.go.
Remove extra spaces in comment alignment to pass fmt-check CI.
@Leeaandrob Leeaandrob changed the title feat: multi-agent routing with declarative bindings feat: model fallback chain + multi-agent routing Feb 13, 2026
Update registerSharedTools to use new WebSearchToolOptions API and
add hardware tools (I2C, SPI) from upstream. Accept upstream's
new web tools config test.
Resolve conflicts:
- pkg/agent/loop.go: integrate context compression, command handling,
  utf8 token estimation, and summarization notification into
  multi-agent routing architecture
- pkg/config/config_test.go: merge imports from both branches
- pkg/agent/loop_test.go: update test to use registry-based sessions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant