feat: add model fallback chain with error classification and cooldown by Leeaandrob · Pull Request #92 · sipeed/picoclaw

Leeaandrob · 2026-02-13T01:09:24Z

Summary

Implement a 2-layer model fallback system (text + image/multimodal) that automatically retries failed LLM requests across multiple providers/models
Error classification engine with ~40 patterns matching 7 failure categories (auth, rate_limit, billing, timeout, format, overloaded, unknown)
Per-provider cooldown tracker with standard exponential backoff (1m→5m→25m→1h) and billing-specific backoff (5h→10h→20h→24h)
24h failure window reset: error counts reset to 0 after 24h of no failures
Model reference parsing supporting provider/model format with provider normalization
Candidate deduplication prevents trying the same provider/model twice
Fully backward compatible: without model_fallbacks configured, behavior is unchanged

New Files (8)

File	Lines	Purpose
`model_ref.go`	64	Parse `provider/model` references with normalization
`error_classifier.go`	253	Classify ~40 error patterns into FailoverReason
`cooldown.go`	207	Per-provider cooldown with standard + billing formulas
`fallback.go`	283	FallbackChain orchestrator: `Execute()` + `ExecuteImage()`
`model_ref_test.go`	125	10 test cases
`error_classifier_test.go`	337	20 test cases
`cooldown_test.go`	269	13 test cases
`fallback_test.go`	473	21 test cases

Modified Files (3)

types.go: Add FailoverError, FailoverReason enum, ModelConfig
config.go: Add model_fallbacks, image_model, image_model_fallbacks + helper methods
loop.go: Integrate FallbackChain into runLLMIteration() when candidates > 1

Config Example

{
  "agents": {
    "defaults": {
      "model": "gpt-4",
      "model_fallbacks": ["anthropic/claude-opus", "groq/llama-3"],
      "image_model": "openai/gpt-4o",
      "image_model_fallbacks": ["anthropic/claude-sonnet"]
    }
  }
}

Cooldown Behavior

Type	Errors	Cooldown
Standard	1 / 2 / 3 / 4+	1m / 5m / 25m / 1h
Billing	1 / 2 / 3 / 4+	5h / 10h / 20h / 24h

Test plan

go build ./... passes
go vet ./... passes
All 128 provider tests pass (64 new + 64 existing)
95%+ coverage on new code
Backward compatible (no fallbacks = unchanged behavior)
context.Canceled never triggers fallback (user abort)
Non-retriable errors (format, image dimension/size) abort immediately
Concurrent access safe (sync.RWMutex in cooldown tracker)

Implement a 2-layer model fallback system that automatically retries failed LLM requests across multiple providers/models: - model_ref.go: Parse "provider/model" references with normalization - error_classifier.go: Classify ~40 error patterns into 7 categories (auth, rate_limit, billing, timeout, format, overloaded, unknown) - cooldown.go: Per-provider cooldown with standard backoff (1m→5m→25m→1h) and billing-specific backoff (5h→10h→20h→24h), 24h failure window reset - fallback.go: FallbackChain orchestrator with Execute (text) and ExecuteImage (multimodal), candidate deduplication, context cancellation - types.go: FailoverError, FailoverReason enum, ModelConfig - config.go: model_fallbacks, image_model, image_model_fallbacks fields - loop.go: Integration with AgentLoop when fallback candidates configured Config example: "model": "gpt-4", "model_fallbacks": ["anthropic/claude-opus", "groq/llama-3"], "image_model": "openai/gpt-4o", "image_model_fallbacks": ["anthropic/claude-sonnet"] Backward compatible: without fallbacks configured, behavior is unchanged.

Leeaandrob · 2026-02-13T15:33:35Z

Superseded by PR #131 which includes the model-fallback-chain as part of the multi-agent-routing feature branch.

Leeaandrob closed this Feb 13, 2026

Leeaandrob mentioned this pull request Feb 13, 2026

feat: model fallback chain + multi-agent routing #131

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add model fallback chain with error classification and cooldown#92

feat: add model fallback chain with error classification and cooldown#92
Leeaandrob wants to merge 1 commit intosipeed:mainfrom
Leeaandrob:feat/model-fallback-chain

Leeaandrob commented Feb 13, 2026

Uh oh!

Leeaandrob commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Leeaandrob commented Feb 13, 2026

Summary

New Files (8)

Modified Files (3)

Config Example

Cooldown Behavior

Test plan

Uh oh!

Leeaandrob commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant