Smart Model Routing

Automatic complexity-based model routing for OpenClaw. Routes requests to the right model for the job — fast, cheap models for simple tasks and capable models for complex ones.

Why Smart Routing?

Most AI workloads follow a power-law distribution: the majority of requests are simple (greetings, quick lookups, short Q&A), while only a fraction require deep reasoning or multi-step orchestration. Without smart routing, every request hits the same model regardless of complexity.

Smart routing is a quality optimization that shifts spend to where it matters. The cost impact depends on your setup:

If your default model is expensive (e.g., Opus): Smart routing is a clear cost saver. Simple tasks drop to Haiku (-80%), standard tasks drop to Sonnet (-40%), and only truly complex work stays on Opus. In our benchmarks, this saves 29.5% with no quality loss.

If your default model is mid-tier (e.g., Sonnet): Smart routing is a quality investment. Simple tasks still drop to Haiku (-67% each), but complex tasks get upgraded to Opus (+67% each) for better output on the work that matters most. Net cost increases ~17.6%, but you get Opus-quality refactoring, architecture design, and multi-step reasoning where the quality difference is most visible.

If you want pure savings from any baseline: Configure only the fast and standard tiers (no heavy tier). This routes simple tasks to Haiku and keeps everything else on your default model — ~30% savings with zero quality regression.

Faster responses. Smaller models have lower latency. Simple questions get answered faster when routed to a lightweight model, giving users snappier interactions for the easy stuff.

How It Works

The plugin registers a before_model_resolve hook that classifies each incoming prompt into one of three complexity tiers:

fast — Simple greetings, factual lookups, status checks, single-turn Q&A
standard — Code questions, debugging, reviews, tool usage, file operations, single-step implementation
heavy — Multi-step refactoring, architecture design, large-scale migrations, complex multi-signal reasoning

Based on the tier, the plugin overrides the model selection to route to the configured model for that tier.

Installation

Install the plugin and configure it in your openclaw.json (at ~/.openclaw/openclaw.json):

openclaw plugins install @openclaw/smart-routing

Then add the plugin config under plugins.entries.smart-routing.config:

{
  "plugins": {
    "entries": {
      "smart-routing": {
        "enabled": true,
        "config": {
          "enabled": true,
          "classifier": "heuristic",
          "tiers": {
            "fast": { "model": "anthropic/claude-haiku-4-5" },
            "standard": { "model": "anthropic/claude-sonnet-4-6" },
            "heavy": { "model": "anthropic/claude-opus-4-6" }
          }
        }
      }
    }
  }
}

Configuration

Field	Type	Default	Description
`enabled`	boolean	`false`	Enable/disable smart routing
`classifier`	string	`"heuristic"`	Classifier mode: `"heuristic"`, `"llm"`, or `"hybrid"`
`tiers`	object	—	Per-tier model mappings (see below)
`hybridConfidenceThreshold`	number	`0.8`	Min heuristic confidence to skip LLM in hybrid mode

Tier Configuration

Each tier (fast, standard, heavy) accepts:

Field	Type	Description
`model`	string	Required. Model to use (`provider/model` format)
`patterns`	string[]	Keywords that boost this tier's score (case-insensitive)
`triggers`	string[]	Additional trigger words (same as patterns)

Classifier Modes

heuristic (default) — Zero-cost, zero-latency classification using message characteristics:

Message length, line count, presence of code blocks, file paths, URLs
Greeting and simple question detection
Standard verb detection ("debug", "review", "implement", "optimize") → standard tier
Heavy verb detection ("refactor", "architect", "restructure", "migrate") → scores in both standard and heavy, creating ambiguity for hybrid mode
Multi-step language detection ("first...then", numbered steps)
Heavy tier requires multiple stacked signals (verb + multi-step + length) for high confidence
Custom patterns and triggers from config

Best for: keeping routing overhead at absolute zero. No API calls, no added latency.

llm — Uses the fast-tier model to classify every request via a structured API call. Adds ~200ms latency but is more accurate for ambiguous prompts. Requires an API key in the environment (e.g., ANTHROPIC_API_KEY). Falls back to heuristic on any error.

Best for: maximum classification accuracy when the latency cost is acceptable.

hybrid — Runs heuristic first. If confidence is below hybridConfidenceThreshold, falls back to the LLM classifier (Haiku) for the final call. Clear-cut cases (greetings, obvious complex tasks) resolve instantly via heuristic at zero cost. Ambiguous cases — like a single "refactor" verb without other complexity signals — get a Haiku classification call (~$0.0004) to make the nuanced Sonnet-vs-Opus decision.

Best for: the optimal balance of speed and accuracy in production.

Examples

Minimal setup (heuristic only)

Two tiers are enough. Simple messages route to Haiku; everything else goes to Sonnet. No heavy tier means complex tasks also use Sonnet.

{
  "enabled": true,
  "classifier": "heuristic",
  "tiers": {
    "fast": { "model": "anthropic/claude-haiku-4-5" },
    "standard": { "model": "anthropic/claude-sonnet-4-6" }
  }
}

Hybrid with custom patterns

Custom patterns and triggers let you tune classification for your specific workload. Heavy triggers should be system-scope verbs that indicate large-scale work — the heuristic already handles standard verbs like "debug" and "review" without config.

{
  "enabled": true,
  "classifier": "hybrid",
  "hybridConfidenceThreshold": 0.75,
  "tiers": {
    "fast": {
      "model": "anthropic/claude-haiku-4-5",
      "patterns": ["greeting", "status", "weather", "time", "date"]
    },
    "standard": { "model": "anthropic/claude-sonnet-4-6" },
    "heavy": {
      "model": "anthropic/claude-opus-4-6",
      "triggers": ["refactor", "architect", "restructure", "rewrite", "migrate", "redesign"]
    }
  }
}

Multi-provider setup

Mix providers across tiers. Use whichever model gives the best cost/performance ratio at each complexity level.

{
  "enabled": true,
  "classifier": "heuristic",
  "tiers": {
    "fast": { "model": "openai/gpt-4o-mini" },
    "standard": { "model": "anthropic/claude-sonnet-4-6" },
    "heavy": { "model": "anthropic/claude-opus-4-6" }
  }
}

Benchmarking

Run the included benchmark script to see how smart routing would classify a set of sample prompts and estimate cost savings vs a single-model baseline:

npx tsx benchmark.ts

The benchmark runs a representative prompt set through the classifier, shows the tier and model each prompt routes to, and compares costs against two baselines (Opus and Sonnet). It shows both savings and cost increases so you can make an informed decision about your tier configuration.

Observability

The plugin logs every routing decision at info level:

[smart-routing] tier=fast model=anthropic/claude-haiku-4-5 confidence=0.85 classifier=heuristic reason="scores: fast=11; selected fast"

Each log entry includes the selected tier, routed model, classifier confidence, which classifier made the decision, and the raw scoring reason. Use these logs to monitor routing distribution and tune your tier configuration.

Behavior Notes

Heartbeat and cron runs are skipped — these have their own model configs.
Missing tier fallback — if the classified tier has no model configured, the plugin falls back to the standard tier. If standard is also missing, routing is skipped and the default model is used.
Plugin hook precedence — other before_model_resolve plugins with higher priority can override the smart routing decision.
LLM classifier auth — the LLM classifier reads API keys from environment variables (ANTHROPIC_API_KEY for Anthropic, OPENAI_API_KEY for OpenAI). It does not use OpenClaw's auth profile system.
No breaking changes — disabling the plugin or removing its config returns OpenClaw to default single-model behavior. There are no migrations or side effects.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
benchmark.ts		benchmark.ts
index.ts		index.ts
openclaw.plugin.json		openclaw.plugin.json
package-lock.json		package-lock.json
package.json		package.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Smart Model Routing

Why Smart Routing?

How It Works

Installation

Configuration

Tier Configuration

Classifier Modes

Examples

Minimal setup (heuristic only)

Hybrid with custom patterns

Multi-provider setup

Benchmarking

Observability

Behavior Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Smart Model Routing

Why Smart Routing?

How It Works

Installation

Configuration

Tier Configuration

Classifier Modes

Examples

Minimal setup (heuristic only)

Hybrid with custom patterns

Multi-provider setup

Benchmarking

Observability

Behavior Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages