Feature Request: Automatic Model Routing by Prompt Complexity

## Problem

Every user prompt in the main conversation thread runs on the same model (e.g., `claude-opus-4-6/max`) regardless of complexity. A simple "fix this typo" burns the same tokens as "architect a distributed system." This wastes significant cost on trivial interactions.

## Current State

OhMyOpenAgent already has the infrastructure for model routing:

- **Category → model mapping** exists (`quick`→Haiku, `deep`→Opus, `unspecified-low`→Sonnet) but is only used for subagent delegation via `delegate_task`, not the main thread.
- **`ultrawork` keyword** allows user-initiated escalation but requires manual intervention.
- **Model resolution pipeline** (`model-resolution-pipeline.ts`) has a clear priority chain that could accommodate a new `provenance: "auto-classified"` step.
- **`small_model`** exists in OpenCode core but is only wired to title generation.

## Proposed Solution

Add an optional **complexity classifier** that runs before the main model, categorizes the prompt, and selects the model via the existing category → model mapping.

### Architecture

```
User prompt
    │
    ▼
┌──────────────────────┐
│  Complexity Classifier│  ← Lightweight (regex/heuristic OR small_model call)
│  Returns: category    │
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│  Model Resolution    │  ← Existing pipeline, new provenance: "auto-classified"
│  Pipeline            │     Sits between "override" and "category-default"
└──────────┬───────────┘
           │
           ▼
    Selected model
```

### Classifier Options (in order of preference)

**Option A: Heuristic (zero-cost, instant)**
- Prompt < 50 tokens + no code blocks + no "architect/design/debug/refactor" keywords → `quick` (Haiku)
- Prompt mentions files but is simple ("rename X to Y", "add import") → `unspecified-low` (Sonnet)
- Everything else → configured default (Opus)
- User can always override with `ultrawork` keyword

**Option B: small_model classifier (low-cost, ~100 tokens)**
- Send the prompt to `small_model` with a fixed system prompt: "Classify this developer request as: quick, low, high, ultrabrain. Return one word."
- Map result to category → model
- Adds ~0.1s latency + ~100 input tokens per message

**Option C: Hybrid**
- Heuristic first (catches obvious simple/complex cases)
- Falls back to `small_model` for ambiguous prompts

### Config Schema Addition

```json
{
  "auto_routing": {
    "enabled": false,
    "strategy": "heuristic",
    "default_category": "unspecified-high",
    "escalation_keywords": ["ultrawork", "ulw", "think hard", "deep dive"],
    "simple_keywords": ["fix typo", "rename", "add import", "what does", "explain"],
    "token_threshold_simple": 50,
    "token_threshold_complex": 500
  }
}
```

### Integration Point

In `model-resolution-pipeline.ts`, insert between step 1 (UI-selected) and step 3 (category-default):

```
1. UI-selected model → provenance: "override"
2. Config model field → provenance: "override"
2.5 NEW: Auto-classifier → provenance: "auto-classified"  ← INSERT HERE
3. Category default → provenance: "category-default"
4. User fallback_models → provenance: "provider-fallback"
5. Hardcoded fallback chain → provenance: "provider-fallback"
6. System default → provenance: "system-default"
```

### User Override

- `ultrawork`/`ulw` keyword already exists and should take priority over auto-classification
- User can also set `"auto_routing": { "enabled": false }` to disable entirely
- Per-agent override: some agents (oracle, prometheus) should always use their configured model regardless of classification

## Expected Impact

Assuming typical usage distribution:
- ~40% of prompts are trivial (questions, typos, simple edits) → routed to Haiku (20x cheaper than Opus)
- ~30% are moderate (multi-file changes, feature work) → routed to Sonnet (5x cheaper)
- ~30% are complex (architecture, debugging, planning) → stays on Opus

Estimated **50-60% token cost reduction** with no perceived quality loss on simple tasks.

## References

- `src/shared/model-requirements.ts` — existing category → model mapping
- `src/shared/model-resolution-pipeline.ts` — resolution priority chain
- `src/plugin/ultrawork-model-override.ts` — keyword-triggered escalation pattern
- `src/config/schema/runtime-fallback.ts` — existing runtime model switching

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Automatic Model Routing by Prompt Complexity #2412

Problem

Current State

Proposed Solution

Architecture

Classifier Options (in order of preference)

Config Schema Addition

Integration Point

User Override

Expected Impact

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feature Request: Automatic Model Routing by Prompt Complexity #2412

Description

Problem

Current State

Proposed Solution

Architecture

Classifier Options (in order of preference)

Config Schema Addition

Integration Point

User Override

Expected Impact

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions