feat: 三层路由架构重构 — FastRules + LLM分类器 + 配置化

admin · admin · commit f58b040cec24 · 2026-02-15T03:14:04.000+08:00
- 新增 config.yaml: 模型、tier、分类器、快速规则全部配置化
- 重写 router.py: Layer 1 快速规则(&lt;1ms) → Layer 2 LLM分类器(带缓存) → Layer 3 模型选择
- 更新 server.py: route 改为 async，支持传入 http_session
- 分类器使用 MiniMax-M2.5 免费模型，支持 reasoning_content 解析
- 分类结果 LRU 缓存（可配置大小和 TTL）
- 不同 tier 真正路由到不同模型（code→GLM-5, reasoning→DeepSeek, creative→GPT-5.2）
- docker-compose 挂载 config.yaml，修改配置无需重建镜像
- Dockerfile 添加 pyyaml 依赖
diff --git a/.env.example b/.env.example
@@ -1,7 +1,8 @@
 # SmartRouter 配置
 
 # 上游 OpenAI 兼容 API 地址
-NEW_API_URL=http://127.0.0.1:30080/v1/chat/completions
+# Docker 环境下使用 host.docker.internal 访问宿主机
+NEW_API_URL=http://host.docker.internal:30080/v1/chat/completions
 
 # 上游 API 密钥（必填）
 NEW_API_KEY=sk-your-api-key-here
diff --git a/Dockerfile b/Dockerfile
@@ -3,10 +3,10 @@ FROM python:3.11-slim
 WORKDIR /app
 
 # 复制代码
-COPY router.py server.py ./
+COPY router.py server.py config.yaml ./
 
 # 安装依赖
-RUN pip install --no-cache-dir -U pip aiohttp
+RUN pip install --no-cache-dir -U pip aiohttp pyyaml
 
 # 暴露端口
 EXPOSE 30081
diff --git a/README.md b/README.md
@@ -1,52 +1,72 @@
 # SmartRouter
 
-智能 LLM 路由代理 — 根据请求内容自动选择最合适（最便宜）的模型，完整支持 OpenAI API 协议。
+智能 LLM 路由代理 — 三层架构自动选择最合适的模型，完整支持 OpenAI API 协议。
 
 ## 特性
 
+- **三层智能路由**：快速规则 → LLM 分类器（带缓存）→ 模型选择
 - **透明代理**：完整透传 OpenAI `/v1/chat/completions` 协议，只替换 model 字段
 - **Tool Calling**：完整支持 tools / tool_choice / tool_calls 透传
 - **流式响应**：SSE (Server-Sent Events) 逐 chunk 透传
 - **多轮对话**：messages 数组完整转发，不丢失上下文
-- **智能路由**：基于关键词分析自动选择 tier（simple / medium / complex / reasoning）
+- **全配置化**：模型、规则、分类器全部在 `config.yaml` 中配置，改配置不改代码
 - **异步高并发**：基于 aiohttp，支持并发请求
 - **零侵入**：不修改请求体（除 model）和响应体，上游返回什么就返回什么
 
-## 原理
+## 路由架构
 
 ```
-客户端请求 → SmartRouter 分析内容 → 选择最便宜的模型 → 透传到上游 API → 原样返回响应
+请求进入
+    │
+    ▼
+┌─ Layer 1: 快速规则 (<1ms) ──────────────────────────┐
+│  • 带 tools 参数 → code tier                         │
+│  • 消息含代码块 → code tier                           │
+│  • system prompt 含代码/推理关键词 → 对应 tier         │
+│  • 短消息 (<10字符) → simple tier                     │
+│  命中 → 直接路由（拦截 60-70% 请求）                   │
+└──────────────────────────────────────────────────────┘
+    │ 未命中
+    ▼
+┌─ Layer 2: LLM 分类器 (带缓存) ──────────────────────┐
+│  • 用便宜模型做一次任务分类                            │
+│  • 结果缓存，相似请求直接命中                          │
+│  • 超时/异常自动 fallback 到 general                  │
+└──────────────────────────────────────────────────────┘
+    │
+    ▼
+┌─ Layer 3: 模型选择 ─────────────────────────────────┐
+│  根据 tier 从 config.yaml 选择首选模型                │
+└──────────────────────────────────────────────────────┘
 ```
 
-### Tier 路由规则
+### 默认 Tier 配置
 
-| Tier | 触发条件 | 默认模型 |
-|------|----------|----------|
-| simple | 问候、简单问答 | MiniMax-M2.5 |
-| medium | 普通对话、多步骤 | MiniMax-M2.5 |
-| complex | 代码、创意写作、长文本 | MiniMax-M2.5 |
+| Tier | 场景 | 默认模型 |
+|------|------|----------|
+| simple | 问候、闲聊 | MiniMax-M2.5 |
+| code | 代码、调试、技术实现 | GLM-5 |
 | reasoning | 数学推理、逻辑证明 | SSY-DeepSeek-V3.2 |
+| creative | 创意写作、文案、翻译 | SSY-GPT-5.2 |
+| general | 通用对话、知识问答 | MiniMax-M2.5 |
 | premium | 手动指定 smart-premium | claude-opus-4-6 |
 
-> 模型列表可在 `router.py` 的 `MODELS` 字典中自定义。
+> 所有模型和 tier 均可在 `config.yaml` 中自定义。
 
 ## 快速开始
 
 ### Docker（推荐）
 
 ```bash
-# 1. 克隆项目
 git clone https://github.com/qingchencloud/smart-router.git
 cd smart-router
 
-# 2. 配置环境变量
 cp .env.example .env
-# 编辑 .env，填入你的上游 API 地址和密钥
+# 编辑 .env 填入上游 API 密钥
+# 编辑 config.yaml 自定义模型和路由规则
 
-# 3. 启动
 docker compose up -d
 
-# 4. 测试
 curl http://localhost:30081/v1/chat/completions \
   -H "Content-Type: application/json" \
   -d '{"model":"auto","messages":[{"role":"user","content":"你好"}]}'
@@ -55,83 +75,26 @@ curl http://localhost:30081/v1/chat/completions \
 ### 本地运行
 
 ```bash
-# 安装依赖
-pip install aiohttp
+pip install aiohttp pyyaml
 
-# 配置环境变量
 export NEW_API_URL="http://127.0.0.1:30080/v1/chat/completions"
 export NEW_API_KEY="sk-your-api-key"
 
-# 启动
 python server.py
 ```
 
 ## API
 
 ### POST /v1/chat/completions
 
-完全兼容 OpenAI Chat Completions API。所有参数原样透传。
-
-**特殊 model 值：**
+完全兼容 OpenAI Chat Completions API，所有参数原样透传。
 
 | model | 行为 |
 |-------|------|
-| `auto` / `smart-auto` | 根据内容自动选择最便宜的模型 |
-| `smart-premium` | 使用 claude-opus-4-6 |
+| `auto` / `smart-auto` | 三层路由自动选择模型 |
+| `smart-premium` | 使用 premium tier 模型 |
 | 其他已知模型名 | 直接使用该模型 |
 
-**示例 — 自动路由：**
-
-```bash
-curl -X POST http://localhost:30081/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "auto",
-    "messages": [{"role": "user", "content": "写一个快排算法"}],
-    "max_tokens": 2048
-  }'
-```
-
-**示例 — Tool Calling：**
-
-```bash
-curl -X POST http://localhost:30081/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "auto",
-    "messages": [{"role": "user", "content": "北京现在几点？"}],
-    "tools": [{
-      "type": "function",
-      "function": {
-        "name": "get_time",
-        "description": "获取城市当前时间",
-        "parameters": {
-          "type": "object",
-          "properties": {"city": {"type": "string"}},
-          "required": ["city"]
-        }
-      }
-    }],
-    "tool_choice": "auto"
-  }'
-```
-
-**示例 — 流式响应：**
-
-```bash
-curl -X POST http://localhost:30081/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "auto",
-    "messages": [{"role": "user", "content": "你好"}],
-    "stream": true
-  }'
-```
-
-### GET /v1/models
-
-返回可用模型列表（OpenAI 格式）。
-
 ### POST /route
 
 查看路由决策（调试用）：
@@ -142,50 +105,42 @@ curl -X POST http://localhost:30081/route \
   -d '{"messages": [{"role": "user", "content": "证明勾股定理"}]}'
 ```
 
-### GET /
+### GET /v1/models
 
-服务状态信息。
+返回可用模型列表。
 
 ## 配置
 
-### 环境变量
+### 环境变量 (.env)
 
 | 变量 | 说明 | 默认值 |
 |------|------|--------|
-| `NEW_API_URL` | 上游 OpenAI 兼容 API 地址 | `http://127.0.0.1:30080/v1/chat/completions` |
-| `NEW_API_KEY` | 上游 API 密钥 | （空，必填） |
-| `SMART_ROUTER_PORT` | 服务端口（docker-compose 用） | `30081` |
-
-### 自定义模型
-
-编辑 `router.py` 中的 `MODELS` 字典：
-
-```python
-MODELS = {
-    "simple": ["your-cheap-model"],
-    "medium": ["your-medium-model"],
-    "complex": ["your-strong-model"],
-    "reasoning": ["your-reasoning-model"],
-    "premium": ["your-premium-model"],
-}
-```
+| `NEW_API_URL` | 上游 API 地址 | `http://127.0.0.1:30080/v1/chat/completions` |
+| `NEW_API_KEY` | 上游 API 密钥 | （必填） |
+| `SMART_ROUTER_PORT` | 服务端口 | `30081` |
 
-每个 tier 的第一个模型为默认选择。
+### config.yaml
 
-## 响应头
+所有路由逻辑均在 `config.yaml` 中配置：
+
+- **tiers**: 每个 tier 的模型列表，第一个为首选
+- **fast_rules**: Layer 1 快速规则，按顺序匹配
+- **classifier**: Layer 2 LLM 分类器（模型、缓存、超时、prompt）
+- **upstream**: 上游 API（支持 `${ENV_VAR}` 语法）
 
-SmartRouter 在响应中添加 `X-SmartRouter-Model` 头，标识实际使用的模型：
+## 响应头
 
 ```
-X-SmartRouter-Model: MiniMax-M2.5
+X-SmartRouter-Model: GLM-5
 ```
 
 ## 架构
 
 ```
 SmartRouter/
 ├── server.py          # aiohttp 异步透明代理服务
-├── router.py          # 路由决策模块（纯逻辑，无网络）
+├── router.py          # 三层路由决策模块
+├── config.yaml        # 全量配置（模型、规则、分类器）
 ├── Dockerfile
 ├── docker-compose.yml
 └── .env.example
diff --git a/config.yaml b/config.yaml
@@ -0,0 +1,109 @@
+# SmartRouter 配置文件
+# 所有模型名称、路由规则、分类器配置均在此处定义
+# 修改后重启服务即可生效
+
+# ===== 上游 API =====
+upstream:
+  url: "${NEW_API_URL}"
+  key: "${NEW_API_KEY}"
+
+# ===== 分类器配置 (Layer 2) =====
+classifier:
+  # 用哪个模型做任务分类（建议用免费/便宜的）
+  model: "MiniMax-M2.5"
+  # LRU 缓存条目数
+  cache_size: 500
+  # 缓存过期时间（秒），0 = 不过期
+  cache_ttl: 3600
+  # 分类超时（秒），超时则 fallback 到 general
+  timeout: 10
+  # 分类 prompt 模板，{user_message} 会被替换为用户消息前 500 字符
+  prompt: |
+    你是一个请求分类器。根据用户消息判断任务类型，只回复类型名称，不要解释。
+
+    可选类型：
+    - simple: 简单问候、闲聊、打招呼
+    - code: 代码生成、调试、编程、技术实现、算法
+    - reasoning: 数学推理、逻辑证明、复杂分析、多步推导
+    - creative: 创意写作、文案、翻译、诗歌、小说
+    - general: 通用知识问答、信息查询、日常对话
+
+    用户消息：
+    {user_message}
+
+    任务类型：
+
+# ===== Tier 模型映射 =====
+# 每个 tier 的 models 列表，第一个为首选
+tiers:
+  simple:
+    models: ["MiniMax-M2.5"]
+    description: "简单问答、闲聊"
+
+  code:
+    models: ["GLM-5", "SSY-DeepSeek-V3.2", "MiniMax-M2.5"]
+    description: "代码生成、调试、技术实现"
+
+  reasoning:
+    models: ["SSY-DeepSeek-V3.2", "DeepSeek-V3.2", "GLM-5"]
+    description: "数学推理、逻辑证明、复杂分析"
+
+  creative:
+    models: ["SSY-GPT-5.2", "SSY-Kimi-K2.5", "MiniMax-M2.5"]
+    description: "创意写作、文案、翻译"
+
+  general:
+    models: ["MiniMax-M2.5", "GLM-5"]
+    description: "通用对话、知识问答"
+
+  premium:
+    models: ["claude-opus-4-6", "claude-opus-4-6-thinking"]
+    description: "手动指定的高端模型"
+
+# ===== 快速规则 (Layer 1) =====
+# 按顺序匹配，命中即停止，不经过 LLM 分类器
+fast_rules:
+  # 带 tools 参数 → 必须用支持 function calling 的模型
+  - name: "has_tools"
+    condition: "has_tools"
+    tier: "code"
+    reason: "请求包含 tools 参数，需要支持 function calling"
+
+  # 消息包含代码块
+  - name: "has_code_block"
+    condition: "has_code_block"
+    tier: "code"
+    reason: "消息包含代码块"
+
+  # system prompt 含代码相关指令
+  - name: "system_code_hint"
+    condition: "system_contains"
+    keywords: ["代码", "编程", "code", "programming", "developer", "engineer", "coder", "debug"]
+    tier: "code"
+    reason: "system prompt 指示代码任务"
+
+  # system prompt 含推理相关指令
+  - name: "system_reasoning_hint"
+    condition: "system_contains"
+    keywords: ["数学", "推理", "math", "reasoning", "logic", "prove", "theorem"]
+    tier: "reasoning"
+    reason: "system prompt 指示推理任务"
+
+  # 短消息问候 (< 10 字符，无特殊内容)
+  - name: "short_greeting"
+    condition: "short_message"
+    max_length: 10
+    tier: "simple"
+    reason: "简短消息"
+
+# ===== 特殊模型名映射 =====
+special_models:
+  # 这些模型名触发特殊行为
+  auto: "smart-auto"           # auto 等同于 smart-auto
+  smart-auto: "smart-auto"     # 自动路由
+  smart-premium: "premium"     # 路由到 premium tier
+
+# ===== 服务配置 =====
+server:
+  port: 30081
+  log_routes: true             # 是否打印路由决策日志
diff --git a/docker-compose.yml b/docker-compose.yml
@@ -9,6 +9,8 @@ services:
       - .env
     environment:
       - NEW_API_URL=${NEW_API_URL:-http://host.docker.internal:30080/v1/chat/completions}
+    volumes:
+      - ./config.yaml:/app/config.yaml:ro
     extra_hosts:
       - "host.docker.internal:host-gateway"
 
diff --git a/router.py b/router.py
diff --git a/server.py b/server.py