fix(app) by TimeBomb2018 · Pull Request #741 · SuanmoSuanyangTechnology/MemoryBear

TimeBomb2018 · 2026-03-30T10:43:36Z

Token consumption of the omni model;
Token consumption of the cluster includes sub-agents

由 Sourcery 提供的摘要

改进对 omni 模型和多代理集群的 token 使用跟踪，并加强 HTTP/MCP 处理的健壮性。

新功能：

将主代理的路由和合并步骤的 token 使用情况，作为多代理集群使用的一部分进行上报。
在支持的 LLM 提供商的流式请求中启用流式用量上报，以获取 token 统计信息。

错误修复：

使从 AI 消息中提取 token 的逻辑在普通和流式对话流程中都能适配不同的提供商元数据格式，更加健壮。
确保在多代理流式对话响应中，子代理的 token 使用事件能够被正确拦截并累积。
在 MCP SSE 和请求处理过程中，将所有 2xx HTTP 状态码视为成功，而不再仅限于 200。

增强：

记录主代理在路由和合并过程中的 token 消耗，以提升对集群行为的可观测性。

Original summary in English

Summary by Sourcery

Improve token usage tracking for omni models and multi-agent clusters, and harden HTTP/MCP handling.

New Features:

Report token usage for master agent routing and merge steps as part of multi-agent cluster usage.
Enable stream usage reporting in streaming requests for supported LLM providers to obtain token statistics.

Bug Fixes:

Make token extraction from AI messages robust across different provider metadata formats in both normal and streaming chat flows.
Ensure sub-agent token usage events are correctly intercepted and accumulated in multi-agent streaming chat responses.
Treat any 2xx HTTP status as success in MCP SSE and request handling instead of only 200.

Enhancements:

Log routing and merge token consumption for the master agent to aid observability of cluster behavior.

1. Token consumption of the omni model; 2. Token consumption of the cluster includes sub-agents

sourcery-ai · 2026-03-30T10:43:51Z

评审者指南

为 omni 模型响应新增了健壮的 Token 使用量提取逻辑，确保多智能体集群的 Token 统计包含主路由与合并的开销，同时收紧 MCP 客户端的 HTTP 状态处理，并支持流式 Token 使用量上报。

多智能体流式 token 统计（包含 sub_usage 事件）的时序图

sequenceDiagram
    actor User
    participant AppChatService
    participant MultiAgentOrchestrator
    participant MasterAgentRouter
    participant SubAgent

    User->>AppChatService: multi_agent_chat_stream()
    AppChatService->>MultiAgentOrchestrator: _execute_supervisor_stream(...)

    MultiAgentOrchestrator->>MasterAgentRouter: _analyze_task() / _call_master_agent_llm()
    MasterAgentRouter-->>MultiAgentOrchestrator: routing_decision + _last_routing_tokens
    MultiAgentOrchestrator->>MultiAgentOrchestrator: task_analysis["routing_tokens"] = _last_routing_tokens

    loop supervisor_stream
        MultiAgentOrchestrator-->>AppChatService: event: sub_usage (routing_tokens)
        AppChatService->>AppChatService: detect "event: sub_usage" and parse data.total_tokens
        AppChatService->>AppChatService: total_tokens += data.get("total_tokens", 0)
        AppChatService-->>User: (no forward for sub_usage)

        MultiAgentOrchestrator->>SubAgent: _execute_sub_agent_stream()
        loop sub_agent_events
            SubAgent-->>MultiAgentOrchestrator: SSE event (may include sub_usage)
            MultiAgentOrchestrator-->>AppChatService: passthrough event
            alt event is sub_usage
                AppChatService->>AppChatService: accumulate total_tokens
            else other event
                AppChatService-->>User: forward event
            end
        end
    end

    MultiAgentOrchestrator->>MultiAgentOrchestrator: _master_merge_results()
    MultiAgentOrchestrator->>MultiAgentOrchestrator: _last_merge_tokens = merge_tokens
    MultiAgentOrchestrator-->>AppChatService: final events
    AppChatService-->>User: final response stream

在 LangChainAgent 中对 omni 模型进行流式 token 提取的时序图

sequenceDiagram
    participant Caller
    participant LangChainAgent
    participant LLMProvider
    participant AIMessage

    Caller->>LangChainAgent: chat_stream(..., files)
    LangChainAgent->>LLMProvider: send streaming request

    loop streaming chunks
        LLMProvider-->>LangChainAgent: AIMessage chunk
        LangChainAgent->>LangChainAgent: build content from chunk
    end

    LangChainAgent->>LangChainAgent: locate final AIMessage
    LangChainAgent->>LangChainAgent: _extract_tokens_from_message(msg)
    alt response_metadata.token_usage.total_tokens
        LangChainAgent->>LangChainAgent: total = response_metadata["token_usage"]["total_tokens"]
    else response_metadata.usage.total_tokens
        LangChainAgent->>LangChainAgent: total = response_metadata["usage"]["total_tokens"]
    else usage_metadata.total_tokens
        LangChainAgent->>LangChainAgent: total = usage_metadata.total_tokens
    else no tokens found
        LangChainAgent->>LangChainAgent: total = 0
    end

    LangChainAgent-->>Caller: yield total_tokens as int in stream
    LangChainAgent-->>Caller: yield content chunks as str (earlier in stream)

更新后的 token 使用量与流式处理类图

classDiagram
    class LangChainAgent {
        +chat()
        +chat_stream(end_user_id, message_chat, storage_type, user_rag_memory_id, memory_flag, files) AsyncGenerator~str|int~
        -_prepare_messages()
        -_build_multimodal_content(text, files) List~Dict~
        <<static>> -_extract_tokens_from_message(msg) int
    }

    class MultiAgentOrchestrator {
        +execute()
        -_analyze_task(message, variables) Dict
        -_execute_sequential()
        -_execute_supervisor_stream(agent_data, message, end_user_id, storage_type, user_rag_memory_id, memory_flag) AsyncGenerator~str~
        -_execute_sub_agent_stream()
        -_master_merge_results(responses, api_key_config)
        -_last_merge_tokens int
        -router MasterAgentRouter
        -config MultiAgentConfig
        -db Session
    }

    class MasterAgentRouter {
        -_call_master_agent_llm(prompt) str
        -_last_routing_tokens int
        -db Session
    }

    class BaseModel {
        +get_model_params(config) Dict~str, Any~
    }

    class MCPClient {
        -_initialize_sse_session()
        -_send_sse_request(request) Dict~str, Any~
        -_send_sse_notification(notification)
        -_initialize_modelscope_session()
        -_session ClientSession
        -server_url str
        -_endpoint_url str
    }

    class AppChatService {
        +multi_agent_chat_stream()
    }

    class ModelApiKeyService {
        +record_api_key_usage(db, api_key_id)
    }

    class RedBearModelConfig {
        +model_name str
        +base_url str
        +api_key str
        +temperature float
        +max_retries int
        +extra_params Dict~str, Any~
        +provider ModelProvider
    }

    class ModelProvider {
        <<enumeration>>
        OPENAI
        XINFERENCE
        GPUSTACK
        OLLAMA
        VOLCANO
        REDBEAR
        DASHSCOPE
    }

    LangChainAgent ..> BaseModel : uses
    MultiAgentOrchestrator --> MasterAgentRouter : has
    MultiAgentOrchestrator ..> ModelApiKeyService : uses
    MasterAgentRouter ..> ModelApiKeyService : uses
    BaseModel --> RedBearModelConfig : takes
    RedBearModelConfig --> ModelProvider : uses
    AppChatService ..> MultiAgentOrchestrator : uses
    MCPClient ..> MCPConnectionError : raises

文件级变更

Change	Details	Files
将从 AIMessage 提取 Token 的逻辑进行集中与泛化，并同时用于非流式和流式聊天，以提升跨提供商的兼容性。	引入一个静态辅助函数，从 response_metadata.token_usage/usage 和 usage_metadata 中读取 total_tokens，支持字典形式和对象形式。用该新辅助函数替代 chat 中基于 response_metadata 的内联 Token 统计逻辑。更新 chat_stream 的返回类型以允许在流中产出整数，并使用该辅助函数在流中发出 Token 使用量，同时记录日志。	`api/app/core/agent/langchain_agent.py`
在多智能体编排总 Token 统计中纳入主智能体路由和合并的 Token 使用量，并通过 SSE 事件传播子智能体的使用量。	在执行多智能体流程时，将来自任务分析的 routing_tokens 以及最后的合并 Token（last merge tokens）累加到每个请求的 total_tokens 中。从路由器中获取 routing_tokens，记录日志，并将其包含在 task_analysis 负载中。在监督者流（supervisor streaming）中，为 routing_tokens 发送一个 sub_usage 类型的 SSE 事件，并透明转发所有下游事件。在主合并阶段，从响应的 usage_metadata/response_metadata 中提取 total_tokens，保存到 orchestrator 上，并记录合并 Token 统计信息。在主智能体路由器中，从响应中提取并存储路由 Token 使用量，以便后续聚合。	`api/app/services/multi_agent_orchestrator.py` `api/app/services/master_agent_router.py` `api/app/services/app_chat_service.py`
通过模型参数配置，为支持的提供商启用流式 Token 使用量上报。	将特定提供商的模型参数构造重构为一个 params 字典。当 extra_params 中启用了流式（streaming）时，设置 stream_usage=True，使 Token 使用量能在流式响应中返回，然后再返回 params。	`api/app/core/models/base.py`
放宽 MCP 客户端对 HTTP 成功状态的检查，从仅接受 200 调整为接受所有 2xx 响应。	更新 SSE 初始化、请求发送、通知以及 modelscope 会话初始化逻辑，将任意 2xx 状态视为成功，否则在读取响应文本后抛出或记录错误。	`api/app/core/tools/mcp/client.py`
在多智能体流式响应中，在应用聊天服务层聚合子智能体的 Token 使用事件。	在 multi_agent_chat_stream 期间，检测 sub_usage 类型的 SSE 行，解析其 JSON 负载中的 total_tokens，将其累加到运行中的 Token 计数中，同时保持其他事件原样转发。	`api/app/services/app_chat_service.py`

技巧与命令

与 Sourcery 交互

触发新评审： 在 Pull Request 中评论 @sourcery-ai review。
继续讨论： 直接回复 Sourcery 的评审评论。
从评审评论生成 GitHub Issue： 在评审评论下回复，请求 Sourcery 从该评论创建一个 Issue。你也可以直接回复评审评论 @sourcery-ai issue 来创建 Issue。
生成 Pull Request 标题： 在 Pull Request 标题的任意位置写上 @sourcery-ai，即可随时生成标题。你也可以在 Pull Request 中评论 @sourcery-ai title 来（重新）生成标题。
生成 Pull Request 摘要： 在 Pull Request 正文任意位置写上 @sourcery-ai summary，即可在该位置生成 PR 摘要。你也可以在 Pull Request 中评论 @sourcery-ai summary 来（重新）生成摘要。
生成评审者指南： 在 Pull Request 中评论 @sourcery-ai guide，即可随时（重新）生成评审者指南。
批量解决所有 Sourcery 评论： 在 Pull Request 中评论 @sourcery-ai resolve，即可将所有 Sourcery 评论标记为已解决。适用于你已经处理完所有评论且不希望再看到它们时。
批量驳回所有 Sourcery 评审： 在 Pull Request 中评论 @sourcery-ai dismiss，即可驳回所有已有的 Sourcery 评审。特别适合你希望基于新一轮评审重新开始的情况——别忘了再评论 @sourcery-ai review 触发新评审！

自定义使用体验

访问你的控制面板来：

启用或禁用评审特性，例如 Sourcery 自动生成的 PR 摘要、评审者指南等。
修改评审语言。
添加、移除或编辑自定义评审指令。
调整其他评审相关设置。

获取帮助

通过邮件联系支持团队以提交问题或反馈。
访问我们的文档，获取详细指南和信息。
关注 Sourcery 团队以保持联系：X/Twitter、LinkedIn 或 GitHub。

Original review guide in English

Reviewer's Guide

Adds robust token usage extraction for omni model responses and ensures multi-agent cluster token accounting includes master routing and merge overhead, while tightening HTTP status handling for MCP clients and enabling streaming usage reporting.

Sequence diagram for multi-agent streaming token accounting with sub_usage events

sequenceDiagram
    actor User
    participant AppChatService
    participant MultiAgentOrchestrator
    participant MasterAgentRouter
    participant SubAgent

    User->>AppChatService: multi_agent_chat_stream()
    AppChatService->>MultiAgentOrchestrator: _execute_supervisor_stream(...)

    MultiAgentOrchestrator->>MasterAgentRouter: _analyze_task() / _call_master_agent_llm()
    MasterAgentRouter-->>MultiAgentOrchestrator: routing_decision + _last_routing_tokens
    MultiAgentOrchestrator->>MultiAgentOrchestrator: task_analysis["routing_tokens"] = _last_routing_tokens

    loop supervisor_stream
        MultiAgentOrchestrator-->>AppChatService: event: sub_usage (routing_tokens)
        AppChatService->>AppChatService: detect "event: sub_usage" and parse data.total_tokens
        AppChatService->>AppChatService: total_tokens += data.get("total_tokens", 0)
        AppChatService-->>User: (no forward for sub_usage)

        MultiAgentOrchestrator->>SubAgent: _execute_sub_agent_stream()
        loop sub_agent_events
            SubAgent-->>MultiAgentOrchestrator: SSE event (may include sub_usage)
            MultiAgentOrchestrator-->>AppChatService: passthrough event
            alt event is sub_usage
                AppChatService->>AppChatService: accumulate total_tokens
            else other event
                AppChatService-->>User: forward event
            end
        end
    end

    MultiAgentOrchestrator->>MultiAgentOrchestrator: _master_merge_results()
    MultiAgentOrchestrator->>MultiAgentOrchestrator: _last_merge_tokens = merge_tokens
    MultiAgentOrchestrator-->>AppChatService: final events
    AppChatService-->>User: final response stream

Sequence diagram for omni model streaming token extraction in LangChainAgent

sequenceDiagram
    participant Caller
    participant LangChainAgent
    participant LLMProvider
    participant AIMessage

    Caller->>LangChainAgent: chat_stream(..., files)
    LangChainAgent->>LLMProvider: send streaming request

    loop streaming chunks
        LLMProvider-->>LangChainAgent: AIMessage chunk
        LangChainAgent->>LangChainAgent: build content from chunk
    end

    LangChainAgent->>LangChainAgent: locate final AIMessage
    LangChainAgent->>LangChainAgent: _extract_tokens_from_message(msg)
    alt response_metadata.token_usage.total_tokens
        LangChainAgent->>LangChainAgent: total = response_metadata["token_usage"]["total_tokens"]
    else response_metadata.usage.total_tokens
        LangChainAgent->>LangChainAgent: total = response_metadata["usage"]["total_tokens"]
    else usage_metadata.total_tokens
        LangChainAgent->>LangChainAgent: total = usage_metadata.total_tokens
    else no tokens found
        LangChainAgent->>LangChainAgent: total = 0
    end

    LangChainAgent-->>Caller: yield total_tokens as int in stream
    LangChainAgent-->>Caller: yield content chunks as str (earlier in stream)

Class diagram for updated token usage and streaming handling

classDiagram
    class LangChainAgent {
        +chat()
        +chat_stream(end_user_id, message_chat, storage_type, user_rag_memory_id, memory_flag, files) AsyncGenerator~str|int~
        -_prepare_messages()
        -_build_multimodal_content(text, files) List~Dict~
        <<static>> -_extract_tokens_from_message(msg) int
    }

    class MultiAgentOrchestrator {
        +execute()
        -_analyze_task(message, variables) Dict
        -_execute_sequential()
        -_execute_supervisor_stream(agent_data, message, end_user_id, storage_type, user_rag_memory_id, memory_flag) AsyncGenerator~str~
        -_execute_sub_agent_stream()
        -_master_merge_results(responses, api_key_config)
        -_last_merge_tokens int
        -router MasterAgentRouter
        -config MultiAgentConfig
        -db Session
    }

    class MasterAgentRouter {
        -_call_master_agent_llm(prompt) str
        -_last_routing_tokens int
        -db Session
    }

    class BaseModel {
        +get_model_params(config) Dict~str, Any~
    }

    class MCPClient {
        -_initialize_sse_session()
        -_send_sse_request(request) Dict~str, Any~
        -_send_sse_notification(notification)
        -_initialize_modelscope_session()
        -_session ClientSession
        -server_url str
        -_endpoint_url str
    }

    class AppChatService {
        +multi_agent_chat_stream()
    }

    class ModelApiKeyService {
        +record_api_key_usage(db, api_key_id)
    }

    class RedBearModelConfig {
        +model_name str
        +base_url str
        +api_key str
        +temperature float
        +max_retries int
        +extra_params Dict~str, Any~
        +provider ModelProvider
    }

    class ModelProvider {
        <<enumeration>>
        OPENAI
        XINFERENCE
        GPUSTACK
        OLLAMA
        VOLCANO
        REDBEAR
        DASHSCOPE
    }

    LangChainAgent ..> BaseModel : uses
    MultiAgentOrchestrator --> MasterAgentRouter : has
    MultiAgentOrchestrator ..> ModelApiKeyService : uses
    MasterAgentRouter ..> ModelApiKeyService : uses
    BaseModel --> RedBearModelConfig : takes
    RedBearModelConfig --> ModelProvider : uses
    AppChatService ..> MultiAgentOrchestrator : uses
    MCPClient ..> MCPConnectionError : raises

File-Level Changes

Change	Details	Files
Centralize and generalize token extraction from AIMessage and use it for both non-streaming and streaming chat to improve compatibility across providers.	Introduce a static helper to read total_tokens from response_metadata.token_usage/usage and usage_metadata, supporting dict and object forms. Replace inline response_metadata-based token counting in chat with the new helper. Update chat_stream return type to allow yielding integers and use the helper to emit stream token usage with logging.	`api/app/core/agent/langchain_agent.py`
Include master agent routing and merge token usage in multi-agent orchestration totals and propagate sub-agent usage via SSE events.	Accumulate routing_tokens from task analysis and last merge tokens into per-request total_tokens when executing multi-agent flows. Capture routing_tokens from the router, log them, and include them in task_analysis payloads. In supervisor streaming, emit a sub_usage SSE event for routing_tokens and transparently forward all downstream events. On master merge, extract total_tokens from response usage_metadata/response_metadata, store on the orchestrator, and log merge token stats. In master agent router, extract and store routing token usage from responses for later aggregation.	`api/app/services/multi_agent_orchestrator.py` `api/app/services/master_agent_router.py` `api/app/services/app_chat_service.py`
Enable streaming token usage reporting for supported providers via model parameter configuration.	Refactor model parameter construction for certain providers into a params dict. When streaming is enabled in extra_params, set stream_usage=True so token usage is included in stream responses before returning params.	`api/app/core/models/base.py`
Relax MCP client HTTP success checks to accept all 2xx responses instead of only 200.	Update SSE initialization, request sending, notification, and modelscope session initialization to treat any 2xx status as success and otherwise raise or log errors with response text.	`api/app/core/tools/mcp/client.py`
Aggregate sub-agent token usage events at the app chat service layer for multi-agent streaming responses.	During multi_agent_chat_stream, detect SSE lines for sub_usage events, parse their JSON payloads, and add total_tokens into the running token count while forwarding other events unchanged.	`api/app/services/app_chat_service.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - 我在这里留下了一些高层面的反馈：

新的 _extract_tokens_from_message 辅助函数把 token 提取逻辑集中起来了，但在 _master_merge_results 和 _call_master_agent_llm 中现在又出现了类似的逻辑重复；建议在这些地方也复用该辅助函数，以保持一致性并便于未来维护。
在 chat_stream 中，生成器类型改为了 AsyncGenerator[str | int, None]，现在可能会在字符串事件之外额外产出裸的 int；这种混合类型的流对调用方来说可能比较脆弱，如果你总是像 sub_usage 那样把 token 计数包装成一致的 SSE/事件格式，会更安全一些。
在 chat_stream 这个紧凑循环内部用于 stream_total_tokens 的 logger.info 调用在高负载下可能非常嘈杂；如果主要是用于排障，建议将其降级为 debug 日志，或者增加限流。

给 AI Agent 的提示词

Please address the comments from this code review:

## Overall Comments
- The new `_extract_tokens_from_message` helper centralizes token extraction, but similar logic is now duplicated in `_master_merge_results` and `_call_master_agent_llm`; consider reusing the helper there for consistency and easier future maintenance.
- In `chat_stream` the generator type is changed to `AsyncGenerator[str | int, None]` and may now yield a bare `int` alongside string events; this mixed-type stream could be brittle for callers and might be safer if you always wrap token counts in a consistent SSE/event format as done with `sub_usage`.
- The `logger.info` call for `stream_total_tokens` inside the tight `chat_stream` loop could be very noisy under load; consider downgrading this to `debug` or adding rate limiting if you only need it for troubleshooting.

Sourcery 对开源项目免费 —— 如果你觉得我们的代码审查有帮助，欢迎分享 ✨

_{帮我变得更有用！请对每条评论点 👍 或 👎，我会根据你的反馈改进后续的代码审查。}

Original comment in English

Hey - I've left some high level feedback:

The new _extract_tokens_from_message helper centralizes token extraction, but similar logic is now duplicated in _master_merge_results and _call_master_agent_llm; consider reusing the helper there for consistency and easier future maintenance.
In chat_stream the generator type is changed to AsyncGenerator[str | int, None] and may now yield a bare int alongside string events; this mixed-type stream could be brittle for callers and might be safer if you always wrap token counts in a consistent SSE/event format as done with sub_usage.
The logger.info call for stream_total_tokens inside the tight chat_stream loop could be very noisy under load; consider downgrading this to debug or adding rate limiting if you only need it for troubleshooting.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The new `_extract_tokens_from_message` helper centralizes token extraction, but similar logic is now duplicated in `_master_merge_results` and `_call_master_agent_llm`; consider reusing the helper there for consistency and easier future maintenance.
- In `chat_stream` the generator type is changed to `AsyncGenerator[str | int, None]` and may now yield a bare `int` alongside string events; this mixed-type stream could be brittle for callers and might be safer if you always wrap token counts in a consistent SSE/event format as done with `sub_usage`.
- The `logger.info` call for `stream_total_tokens` inside the tight `chat_stream` loop could be very noisy under load; consider downgrading this to `debug` or adding rate limiting if you only need it for troubleshooting.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

# Conflicts: # api/app/core/agent/langchain_agent.py # api/app/core/tools/mcp/client.py

sourcery-ai

New security issues found

fix(app):

876c39b

1. Token consumption of the omni model; 2. Token consumption of the cluster includes sub-agents

sourcery-ai Bot reviewed Mar 30, 2026

View reviewed changes

TimeBomb2018 added 3 commits April 1, 2026 15:27

Merge branch 'refs/heads/develop' into feature/agent-tool_xjn

9561578

# Conflicts: # api/app/core/agent/langchain_agent.py # api/app/core/tools/mcp/client.py

feat(models): support reasoning_content streaming

264183c

feat(models): support reasoning_content streaming

386ed2b

sourcery-ai Bot reviewed Apr 1, 2026

View reviewed changes

Comment thread api/app/controllers/service/end_user_api_controller.py Outdated

fix(app service)Sourcery mistook the log f-string for SQL.:

258c19f

zhuwh approved these changes Apr 1, 2026

View reviewed changes

zhuwh merged commit 75bb96d into develop Apr 1, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(app)#741

fix(app)#741
zhuwh merged 5 commits intodevelopfrom
feature/agent-tool_xjn

TimeBomb2018 commented Mar 30, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot commented Mar 30, 2026 •

edited

Loading

与 Sourcery 交互

自定义使用体验

获取帮助

Reviewer's Guide

Sequence diagram for multi-agent streaming token accounting with sub_usage events

Sequence diagram for omni model streaming token extraction in LangChainAgent

Class diagram for updated token usage and streaming handling

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TimeBomb2018 commented Mar 30, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

由 Sourcery 提供的摘要

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

评审者指南

多智能体流式 token 统计（包含 sub_usage 事件）的时序图

在 LangChainAgent 中对 omni 模型进行流式 token 提取的时序图

更新后的 token 使用量与流式处理类图

文件级变更

与 Sourcery 交互

自定义使用体验

获取帮助

Reviewer's Guide

Sequence diagram for multi-agent streaming token accounting with sub_usage events

Sequence diagram for omni model streaming token extraction in LangChainAgent

Class diagram for updated token usage and streaming handling

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TimeBomb2018 commented Mar 30, 2026 •

edited by sourcery-ai Bot

Loading

sourcery-ai Bot commented Mar 30, 2026 •

edited

Loading