Skip to content

feat: add optional reasoning_content to LLMResultChunkDelta #227

@ultramancode

Description

@ultramancode

Summary

Add optional reasoning_content field to LLMResultChunkDelta to support structured streaming of LLM reasoning/thinking processes, separate from the final response text.

This change is part of a three-repository update (sdk, plugin, and main),
with a safe, phased rollout plan detailed below to ensure backward compatibility.

Follow-up to #23313

Motivation

Current inefficient flow (error-prone):

  1. Some vendor APIs return reasoning already separated (e.g., delta.thinking)
  2. Some plugins artificially merge it into message.content using <think> tags
  3. The Dify Main backend re-parses those <think> tags via _split_reasoning() to separate them again
  4. Risk of parsing errors, encoding issues, nested tags

Problems:

  • Unnecessary serialization/deserialization overhead
  • Parsing bugs (malformed tags, edge cases)
  • The Dify Main backend already supports the reasoning_content structure (#23313),
    but currently relies on fragile parsing through _split_reasoning() because the SDK lacks this field and plugin support.

Proposed flow:

  1. Vendor API returns separated data → Plugin preserves separation
  2. Plugin sends via reasoning_content field (no merging)
  3. Main reads directly (no parsing)
  4. Efficient, type-safe, maintainable, less error-prone separation

This aligns the SDK with how vendor APIs actually work.

Proposed Changes

Phase 1: SDK (this issue)

class LLMResultChunkDelta(BaseModel):
    index: int
    message: AssistantPromptMessage
    reasoning_content: str | None = None  # ← NEW
    usage: LLMUsage | None = None
    finish_reason: str | None = None

Compatibility: Backward-compatible (optional field, default None)

Phase 2: Plugins (defensive)

New plugin versions will enforce SDK versions that include this PR or later (via requirements.txt)
Add defensive pop() before Dify Main Backend sends the flag (Phase 3).
This prevents vendor API errors if an older plugin receives updated parameters from the Main backend.

def _chat_generate(self, model_parameters, ...):
    model_parameters = dict(model_parameters)
    model_parameters.pop("_dify_supports_reasoning_content", False)  # defensive
    # ... call vendor API

Affected plugins: All LLM plugins

Phase 3: Dify Main Backend (read & capability signaling)

Main reads reasoning_content from plugin responses:

  • Use delta.reasoning_content directly if present

Main sends capability flag via model_parameters:

  • Plugins detect this and optimize (single-channel vs dual-channel streaming)

Phase 4: Plugins (opt-in implementation)

Provider-by-provider implementation for reasoning-capable models:

My Plan

  1. SDK PR
  2. Plugin PR (Phase 2 defensive)
  3. Main PR (Phase 3)
  4. Plugin PRs (Phase 4 opt-in) → Gradual rollout per provider

Backward Compatibility

No breaking changes at any phase

  • Old plugins work with new SDK (field ignored)
  • New plugins work with old Main (dual-channel fallback)
  • Old Main works with new plugins (defensive pop)

Migration Safety

This phased approach ensures zero-risk migration despite independent release cycles:
no coordination required between repositories.

Why this is safe:

  • No LLM node disruption: All phases use optional fields/parameters
  • SDK-independent: Plugins can upgrade SDK without waiting for Main
  • Main-independent: SDK can release without breaking existing deployments
  • Plugin-independent: Each provider can opt in at its own pace

Version compatibility matrix:

Scenario SDK Main Plugin Result
Old stack Old Old Old Works (current flow)
SDK updated New Old Old Works (field ignored)
SDK + Plugin New Old New Works (defensive pop)
SDK + Main New New Old Works (Main reads field if present)
All updated New New New Optimal (direct reasoning flow)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions