Skip to content

[Deployment Feedback] Six-board workflow pilot paused due to reproducibility & flow stability gaps #150

@Charpup

Description

@Charpup

[Deployment Feedback] Six-board workflow pilot paused due to reproducibility & flow stability gaps

Hi maintainers, thanks for the project.

We ran a multi-agent workflow pilot ("三省六部") and decided to pause deployment for now. Sharing concrete observations to help triage:

Environment / context

  • Repo: cft0808/edict
  • Scenario: multi-agent flow with taizi -> zhongshu -> menxia -> shangshu -> departments
  • Goal: stable end-to-end workflow under repeated smoke tests

What we observed

  1. E2E flow instability (P0)

    • Repeated smoke reruns failed at early hop timeout (zhongshu hop entry path in our orchestration).
    • Contract checks can pass while E2E still fails.
  2. Timeout behavior inconsistency across paths

    • We observed behavior that looked like mixed timeout paths (business-path vs no-ack/gateway-path), making gate decisions noisy.
    • We had to split timeout metrics into business_timeout vs infra-related categories to avoid false gate failures.
  3. Workspace/path consistency remains critical

    • Prior issue/PR threads suggest data-path/workspace split can desync state and dashboard view.
    • This was a major risk factor in our go/no-go.
  4. Mitigation attempts (non-code) were insufficient for E2E

    • ACK-first prompt posture improved ACK timing.
    • Single-flight serialization did not improve E2E pass rate in our A/B.

Repro signal from our side

  • We could reproduce instability under repeated rounds in our pilot setup.
  • We paused rollout because reproducibility confidence remained low for production use.

Why we paused

  • For production orchestration, we need:
    1. deterministic E2E stability,
    2. clear timeout source-of-truth across paths,
    3. stronger issue/PR convergence on core flow blockers.

Suggestion

If helpful, we can share a sanitized smoke matrix and gate checklist format (E2E + contract + timeout split + rollback gates) that we used in evaluation.

Thanks again for the project and ongoing work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions